Show HN: Qwen-2.5-32B is now the best open source OCR model
Published on: 2025-05-19 20:00:49
Omni OCR Benchmark
A benchmarking tool that compares OCR and data extraction capabilities of different large multimodal models such as gpt-4o, evaluating both text and json extraction accuracy. The goal of this benchmark is to publish a comprehensive benchmark of OCRaccuracy across traditional OCR providers and multimodal Language Models. The evaluation dataset and methodologies are all Open Source, and we encourage expanding this benchmark to encompass any additional providers.
Open Source LLM Benchmark Results (Mar 2025) | Dataset
Benchmark Results (Feb 2025) | Dataset
Methodology
The primary goal is to evaluate JSON extraction from documents. To evaluate this, the Omni benchmark runs Document ⇒ OCR ⇒ Extraction. Measuring how well a model can OCR a page, and return that content in a format that an LLM can parse.
Evaluation Metrics
JSON accuracy
We use a modified json-diff to identify differences between predicted and ground truth JSON objects. You can review the evaluation/j
... Read full article.