Show HN: Benchmarking VLMs vs. Traditional OCR
Published on: 2025-07-11 20:49:29
OmniAI OCR Benchmark
Using Structured Outputs to evaluate OCR accuracy
Published Feb 20, 2025
Overview
Are LLMs a total replacement for traditional OCR models? It's been an increasingly hot topic, especially with models like Gemini 2.0 becoming cost competitive with traditional OCR.
To answer this, we run a benchmark evaluating OCR accuracy between traditional OCR providers and Vision Language Models. This is run with a wide variety of real world documents. Including all the complex, messy, low quality scans you might expect to see in the wild.
The evaluation dataset and methodologies are entirely Open Source. You can run the benchmark yourself using the benchmark repository on Github. You can also view the raw data from the benchmark in the Hugging Face repository. The following results evaluate the top VLMs and OCR providers on 1,000 documents. We measure accuracy, cost, and latency for each provider.
OmniAI Gemini 2.0 Flash Azure GPT-4o AWS Textract Claude Sonnet 3.5 Google D
... Read full article.