Find the best open-source OCR models in one place at Papers with Code [P]
Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.
Hi, I've created an overview of the most important OCR benchmarks, along with the top open models, and links to their paper and code: https://paperswithcode.co/tasks/ocr.
This week, new OCR models were released by Baidu and Mistral.
Baidu released Unlimited OCR, a 3B-parameter model that introduces a key innovation called Reference Sliding Window Attention (R-SWA) and builds on top of DeepSeek OCR. Mistral released OCR 4, which is available via an API.
OCR, or Optical-Character Recognition, is the task of digitizing PDFs or scanned documents. There's, of course, a huge interest in this task, as it enables ingestion of all company data for agentic use cases. AI agents love Markdown; it can be valuable to turn all those messy PDF documents into a standardized, machine-readable format. This enables use cases like agentic RAG (retrieval-augmented generation), which powers chatbots, both internally and for external customer support.
With a large number of OCR releases on Hugging Face over the last few months, it may be hard to know which one to use.
Hence, I've built this page, which lists the major OCR benchmarks, along with the top-performing models and links to their code. This is obviously made available on Papers with Code, the website I'm maintaining (it's a revival of the old website, which was taken down).
The top recommended benchmarks are OlmOCRBench, created by Ai2, and OmniDocBench, created by Shanghai AI Laboratory.
Current top recommendations are Chandra OCR 2 by Datalab and Mistral OCR v4. The former is openly available, hence you can either self-host it or use their serverless API.
Let me know which other tasks you want to see major benchmarks for now!
Cheers,
Niels
open-source @ HF
[link] [comments]
More from r/MachineLearning
-
Loss functions in Instance Representation Learning [R]
Jun 29
-
Price elasticity model [R]
Jun 29
-
Rejected MICCAI paper: workshop -> journal/conference or directly journal/conference [R]
Jun 29
-
I built a demo agricultural planning system with an AI advisor for small-scale farmers in Nicaragua using NASA data [p]
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.