Don't bother parsing: Just use images for RAG
At Morphik, we build RAG tools to provide developers accurate search over complex documents. In this article, we explain why we operate over "images" of pages instead of doing OCR/ parsing. If you’ve ever tried to extract information from a complex PDF: one with charts, diagrams, and tables mixed with text, you know the pain. That invoice with a nested table showing quarterly breakdowns? The research paper whose intricate figures actually contain the key findings? The technical manual where the