News

Oct 17, 2025

Agentic Document Extraction: Automation for processes

by

Marc Challandes

CMO

by

Marc Challandes

CMO

by

Marc Challandes

CMO

published

Oct 17, 2025

share

email icon
x icon
facebook icon
copy icon

Digitalization and increasing work density in companies are leading to the manual creation and processing of individual customer documents, tying up more and more resources. The solution lies in automation and intelligent AI integration, which addresses the precise area where efficiency gains are most urgently needed – in individualized document creation.

Agentic Document Extraction represents the use of advanced AI agents that not only automate repetitive work but also integrate all relevant data sources – from internal uploads to external web research – into the document creation process. The system is designed to extract, sort, and understand unstructured content and provide it back to the customer's systems (e.g., Abacus, SAP, Microsoft, or a proprietary ERP) or as an editable PowerPoint or Word file.

Here are the differences between traditional Optical Character Recognition (the previous approach) and the new Agentic approach.



Traditional OCR

Agentic Document Extraction

Way of thinking

"I see letters."

"I understand documents."

Competence

Recognise

Understanding + Structuring + Acting

Architecture

Linear process

Multi-Step-Agentic-Flow with LLM-Reasoning

Output

raw text

Clean, structured JSON data, often with validation

Our Approach to Document Extraction

Our client Sobrado processes thousands of PDFs monthly. The PDFs are partly unstructured. The goal was to read the relevant data from the documents, structure it, and convert it into a machine-readable format for import back into the end systems. This new automation significantly increases efficiency while improving quality.

 

Image: System architecture PoC / Silvan Mühlemann

For another project, the objective is to extract specific content from various documents (mainly PDFs) per customer order, in order to then make it available in an open, editable file (see image above). This is intended to achieve a significant increase in efficiency in the existing process.

This approach can be adapted for any industry and problem. We thus implement the possibilities of Agentic Document Extraction into practical workflows for our customers. Reading complex graphics, integrating multiple data sources, and providing editable end products meet the current demands for automation and customization.

Various technologies such as OpenAI, LandingAI, Python, Haystack, Langchain, etc., are combined in this process.

Outlook

Agentic Document Extraction is an innovative field that opens up new possibilities for companies to automate processes that were previously not possible. Together with our customers, we are developing the next generations of AI-supported document processes and thus remain at the forefront of the digital transformation for productive B2B workflows.

by

Marc Challandes

CMO

by

Marc Challandes

CMO

by

Marc Challandes

CMO

by

Marc Challandes

CMO

published

Oct 17, 2025

share

email icon
x icon
facebook icon
copy icon

Recent News

Recent News

Recent News

Ready to create

impact?

Ready to create

impact?

Ready to create

impact?