Skip to main content
Subhajit Bhar - IDP Engineer

Subhajit Bhar

LLM Engineer — Intelligent Document Processing

If your team is manually copying data from PDFs, lab reports, invoices, or document attachments into spreadsheets — that’s an engineering problem. I fix it.

I build production-grade document processing pipelines that eliminate that manual work entirely. Real documents, messy layouts, edge cases included — delivered as a complete system your team can use from day one.

Based in Durham, UK. Working with consultancies, operations teams, and logistics businesses across the UK and beyond.

Book a Diagnostic Session ($250 fixed price) → Share your documents → Get a written action plan within 3 days.

Book a Diagnostic Session →

Recent

Azure Document Intelligence Alternatives

·956 words·5 mins
Azure Document Intelligence (formerly Form Recognizer) is Microsoft’s managed IDP service. It handles invoices, receipts, purchase orders, and ID documents well — out of the box, with no custom training required for standard formats. For many use cases, it’s a reasonable starting point. For many production workflows, it’s not enough.

Extracting Tables from PDFs in Python: The Complete Guide

·1073 words·6 mins
Extracting tables from PDFs is one of the most common requirements in document automation and one of the most reliable ways to introduce subtle errors if you do it carelessly. This guide covers table extraction with pdfplumber — the most capable Python library for this — including how it works, when it works, and what to do when it doesn’t.