← All posts
AI2SEE
Case Study Document AI Agentic AI Manufacturing
Glass Manufacturing · Pune, India · Document AI & Agentic Workflow

From handwritten order to verified invoice — in under 40 seconds.

How we built an agentic AI pipeline for SA Glass that ingests any order format — hand-sketched drawings, WhatsApp photos, Excel sheets, CAD files, or typed text — and produces a validated Proforma Invoice ready for ERP import.

10×
Faster order-to-PI turnaround vs. manual processing
96%
Field-level extraction accuracy on first pass
8wks
Kickoff → live in production at SA Glass
Delivery timeline
Wk 0
Order audit & format mapping
Wk 3
Extraction agent live on samples
Wk 6
Full pipeline + HITL review UI
Wk 8
Live · 10× faster PIs

SA Glass's orders arrived in every format except a consistent one.

SA Glass, a glass manufacturing company based in Pune, receives customer orders in a wide variety of formats — hand-drawn dimension sketches sent via WhatsApp, scanned handwritten order forms, Excel sheets with varying column structures, typed PDFs, and occasionally digital CAD drawings. Every incoming order had to be read, interpreted, manually re-keyed into their system, and converted into a Proforma Invoice before production could begin.

A single order took 25–40 minutes of staff time to process. During high-demand periods, the backlog stretched to 2–3 days — meaning customers waited days for a PI that should take minutes, and production planning couldn't start until the invoice was issued. The bottleneck wasn't capacity; it was the sheer variety of formats that made automation seem impossible.

The challenge wasn't OCR — it was understanding. A handwritten sketch showing glass dimensions with annotations, arrows, and shorthand requires a system that can reason about the document, not just read the pixels. Rules-based automation had been tried and abandoned; the format variation defeated every template-matching approach they'd tested.


An agentic pipeline that reads like a human, validates like a system.

We built a multi-agent workflow that breaks the problem into discrete reasoning steps — each handled by a specialised agent with a specific role and a defined confidence threshold. No single model tries to do everything. The agents collaborate, hand off structured data between themselves, and escalate to a human reviewer only when confidence falls below a defined threshold.

What the pipeline accepts

✍️

Handwritten Orders

Scanned or photographed forms, any handwriting style

📐

Dimension Sketches

Hand-drawn glass specs with measurements and annotations

📊

Excel / Spreadsheet

Variable column layouts, merged cells, informal headers

📄

Typed / PDF Orders

Digital text orders in any layout or template

🔧

CAD / Technical Drawings

DXF or image exports of engineering dimension drawings

💬

WhatsApp Photos

Camera-phone images of physical order documents

The agentic pipeline

1
Ingestion Agent

Format Detection & Pre-processing

Identifies the input type (sketch, form, spreadsheet, CAD, photo), applies the appropriate normalisation strategy, and routes to the correct extraction path. Handles orientation correction, deskew, and noise reduction before any extraction happens.

imagepdfxlsxdxf→ normalised input
2
Extraction Agent

Multimodal Field Extraction

A vision-language model reads the normalised input and extracts all order fields — dimensions (length, width, thickness, shape), quantity, glass type, finish, special requirements, delivery details, and customer reference. For sketches, it interprets dimension lines, annotation arrows, and shorthand notation the same way an experienced estimator would.

dimensionsglass specquantityfinishdelivery→ structured JSON
3
Validation Agent

Business-Rule & Catalogue Checking

Checks extracted fields against SA Glass's product catalogue (glass types, thickness ranges, available finishes), validates dimensional feasibility, flags impossible measurements, and cross-references against the customer master for pricing tier and credit terms. Each field gets a confidence score.

product cataloguecustomer masterpricing rules→ validated fields + confidence
4
Routing Agent

Confidence-Gated Routing

If all fields clear the confidence threshold, the order proceeds directly to PI generation. If any field is uncertain, the routing agent surfaces only those fields — highlighted in context — to a human reviewer. The reviewer sees the original document alongside the extracted data and corrects specific fields rather than re-keying the whole order.

high confidence → autolow confidence → human review
5
Generation Agent

Proforma Invoice Generation

Produces a formatted PI in SA Glass's standard template — line items, pricing, GST calculation, payment terms, and delivery schedule — and pushes it to their ERP system via API. The PI is also emailed to the customer automatically with the order reference attached.

PI documentERP pushcustomer email→ done ✓

The human-in-the-loop layer

Enterprise AI without governance is a liability. We built the review workflow as a first-class part of the system, not an afterthought.

Auto-approved

~78% of orders

High-confidence extractions proceed without human intervention. Average PI generation time: 38 seconds from document receipt.

Flagged for review

~22% of orders

Ambiguous sketches or unusual specs. Reviewer sees highlighted uncertain fields only — average review time: 3 minutes, not 35.

Phase 01

Order Audit

Catalogued 6 months of historical orders across all format types. Identified 11 distinct input patterns and the failure modes of prior automation attempts.

Phase 02

Extraction Pipeline

Multimodal extraction agent built and tested on 400 historical orders. Confidence scoring calibrated against ground-truth PIs from the same period.

Phase 03

Validation & Routing

Business-rule validation connected to live catalogue. Confidence thresholds tuned to balance automation rate vs. error rate with the operations team.

Phase 04

PI Generation & Integration

PI template engine, ERP API integration, customer email automation, and the human review UI shipped and tested with the SA Glass team in parallel.


35 minutes became 38 seconds — for 78% of orders, automatically.

10×
Faster order-to-PI turnaround · 35 min → under 40 sec (auto)
96%
Field-level extraction accuracy on first pass across all input formats
78%
Orders fully automated — zero human touch required
3min
Average human review time on flagged orders (was 35+ minutes)
"Our team was spending half their day re-typing orders from WhatsApp photos. Now they spend 20 minutes reviewing edge cases. The rest is handled. That's the difference."
Operations Head, SA Glass, Pune

Within four weeks of going live, SA Glass had processed over 800 orders through the pipeline. The production planning team reported that PI backlog — previously 2–3 days during peak periods — had dropped to same-day for all auto-approved orders. The review queue for flagged orders clears in under an hour, compared to the previous overnight backlog.

"The agents don't just read the document. They understand it — the same way our estimators do."

Five agents, one pipeline, any input format.

We don't publish model names or integration specifics — those are a competitive advantage for SA Glass. The structural pattern, however, applies to any document-heavy manufacturing or B2B workflow.

Foundation

Multimodal Vision-Language Model

A frontier multimodal model handles extraction across all input types — the same model reads a typed PDF and a hand-sketched drawing. Domain-specific prompting guides it to SA Glass's product vocabulary and notation conventions.

Orchestration

Agent Workflow Engine

A lightweight orchestration layer routes documents between agents, manages state between steps, and enforces the confidence-gating logic. Each agent has a defined input schema, output schema, and escalation condition.

Data Layer

Live Catalogue & Customer Master

The validation agent queries SA Glass's product catalogue and customer master in real time — not a static snapshot. Price changes, new product lines, and customer updates are reflected immediately without a pipeline retrain.

Human-in-the-Loop

Contextual Review Interface

A lightweight web UI that shows the original document alongside extracted fields, with uncertain values highlighted in context. Corrections feed back into the confidence model to improve routing over time.

Agentic AI Multimodal LLM Document AI OCR ERP Integration Human-in-the-Loop Workflow Orchestration CAD Parsing

AI2SEE · Proven in weeks, not years

Drowning in document formats?

We build agentic document AI pipelines for manufacturing, logistics, and B2B services — any input format, ERP-ready output, human-in-the-loop where it matters. Start with a free 30-minute scoping call.

Talk to our team →

We respond within 24 hours. First call is free.