Citation-grounded answers from 8,000 documents a month
TIFIN
The challenge
Each client generated thousands of complex financial documents every month. Manually extracting and connecting information across these documents was slow and didn't scale, limiting how quickly advisors could get grounded answers.
Our approach
We built a scalable PDF ingestion pipeline processing 8,000 documents per month per client using Smart OCR (AWS Textract plus multimodal GPT-4o/Claude), feeding LlamaIndex chunks into per-client Pinecone namespaces and a Neo4j knowledge graph. Hybrid retrieval with NL-to-Cypher translation runs on event-driven ECS Fargate workers (S3 to SNS to SQS to Fargate) with DynamoDB idempotency tracking. We evaluated 12+ models and selected Claude Haiku for 3x cost efficiency.
Results
- 8,000 documents processed per client, per month
- Citation-grounded responses via hybrid dense + graph retrieval
- 3x cost efficiency after evaluating 12+ models for the retrieval pipeline
Have a similar challenge?
Let's discuss how we can help you achieve results like these.