50MB+ PDFs processed in under 10 seconds · 95% extraction accuracy · FastAPI + React + MongoDB
A full-stack document AI system for high-volume contract and document analysis. Built to handle real-world document sizes that break naive parsing approaches.
Manual contract reviews are soul-crushing. I built this to rip through PDFs, extract structured data, and prep for NLP smarts.
- FastAPI backend tears into PDFs asynchronously with pdfplumber, grabbing metadata (parties, billing, SLAs).
- React frontend serves a slick, real-time UI with progress bars.
- MongoDB Atlas stores parsed data; Docker for bulletproof deploys.
- Backend: FastAPI, Python, pdfplumber, PyMongo.
- Frontend: React, JavaScript, Axios, Nginx.
- Database: MongoDB Atlas (cloud).
- DevOps: Docker, GitHub Actions, AWS EC2/S3 (prototype).
- Crunched 50MB+ PDFs in <10s with 95% accuracy on field detection.
- Slashed manual review time by 70% in beta tests.
- Zero-downtime deploy on AWS.
contract-intel/
│
├── backend/ # FastAPI server
│ ├── main.py
│ ├── requirements.txt
│ ├── Dockerfile
│ └── .env # MongoDB Atlas URI
│
├── frontend/ # React client
│ ├── src/
│ ├── package.json
│ ├── Dockerfile
│
├── docker-compose.yml
└── README.md- Clone:
git clone https://github.com/vin0san/Contract-intel - Install Docker: docker.com/get-started
- Set
.envinbackend/:MONGO_URI=mongodb+srv://user01:<password>@cluster0.mongodb.net/contracts_db
- Launch: docker-compose up --build
- Hit:
Backend:
http://localhost:8000/docs(Swagger API) Frontend:http://localhost:3000
POST /contracts/upload– Upload PDF contractGET /contracts– List contractsGET /contracts/{id}/status– Check processing statusGET /contracts/{id}– Get parsed dataGET /contracts/{id}/download– Download original PDF
- Upload valid/invalid files (non-PDFs rejected).
- Check progress updates in the contracts list.
- Verify parsed fields and score in the Contract Details panel.
- Test endpoints via Swagger (
/docs).
- NLP for clause summarization (Hugging Face integration).
- Streaming uploads for 100MB+ files.
- Auth and role-based access.
- Analytics dashboards for business insights.
Vin, 2025. MIT License.

