$ open projects/pharma-rag
Advanced RAG for Pharmaceutical Compliance
Lead engineer · Final Year Project · Aug 2024 — Present
RAG pipeline that automates regulatory-document retrieval for pharma compliance teams — moved from Llama 3 to Mistral 7B, swapped naive retrieval for SBERT embeddings, and tuned ChromaDB for production-scale corpora.
- ▸Built chunking + semantic retrieval with SBERT (~400MB) for low-latency embedding
- ▸Migrated LLM from Llama 3 → Mistral 7B for ~2× faster generation at lower cost
- ▸ChromaDB-backed vector store; FAISS migration in progress for further perf gains
- ▸Designed evaluation harness for retrieval precision on regulatory corpora