Generative AIJan 2024 - Mar 2024

Construction Knowledge RAG System

Domain-specific RAG system for construction documentation with hybrid search, streaming responses, and context-aware chat history management — built on Flask and LangGraph.

Architecture Flow

data flow · live
User Query
Embedding
Vector rep
Hybrid Search
Pinecone + BM25
LangGraph
Orchestrator
LLM
Grounded answer
Streamed Reply
Token by token

Key Achievements

  • Implemented hybrid search combining dense vector embeddings and sparse BM25 keyword retrieval for higher answer precision
  • Built real-time token-by-token streaming responses using Flask and LangGraph for a seamless conversational UX
  • Engineered chat history management with context summarization and tokenization for coherent multi-turn conversations
  • Designed Admin portal for end-to-end document ingestion, chunking, and embedding pipeline management
  • Enabled natural language Q&A over complex construction documentation, eliminating manual lookup entirely
  • Supported multiple document formats including PDF and DOCX within a reusable ingestion pipeline

Core Challenge

Construction teams were spending excessive time manually searching through hundreds of pages of technical documents, compliance reports, and contracts — with no intelligent system to surface precise answers quickly.

Solution

Built a LangGraph-orchestrated RAG pipeline on Flask with hybrid retrieval (dense embeddings + BM25), token-streaming for real-time responses, and a context summarization layer to manage long multi-turn conversations without losing coherence.

Timeline
Jan 2024 - Mar 2024
Team
Lead Engineer
Status
Production Ready
Category
Generative AI
Live Preview View Code

Deep Dive

Developed a domain-specific Retrieval-Augmented Generation (RAG) system tailored for the construction industry, addressing the challenge of extracting precise, contextual answers from large volumes of unstructured technical documentation including blueprint specs, safety manuals, compliance reports, and project contracts.

The platform features a dual-interface architecture: an Admin portal where authorized users upload construction PDFs and documents which are then parsed, chunked, and stored as vector embeddings — and a User-facing query interface where natural language questions are answered by retrieving the most semantically relevant content through a hybrid search strategy combining dense vector similarity with sparse keyword matching.

Built on Flask with LangGraph orchestrating the RAG pipeline, the system delivers token-by-token streaming responses for a fluid conversational experience. A robust chat history management layer handles context summarization and tokenization to maintain coherent multi-turn conversations without exceeding context window limits.

Tangible Impact

Delivered a fully functional knowledge assistant that allows site engineers and project managers to query the entire document base in natural language, with accurate streamed responses and persistent conversational context across sessions.

Tech Stack

PythonFlaskLangGraphLangChainOpenAI EmbeddingsBM25PineconePostgreSQL

© 2024 NIKHIL

BACK TO TOP ↑