Generative AIJan 2024 - Mar 2024

Construction Knowledge RAG System

Domain-specific RAG system for construction documentation with hybrid search, streaming responses, and context-aware chat history management — built on Flask and LangGraph.

Architecture Flow

data flow · live

User Query

Embedding

Vector rep

Hybrid Search

Pinecone + BM25

LangGraph

Orchestrator

LLM

Grounded answer

Streamed Reply

Token by token

User Query

Embedding

Vector rep

Hybrid Search

Pinecone + BM25

LangGraph

Orchestrator

LLM

Grounded answer

Streamed Reply

Token by token

Key Achievements

Implemented hybrid search combining dense vector embeddings and sparse BM25 keyword retrieval for higher answer precision
Built real-time token-by-token streaming responses using Flask and LangGraph for a seamless conversational UX
Engineered chat history management with context summarization and tokenization for coherent multi-turn conversations
Designed Admin portal for end-to-end document ingestion, chunking, and embedding pipeline management
Enabled natural language Q&A over complex construction documentation, eliminating manual lookup entirely
Supported multiple document formats including PDF and DOCX within a reusable ingestion pipeline

Core Challenge

Construction teams were spending excessive time manually searching through hundreds of pages of technical documents, compliance reports, and contracts — with no intelligent system to surface precise answers quickly.

Solution

Built a LangGraph-orchestrated RAG pipeline on Flask with hybrid retrieval (dense embeddings + BM25), token-streaming for real-time responses, and a context summarization layer to manage long multi-turn conversations without losing coherence.

Timeline

Jan 2024 - Mar 2024

Team

Lead Engineer

Status

Production Ready

Deep Dive

Developed a domain-specific Retrieval-Augmented Generation (RAG) system tailored for the construction industry, addressing the challenge of extracting precise, contextual answers from large volumes of unstructured technical documentation including blueprint specs, safety manuals, compliance reports, and project contracts.

The platform features a dual-interface architecture: an Admin portal where authorized users upload construction PDFs and documents which are then parsed, chunked, and stored as vector embeddings — and a User-facing query interface where natural language questions are answered by retrieving the most semantically relevant content through a hybrid search strategy combining dense vector similarity with sparse keyword matching.

Built on Flask with LangGraph orchestrating the RAG pipeline, the system delivers token-by-token streaming responses for a fluid conversational experience. A robust chat history management layer handles context summarization and tokenization to maintain coherent multi-turn conversations without exceeding context window limits.

Tangible Impact

Delivered a fully functional knowledge assistant that allows site engineers and project managers to query the entire document base in natural language, with accurate streamed responses and persistent conversational context across sessions.

Tech Stack

PythonFlaskLangGraphLangChainOpenAI EmbeddingsBM25PineconePostgreSQL