AI/MLNot on GitHub

ScoutGPT

Agentic AI pipeline over 200M+ row OLAP dataset

Parcyl.ai

7 technologies
3 key decisions
4 results

Problem

Problem

Parcyl.ai needed a way for non-technical users to query and derive insights from a 200M+ row analytical dataset that spanned multiple data domains. Traditional BI tools required SQL expertise and couldn't handle follow-up questions or multi-step reasoning. The challenge was building an agent that could decompose natural language queries into DuckDB-optimized SQL, handle ambiguity, and return explainable results — all with sub-3-second latency on a dataset this size.

Approach

Approach

I built ScoutGPT as a full agentic loop: a LangChain agent with a DuckDB "tool" that generates and executes SQL queries, a Dagster orchestration layer that ingests and transforms data from Snowflake on a schedule, and a Next.js + FastAPI frontend/backend that streams responses. The key insight was using DuckDB's OLAP column store locally rather than querying Snowflake directly — this cut query latency from 8–12s to under 1s for most analytical queries. The agent uses a chain-of-thought approach: it first understands intent, then generates SQL, executes it, interprets results, and formats them for a non-technical audience.

Architecture

Architecture

ScoutGPT — system diagram

Next.js UIFastAPILangChain AgentDuckDB OLAPDagster PipelineSnowflake

Key Technical Decisions

Key Technical Decisions

Assembly Instructions — 3 Steps
01

DuckDB over direct Snowflake querying

Snowflake queries on 200M rows averaged 8–12s even with clustering keys. DuckDB's columnar in-process store brought this to <1s for the analytical patterns the agent generates. The trade-off is a daily Dagster sync to keep the local store fresh — acceptable for the use case.

02

LangChain ReAct agent over a fine-tuned model

A fine-tuned model would need frequent retraining as the schema evolved. ReAct + tool use with a strong system prompt proved more maintainable and more robust on novel query patterns. The schema is injected into the system prompt on every call.

03

Streaming responses via Server-Sent Events

Agent reasoning steps take 2–4s. Streaming the chain-of-thought incrementally made the UI feel responsive. Users see 'Thinking: converting to SQL → executing query → interpreting results' as it happens rather than waiting for a single response blob.

Results

Results

  • Query latency reduced from 8–12s (Snowflake direct) to <1s (DuckDB local)
  • Non-technical users can derive insights without writing SQL
  • Agent correctly interprets follow-up questions with conversation history
  • Dagster pipeline runs nightly with zero manual intervention

Tech Stack

Tech Stack

LangChainDuckDBDagsterSnowflakeFastAPINext.jsPython

Links

Not on GitHub (internal product)