Back to Projects
Live V2

AI Research Assistant

Agentic Knowledge Graph Pipeline

Python FastAPI Google ADK Kuzu Qdrant OpenAI TTS Cloudflare R2

Motivation

When I wanted to get into reading more research papers on emerging AI trends, I didn't know where to start. Because of this, I built this to research, document, and surface AI research through knowledge graph connections, giving me one place that tells me not only what was published today, but also what's worth paying attention to.

What's New in V2

Cloudflare R2 Storage

Migrated off Render's filesystem to Cloudflare R2. The system uses a write-through cache + cache-aside read pattern, and restores itself fully from R2 on startup. Redeployments are now seamless.

Qdrant RAG Integration

Plugged in Qdrant vector search as a RAG layer. Research findings are embedded via OpenAI's text-embedding-3-small. At report time, trending concepts from the Kuzu knowledge graph drive semantic retrieval — so the LLM writes from the most relevant evidence, not just recency.

Weekly AI Research Podcast

Every Monday, the pipeline auto-generates a ~15-minute two-host podcast covering concepts from the week's most impactful papers. Gemini writes the conversational script, OpenAI TTS HD renders it, and episodes stream directly from R2. The UI has a dedicated browse page and an inline player embedded in daily report headers.

How it Works

Agentic Pipeline Architecture

A SequentialAgent orchestrator (Google ADK) coordinates five specialised agents that work together to process, analyse, and connect research papers.

01

Ingestion Agent — Fetches new papers from research sources, parsing metadata, abstracts, and key concepts. Processes hundreds of papers per day.

02

Analysis Agents — Extract concepts, methods, datasets, problems, and institutional data from each paper.

03

Knowledge Graph Construction — Builds and enriches a Kuzu knowledge graph tracking relationships between concepts, authors, institutions, and research trends.

04

Signal Detection — Identifies accelerating concepts, convergence signals, bridge papers, and unresolved problems gaining attention.

05

Report Generation — Produces a daily intelligence briefing with recommended reads, institution leaderboards, and rising authors. RAG retrieval via Qdrant ensures reports reference the most relevant evidence.

Key Capabilities

Daily Ingestion

Processes hundreds of papers per day — 475 ingested in a single day — extracting structured data and building connections.

Concept Tracking

Tracks accelerating concepts, newly introduced ideas, and convergence signals across research fields.

Knowledge Graph

3,400+ papers, 14,500+ authors, 10,300+ concepts, 6,000+ methods, 2,000+ datasets, and 1,400+ institutions — all interconnected in Kuzu.

Intelligence Briefings

Daily reports covering accelerating concepts, benchmark trends, institution leaderboards, rising authors, and recommended reads.

Weekly Podcast

Auto-generated ~15-minute episodes with two AI hosts discussing the week's most impactful research findings.

Semantic RAG Retrieval

Qdrant vector search ensures reports and podcasts are grounded in the most relevant research evidence, not just recency.

Tech Stack

Python FastAPI Google ADK Gemini Kuzu Graph DB Qdrant OpenAI TTS HD Cloudflare R2 Render