AI Research Assistant
Agentic Knowledge Graph Pipeline
Motivation
When I wanted to get into reading more research papers on emerging AI trends, I didn't know where to start. Because of this, I built this to research, document, and surface AI research through knowledge graph connections, giving me one place that tells me not only what was published today, but also what's worth paying attention to.
What's New in V2
Cloudflare R2 Storage
Migrated off Render's filesystem to Cloudflare R2. The system uses a write-through cache + cache-aside read pattern, and restores itself fully from R2 on startup. Redeployments are now seamless.
Qdrant RAG Integration
Plugged in Qdrant vector search as a RAG layer. Research findings are embedded via OpenAI's text-embedding-3-small. At report time, trending concepts from the Kuzu knowledge graph drive semantic retrieval — so the LLM writes from the most relevant evidence, not just recency.
Weekly AI Research Podcast
Every Monday, the pipeline auto-generates a ~15-minute two-host podcast covering concepts from the week's most impactful papers. Gemini writes the conversational script, OpenAI TTS HD renders it, and episodes stream directly from R2. The UI has a dedicated browse page and an inline player embedded in daily report headers.
How it Works
Agentic Pipeline Architecture
A SequentialAgent orchestrator (Google ADK) coordinates five specialised agents that work together to process, analyse, and connect research papers.
Ingestion Agent — Fetches new papers from research sources, parsing metadata, abstracts, and key concepts. Processes hundreds of papers per day.
Analysis Agents — Extract concepts, methods, datasets, problems, and institutional data from each paper.
Knowledge Graph Construction — Builds and enriches a Kuzu knowledge graph tracking relationships between concepts, authors, institutions, and research trends.
Signal Detection — Identifies accelerating concepts, convergence signals, bridge papers, and unresolved problems gaining attention.
Report Generation — Produces a daily intelligence briefing with recommended reads, institution leaderboards, and rising authors. RAG retrieval via Qdrant ensures reports reference the most relevant evidence.
Key Capabilities
Daily Ingestion
Processes hundreds of papers per day — 475 ingested in a single day — extracting structured data and building connections.
Concept Tracking
Tracks accelerating concepts, newly introduced ideas, and convergence signals across research fields.
Knowledge Graph
3,400+ papers, 14,500+ authors, 10,300+ concepts, 6,000+ methods, 2,000+ datasets, and 1,400+ institutions — all interconnected in Kuzu.
Intelligence Briefings
Daily reports covering accelerating concepts, benchmark trends, institution leaderboards, rising authors, and recommended reads.
Weekly Podcast
Auto-generated ~15-minute episodes with two AI hosts discussing the week's most impactful research findings.
Semantic RAG Retrieval
Qdrant vector search ensures reports and podcasts are grounded in the most relevant research evidence, not just recency.