This page is taken from the examples/research/ directory and refers to the code there.
Research Agent Example 🧠
Automate Information Gathering and Synthesis with GenAI Processors.
This example demonstrates how to build a multi-step "Research Agent" using the
genai-processors library. The agent takes a user query, breaks it down into
researchable topics, gathers information using AI tools, and then synthesizes a
comprehensive answer.
✨ Key Features
This example highlights several key capabilities of the genai-processors
library:
- Modular Agent Design: Decomposes a complex research task into distinct,
reusable processors (
TopicGenerator,TopicResearcher,TopicVerbalizer). - Structured Data Flow: Utilizes custom dataclasses (
interfaces.Topic) embedded withinProcessorParts to pass structured information through the pipeline. - Search Tool Integration: Leverages the GenAI API's Google Search tool
use capabilities within the
TopicResearcherfor dynamic information retrieval. - Dynamic Content Generation: Generates research topics and synthesized reports based on runtime user input and intermediate findings.
- Pipeline Composition: Chains custom-built processors with core library
processors (like
GenaiModelandPreamble) to create a complete, end-to-end workflow. - Configuration Management: Uses a dedicated
Configobject (interfaces.Config) to manage model names, prompt parameters, and tool configurations. - Asynchronous Processing: Inherits the asynchronous and concurrent nature
of the
genai-processorsframework for efficient operation.
⚙️ How it Works
The Research Agent follows a structured pipeline:
- Input: Receives a user's research query as a stream of
ProcessorParts. - Topic Generation (
TopicGenerator):- A
GenaiModelis prompted (usingprompts.TOPIC_GENERATION_PREAMBLE) to analyze the user's query and generate a list of distinct research topics. - Outputs these topics as
ProcessorParts, each containing aTopicdataclass (initially without research text).
- A
- Topic Research (
TopicResearcher):- For each
Topicpart, anotherGenaiModel(configured with tools like Google Search) is prompted (usingprompts.TOPIC_RESEARCH_PREAMBLE) to find relevant information. Using PartProcessors, the calls are all made concurrently, minimizing Time to First Token (TTFT). - The research findings are added to the
research_textfield of theTopicdataclass.
- For each
- Verbalization (
TopicVerbalizer):- Each researched
Topicpart is transformed into a human-readable Markdown string, summarizing the topic, its relation to the original query, and the research findings. The verbalization done with a Jinja2 template processor.
- Each researched
- Synthesis (within
ResearchAgent):- All verbalized research texts are collected.
- A final
GenaiModelis prompted (usingprompts.SYNTHESIS_PREAMBLE) to synthesize these individual research pieces into a single, coherent response that addresses the user's original query.
- Output: Streams the final synthesized research report.
🧩 Key Components
agent.py:ResearchAgent: The main processor that orchestrates the entire pipeline by chaining the sub-processors.
interfaces.py:Topic: Dataclass defining the structure for a research topic (topic string, relationship, research text).Config: Dataclass for configuring the agent (model names, number of topics, enabled tools).
prompts.py:- Contains the string preambles used to instruct the GenAI models at each stage (topic generation, research, synthesis).
processors/:topic_generator.py: ImplementsTopicGeneratorfor identifying research sub-topics.topic_researcher.py: ImplementsTopicResearcherfor gathering information on each topic using tools.
🛠️ Configuration
The behavior of the ResearchAgent can be customized through the
interfaces.Config object, allowing you to specify:
topic_generator_model_nametopic_researcher_model_nameresearch_synthesizer_model_namenum_topics(number of topics to generate)excluded_topics(list of topics to avoid)enabled_research_tools(list of GenAI tools for the researcher, e.g., Google Search)
📚 Example Notebook
An example notebook can be found here.
📜 License
This example is licensed under the Apache License, Version 2.0.