Building an agentic RAg pipeline

Traditional RAg systems work in a linear pipeline: query encoding, vector similarity search, top-k document retrieval, and response generation. This approach works reasonably well for simple information retrieval but fails when deals with complex technical queries that require multi-faceted reasoning, domain knowledge, and coordinated retrieval strategies.

In enterprise technical domains where engineers look for components that satisfy exact specifications such as "400VAC input, 28VDC output at 9kW power supply" traditional RAg fails because it cannot distinguish between quantitative specs and qualitative requirements, does not possess domain-specific reasoning abilities, and cannot coordinate hybrid retrieval approach based on query complexity.

Agentic RAg changes this landscape by using specialized agents that collaborate to solve complex queries through intelligent orchestration. In contrast to monolithic approach, the system employs domain-specific agents each specialized for specific aspects of retrieval and reasoning that can adapt their strategies based on query features and intermediate results. This architecture combines structured data precision (SQL databases for exact specification matching) with semantic understanding (vector databases for contextual relationships), enabling the system to handle technical complexity that similar to human expert reasoning while maintaining the scalability advantages of automation.

Case Study: Enterprise Power Supply Discovery

Let me guide you through our implementation that achieved 95% accuracy in technical product discovery. The system handles queries from procurement teams, engineers, and technical specialists searching our catalog of industrial power supplies a domain where precision matters and specification mismatches can cost thousands in procurement mistakes or project delays.

When a user queries "I need a power supply with 400VAC input and 28VDC output at 9kW," our agentic system instantly demonstrates its superiority over traditional RAg through intelligent query decomposition and multi-strategy retrieval that I will explain in the architecture breakdown.

Agentic RAg system architecture: Multi-agent orchestration for technical search

Before diving into the technical implementation, let me explain how our system handles a real-world query. Consider this scenario that I took before: A procurement engineer needs to find a power supply for a new industrial control system and queries: "I need a power supply with 400VAC input and 28VDC output at 9kW."

system architecture of multi-agent orchestration

Our agentic RAg system, which was built with CrewAI, processes this query through a sophisticated multi-agent pipeline designed specifically for technical product discovery. The architecture employs the CrewAI's crew-based approach where four specialized agents collaborate to solve complex queries.

class AgenticRagCrewPipeline(Crew):
    def _create_agents(self) -&gt; List:
        return [
            create_answer_generator_agent(),
            create_product_answer_generator_agent(),
            create_reqs_retriever(),
            create_products_retriever()
        ]

This modular approach provides several advantages:

Maintainability: Individual agents can be updated without affecting the entire system.
Scalability: New agents can be added for additional functionality.
Testability: Each agent can be tested in isolation.
Performance: Agents can be optimized for their specific tasks.

Intent Classification Layer

The system begins with intelligent intent classification that determines query routing. This is crucial for performance optimization and ensures that each query is processed by the most appropriate agent:

Product-specific queries: "What are the specs for Model-X123?"
Route to: Products Retriever Agent
Strategy: Direct product lookup with specification extraction
Requirement-based queries: "I need a 9kW AC-DC converter with isolation"
Route to: Requirements Retriever Agent
Strategy: Multi-criteria matching with fallback mechanisms
Pricing queries: "What is the cost of Model-Y456?"
Route to: Human support consultation
Strategy: Redirecting to appropriate department
Irrelevant queries: general questions outside domain scope
Route to: Polite decline with guidance
Strategy: Boundary enforcement to maintain system focus

The intent classification agent uses a combination of keyword matching, semantic similarity, and learned patterns to achieve high accuracy in routing decisions.

Hybrid data architecture

The agentic RAg system employs a sophisticated dual-storage approach that combines the precision of structured data (a SQL database) with the flexibility of semantic search (a vector database).

Structured data layer

The structured layer leverages Presto in watsonx.data to create an SQL database that stores quantitative specifications extracted from product datasheet using large language models. This enables precise filtering for specifications like voltage ratings, power specifications, and technical standards.

CREATE TABLE products (
    id VARCHAR(50) PRIMARY KEY,
    name VARCHAR(200),
    input_voltage_min DECIMAL(10,2),
    input_voltage_max DECIMAL(10,2),
    output_voltage DECIMAL(10,2),
    power_rating DECIMAL(10,2),
    efficiency DECIMAL(5,2),
    temperature_min DECIMAL(5,2),
    temperature_max DECIMAL(5,2),
    isolation_rating VARCHAR(50),
    standards_compliance TEXT[]
);

This structured approach enables:

Precise filtering: Exact matches on numerical specifications.
Range queries: Finding products within specific parameter ranges.
Complex joins: Combining multiple specification criteria.
Performance optimization: Indexed queries for fast retrieval.

Semantic data layer (Vector Database)

The vector database stores semantic embeddings that capture contextual information and nuanced relationships between concepts. This maintains full context while enabling fuzzy matching and handling variations in terminology.

# Example embedding structure
{
    "document_id": "product_123_datasheet",
    "chunk_id": "section_2_specifications",
    "embedding": [0.123, -0.456, 0.789, ...],  # 768-dimensional vector
    "metadata": {
        "product_category": "power_supplies",
        "technical_domain": "electrical_engineering",
        "content_type": "specifications"
    }
}

The semantic layer provides:

Contextual understanding: Captures nuanced relationships between concepts.
Fuzzy matching: Handles variations in terminology and phrasing.
Conceptual similarity: Finds related products even without exact matches.
Multilingual support: Can handle queries in different languages.

Technical deep dive: Agent specialization

Let’s take a closer look at each of the agents in our implementation.

Products Retriever agent

The Products Retriever agent is responsible for handling product-related queries with smart query cleaning and retrieval.

def create_products_retriever():
    return Agent(
        role="Intelligent Document Retrieval Orchestrator",
        goal="Retrieve the most relevant and complete document chunks",
        backstory="""You are a powerful reasoning agent with access to the products_retriever_tool.
        Extract search terms from the query e.g., 'M123', '400V AC input', '28V DC output', 
        '9kW power supply', 'AC to DC', 'Isolated'

        Then pass the user query and these search terms to products_retriever_tool to get 
        the most contextually relevant and complete set of chunks.""",
        verbose=True,
        allow_delegation=False,
        llm=get_llm()
    )

The Products Retriever agent employs advanced query processing techniques.

In this process it removes greetings, signatures, and leaves only pure search terms behind. For example, "Hello, I'm looking for information on M12345-101, thanks!" gets simplified to just "M12345-101." (M12345-101 is a random model name.)

The agent employs multiple retrieval strategies in conjunction:

Exact match retrieval: Direct search of product database.
Semantic search: Vector similarity search for semantic matches.
Specification filtering: SQL queries for technical specifications.
Fuzzy matching: Typos and name variations in product names.

Requirements Retriever agent

The Requirements Retriever agent specializes in complex requirement-based queries with advanced term categorization.

def create_reqs_retriever():
    return Agent(
        role="Intelligent Document Retrieval Orchestrator",
        goal="Retrieve relevant chunks for requirement-based queries",
        backstory="""You are a powerful reasoning agent with access to the reqs_retriever_tool.
        Extract two types of search terms from the query:
        - Quantitative (e.g., '400V AC input', '28V DC output', '9kW power supply')
        - Qualitative (e.g., 'AC to DC', 'Isolated', 'Bidirectional')

        Then pass the user query and these search terms to reqs_retriever_tool.""",
        verbose=True,
        allow_delegation=False,
        llm=get_llm()
    )

The Requirements Retriever agent implements advanced natural language processing to extract and categorize search terms into quantitative specifications (for example, "400VAC," "28VDC," and "9kW") and qualitative specifications (for example, "AC to DC converter" ,"isolated power supply").

The Requirements Retriever agent uses advanced multi-stage retrieval process with smart fallback strategy:

Primary strategy: Attempts to retrieve matches from structured SQL database based on quantitative terms.
Fallback strategy: Falls back to semantic search based on quantitative and qualitative terms if SQL returns none.
Hybrid approach: If SQL returns results, performs semantic search for. additional context and identifies overlapping documents for priority ranking.

This is particularly robust because it can find the product when shoppers do not employ exact jargon. If a shopper asks for a "switching power supply" but the data sheet calls it a "switched-mode power converter," the semantic understanding bridges that gap.

Answer generator agents

The system employs double-specialized agents to handle different forms of answers, and each one is designed to respond to particular.

def create_answer_generator_agent():
    return Agent(
        role="Final Answer Synthesizer for Product Queries",
        goal="generate complete, accurate, and well-structured responses",
        backstory="""You are an AI assistant specialized in technical product information.
        Your job is to format retrieved information into clear, professional responses.

        Critical Rules:
        1. When users mention "output", interpret as output voltage unless power is explicitly mentioned
        2. Use tables for specifications only when explicitly requested
        3. Keep responses concise unless detailed comparisons are requested
        4. Always include source citations
        5. Never hallucinate information not present in source documents
        """,
        verbose=True,
        allow_delegation=False,
        llm=get_llm()
    )

The Answer generator agent excels at identifying requirement matches, suggesting alternatives when there are no exact matches, and making straightforward, concise recommendations with technical rationale.

Technical deep dive: The retrieval pipeline

The system uses an advanced mechanism for selecting retrieval strategies from query attributes and intermediate results. The hybrid retrieval pipeline uses a combination of techniques that complement each other to enable complete coverage.

Product-specific query processing

When a user asks for a specific model like "What are the specs for M2786-101?", the system:

Query cleaning: Removes greetings and extracts clean search keywords.
Hybrid retrieval: Queries SQL database for exact matches and performs semantic vector search in Milvus.
Result prioritization Results that appear in both result sets are prioritized since they're most likely to be exactly what the user is looking for.
Re-ranking: When results are too many, an intelligent re-ranking mechanism selects the most appropriate document pieces.
Answer generation: Provides concise responses with proper source attribution, never hallucinating information.

Requirement-based query processing

For requirement-based queries like "I need a power supply that can handle 400VAC input and deliver 28VDC output at 9kW," the system demonstrates advanced intelligence:

Term extraction: Identifies quantitative specs ("400VAC," "28VDC," "9kW") and qualitative adjectives ("AC to DC converter").
Smart fallback strategy: First attempts structured SQL database search, then falls back to semantic search if needed.
Engineering logic: The precision agent knows that power requirements typically mean "at least as much power," rather than exactly that much.
Contextual understanding: Knows domain-specific vocabulary and applies engineering domain expertise.

Conclusion

The Agentic RAg architecture represents a significant advancement over traditional RAg systems, particularly for advanced technical domains. With the use of specialized agents, hybrid data structures, and intelligent orchestration, we achieved 95% accuracy while significantly improving user experience and operational efficiency.

Success factors included:

Domain specialization: Domain-specific agents and retrieval strategies.
Hybrid data architecture: Combination of structured and semantic search for comprehensive coverage.
Intelligent orchestration: Context-aware task routing based on query attributes.
Quality control: Strong validation to prevent hallucination and ensure correctness.

The system demonstrates that with careful architectural design and domain expertise, AI systems can achieve human-level performance in complex technical tasks while providing the scalability and consistency advantages of automation.

The architectural principles demonstrated in this power supply discovery system are highly transferable across diverse scientific and technical domains, including:

Chemical compound discovery, where agents can match molecular structures and reaction conditions
Pharmaceutical applications connecting compounds with therapeutic targets and clinical trial eligibility
Aerospace component specification navigating complex MIL-STD requirements
Semiconductor IC selection based on electrical characteristics and package constraints
Mechanical engineering applications matching bearings, fluid systems, and manufacturing equipment to precise load ratings and environmental specifications.

This universal applicability stems from the system's core strength: the ability to intelligently combine structured quantitative data (molecular weights, voltage ratings, pressure specifications) with semantic understanding of qualitative requirements (biocompatibility, aerospace-grade, chemical compatibility), enabling any technical domain with complex specification matching to achieve similar 95% accuracy improvements by adapting the multi-agent orchestration framework to their specific terminology, reasoning patterns, and validation requirements.

To companies considering similar implementations, the key is to start with a solid foundation in domain-specific requirements and invest on good-quality data preparation. The agentic approach provides a powerful framework for building advanced, reliable AI systems that can handle the complexity of real-world technical applications.

Learn more about agentic RAg: