This is a cache of https://developer.ibm.com/tutorials/build-ai-assistant-quarkus-docling/. It is a snapshot of the page as it appeared on 2026-01-19T12:19:46.458+0000.
Develop an AI-powered regulatory compliance assistant with Quarkus, Docling, and LangChain4j - IBM Developer
Modern financial institutions operate in one of the most heavily regulated environments in the world. Every month, regulators such as the SEC, FINRA, FCA, ESMA, and OCC publish Regulatory Change Bulletins announcing new rules, amendments, reporting obligations, and enforcement guidance. These bulletins are dense, technical, and filled with cross-references to prior regulations, effective dates, and jurisdiction-specific requirements.
Today, compliance teams must manually read dozens of these documents, determine which changes apply to their organization, track deadlines, and assess operational impact. This process is slow, error-prone, and costly. Missing a single requirement can result in fines, audit findings, or reputational damage.
This tutorial shows how to transform this manual workflow into an AI-assisted experience by building an enterprise RAg application, a Regulatory Change Impact Assistant. This assistant can instantly retrieve, summarize, and reason over regulatory bulletins with audit-grade traceability.
RAg is a technique that enhances LLM responses by retrieving relevant information from a knowledge base before answers are generated. Unlike traditional chatbots that rely solely on their training data, RAg-based assistants can query documents that were added after the model was trained, reference specific sources (which is crucial for regulatory compliance), reduce hallucinations by grounding responses in actual document content, and maintain audit trails by tracking which documents informed each answer.
The IBM granite family of foundation models provides state-of-the-art capabilities for both text generation and embeddings, which makes them a great fit for enterprise-grade RAg applications like this Regulatory Change Assistant. The granite models are optimized for regulatory and compliance tasks.
In this tutorial, you'll learn about:
How to build a RAg application with Quarkus
IntegratingLangChain4j for AI/LLM capabilities
Using the IBM granite family of foundation models locally through Ollama for privacy and enterprise model fidelity
Usingpgvector for vector similarity search
Processing regulatory documents with Docling
Building REST APIs for document-based AI assistants
To learn more about using quarkus, check out the Quarkus Basics learning path.
(Optional) Podman (or Docker), for Quarkus Dev Service
The application uses Quarkus Dev Services, which will automatically start Docling and PostgreSQL in containers. It can also start your model in a llama.cpp container. Some developers find it easier to use native Ollama installed locally though.
rest-jackson: Provides RESTEasy Reactive framework with Jackson for JSON serialization. This extension enables building fast, non-blocking REST APIs that can handle concurrent requests efficiently.
langchain4j-pgvector: Integrates LangChain4j with PostgreSQL's pgvector extension. This extension allows storing and searching document embeddings directly in your database, eliminating the need for a separate vector database.
langchain4j-ollama: Connects LangChain4j to Ollama, enabling you to use local LLMs and embedding models. This extension keeps your data private and reduces API costs while maintaining enterprise-grade model quality.
io.quarkiverse.docling:quarkus-docling: Provides document processing capabilities through IBM's Docling service. Docling excels at extracting structured content from PDFs, DOCX files, and HTML while preserving document structure, tables, and metadata.
hibernate-orm-panache: Simplifies database operations with an active record pattern. While we don't use it extensively in this tutorial, it's included for potential future enhancements like storing document metadata or user queries.
jdbc-postgresql: PostgreSQL JDBC driver required for database connectivity. Quarkus Dev Services will automatically start a PostgreSQL container with pgvector enabled.
After creating the Quarkus project, navigate to it
cd regulatory-change-assistant
Copy codeCopied!
Step 2. Review the project structure
The tutorial is going to build the following project structure.
Create or update src/main/resources/application.properties:
# Database Configuration is optional in development. In production you'd set the following properties:# quarkus.datasource.db-kind=postgresql# quarkus.datasource.username=quarkus# quarkus.datasource.password=quarkus# quarkus.datasource.jdbc.url=jdbc:postgresql://localhost:5432/regulatory_rag
# Hibernate Configurationquarkus.hibernate-orm.schema-management.strategy=drop-and-create
# LangChain4j - Ollama Configuration# Quarkus auto-detects or starts Ollama for you. If you are running it on a different host/port, change it here:# quarkus.langchain4j.ollama.base-url=http://localhost:11434quarkus.langchain4j.ollama.chat-model.model-id=granite4:latestquarkus.langchain4j.ollama.embedding-model.model-id=granite-embedding:latestquarkus.langchain4j.ollama.timeout=PT60S
# Model temperature should be 0.1-0.3 for regulatory workquarkus.langchain4j.ollama.chat-model.temperature=0.2
# LangChain4j - pgvector Configurationquarkus.langchain4j.pgvector.dimension=384
# Docling Configuration# Quarkus spins up Docling for you. If you are running it on a different host/port, change it here:# quarkus.docling.service.url=http://localhost:8081
Copy codeCopied!
Code Walkthrough
Now, let's walk through the code for the Regulatory Change Impact Assistant.
Retrieval Augmentor
First, create the retrieval augmentor supplier that connects the document retriever to the AI service.
The RetrievalAugmentorSupplier:
Implements Supplier<RetrievalAugmentor> to provide the retrieval augmentor
Injects the DocumentRetriever to use for content retrieval
Builds a custom CustomContentInjector with a PromptTemplate that splits out the findings for the LLM.
package com.ibm.ai;
importstatic java.util.Arrays.asList;
import java.util.List;
import java.util.function.Supplier;
import com.ibm.retrieval.DocumentRetriever;
import dev.langchain4j.model.input.PromptTemplate;
import dev.langchain4j.rag.DefaultRetrievalAugmentor;
import dev.langchain4j.rag.RetrievalAugmentor;
import dev.langchain4j.rag.content.injector.ContentInjector;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;
@ApplicationScopedpublicclassRetrievalAugmentorSupplierimplementsSupplier<RetrievalAugmentor> {
@Inject
DocumentRetriever documentRetriever;
@Overridepublic RetrievalAugmentor get() {
PromptTemplatepromptTemplate= PromptTemplate.from(
"""
{{userMessage}}
Answer using the following information:
{{contents}}
When citing sources, use the Document Information provided with each content block.
Format citations as: [Document: doc_id, Page: page_number]""");
List<String> metadataKeys = asList(
"doc_id",
"page_number",
"document_type",
"file_name",
"retrieval_method",
"similarity_score",
"retrieval_timestamp");
ContentInjectorcontentInjector=newCustomContentInjector(promptTemplate, metadataKeys);
return DefaultRetrievalAugmentor.builder()
.contentRetriever(documentRetriever)
.contentInjector(contentInjector)
.build();
}
}
Copy codeCopied!Show more
Next, we need to create the custom ContentInjector that extends DefaultContentInjector to control metadata formatting. The PromptTemplate only receives {{userMessage}} and {{contents}}; metadata formatting happens in the format() method of the CustomContentInjector.
package com.ibm.ai;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import io.quarkiverse.langchain4j.RegisterAiService;
@RegisterAiService(retrievalAugmentor = RetrievalAugmentorSupplier.class)publicinterfaceRegulatoryChangeImpactAssistant {
@SystemMessage("""
You are a specialized Regulatory Change Impact Assistant for financial institutions.
Your responsibilities:
- Answer questions about regulatory changes and compliance requirements from Regulatory Change Bulletins
- Analyze the impact of regulatory bulletins on business processes and operations
- Provide guidance on compliance obligations, deadlines, and required actions
- Explain complex regulatory language in clear, actionable terms
- Identify cross-references to other regulations and related requirements
Response format:
1. Direct answer with regulatory context
2. Supporting evidence from regulatory bulletins (with citations including document ID and page number)
3. Impact assessment and compliance recommendations
4. Relevant deadlines or effective dates if mentioned
Important guidelines:
- Always cite your sources using the format: [Document: doc_id, Page: page_number]
- If you cannot find relevant information in the provided bulletins, clearly state that
- Refuse questions about topics outside regulatory compliance (e.g., general business advice, product recommendations)
- Be precise about regulatory requirements and avoid speculation
- Highlight any jurisdiction-specific requirements or exceptions
""")
Stringchat(@UserMessage String userQuestion);
}
Why page-level segmentation? Regulatory bulletins often reference specific pages, and compliance teams need to verify information at the source. Page-level granularity enables precise citations while keeping context intact.
Why the similarity score threshold of 0.7? This threshold balances relevance with recall. Lower values (0.5-0.6) return more results but might include irrelevant content. Higher values (0.8-0.9) ensure high precision but might miss relevant documents. For regulatory work, precision is critical, so 0.7 provides a good balance.
Why max 5 results? Regulatory questions often benefit from multiple perspectives, but too many results can confuse the LLM or include irrelevant information. Five segments typically provide sufficient context without overwhelming the model's context window.
Why enrich metadata at retrieval time? Adding retrieval metadata (similarity score, timestamp) at query time rather than ingestion time allows you to track how well each query performed and enables future analytics on retrieval quality.
Exercise 3
Modify MAX_RESULTS and MIN_SCORE constants to see how it affects retrieval quality.
Document ingestion
The DocumentLoader:
The @PostConstruct runs on application startup after dependency injection
Processes all Regulatory Change Bulletins in src/main/resources/documents
Supports PDF, DOCX, and HTML file formats (filtered by extension)
Now that we understand the code for the demo application, let's run it.
1. Prepare the documents to ingest
You can basically use any form of PDF, .docx, or HTML files. Because the prompt is designed to be a regulatory change bulletin answer machine, it makes sense to use real data. For the demo, we have picked three examples. Download them and place them in /src/main/resources/documents.
When you test the application, here's what to look for:
good responses should:
Directly answer the question with regulatory context
Include citations in the format [Document: filename.pdf, Page: X]
Reference-specific requirements, deadlines, or obligations
Acknowledge when information isn't found in the bulletins
Provide actionable compliance guidance
Signs of quality retrieval:
Citations match the question's topic
Multiple relevant documents are cited (if applicable)
Page numbers are accurate and the content is relevant
The LLM synthesizes information rather than just copying text
Common Issues:
No citations: Check whether documents were loaded successfully
Irrelevant citations: The similarity threshold might be too low
generic answers: The retrieved content might lack specificity
Missing information: The question might require documents that were not yet ingested
Testing Strategy:
Start with broad questions to test general retrieval.
Progress to specific regulatory requirements.
Test edge cases (questions about topics not in documents).
Verify citation accuracy by checking source documents.
Test with different document types (PDF, DOCX, HTML).
Best practices when building RAg-based assistants
Building an effective RAg-based assistant requires careful attention to document quality, query design, and system performance. The following best practices will help you optimize your assistant's accuracy, reliability, and user experience. These guidelines cover the entire lifecycle from document preparation through query optimization and performance tuning.
Document preparation
Proper document preparation is the foundation of accurate retrieval. Well-organized, high-quality documents with descriptive naming conventions ensure that your RAg system can effectively index and retrieve relevant information.
File naming:
Use descriptive filenames: SEC-2024-01-disclosure-requirements.pdf
Include dates: FINRA-2024-03-15-trading-rules.pdf
Avoid special characters that might cause issues
Document quality:
Use text-based PDFs (not scanned images) when possible
Ensure that documents are complete and not corrupted
Remove password protection before ingestion
For HTML files, ensure they're well formed
Organization:
Keep documents in src/main/resources/documents
Organize by regulator or date if needed
Remove outdated documents to avoid confusion
Query optimization
The quality of responses from your RAg assistant depends heavily on how questions are formulated. Well-crafted queries that are specific, contextual, and appropriately scoped will yield more accurate and actionable results.
Effective questions:
Be specific: "What are the SEC disclosure requirements for Q1 2024?" vs. "Tell me about SEC"
Include context: "What data retention requirements apply to broker-dealers?"
Ask for impact: "How does the new FINRA rule affect existing compliance procedures?"
Avoid:
Overly broad questions that return too many results
Questions about topics not in your document set
Multi-part questions (ask one at a time for better results)
Performance considerations
Understanding the performance characteristics of your RAg system helps you optimize resource usage and response times. Consider these factors when you scale your application or working with large document collections.
Embeddinggeneration:
Document processing happens at startup, so the first load takes time
Consider processing documents in batches for large collections
Monitor memory usage with many large documents
Query performance:
Vector search is fast with proper indexing
Response time depends on LLM generation speed
Consider caching frequent queries in production
Database optimization:
pgvector indexes (HNSW) are created automatically
Monitor index size as document count grows
Consider archiving old documents to maintain performance
Extending the application
The basic RAg assistant provides a solid foundation, but you can enhance it with additional features to better serve specific use cases. The following extensions demonstrate how to add document filtering, improve metadata handling, customize AI behavior, implement streaming responses, optimize performance through caching, and add observability. Each extension builds on the core architecture while maintaining the application's modularity and maintainability. Think about them as advanced exercises that you could do if you want to push the boundaries of the basic application.
Add document type filtering
Allow users to narrow their searches to specific document types (for example, SEC bulletins, FINRA notices, internal policies). This improves retrieval precision when dealing with diverse document collections.
Enhance DocumentRetriever to filter by document type:
@ApplicationScopedpublicclassDocumentRetrieverimplementsContentRetriever {
@Overridepublic List<Content> retrieve(Query query) {
// Extract document type from query if specifiedStringdocType= extractDocumentType(query.text());
EmbeddingSearchRequest.BuilderrequestBuilder= EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(5)
.minScore(0.7);
// Add metadata filter if document type specifiedif (docType != null) {
requestBuilder.metadataFilter(metadata ->
docType.equals(metadata.get("document_type"))
);
}
// Perform search...
}
}
Copy codeCopied!
Add regulation type metadata
Enrich document metadata with regulation classifications (e.g., "data privacy", "trading rules", "disclosure requirements"). This enables more sophisticated filtering and helps users find relevant regulatory guidance faster.
Enhance DoclingConverter to extract regulation types:
public List<TextSegment> extractPages(File sourceFile)throws IOException {
// Extract regulation type from documentStringregulationType= extractRegulationType(sourceFile);
// Create TextSegments with regulation metadatareturn pageTextMap.entrySet().stream()
.map(entry -> {
Metadatametadata= Metadata.from(Map.of(
"doc_id", fileName,
"page_number", String.valueOf(pageNumber),
"regulation_type", regulationType
));
return TextSegment.from(text, metadata);
})
.collect(Collectors.toList());
}
Copy codeCopied!
Customize the system prompt
Tailor the AI assistant's behavior, tone, and response format to match your organization's specific needs and compliance requirements. A well-crafted system prompt ensures consistent, professional responses aligned with your business context.
Edit RegulatoryChangeImpactAssistant.java:
@SystemMessage("""
You are a specialized Regulatory Change Impact Assistant for [YOUR ORgANIZATION].
Your responsibilities:
- Answer questions about regulatory changes and compliance requirements
- Analyze the impact of regulatory bulletins on business processes
- Provide guidance on compliance obligations
- Refuse questions about topics outside regulatory compliance
Response format:
1. Direct answer with regulatory context
2. Supporting evidence from regulatory bulletins (with citations)
3. Impact assessment and compliance recommendations
""")
Copy codeCopied!
Add response streaming
Implement real-time response streaming to provide immediate feedback to users as the AI generates answers. This significantly improves perceived performance and user experience, especially for complex queries that require longer processing times.
Modify RegulatoryChangeResource to support Server-Sent Events:
gain visibility into your RAg system's performance with metrics tracking. Monitor retrieval times, embeddinggeneration speed, and query patterns to identify bottlenecks and optimize system performance.
Build intelligence about user behavior and system effectiveness by logging queries and their results. This data helps identify knowledge gaps, improve document coverage, and refine retrieval strategies over time.
Track which questions are asked and how well the system responds:
Ensure compliance accuracy by tracking document lifecycle events. Automatically identify outdated regulations and prevent the assistant from citing expired or superseded bulletins in its responses.
Track when regulatory bulletins expire or are superseded:
@EntitypublicclassRegulatoryBulletinextendsPanacheEntity {
public String docId;
public LocalDate effectiveDate;
public LocalDate expirationDate;
publicboolean superseded;
public String supersededBy;
publicstatic List<RegulatoryBulletin> findActive(LocalDate date) {
return find("effectiveDate <= ?1 AND (expirationDate IS NULL OR expirationDate >= ?1) AND superseded = false", date).list();
}
}
Copy codeCopied!
Implement query rewriting
Enhance retrieval recall by automatically expanding user queries with domain-specific synonyms and related terminology. This helps capture relevant documents even when users phrase questions differently than the source material.
Improve retrieval by expanding queries with synonyms or related terms:
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.