Enhance user experiences with a multiturn conversational chatbot

Archived content

Archive date: 2025-07-16

This content is no longer being updated or maintained. The content is provided “as is.” Given the rapid evolution of technology, some content, steps, or illustrations may have changed.

IBM watsonx Assistant is a conversational AI platform that is designed to help deliver exceptional customer support experiences. It’s powered by large language models (LLMs), has an intuitive user interface, and helps you build AI-powered voice agents and chatbots.

Currently, one challenge in using watsonx Assistant is that each interaction is independent and the chatbot does not retain any information from previous exchanges. It works as a stateless chatbot in that way. However, there is a way to enhance your watsonx Assistant to be a multiturn chatbot.

There are two types of chatbots:

Single-turn (Stateless) chatbots: Handles one question at a time without remembering anything from previous interactions.

Example:

  User: "What's the weather in New York?"  
  Bot: "The weather in New York is 75°F and sunny."  
  User: "How about tomorrow?"  
  Bot: "I don't understand. Can you please specify a city?"

Multiturn (Stateful) chatbots: Remembers and uses information from earlier parts of a conversation to provide more relevant and coherent responses.

Example:

 User: "What's the weather in New York?"  
 Bot: "The weather in New York is 75°F and sunny."  
 User: "How about tomorrow?"  
 Bot: "Tomorrow, it will be 70°F and partly cloudy in New York."

This article covers the implementation of stateful question and answer capabilities by using watsonx Assistant. The article contains ready-to-use code examples, prompts, and JSON import files that contain watsonx Assistant action skills.

Flow diagram

Overview

This implementation is provided fully within this JSON file. The article covers each section of the file. If you would like to try this process on your own, you need:

An IBM Cloud account for using the watsonx platform
A published instance of watsonx Assistant
A published instance of Watson Discovery
watsonx Assistant to Watson Discovery and watsonx.ai set up

The article explains how to retrieve relevant documents from Watson Discovery in a stateful manner and to make the watsonx Assistant stateful just using action skills. This process uses:

watsonx Assistant for its conversion capabilities, integration features, and security protocols
Watson Discovery as a knowledge base and search engine
watsonx.ai "ibm/granite-13b-chat-v2" models because they are obedient and hallucinate less

Step-by-step implementation and explanation

If you’d like to follow along with the article, you must follow the setup instructions in the Single turn RAG tutorial guide to connect your assistant to Watson Discovery and watsonx through custom extensions.

After you’ve imported the custom extensions, integrated them with watsonx assistant, and uploaded the Action Skill JSON file, you see the following action skills:

Action skill

The rest of this article explains the following action skills. You can see how these are implemented in the associated JSON file. The article explains the various sections in this file and whether changes are needed if you’d like to implement this on your own.

Search captures the user query in query_text, takes Watson Discovery extension parameters, and makes a call to Watson Discovery to retrieve the documents, which are stored in search_results.
Generate answer concatenates the results, prepares the prompt, and handles the model_response.
Invoke watsonx generation API takes the extension parameters and makes a call to watsonx.ai for generation.

Action 1: Search

This is the very first step where you parse a user query to Watson Discovery and fetch the relevant documents.

Search: Step 1 to Step 3

If you are following on your own, no changes are needed to this section. These steps add the user input value to a variable query_text.

Search: Step 4

This is the first step where you parse a user query to Watson Discovery and fetch the relevant documents. In this step, you save the user query for later use.

Search: Step 5

In this step, you use 3 variables that you are defining in the later stages:

Variable 1 - Context_var: This is the title of the document that is retrieved in the previous round of the conversation.
Variable 2 - Res: This is the LLM response from the previous round of the conversation.
Variable 3 - query_text: This is the user query from the existing round of the conversation.

You can choose to reset the Context_var only once, in every round, or by condition.

Search: Step 7

You would define the Context_var, as shown in the following image.

Search: Step 8

In this section, there are no changes needed. With Watson Discovery Extension, you can set the parameters and parse the query_text to the Watson Discovery.

Search: Step 9

Reassign the original user query now that you have the relevant documents from Watson Discovery. The original user query is stored in Initial_query, query_text.

Action 2: Generate answer

Generate answer: Step 1 to Step 4

No changes are needed for these steps. Step 1 to Step 4 concatenate the documents that Watson Discovery has returned.

Generate answer: Step 5

This is a stateful conversation between the user and the assistant, and it requires appropriate tags for it to function properly.

In this step, you are assigning the tags of "ibm/granite-13b-chat-v2" because that Granite model is used here. The following code shows the tag format for it:

<|user|>: The student
<|assistant|>: The counsellor

This example uses 2 variables (The Initial_q_res variable is defined later.)

Variable 1 - Summary: This is the conversation summary between the user and the assistant, or
Variable 1 – Initial_q_res: This is the conversation between the user and the assistant.
Variable 2 – query_text: This is the user query.

Step 5

Generate answer: Step 6

In this step. you define the model input. Make sure that you use the tags of the model you are using.

This example uses "ibm/granite-13b-chat-v2". The following code shows the tag format for it:

<|system |>: The standard operating instructions
<|user|>: The instructions for the model to follow
[Document]/[End]: The document grounding

Step 6

Generate answer: Step 7 to Step 10

In this section, no changes are needed. These steps invoke the watsonx.ai extension for the generation task and handle the errors, if encountered.

Generate answer: Step 11

In this step, you define the Initial_q_res variable, which contains the conversation history between the user and the assistant from the very beginning.

Step 11

You also are defining the Res variable here, which contains the response from the LLM.

Step 11 a

Generate answer: Step 12 and Step 13

In this step, you make another LLM call to summarize the conversation between the user and the assistant.

Step 12 Step 13

Final result

final result

Summary

This article provided an overview and step sections for a process to turn a single-turn chatbot into a multiturn chatbot by using watsonx Assistant and Watson Discovery. It explored how to retrieve relevant documents from Watson Discovery in a stateful manner and make the watsonx Assistant stateful just using action skills.