This is a cache of https://developer.ibm.com/tutorials/awb-governing-private-llm-deployments-on-ibm-cloud/. It is a snapshot of the page as it appeared on 2026-02-02T13:57:25.592+0000.
Governing private LLM deployments hosted on IBM Cloud Virtual Servers using watsonx.governance - IBM Developer
This content is no longer being updated or maintained. The content is provided “as is.” Given the rapid evolution of technology, some content, steps, or illustrations may have changed.
The need for organizations to maintain privacy and run large language models (LLMs) in controlled environments is increasing. One way to achieve this is by hosting models on cloud servers and managing compute, network, and storage resources independently, rather than relying on Platform as a service (PaaS) services. IBM Cloud offers several GPU profiles that can be used for training and inferencing LLMs. Once these models are hosted privately, effective governance can be achieved through watsonx.governance, allowing organizations to evaluate prompts while maintaining the privacy of their hosted models.
watsonx.governance overview
IBM watsonx.governance is designed to help organizations direct, manage, and monitor their artificial intelligence (AI) activities effectively. With watsonx.governance, you can:
Govern generative AI and machine learning (ML) models from any vendor.
Evaluate and monitor model health, accuracy, drift, bias, and generative AI quality.
Access governance, risk, and compliance capabilities, including workflows with approvals, customizable dashboards, risk scorecards, and reports.
Use factsheet capabilities to automatically collect and document model metadata across the AI model lifecycle.
watsonx.governance's compliance features enable you to manage AI in alignment with compliance policies. Its risk management capabilities allow you to proactively detect and mitigate risks, such as fairness, bias, and drift. The lifecycle governance features help you manage, monitor, and govern AI models from IBM, open-source communities, and other model providers.
In this tutorial, you'll learn how to monitor private endpoints using watsonx.governance. We will deploy the LLaMA 7B and Mistral 7B models via Hugging Face on a Virtual Server Instance (VSI) on IBM Cloud, and demonstrate how to configure their private endpoints for monitoring with watsonx.governance. Additionally, we'll cover topics within watsonx.governance, focusing on real-time monitoring of privately deployed LLMs.
Prerequisites
To follow this tutorial, you will need an IBM Cloud account.
Estimated time
This tutorial should take approximately 20-30 minutes to complete.
Steps
There are three main steps to governing private LLM deployments:
Set up services on IBM Cloud to run watsonx.
Deploy an LLM on a Virtual Server Instance (VSI) on IBM Cloud.
Monitor detached prompt templates using watsonx.governance.
Set up services on IBM Cloud to run watsonx
Step 1: Set up your IBM Tech Zone environment
To begin, you can reserve an IBM Cloud instance from any collection on TechZone.
If you are enrolled in a bootcamp, you will likely be working in a shared instance. You should have received an email from noreply@techzone.ibm.com with the subject line “[EXTERNAL] A reservation has been shared with you on IBM Technology Zone.”
Accept the invitation to the instance.
Join the instance by clicking the HERE link in the email where it says, “Please go HERE to accept your invitation” or by clicking Join now as shown in the image below.
Once the page displays, check the box as indicated in the image, then click Join Account.
Step 2: Provision a Watson Studio instance
Navigate to the IBM Cloud catalog to provision an instance of IBM Watson Studio.
Enter Watson Studio in the search bar.
Select the region Dallas and choose the Lite plan.
Specify a unique instance name, add your name to the tags, and click Create.
Once the service is successfully provisioned, launch the service with watsonx.
Step 3: Create a new Project
To create a new project:
In the Projects section at the bottom of the page, click the + symbol to create a new project.
Enter a unique name for your project, including both your first and last name, along with any other relevant information.
Associate Cloud Object Storage:
If you do not already have a Cloud Object Storage (COS) instance, click on the link and create a new instance.
If a Cloud Object Storage (COS) instance is already selected for you (with a name starting with itzcos-...), you do not need to make any changes.
If prompted to select from multiple instances, please consult with your bootcamp lead to choose the correct COS instance.
Click Create. It may take a few seconds for the project to be officially created.
Step 4: Associate the correct Watson Machine Learning (WML) instance
After your project is created, you will be directed to the project home page.
Select the Manage tab.
In the left sidebar, click on Services and Integrations, then click on Associate service.
Select the service with Type = Watson Machine Learning and click Associate.
Note: If you cannot find the service, try removing all filters from the Locations dropdown list.
Step 5: Create an instance of watsonx.governance
Click on the Navigation menu and select Services catalog under Administration.
In the service catalog, select watsonx.governance.
Specify the required details and click Create.
Step 6: Verify that an inventory is created
Now that you have created a project, you’ll need to create an inventory for the next part of the tutorial.
Navigate to the watsonx home page, and in the left sidebar, click on AI Governance dropdown, then select AI use cases.
If you are directed to a page that says No inventories available yet, click Manage inventories and create a new one.
If you see some use cases listed on the page, an inventory has already been created, and you are all set! You can proceed with Lab 1.
Note: If you want to create a fresh inventory for the bootcamp, click the gear icon at the top, navigate to Inventories, and follow the same instructions as above.
When creating your inventory, be sure to give it a unique name and select the correct Cloud Object Storage (COS) instance.
With the project and inventory set up, you are now ready to deploy an LLM on a virtual server instance on IBM Cloud.
Deploy an LLM on a Virtual Server Instance on IBM Cloud
Set up the environment
In this tutorial, you will use IBM Cloud to provision and set up the following components:
When provisioning a VSI, the following resources will be automatically created:
A Virtual Private Cloud (VPC) attached to it
A security group governing the VPC
A Floating IP in the same region to expose your application to the internet
Step 1: Provision a Virtual Server Instance for VPC
Go to the IBM Cloud Catalog and search for Virtual Server for VPC.
Choose the following configuration options and click Create:
Image: Select ibm-ubuntu-22-04-4-minimal-amd64-1.
Profile: Select a balanced profile with 8 vCPU and 32 GB RAM.
Specify a name for your instance.
Generate an SSH key specific to the system you will use to log in. Click Create SSH Key, name the key, and then click Create. The key will be automatically generated and downloaded for you.
Choose the Virtual Network Interface for Networking and let it create one for you. Click on Create VPC to create a VPC network.
After setting up the configurations, click on Create. Wait for the provisioning process to complete.
Step 2: Set up the networking
Before you can access the cluster, you need to configure the networking settings.
Step 2a: Obtain a Floating IP
Your instance does not come with a Floating IP by default, you'll need to get one.
From the IBM Cloud navigation menu, select Floating IP.
Click on Reserve. Ensure that you select the same zone as the one you chose for your instance. This will allow the instance to appear in the resource-to-bind drop-down list.
Click Reserve to assign the Floating IP to your instance.
Step 2b: Allow inbound traffic on Port 80
Your application is deployed on port 80, allowing it to be accessed directly without redirection. Although the firewall settings are largely managed by the deployment script, you will need to manually add an inbound rule to the security group.
Go to the Security Groups section for your VPC in the IBM Cloud dashboard.
Choose the security group that is linked to your instance. You can identify it by checking the tags in the instance details.
Within the security group, navigate to Rules and add a new TCP Inbound Rule to permit traffic on port 80.
Step 3: Access the cluster
Open the terminal and navigate to the directory where your SSH key was downloaded.
Run the following command to set the appropriate permissions, ensuring that the key can be used for SSH access:
chmod 400 <path to your pem file>
Copy codeCopied!
Use the command below to connect to your cluster. Replace <path-to-your-pem-file> with the actual path to your SSH key and <Floating IP> with the IP address of your instance:
ssh -i <pathto your pem file> root@<Floating IP>
Copy codeCopied!
When prompted, type yes to add the server to your list of known hosts.
Step 4: Run the script to install libraries
To streamline the setup process, there's a script that will install all necessary libraries, tools, and a Flask application to provide UI or API access to the model. Follow these steps:
Open a terminal window and run the following command to clone the repository:
Execute the script to begin the installation process:
./install.sh
Copy codeCopied!
Wait for the setup to complete. This may take a few minutes. Once you see a completion message, the setup is complete.
Note: If prompted for manual input, especially during the update-initramfs step, press Enter until you return to the console. The process will then continue automatically.
Step 5: Verify the endpoint
After the script completes, the Flask application should be running on your VM.
Congratulations! You have successfully deployed an LLM on a VSI instance.
Monitoring detached prompt templates using watsonx.governance
Follow these steps to set up governance for the prompts for the model deployed on the Virtual Server Instance.
Begin by downloading the required Git repository to your local computer.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.