This is a cache of https://www.zdnet.com/article/ibm-open-sources-its-granite-ai-models-and-they-mean-business/. It is a snapshot of the page at 2024-05-14T01:11:33.617+0000.
IBM open-sources its Granite AI models - and they mean business | ZDNET
X
Innovation

IBM open-sources its Granite AI models - and they mean business

Many companies claim to have open-sourced their LLMs, but IBM actually did it. Here's how.
Written by Steven Vaughan-Nichols, Senior Contributing Editor
models56gettyimages-2148123501
BlackJack3D/Getty Images

Open-sourcing large language models (LLMs) isn't easy. Just ask the Open Source Initiative (OSI), which has been working on an AI-compatible open-source definition for nearly two years. Some companies -- Meta, for example -- claim to have open-sourced their LLMs. (They haven't.) But, now IBM has gone ahead and done it

IBM managed the open sourcing of Granite code by using pretraining data from publicly available datasets, such as GitHub Code Clean, Starcoder data, public code repositories, and GitHub issues. In short, IBM has gone to great lengths to avoid copyright or legal issues. The Granite Code Base models are trained on 3- to 4-terabyte tokens of code data and natural language code-related datasets. 

Also: Why open-source generative AI models are still a step behind GPT-4

All these models are licensed under the Apache 2.0 license for research and commercial use. It's that last word -- commercial -- that stopped the other major LLMs from being open-sourced. No one else wanted to share their LLM goodies. 

But, as IBM Research chief scientist Ruchir Puri said, "We are transforming the generative AI landscape for software by releasing the highest performing, cost-efficient code LLMs, empowering the open community to innovate without restrictions."

Without restrictions, perhaps, but not without specific applications in mind. 

The Granite models, as IBM ecosystem general manager Kate Woolley said last year, are not "about trying to be everything to everybody. This is not about writing poems about your dog. This is about curated models that can be tuned and are very targeted for the business use cases we want the enterprise to use. Specifically, they're for programming."

These decoder-only models, trained on code from 116 programming languages, range from 3 to 34 billion parameters. They support many developer uses, from complex application modernization to on-device memory-constrained tasks.

IBM has already used these LLMs internally in IBM Watsonx Code Assistant (WCA) products, such as WCA for Ansible Lightspeed for IT Automation and WCA for IBM Z for modernizing COBOL applications. Not everyone can afford Watsonx, but now, anyone can work with the Granite LLMs using IBM and Red Hat's InstructLab

Also: The best AI chatbots: ChatGPT and alternatives

As Red Hat SVP and chief product officer Ashesh Badani said, InstructLab will "lower many of the barriers facing GenAI across the hybrid cloud, from limited data science skills to the sheer resources required." The point is to lower the entry level for developers who want to use LLMs. 

How low? As Matt Hicks said at the Red Hat Summit, "Capabilities that, just a year ago, were coupled to high-end, fairly exotic hardware can now run on a laptop. Training techniques that once ran in the hundreds of millions of dollars are now being replicated for a few thousand." 

For example, besides InstructLab, you can use Ollma to run LLMs locally. As Bala Priya C explains in KDnuggets, "With Ollama, everything you need to run an LLM -- model weights and all of the config -- is packaged into a single Modelfile. Think Docker for LLMs." The models are available on platforms like Hugging Face, GitHub, Watsonx.ai, and Red Hat Enterprise Linux (RHEL) AI.

IBM anticipates that programmers, in addition to writing code with the Granite LLMs, will save time and energy by using these LLMs to create tests and find and fix bugs. "Many of the quotidian but essential tasks that are part of a developer's day -- from generating unit tests to writing documentation or running vulnerability tests -- could be automated with these models.  

Also: AI21 and Databricks show open source can radically slim down AI

Besides helping developers, IBM sees business benefits in Granite models because, unlike many others, their licensing is clear, as is how the models have been trained. In addition, the data has been cleaned and filtered for hate, abuse, and profane language.

So, if your company has hesitated to explore using AI to build programs for legal reasons, IBM has just provided you with the open-source tools you'll need to improve your software development work. Give them a try. Some of you will build great things from these Granite blocks. 

Editorial standards