Nvidia Wants to Rewrite the Software Development Stack

At a fundamental level, Nvidia is rethinking the underlying software stack that helps AI generate code that humans need.

Mar 11th, 2024 10:25am by Agam Shah

Feature image via Nvidia.

Nvidia’s CEO Jensen Huang has caused consternation with a recent proclamation that people will not need to learn how to program with advances in AI.

AI can generate code to solve specific problems; that is already proven. But at a fundamental level, Nvidia is rethinking the underlying software stack that helps AI generate code that humans need.

Huang’s idea: For decades, the world has been held hostage to conventional computing around CPUs, in which humans would write applications to retrieve prepared information in databases.

“The way that we do computing today, the information was written by someone, created by someone, it’s basically pre-recorded,” Huang said in a sit-down session last week at Stanford University.

Nvidia’s GPUs opened a path for accelerated computing to a more algorithmic style of computing, in which creative reasoning — not logic — helps determine outcomes.

“Why program in Python? In the future, you will tell the computer what you want,” Huang said.

Programming in the Future

Pundits are predicting that, five years from now, information in the forms of text, images, video, and voice will all be fed real-time to large language models (LLMs). The computer will continuously improve itself from all the information feeds and multimodal interactions.

“In the future, we’ll have continuous learning. We could decide whether that continuous learning result will be deployed,” Huang said. “The way you interact with the computer is not going to be C++,” Huang said.

That is where AI comes in — people will reason and ask computers to generate code to meet specific objectives. That will require people to speak to computers in plain language, not in C++ or Python.

“My point is that programming has changed in a way that is probably less valuable,” Huang said, adding that AI has closed the technology divide of humanity.

“Today, about 10 million people are gainfully employed because we know how to program computers, which leaves the other 8 billion behind. That is not true in the future,” Huang said.

English Is the New Programming Language

Huang said the English language will be the most powerful programming language, and human-scale interaction is a key ingredient in closing the tech gap.

Generative AI will be more of an operating system, and humans can tell computers in plain language to create applications. Large language models (LLMs) will help humans run their ideas through computers, Huang said.

For example, humans are already able to tell LLMs to generate Python code for specific domain applications, and all of it in plain English.

“How do you make a computer do what you want it to do? How do you fine-tune the instructions with that computer? That’s called prompt engineering. There is an artistry to that,” Huang said.

People can focus on knowledge and domain expertise, and generative AI will fill in the programming gap. That will impact the software development landscape, Huang said.

Huang previously likened LLMs to college grads who were pre-trained to be super smart. Nvidia is surrounding large models with specialized knowledge in areas such as health care and finance, which could support enterprises.

There are about $1 trillion worth of data centers, which will more than double over the next four to five years to $4 trillion to $5 trillion, Huang said. Nvidia’s GPUs touch almost every AI installation and application.

Don’t Dismiss Nvidia’s CEO

Huang’s prognostications in the past have paid dividends. He is credited with being an AI pioneer — he steered the engineering of Nvidia GPUs so that decades-old AI theories could be put to work.

Nvidia’s stranglehold over the AI market has pushed the company’s valuation to around $2 trillion, and the company is poised for a historic year after a groundbreaking 2024.

GPU sales catapulted company revenue to $22.1 billion for the fourth quarter, a staggering 265% increase from last year’s same quarter. Revenue for 2024 was up by 126% to $60.9 billion compared to 2023.

In the early 2000s, Nvidia was hawking GPUs for gaming. Huang realized that vector processing units could be used for larger modeling and simulations needed in scientific computing. He created the CUDA software stack in 2007 for accelerated computing, and it is now a central ingredient in Nvidia’s AI dominance.

Nvidia App Dev Approach

AI makes it possible for users to talk with different types of data in the form of text, images, and voice. The different data types need a new software stack and accelerators like GPUs to work reasonably well.

NVidia’s CUDA GPU drrver software provides the core foundation of tools to communicate with the GPU. It includes a programming model, development tools, and a large array of libraries. AI developers are using the CUDA primitives to exploit Nvidia GPU capabilities.

CUDA also has tools that automate coding for people to run applications on GPUs. Nvidia is creating universal translators that can take in queries, run a few lines of Python code, and pass it through the selected AI models.

Nvidia’s CUDA is disassembling traditional software development models where applications were written for CPUs. The AI landscape has new types of data, algorithms, and compute engines, and the GPU replaces the CPU, which is ill-equipped at handling complex problems.

But there are similarities between Nvidia’s AI stack and the so-called x86 Wintel platforms. If an AI was trained on an Nvidia GPU, it will also mostly require Nvidia hardware for inferencing. But that could change as AI companies like Microsoft and Meta start deploying their own AI hardware.

Nvidia’s Structure Lines up

Nvidia’s business structure reflects the way it expects AI to supplement human interaction with computers: by the data types and domain knowledge.

The company has pre-built CUDA tools to work with all types of models. For example, it has an auto business that includes all the hardware and software components needed for companies to build autonomous cars. Its health business helps doctors use AI to interact with medical data by fusing images, patient reports, and voice inputs.

Nvidia calls its AI Enterprise Suite the “AI operating system.” The software includes LLMs like NeMo, compilers, libraries, and development stacks. But companies will need Nvidia’s GPUs.

The stack is populated with additional intermediate steps that address some of AI’s thorny issues. For example, a tool called Guardrails can analyze LLM output to prevent hate speech and keep conversations on track. That depends on the rules set out by the owner. These applications can be developed using the LangChain framework.

Nvidia’s larger goal with its stack is to get rid of the command line altogether and provide interactive prompting techniques to interact with databases. That does not have much to do with the software stack, but it plays a role in how search is changing to provide more relevant information — the what, how, when, and why — to users.

Huang is selling subscription packages for its AI software as the company switches gears to a software-sells-hardware strategy, a complete flip of its past hardware-sells-software strategy. Nvidia hopes to sell more software that runs only on its GPUs.

Developer Impact

Huang said programmers will still be needed for its CUDA framework, and for general-purpose computing applications that do not need GPUs.

But his message was clear: the future is AI, and developers need to quickly adapt their skillset to the changing landscape.

Nvidia has come up with the concept of an AI factory, which ingests data as raw materials and spits out processed data as the final product. Nvidia has established solid partnerships with all cloud providers and software providers like Google, Snowflake, Salesforce, Oracle, and VMware.

Nvidia is a lone wolf trying to change the software stack with its proprietary hardware and software platform. But rivals are catching up fast — AMD’s ROCm and Intel’s OneAPI are open-source options that are gaining traction. Google is developing its own software and hardware stack to power its AI infrastructure.

Nvidia’s next developer conference, GTC, will be coming later this month. There are basic seminars on how to write CUDA programs, sessions about AI implementations from companies like X (formerly known as Twitter), and talks about opportunities for developers in AI.

Huang’s keynote will lead off the show.

Agam Shah has covered enterprise IT for more than a decade. Outside of machine learning, hardware and chips, he's also interested in martial arts and Russia.