At 11 pm on March 21, Nvidia CEO Jensen Huang’s speech kicked off GTC 2023.
After ChatGPT and GPT-4 set off this wave of generative AI, Nvidia, which provides the heart for AI, became the big winner behind it, and this year’s GTC is destined to become the most watched session ever.
Huang Renxun did not disappoint his followers.
“AI’s iPhone moment has arrived.” In the 70-minute speech, Lao Huang repeated it four or five times.
Every time before he speaks this sentence, he will share a new development about generative AI – a revolution in the fields of creation, medical treatment, industry, etc., allowing ordinary people to train large-scale model cloud services with a browser, and allowing Superchip with 10x reduction in processing costs for large models…
“The development of AI will exceed anyone’s imagination.” This sentence is the best footnote to this speech.
01 Reduce the processing cost of large language models by an order of magnitude
In 2012, Alex Kerchevsky, Ilya Suskever, and their mentor Geoff Hinton trained AlexNet using 14 million images on two GeForce GTX 580s—this is considered the beginning of this round of AI revolution, because it was the first to prove that GPU Can be used to train artificial intelligence.
Four years later, Huang Renxun personally delivered the first NVIDIA DGX supercomputer to OpenAI. In the following years, OpenAI’s breakthrough in large-scale language models brought AIGC into people’s field of vision, and broke the circle completely after the launch of ChatGPT at the end of last year. Within months, this conversational AI product attracted more than 100 million users, making it the fastest growing app in history.
Originally used as a research device for AI, NVIDIA DGX is now widely used by enterprises to optimize data and process AI. According to Jensen Huang, half of Fortune 100 companies have DGX installed.
Among these, deploying LLMs like ChatGPT is becoming an increasingly important work for DGX.In response, Jensen Huang announced a new GPU – the H100 NVL with dual graphics processor NVLink.
Based on NVIDIA’s Hopper architecture, the H100 uses the Transformer engine, which is designed to specifically handle models like GPT. A standard server with four pairs of H100 and NVLINK can process 10 times faster than HGX A100 for GPT-3 processing. According to the official website data, the comprehensive technical innovation of H100. Can speed up large language models by a factor of 30.
“H100 can reduce the processing cost of large language models by an order of magnitude,” Huang said.
Additionally, cloud computing has grown 20% annually over the past decade to become a $1 trillion industry. For AI and cloud computing, Nvidia designed the Grace CPU. Under the new architecture, the GPU is responsible for processing AI workloads, and the Grace CPU is responsible for sampling. The two are connected through a 900 GB/s high-speed transmission device.
“Grace-Hopper is the best choice for processing large-scale data sets.” Huang Renxun said, “Our customers want to build AI large models with several orders of magnitude larger training data, and Grace-Hopper is the ideal engine.”
In a sense, computing cost has become the core issue hindering the development of generative AI today. OpenAI has burned billions or even tens of billions of dollars for this, and Microsoft has never opened it to the wider public due to cost considerations. The new Bing even limits the number of conversations a user can have per day.
Nvidia’s launch of a more efficient computing power solution at this time undoubtedly solved a big problem for the industry.
02 DGX Cloud: Empowering Any Business to Build AI ability
Another focus on generative AI at this year’s GTC is DGX Cloud.
In fact, this is not the first time Nvidia has announced DGX Cloud. When Nvidia’s Four Seasons News was released, Huang Renxun revealed to the outside world that Nvidia will cooperate with cloud service manufacturers, so that customers can use web browsers to use DGX computers through NVIDIA DGX Cloud to train and deploy large language models or complete other tasks. AI workloads.
Nvidia has already cooperated with Oracle, and it is expected that Microsoft Azure will also start hosting DGX cloud in the next quarter, and Google Cloud will also join the ranks in the near future, providing managed services to those enterprises willing to build new products and develop AI strategies Provide DGX cloud service.
Huang Renxun said that this partnership brings Nvidia’s ecosystem into the hands of cloud service providers, while expanding Nvidia’s market size and coverage. Enterprises will be able to rent DGX cloud clusters on a monthly basis, ensuring they can quickly and easily scale large multi-node AI training.
03 ChatGPT is just the beginning
“Accelerated computing is the warp engine, and AI is its energy source,” said Jensen Huang. “The ever-changing capabilities of generative AI have given us a sense of urgency to reimagine its products and business models.”
The large language models represented by ChatGPT and GPT-4 have been popular all over the world in the past few months, but for Nvidia, ChatGPT and large models are not all of AI. At the meeting, Huang Renxun also shared more of Nvidia’s explorations in the field of AI and his own observations.
The first is the hottest generative AI.
Only a hand-drawn sketch is needed to generate a 3D modeled house type.
Writing code is also a breeze.
And making music.
To accelerate the work of those seeking to leverage generative AI, NVIDIA announced the formation of NVIDIA AI Foundations, a cloud service and foundry for those who need to build, improve and customize LLM and generative AI using their proprietary data Training domain-specific AI.
AI Foundations services include NVIDIA NeMo, for building text-to-text generative models; Picasso, a visual-language modeling service for users who want to build models trained on licensed content; and BioNeMo, for biomedical researchers.
as a productivity tool,AI It is also playing a huge value. Huang Renxun introduced several very interesting cases in his speech.
The first is with US telecommunications giant AT&T. AT&T needs to deploy 30,000 technicians on a regular basis to serve 13 million customers in 700 areas. Scheduling is a pain point for this huge amount of data. If it runs on the CPU, scheduling optimization will take an entire night to complete.
With NVIDIA’s CuOpt, AT&T can accelerate the optimization time of its schedule by 100 times and update its schedule in real time.
In a sense, with the help of Nvidia, AT&T has achieved what Internet companies such as Meituan and Didi that need real-time matching have accumulated over the years.
Another example is cooperation with chip companies. With the technological war between China and the United States, most people know the key equipment of the semiconductor industry, the lithography machine. But what is little known is that with the development of process technology, the demand for computing power in chip design is also a major pain point in the semiconductor industry.
Computational lithography is the largest computational workload in chip design and manufacturing today, consuming tens of billions of CPU hours per year and increasing in cost as algorithms become more complex.
In response, Nvidia announced the launch of cuLitho, a computational lithography library. And cooperate with ASML, TSMC and other giants, so as to greatly reduce the computing power consumption in the chip design process, save energy and reduce emissions.
In fact, reducing energy consumption and improving computing efficiency are in the eyes of Huang Renxun AI Another great value that technology will bring to human society.At the moment when Moore’s Law is invalid, the arrival of accelerated computing and AI is just in time.
“Sustainability, generative AI, and digitalization are challenges across industries. Industrial companies are racing to digitize and reinvent themselves as software-driven technology companies—be the disruptors, not the disrupted,” Accelerated Computing Let these companies meet these challenges, Huang Renxun said. “Accelerated computing is the best way to reduce power consumption, achieve sustainability and be carbon neutral.”
Finally, similar to an easter egg, it is not difficult to guess from Lao Huang’s performance in this speech that the Lao Huang appearing in the video should be an avatar throughout the whole process. If there is no accident, it is very likely that it is also a product of “generative AI”-while showing the progress of AI, it can also be regarded as Huang Renxun’s “nuclear bomb computing power” for his own family.