❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Jensen Huang says the 3 elements of AI scaling are all advancing. Nvidia's Blackwell demand will prove it.

21 November 2024 at 02:00
Jensen Huang and Sam Altman
Scaling has been a key concern for AI leaders.

I-HWA CHENG/AFP via Getty Images; Justin Sullivan/Getty Images

  • Reports on an AI progress slowdown raised concerns about model scaling on Nvidia's earnings call.
  • An analyst questioned if models are plateauing and if Nvidia's Blackwell chips could help.
  • Huang said there are three elements in scaling and that each continues to advance.

If the foundation models driving the panicked rush toward generative AI stop improving, Nvidia will have a problem. Silicon Valley's whole value proposition is the continued demand for more and more computing power.

Concerns about scaling laws started recently with reports that OpenAI's progress in improving its models was slowing. But Jensen Huang isn't worried.

The Nvidia CEO got the question Wednesday, on the company's third-quarter earnings call. Has progress stalled? And could the power of Nvidia's Blackwell chips start it up again?

"Foundation model pre-training scaling is intact and it's continuing," Huang said.

He added that scaling isn't as narrow as many think.

In the past, it may have been true that models only improved with more data and more pre-training. Now, AI can generate synthetic data and check its own answer to β€”in a wayβ€” train itself. But, we're running out of data that hasn't already been ingested by these models, and the impact of synthetic data for pre-training is debatable.

As the AI ecosystem matures, tools for improving models are gaining importance. The first generation of post-training improvement for models came from armies of humans checking AI's responses one by one.

Huang shouted out OpenAI's Strawberry or o1 model, which uses more modern strategies like "chain of thought reasoning" and "multi-path planning." These are both tactics that encourage the models to think longer and in a more step-by-step fashion so that the responses are more considered.

"The longer it thinks, the better and higher quality answer it produces," Huang said.

Pre-training, post-training improvements, and new reasoning strategies all improve models, Huang said. Of course, if the model is doing more computing to answer the same fundamental question, that's where higher-powered compute is necessary β€” especially since users want their responses just as fast, if not faster.

The demand for Blackwell is the result, he said.

After all, the first generation of foundation models took about 100,000 Hopper chips to build. "You know, the next generation starts at 100,000 Blackwells," Huang said. The company said commercial shipments of Blackwell chips are just beginning.

Read the original article on Business Insider

❌
❌