❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Jensen Huang shot down comparisons to Elon Musk and yelled at his biographer. The author told BI what Huang is like.

8 April 2025 at 10:32
Jensen Huang standing in leather jacket
Nvidia CEO Jensen Huang.

Artur Widak/NurPhoto

  • Nvidia CEO Jensen Huang likes to conduct intense, public examinations of his team's work.
  • Stephen Witt's book about Huang and Nvidia debuted in the US on Tuesday.
  • Witt experienced Huang's ire when he brought up the more sci-fi-adjacent potential for AI.

At Nvidia, getting a dressing down from CEO Jensen Huang is a rite of passage.

The CEO has confirmed this in interviews before, but the writer Stephen Witt can now speak from experience. Witt is the author of "The Thinking Machine: Jensen Huang, Nvidia, and the World's Most Coveted Microchip," which chronicles the CEO's life and career and Nvidia's historic rise from a background player to a star of the AI revolution.

Witt describes a fair bit of yelling throughout Nvidia's history.

The company's culture is demanding. Huang prefers to pick apart the team's work in large meetings so that the whole group can learn. Witt's book delves into not just what Nvidians have done but how they think β€” or don't think β€” about what their inventions will bring in the grander scheme of history.

A book in Mandarin with a picture of Jensen Huang on the cover is displayed in a bookstore.
Stephen Witt's "The Thinking Machine" in a bookstore in Taipei, Taiwan. The book was first released in Asia and released in the US on Tuesday.

Robert Smith

In the final scene of the book, which was already available in Asia and was released in the US on Tuesday, Witt interviewed Huang last year in a room covered in whiteboards detailing Nvidia's past and future. Huang was visibly tired, having just wrapped up the company's nonstop annual conference. After a series of short, curt responses, Witt played a clip from 1964 of the science-fiction writer Arthur C. Clarke musing that machines will one day think faster than humans, and Huang changed entirely.

Witt wrote that he felt like he had hit a "trip wire." Huang didn't want to talk about AI destroying jobs, continue the interview, or cooperate with the book.

Witt told Business Insider about that day and why Huang sees himself differently than other tech titans like Tesla's Elon Musk and OpenAI's Sam Altman. Nvidia declined to comment.

This Q&A has been edited for clarity and length.

At the end of the book, Huang mentions Elon Musk and the difference between them. You asked him to grapple with the future that he's building. And he said, "I feel like you're interviewing Elon right now, and not me." What does that mean?

I think what Jensen is saying is that Elon is a science-fiction guy. Almost everything he does kind of starts with some science-fiction vision or concept of the future, and then he works backward to the technology that he'll need to put in the air.

In the most concrete example, Elon wants to stand on the surface of Mars. That's kind of a science-fiction vision. Working backward, what does he have to do today to make that happen?

Jensen is exactly the opposite. His only ambition, honestly, is that Nvidia stays in business. And so he's going to take whatever is in front of him right now and build forward into the farthest that he can see from first principles and logic. But he does not have science-fiction visions, and he hates science fiction. That is actually why he yelled at me. He's never read a single Arthur C. Clarke book β€” he said so.

He's meeting Elon Musk, Sam Altman, and other entrepreneurs in the middle. They're coming from this beautiful AGI future. Jensen's like, "I'm going to just build the hardware these guys need and see where it's going." Look at Sam Altman's blog posts about the next five stages of AI. It's really compelling stuff. Jensen does not produce documents like that, and he refuses to.

So, for instance, last month, Musk had a livestreamed Tesla all-hands where he talked about the theory of abundance that could be achieved through AI.

Exactly. Jensen's not going to do that. He just doesn't speculate about the future in that way. Now, he does like to reason forward about what the future is going to look like, but he doesn't embrace science-fiction visions. Jensen's a complicated guy, and I'm not still completely sure why he yelled at me.

This is hard to believe, but I guarantee you it is true. He hates public speaking, he hates being interviewed, and he hates presenting onstage. He's not just saying that. He actually β€” which is weird, because he's super good at it β€” hates it, and he gets nervous when he has to do it. And so now that GTC has become this kind of atmosphere, it really stresses him out.

A white, bald man (Stephen Witt) wearing a grey shirt looks directly into the camera
Stephen Witt is the author of "The Thinking Machine: Jensen Huang, Nvidia, and the World's Most Coveted Microchip."

Stephen Witt

Earlier in the book, Huang flippantly told you that he hopes he dies before the book comes out. The comment made me think about who might succeed 62-year-old Huang. Did you run into any concrete conversations about a succession plan?

He can't do it forever, but he's in great shape. He's a bundle of energy. He's just bouncing around. For the next 10 years, at least, it's going to be all Jensen.

I asked them, and they said they have no succession plan. Jensen said: "I don't have a successor."

Jensen's org chart is him and then 60 people directly below him. I say this in the book β€” he doesn't have a second in command. I know the board has asked this question. They didn't give me any names.

You describe in the book how you were a gamer and used Nvidia graphic cards until you very consciously stopped playing out of worry you were addicted. Did Nvidia just fall off your radar for 10 to 15 years after that? How did you end up writing this book?

This is an interesting story. I should have put this in the book. I bought Nvidia stock in the early 2000s and then sold it out of frustration when it went up.

I basically mirrored [Nvidia cofounder] Curtis Priem's experience and sold it in 2005 or 2006 β€” which looked like a great trade for seven years because it went all the way back down. I was like, "Oh, thank God I sold that," because it went down another 90% after that.

I probably broke even or lost a small amount of money. I have worked in finance and one of the counterintuitive things that people don't understand about finance is the best thing you can do for your portfolio is sell your worst-performing stock because you get tax advantages.

So I was aware of kind of the sunk-cost fallacy, and it looked like a great trade. Then I paid no attention to the company for 17 years. It wasn't until ChatGPT came along that I even paid attention to them coming back. And I was like, wait β€” what's going on with Nvidia? Why is this gaming company up so much? I started researching, and I realized these guys had a monopoly on the back end of AI.

I was like, "Oh, I'll just take Jensen and pitch him to The New Yorker." I honestly thought the story would be relatively boring. I was shocked at what an interesting person Jensen is. I thought for sure when I first saw the stock go up, they must have some new CEO doing something interesting.

To my great surprise, I learned that Jensen was still in charge of the company and in fact, at that point, was the single longest-serving tech CEO in the S&P 500.

I was like, it's the same guy? Doing the same thing? And then Jensen was so much more compelling of a character than I ever could hope for.

Read the original article on Business Insider

AI models get stuck 'overthinking.' Nvidia, Google, and Foundry could have a fix.

8 April 2025 at 10:01
A phone screen shows the two apps of ChatGPT and DeepSeek.
OpenAI's ChatGPT o1 and DeepSeek's R1 models could benefit from answering the same question repeatedly and picking the best answer.

picture alliance/dpa/Getty Images

  • Large language models like DeepSeek's R1 are "overthinking," affecting their accuracy.
  • Models are trained to question logic, but overthinking can lead to incorrect answers.
  • A new open-source framework aims to fix this and could give a glimpse of where AI is headed.

Large language models β€” they're just like us. Or at least they're trained to respond like us. And now they're even displaying some of the more inconvenient traits that come along with reasoning capabilities and "overthinking."

Reasoning models like OpenAI's o1 or DeepSeek's R1, have been trained to question their logic and check their own answers. But if they do that for too long, the quality of the responses they generate starts to degrade.

"The longer it thinks, the more likely it is to get the answer wrong because it's getting stuck," Jared Quincy Davis, the founder and CEO of Foundry, told Business Insider. Relatable, no?

"It's like if a student is taking an exam and they're taking three hours on the first question. It's overthinking β€” it's stuck in a loop," Davis continued.

Davis, along with researchers from Nvidia, Google, IBM, MIT, Stanford, DataBricks, and more, launched an open-source framework Tuesday called Ember that could portend the next phase of large language models.

Overthinking and diminishing returns

The concept of "overthinking" might seem to contradict another big break in model improvement: inference-time scaling. Just a few months ago, models that took a little more time to come up with a more considered response were touted by AI luminaries like Jensen Huang as the future of model improvement.

Reasoning models and inference-time scaling are still huge steps, Davis said, but future developers will likely think about using them differently.

Davis and the Ember team are formalizing a structure around a concept that he and other AI researchers have been playing with for months.

Nine months ago β€” an eon in the machine learning world β€” Davis described his hack of asking, referred to as "calling," ChatGPT 4 the same question many times and taking the best of the responses.

Now, Ember's researchers are taking that method and supercharging it, envisioning compound systems wherein each question or task would call a patchwork of models, all for different amounts of thinking time, based on what's optimal for each model and each question.

"Our system is a framework for building these networks of networks where you want to, for example, compose many, many calls into some broader system that has its own properties. So this is like a new discipline that I think jumped from research to practice very quickly," Davis said.

In the future, the model will choose you

When humans overthink, therapists might tell us to break problems down into smaller pieces and address them one at a time. Ember starts with that theory, but the similarity ends pretty quickly.

Right now, when you log into Perplexity or ChatGPT, you choose your model with a dropdown menu or toggle switch. Davis thinks that won't be the case for much longer as AI companies seek better results with these more complex strategies of routing questions through different models with different numbers and lengths of calls.

"You can imagine, instead of being a million calls, it might be a trillion calls or quadrillion calls. You have to sort the calls," Davis said. "You have to choose models for each call. Should each call be GPT 4? Or should some calls be GPT 3? Should some calls be Anthropic or Gemini, and others call DeepSeek? What should the prompts be for each call?"

It's thinking in more dimensions than the binary question-and-answer we've known. And it's going to be particularly important as we move into the era of AI agents where models perform tasks without human intervention.

Davis likened these compound AI systems to chemical engineering.

"This is a new science," he said.

Have a tip or an insight to share? Contact Emma at ecosgrove@businessinsider.com or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

Building AI is about to get even more expensive — even with the semiconductor tariff exemption

7 April 2025 at 02:25
semiconductor
While the wafers that power AI chips are exempt from tariffs other components are not.

Michael Buholzer/Reuters

  • Most semiconductors are tariff-exempt, but full products with chips inside may face tariffs.
  • Without supply chain shifts, Nvidia's non-exempt AI products could see cost impacts from tariffs.
  • On-shoring assembly of chips may mitigate tariff effects, but increase costs.

Most Semiconductors, the silicon microchips that run everything from TV remote controls to humanoid robots are exempt from the slew of tariffs rolled out by the Trump administration last week. But that's not the end of the story for the industry which also powers the immense shift in computing toward artificial intelligence that's already underway, led by the US.

There are roughly $45 billion worth of semiconductors (based on 2024 totals gathered by Bernstein), that remain tariff-free β€” $12 billion of which comes from Taiwan, where AI chip leader Nvidia manufactures. But, the AI ecosystem requires much more than chips alone.

Data centers and the myriad materials and components required to generate depictions of everyone as an anime character are not exempt. For instance, an imported remote-controlled toy car with chips in both components would need an exception for toys, to avoid fees.

"We still have more questions than answers about this," wrote Morgan Stanley analysts in a note sent to investors Thursday morning. "Semiconductors are exempt. But what about modules? Cards?"

As of Friday morning, analysts were still scratching their heads as to the impact, despite the exemption.

"We're not exactly sure what to do with all this," wrote Bernstein's analysts. "Most semiconductors enter the US inside other things for which tariffs are likely to have a much bigger influence, hence secondary effects are likely to be far more material."

AI needs lots of things that aren't exempt

Nvidia designs chips and software, but what it mainly sells are boards, according to Dylan Patel, chief analyst at Semianalysis. Boards contain multiple chips, but also power delivery controls, and other components to make them work.

"On the surface, the exemption does not exempt Nvidia shipments as they ship GPU board assemblies," Patel told Business Insider. "If accelerator boards are excluded in addition to semiconductors, then the cost would not go up much," he continued.

These boards are just the beginning of the bumper crop of AI data centers in the works right now. Server racks, steel cabinets, and all the cabling, cooling gear, and switches to manage data flow and power are mostly imported.

A spokesperson for AMD, which, like Nvidia, produces its AI chips in Taiwan, told BI the company is closely monitoring the regulations.

"Although semiconductors are exempt from the reciprocal tariffs, we are assessing the details and any impacts on our broader customer and partner ecosystem," the spokesperson said in an email statement.

Nvidia declined to comment on the implications of the tariffs. But CEO Jensen Huang got the question from financial analysts last month at the company's annual GTC conference.

"We're preparing and we have been preparing to manufacture onshore," he said. Taiwanese manufacturer TSMC has invested $100 billion in a manufacturing facility in Arizona.

"We are now running production silicon in Arizona. And so, we will manufacture onshore. The rest of the systems, we'll manufacture as much onshore as we need to," Huang said. "We have a super agile supply chain, we're manufacturing in so many different places, we could shift things," he continued.

In addition to potentially producing chips in the US, it's plausible that companies, including Nvidia, could do more of their final assembly in the US, Chris Miller, the author of "Chip War" and a leading expert on the semiconductor industry told BI. Moving the later steps of the manufacturing process to un-tariffed shores, which right now include Canada and Mexico as well as the US, could theoretically allow these companies to import bare silicon chips and avoid levies. But that change would come with a cost as well, Miller said.

With retaliatory tariffs rolling in, US manufacturers could find tariffs weighing down demand in international markets too.

Supply chain shifts and knock-on effects

Semiconductor industry veteran Sachin Gandhi just brought his startup Retym out of stealth mode last week, with a semiconductor that helps data move between data centers faster. Technology like his has been most relevant to the telecom industry for decades and is now finding new markets in AI data centers.

Retym's finished product is exempt from tariffs when it enters the US, but the semiconductor supply chain is complex. Products often cross borders while being manufactured in multiple steps, packaged, tested, and validated, and then shipped to the final destination.

A global tariff-rich environment will probably bring up his costs in one way or another, Gandhi told BI. End customers like hyperscalers and the ecosystem of middlemen who bundle all these elements together and sell them will figure out how to cover these costs without too much consternation to a point, he said.

"Demand is not particularly price sensitive," wrote Morgan Stanley analyst Joe Moore Thursday.

AI is already an area where companies appear willing to spend with abandon. But, it's also maturing. Now, when companies are working to put together realistic estimates for normal business metrics like return on investment, unit economics, and profitability, tariffs risk pushing that down the road, potentially years.

Have a tip or an insight to share? Contact Emma at ecosgrove@businessinsider.com or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

AI has ushered in a new kind of hacker

1 April 2025 at 02:00
Male hacker coding.
AI is offering hackers new openings.

GettyImages/ Hero Images

  • Hackers are using new AI models to infiltrate companies with old tricks.
  • Open-source models are gaining popularity, but raise the bar for cybersecurity.
  • Researchers scoured Hugging Face for malicious models and found hundreds.

AI doomsayers continue to worry about the technology's potential to bring about societal collapse. But the most likely scenario for now is that small-time hackers will have a field day.

Hackers usually have three main objectives, Yuval Fernbach, the chief technology officer of machine learning operations at software supply chain company JFrog, told Business Insider. They shut things down, they steal information, or they change the output of a website or tool.

Scammers and hackers, like employees of any business, are already using AI to jump-start their productivity. Yet, it's the AI models themselves that present a new way for bad actors to get inside companies since malicious code is easily hidden inside open-source large language models, according to Fernbach.

"We are seeing many, many attacks," he said. Overloading a model so that it can no longer respond is particularly on the rise, according to JFrog.

"It's quite easy to get to that place of a model not responding anymore," Fernbach said. Industry leaders are starting to organize to cut down on malicious models. JFrog has a scanner product to check models before they go into production. But to some extent the responsibility will always be on each company.

Malicious models attack

When businesses want to start using AI, they have to pick a model from a company like OpenAI, Anthropic, or Meta, as most don't go to the immense expense of building one in-house from scratch. The former two firms offer proprietary models so the security is somewhat more assured. But proprietary models cost more to use and many companies are wary of sharing their data.

Going with an open-source model from Meta or any one of the thousands available is increasingly popular. Companies can use APIs or download models and run them locally. Roughly half of the companies in a recent survey of 1,400 businesses by JFrog and InformationWeek were running downloaded models on their own servers.

As AI matures, companies are more likely to stitch together multiple models with different skills and expertise. Thoroughly checking them all for every update runs counter to the fast, free-wheeling AI experimentation phase of companies, Fernbach said.

Each new model, and any updates of data or functionality down the road, could contain malicious code or simply a change to the model that impacts the outcome, Fernbach said.

The consequences of complacency can be meaningful.

In 2024, a Canadian court ordered Air Canada to give a bereavement discount to a traveler who had been given incorrect information on how to obtain the benefit from the company's chatbot, even after human representatives of the airline denied it. The airline had to refund hundreds of dollars and cover legal fees. At scale, this kind of mistake could be costly, Fernbach said. For example, Banks are already concerned that generative AI is advancing faster than they can respond.

An man in a black t-shirt has a conversation in a grey room with a JFrog logo on a screen in the background.
JFrog CTO Yuval Fernbach speaks at a company event in New York City in March 2025.

JFrog

To find out the scale of the problem, JFrog partnered with online repository for AI models, Hugging Face, last year. Four hundred of the more than 1 million models contained malicious code β€” less than 1% and about the same chance as landing four of a kind in a five-card hand of poker.

Since then, JFrog estimates that while the number of new models has increased three-fold, attacks increased seven-fold.

Adding insult to injury, many popular models often have malicious imposters whose names are slight misspellings of authentic models that tempt hurried engineers.

Fifty-eight percent of companies polled in the same survey either had no company policy around open-source AI models or didn't know if they had one. And 68% of responding companies had no way to review developers' model usage other than a manual review.

With agentic AI on the rise, models will not only provide information and analysis but also perform tasks, and the risks could grow.

Read the original article on Business Insider

Don't overthink CoreWeave's IPO. It is a bellwether — just not for all of AI.

28 March 2025 at 14:57
Men, women, and children stand behind a podium which reads: Nasdaq and CRWV listed" in front of. ascreen that reads 
Coreweave" as confetti drops from the ceiling.
Mike Intrator, Chief Executive Officer and founder of CoreWeave, (C) rings the opening bell surrounded by Executive Leadership and family during the company's Initial Public Offering (IPO) at the Nasdaq headquarters on March 28, 2025 in New York City.

Michael M. Santiago/Getty Images

  • CoreWeave's Nasdaq debut saw shares fall below their IPO price, raising market concerns.
  • CoreWeave is the first US pure-play AI public offering, relying heavily on Nvidia GPUs.
  • The IPO tests the neocloud concept, with implications for AI's future and Nvidia's role.

CoreWeave listed on the Nasdaq Friday amid a shifting narrative and much anticipation. The company priced its IPO at $40 per share. The stock flailed, opening at $39 per share, then falling as much as 6% and ending the day back up at $41.59.

The cloud firm, founded in 2017, is the first pure-play AI public offering in the US. CoreWeave buys graphics processing units from Nvidia and then pairs them with software to give companies an easy way to access the GPUs and achieve the top performance of their AI products and services.

The company's financial future is dependent on two unknowns β€” that the use and usefulness of AI will grow immensely, and that those workloads will continue to run on Nvidia GPUs.

It's no wonder that the listing has often been described as a bellwether for the entire AI industry.

But CoreWeave's specific business has some contours that could be responsible for Friday's ambivalent debut without passing judgment on AI as a whole.

CoreWeave customers are highly concentrated and its suppliers are even more so. The company is highly leveraged, with billions in debt, collateralized by GPUs. The future obsolescence of those GPUs is looming.

Umesh Padval, managing director of Thomvest expects the pricing for the GPU computing CoreWeave offers to go down in the next 12 to 18 months as GPU supply continues to improve, which could challenge the company's future profitability.

"In general, it's not a bellwether in my opinion," Padval told Business Insider.

Beyond opening day

So what does it mean that CoreWeave's debut didn't rise to meet hopes and expectations?

Karl Mozurkewich, principal architect at cloud firm Valdi told BI the Friday IPO is more of a test for the neocloud concept than for AI. Neoclouds are a term used to describe young public cloud providers that solely focus on accelerated computing. They often use Nvidia's preferred reference architecture and, in theory, demonstrate the best possible performance for Nvidia's hardware.

Nvidia CEO Jensen Huang gave the buch a shoutout at the company's tentpole GTC conference last week.

"What they do is just one thing. They host GPUs," Huang said to an audience of nearly 18,000. "They call themselves GPU clouds, and one of our great partners, CoreWeave, is in the process of going public and we're super proud of them."

CoreWeave's public market performance will signal what shape the future could take for these companies, according to Mozurkewich. Will more companies try to replicate the GPU-cloud model? Will Nvidia seed more similar businesses? Will it continue to give neoclouds early access to new hardware?

"I think the industry is very interested to see if the shape of CoreWeave is long-term successful," Mozurkewich said.

Daniel Newman, CEO of the Futurum Group, told BI that CoreWeave is "one measuring point of the AI trade; it isn't entirely indicative of the overall AI market or AI demand." He added the company has the opportunity to improve its fate as AI scales and the customer base grows and diversifies.

Lucas Keh, Semiconductors Analyst at Third Bridge agreed.

"Currently, more than 70% of CoreWeave's revenue comes from hyperscalers, but our experts expect this concentration to decrease 1β€”2 years after an IPO as the company diversifies its customer base beyond public cloud customers," Keh said via email.

Having a handful of large, dominant enterprise customers is not uncommon for a young provider like CoreWeave, Mozurkewich said. But it's also no surprise that it could concern investors.

"This is where CoreWeave has a chance to shine as AI and the demand for AI spans beyond the big 7 to 10 names. The caveat will be how stable GPU prices are as availability increases and competition increases," Newman said.

Other issues, like obsolescence, the necessary depreciation, and leverage will be harder to shake.

Have a tip or an insight to share? Contact Emma at ecosgrove@businessinsider.com or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

Meet a decades-old software company hitching a ride on the Nvidia rocket ship

26 March 2025 at 02:00
A man dressed in black clothes and a black leather jacket satnds in front of a large screen
Nvidia CEO Jensen Huang during his keynote address at the GTC AI Conference in San Jose, California, on March 18, 2025.

OSH EDELSON/AFP via Getty Images

  • The AI boom is changing the trajectory of decades-old computing companies.
  • DDN, known in the supercomputing field, now offers crucial data storage for AI data centers.
  • The company recently raised $300 million from Blackstone at a $5 billion valuation.

As Nvidia CEO Jensen Huang gave his keynote address at the company's annual GTC conference to an audience of roughly 17,000 people in San Jose, California, last week, dozens of companies sat on the edge of their seats to find out if they'd get a mention. Some had a glimpse of the slides ahead of time, but no one knew for certain until the words came out of Huang's mouth.

DataDirect Networks, or DDN, was one of the lucky ones to leave the arena happy. Huang named the company alongside Dell, HP Enterprise, Hitachi, and IBM. But while those firms have enjoyed broader name recognition in the tech world for decades, DDN was invited on the rocket ship that is Nvidia just a few years ago β€” and everything changed.

DDN makes hardware, but more importantly software that allows GPU users to access their stored data fast. It cuts down on latency, which is the industry term for lag in AI systems that can delay, for example, the answer to a question asked of a chatbot like ChatGPT.

When Nvidia sells $100 worth of GPUs, DDN's opportunity is $5-10 dollars.

"You have to have the right amount of performance, reliability, and stability to extract your data at full speed, real-time, to feed the GPUs," Paul Bloch, DDN's 64-year-old President, told Business Insider.

The company has been working with Nvidia for about eight years. Before that, it served the stable, but much smaller industry of supercomputers for decades.

Bloch and CEO Alex Bouzari are fixtures of the supercomputing world. Before the AI boom, top research universities, government labs, and the oil and gas industry knew them well. But mainstream data centers didn't need them. They do now.

"For the very first time, your storage system will be GPU-accelerated," Huang said at GTC.

A man in a blue suit and grey shirt with short,white hair and glasses stands in front of an orange, red, and grey background.
DDN President and Cofounder Paul Bloch

DDN

As data sets grow larger and AI is used at scale for data-heavy applications like real-time video, Bloch can see a day when the company scales independently of Nvidia.

"All of a sudden it's much easier to deliver the value in AI because we've already done it in the past with similar market,' Bloch said. "That's why the success of the company has gone exponential."

DDN has found a new gear in the age of AI. Conversations, fixes, new features, and deals that used to take weeks and months now take days. And Bloch and his team are learning to work at Jensen Huang's pace β€” what he calls the "speed of light."

"Jensen's emails are fantastic. They are 10 words or less," Bloch said.

Moving into the era of AI has taken some adjustments, but today, DDN and its competitors are essential to the parallel computing that enables AI. When a company seeks to set up a GPU cloud, it has to have a version of DDN's tech. And DDN is part of Nvidia's reference architecture, which is the recommended setup for maximal GPU performance.

Not all of Nvidia's customers choose to use these instructions, but the mere fact has brought DDN new customers. Hyperscalers have traditionally preferred to use their own recipes of parts and players in their data centers, but Bloch said that in the era of Nvidia's Blackwell, they are starting to come around.

Now it's preparing for a blockbuster third decade. In January, the company raised $300 million in growth funding from Blackstone at a $5 billion valuation, and Bloch got a congratulatory email from Huang at 6:30 a.m. the morning of the announcement, he said.

"DDN does not need the money, per se," Bloch said. "We're profitable. We're EBITDA positive, we're growing very quickly."

What he wants, though, is access to the C-suites that come with a Blackstone affiliation. DDN used to sell through researchers and developers β€” after all, the technology was so niche that it was far below top executives' notice. Now, AI infrastructure is one of the most important expenditures a company can make, and decisions around it are scrutinized at the highest levels.

DDN has received acquisition offers over the years, but Bloch said the company is his and Bouzari's "life's work" and they stay independent by choice.

"We are control freaks," Bloch said, adding that he sees two to five years more of break-neck AI infrastructure buildout.

"I'm not even sure it's going to stop," he said.

Read the original article on Business Insider

5 big takeaways from Nvidia's GTC conference

The San Jose McEnery Convention Center covered in lime green for Nvidia's GTC conference.
Nvidia's GTC conference has become a central point in the calendar for the ever-expanding AI industry.

Emma Cosgrove

  • Nvidia's annual GTC conference wrapped up on Friday after a week of panels and exclusive events.
  • The weeklong tech showcase highlighted advancements in AI, robotics, and quantum computing.
  • Here are five key points to take away if you missed the GTC festivities this week.

The Super Bowl of tech was held this week in San Jose, drawing audiences from around the world to marvel at advancements across the industry that were showcased during Nvidia's annual GTC conference.

While Nvidia has hosted GTC β€” that's short for GPU Technology Conference β€” since 2009, the event has expanded over the last decade and a half to highlight not just advancements in semiconductors, accelerated computing, and artificial intelligence but also robotics, autonomous and electric vehicles, and future-looking technology like quantum computing.

"People have noticed that the feeling of GTC has really changed," CEO Jensen Huang said in a Thursday press conference.

And he's right, the event has changed.

If you weren't able to attend GTC this year, here's what you missed.

AI is big business now, and the stakes are higher

In last year's GTC keynote, Huang mentioned the word "revenue" once. This year it was mentioned 10 times. The message was that it's time for everyone in the AI ecosystem β€” not just Nvidia β€” to make money.

In his keynote, Huang emphasized Nvidia's evolution from a chip company to an AI infrastructure company that sells "AI factories."

As he described, AI factories produce intelligence, and the most productive factory should be the most successful. Huang said the quality of each company's AI factory will have a big impact on its overall revenue. And that applies to any industry, whether it's traditionally in "tech" or not.

"Every industry is here. Every country is here, and every company is here because we have become a foundational company by which other companies are built," Huang told reporters Thursday.

Nvidia is at the center of the AI ecosystem

Since Nvidia's technology enables most of the innovation in AI today, it also sets the bar for what's possible β€” how fantastically engineers can apply their imaginations. The next round of chip architecture that Huang described in his keynote address drew a small gasp from the audience in the room due to the power and performance claims.

Cloud companies and AI developers at the conference told BI the ecosystem is still wrapping its head around what massive advances in chip performance will mean for what developers can build and what other players in the infrastructure ecosystem need to prepare. That's why, Huang explained, the company announced several generations at a time.

"Now everybody else can plan," he said in a Thursday press conference. "In the good old days, you're building chips. Somebody buys a chip, they put it in the computer, they sell a computer. What we do now is we build AI infrastructure that are deployed hundreds of billions of dollars at a time, and so you better do a good job planning."

The cadence of AI innovation has picked up

There was an urgency in the halls of the San Jose Convention Center, likely due to packed schedules and long lines. But the pace of AI model innovation has quickened too, and in side conversations, press conferences, and panels the refrain was that each new paradigm lasts about six months.

Nvidia's Senior Director of High-Performance Computing and AI Factory Solutions Dion Harris told Business Insider that one of the most important stakeholders he spends time with in his day-to-day work is cutting-edge researchers. They are closer to what's coming next and can potentially drive the most value, for example, from the AI factory model, in the future.

Robots are coming, but they're not here yet

The exhibition floor featured a variety of robots and self-driving cars to underscore what Huang said in his keynote.

Huang described robots as a substitute for human labor in short supply, the key that could make every car self-driving and an essential technology in taking physical action on much of the intelligence we can create today.

"Physical AI and robotics are moving so fast, everybody, pay attention to this space. This could very well likely be the largest industry of all," Huang said in his keynote address.

A robotic future is important to Nvidia as leveraging AI to understand the physical universe and take action in three dimensions will require an immense amount of compute. And if it remains the titan provider of computing power it is today, it is expected to raise Nvidia's profile even further as it becomes the foundation of a new industry. But many of the robots on display in the convenient center are nascent in their capabilities or still in development.

Nvidia announced a new open-source reasoning model for robotics aimed at accelerating out of the development phases Tuesday. The new model could make robots able to adjust to different tasks and environments on the spot.

"This is gonna make it become viable," Rev Lebaredian, Nvidia's vice president of Omniverse and Simulation, told BI. "We can finally build an industry around it."

Quantum computing is officially on the scene

Once considered a far-away technology (even by Huang himself, who as recently as January suggested we're 20 years from the technology being "very useful," sending stocks tumbling), quantum computing was vindicated at this year's GTC.

Nvidia hosted its first Quantum Day, dedicated to showcasing the burgeoning technology that researchers say could help discover new drugs, develop new chemical compounds, or break encryption methods, among other outcomes.

Huang started Quantum Day by declaring how wrong he'd been just months earlier before welcoming three panels of executives from various quantum computing companies to explain why quantum processing is the next frontier for the tech sector.

While advancement in the field has historically been slow-going due to deeply technical problems with error correction and scalability, this year's GTC displayed the momentum that has been building behind the scenes from giants like Microsoft, Amazon Web Services, Google, as well as lesser-known players like D-Wave, IonQ, and Rigetti.

Huang said onstage that this was just the first Quantum Day of many to come at GTC, demonstrating that Nvidia's conference will make space for new developments in the field.

"It seems like next year we're going to have some quantum demos at GTC," Huang said Thursday.

Read the original article on Business Insider

Inside Nvidia's annual developer festival, where AI meets Denny's pancakes and Taiwan-style night markets

19 March 2025 at 20:29
The San Jose McEnery Convention Center covered in lime green for Nvidia's GTC conference.
Nvidia's GTC conference has become a central point in the calendar for the ever-expanding AI industry.

Emma Cosgrove

  • Nvidia's GTC conference in San Jose, California, featured CEO Jensen Huang's keynote on AI advances.
  • Huang's speech highlighted Nvidia's new AI partnerships, software tools, and chip architectures.
  • With crowded sessions and a bustling exhibition floor, Nvidia's immense growth was on display.

The party started as so many do β€” with pancakes in a parking lot.

I attended Nvidia's GTC conference, which has taken over downtown San Jose, California, this week. Tuesday was the biggest day for the AI juggernaut. At 10 a.m. Nvidia CEO Jensen Huang began his keynote address, which lasted more than two and a half hours.

But first, breakfast.

The legendary Denny's breakfast

It was a chilly early morning in San Jose. The "pregame" started at 6:30 a.m. with breakfast from Denny's, the restaurant where Huang came up with the idea for Nvidia. I needed to know who would show up more than three hours early for a speech about computer chips.

When I arrived just before 7 a.m., the line was already substantial. A massive red mobile Denny's kitchen was cooking up "Nvidia bytes" β€” essentially sausages and pancakes. Diners were encouraged to wrap up their bytes like a taco and add syrup on top, like Huang does.

Conference goers line up outside a Denny's pop-up restaurant outside Nvidia's GTC AI event
Conference-goers line up outside a Denny's pop-up restaurant outside Nvidia's GTC AI event.

Emma Cosgrove/Business Insider

I chatted with some of the early birds. Some were die-hard Nvidia fans. Some were jet-lagged, having flown in the day before from London or Toronto, so they were up anyway. Some wanted to get into the SAP Center as soon as the stadium doors opened to avoid the massive lines that would form the hour before the speech. Some heard a rumor that Huang himself might stop by the tailgate.

And sure enough, by 7:25 a.m., muscled men in suits with earpieces started multiplying. With no fanfare, Huang walked out from behind the registration tent wearing his signature uniform, all black and a leather moto jacket. The bleary-eyed crowds sprung into action β€” phones up for photos.

Jensen Huang at Nvidia's GTC AI conference
Nvidia CEO Jensen Huang made an appearance at the company's GTC AI conference Denny's breakfast pop-up.

Emma Cosgrove/Business Insider

Huang donned an apron and went inside the food truck to make some pancakes, as he had as a 15-year-old Denny's employee.

"At this pace, I'd run the company out of business. I used to be a lot faster," he said of his chef skills after emerging from the kitchen and immediately meeting CNBC reporter Kristina Partsinevelos and a camera crew.

Partsinevelos tried to ground the conversation, but Huang was all jokes.

"You're talking about the stock? I'm talking about Denny's!" he said.

By 8:15 a.m., Huang disappeared into the SAP Center, where he turned up on the pre-show panel airing live inside the stadium.

Inside a dark auditorium, a screen shows five men at a table and one man in an apron standing speaking to them with a bag of food.
Nvidia Jensen Huang served breakfast to the panel on Nvidia's pregame show at his GTC keynote speech.

Emma Cosgrove/Business Insider

As I reached my floor seat, the panel was giving a reverent retrospective of the company β€” including its many brushes with failure before AI changed everything.

Huang 'without a net'

Leading up to the speech, Nvidia's partner companies were eager to find out if they would garner a mention on one of tech's brightest stages. One Nvidia employee told me that up to the last minute, a local war room of Nvidia employees was tweaking the company's dozens of announcements.

Once the speech starts, it was all in Huang's hands.

He kicked off by firing T-shirts into the crowd from an Nvidia-green T-shirt cannon.

"I just want you to know that I'm up here without a net. There are no scripts, there's no teleprompter, and I've got a lot of things to cover. So let's get started," Huang said.

A man in a leather jacket stands on a stage and fires a t-shirt cannon into the crowd.
Nvidia CEO Jensen Huang started his 2025 GTC keynote address by firing T-shirts into the crowd.

Emma Cosgrove/Business Insider

The 62-year-old CEO proceeded to blow through his scheduled two hours.

He focused on Nvidia's advancements, a flurry of new partnerships and software tools for AI developers, and coming chip architectures that could underpin the computation speed and efficiency that creates new industries. These are already creating what Huang calls "AI Factories."

He also spoke on the challenges of realizing the AI-enabled future he has promised for years. He addressed shortages of fresh data to feed AI models, limitations in the ever-important training phase, and energy constraints.

A man stands in front a a projection of Nvidia's office on a stage inside the dark SAP Center Stadium.
In his keynote address, Nvidia CEO Jensen Huang took the audience on a virtual tour of Nvidia HQ as he moved from subject to subject.

Emma Cosgrove/Business Insider

The world of computing has reached a "tipping point," and the "platform shift" to accelerated computing is well underway, he said.

The crowd stayed rapt, although a little antsy at the two-hour mark. But the final video clip reenergized the room. A Disney-designed robot named Blue, which looked like part of the Star Wars universe, toddled through a desert and then ascended β€” for real β€” from below the stage.

Then the crowd jumped to their feet and raised their phones.

"Have a great GTC! Thank you! Hey, Blue, let's go home. Good job," said Jensen.

A man looks down at a small robot in front of a screen showing a fictional desert scene.
Nvidia CEO Jensen Huang talks to the Disney robot "Blue," which was controlled by Disney Imagineers off-stage at his 2025 GTC keynote.

Emma Cosgrove/Business Insider

'We're going to have to grow San Jose'

After the speech, thousands of attendees streamed into the downtown San Jose streets. The SAP Center, which had only a few empty seats, holds 17,500, and 25,000 people were expected at this year's event.

The crowds made their way back to Plaza de Cesar Chavez, temporarily renamed GTC Park, to find lunch at the procession of food trucks on-site daily. Attendees again had to wait in long lines.

A man drives a passenger in a Nvidia GTC and Oracle branded  bicycle-powered rickshaw.
Nvidia's GTC took over downtown San Jose this week.

Emma Cosgrove/Business Insider

The lunch lines were just one of many signs that GTC has outgrown its traditional home. Lines to get into the San Jose Convention Center's conference sessions snaked through the hallways.

Nvidia still calls GTC a developer conference, though the evolution from technical developer confab to serious dealmaking destination was on display at a swanky building next to the convention center dedicated only to business meetings. The elevators couldn't handle the volume of people constantly coming in and out.

Dozens of people stare at their phones while waiting in a long line in front of Nvidia's
Massive queues formed outside the building designated for business meetings at GTC as attendees waited for elevators.

Emma Cosgrove/Business Insider

Even Nvidia team members arriving just behind me balked and abandoned ship to relocate when they saw the lines. Getting from the sidewalk to the meeting room inside the building took 35 minutes.

"The only way to hold more people at GTC is we're going to have to grow San Jose, and we're working on it," said Huang during the keynote.

Nvidia's robotic future

Logistics aside, I soon met with Kimberly Powell, Nvidia's vice president of healthcare, who detailed the many ways Nvidia's accelerated computing is changing how doctors and hospitals work.

She said it could be decades before robots can actually perform surgeries without human assistance. But companies like Moon Surgical are already creating surgical assistance robots to hold cameras and tools with arms that never tire. Nvidia also works with da Vinci robots, which can suture wounds, among other tasks.

Robots simulate surgery with tools in a fake body and the interior view on a screen above.
Robotics assistants for surgery are on the way, according to Nvidia.

Emma Cosgrove/Business Insider

I then headed back to the convention center to walk the exhibition floor during happy hour, where I saw some of the technology Powell championed on display. Because Nvidia's impact spans many industries, the floor showcased cars, vacuuming robots, simulated human bodies ready for surgery, and all the biggest names in cloud computing.

Pictured are four robots in silver and green
Robots from JotBot, Agility Robotics, Unitree, and more were on display at Nvidia GTC.

Emma Cosgrove/Business Insider

I also passed the Nvidia gear store, which was booming. A worker there told me the 2025 GTC T-shirt and puffer vests were the biggest sellers.

Nvidia-branded merchandise is displayed in a corner of a conference hall labeled "gear store."
Nvidia's store was busy throughout the GTC conference.

Emma Cosgrove/Business Insider

My 12-hour Tuesday at the conference ended at the GTC Night Market back in the park. The setup was an homage to Huang's love of Taiwan's night markets, with live music, drinks, local food like bao buns, yakitori, cupcakes, and a punnily-named "juice" bar sponsored by GPU cloud provider Coreweave.

Green illuminated signs and lanterns hang above a walkway lined with food tents at Nvidia GTC Night Market.
Nvidia's Night Market is inspired by CEO Jensen Huang's childhood in Taiwan.

Emma Cosgrove/Business Insider

If Nvidia has its way, AI is going to continue to do a lot of hard work for us going forward. But 12 hour-days are here to stay, at least for a while. On my way back to my hotel β€” via San Jose bike share past a now-silent SAP Center β€” I thought of these two I had spotted inside the convention center:

Two people sleep in a Nvidia GTC branded conference pod at Nvidia's GTC conference.
Nvidia's GTC conference is a marathon, not a sprint.

Emma Cosgrove/Business Insider

Have a tip? Contact this reporter via email at ecosgrove@businessinsider.com or Signal at 443-333-9088. Use a personal email address and a nonwork device; here's our guide to sharing information securely.

Read the original article on Business Insider

Disney is getting into the next-gen robot game and it's kind of cute

Star Wars BDX droids at Disney's "Season of the Force" event.
Star Wars BDX droids at Disney's "Season of the Force" event.

Paul Bersebach/MediaNews Group/Orange County Register via Getty Images

  • Disney is partnering with Nvidia and Google DeepMind to create an open-source physics engine.
  • The engine will help robots learn to navigate complex tasks.
  • Disney hopes to feature entertainment robots at its theme parks.

On stage during a keynote speech this week, Nvidia CEO Jensen Huang spoke to an adorable robot named Blue, which responded in classic "bee-boop" robot language.

It was a modern meet cute.

Jensen introduced the robot at Nvidia's GTC AI Conference β€” which some are calling the Super Bowl of AI β€” on Tuesday in San Jose, California.

Huang said Disney Research is partnering with Nvidia and Google DeepMind to develop Newton, an open-source physics engine to help robots like Blue learn to navigate complex tasks more accurately. Newton is built on the Nvidia Warp framework. Disney Research plans to use the advanced technology to upgrade its robotic characters to be more lifelike and expressive.

"Disney Research will be the first to use Newton to advance its robotic character platform that powers next-generation entertainment robots," a press release said.

An Nvidia spokesperson told BI that the first conversation between the company, Google Deepmind, and Disney Research took place in December.

Nvidia CEO Jensen Huang talks to a robot at the company's AI conference on March 18, 2025.
Nvidia CEO Jensen Huang interacts with a robot at the company's AI conference.

JOSH EDELSON / AFP

The audience attending Huang's keynote address clapped and cheered when Blue, a droid inspired by "Star Wars," walked onstage. The droids, which have not used Newton yet, will be coming to Walt Disney World, Tokyo Disneyland, and Disneyland Paris this year. A squad of the droids appeared with Jon Favreau and Disney Imagineers at the 2025 SXSW Conference & Festivals.

Disney Imagineers first revealed the robots at a Disney event in 2023. During the demonstration, three robots roamed around the "Star Wars: Galaxy's Edge" attraction at Disney's Hollywood Studios.

While Disney has used audio-animatronic figures in its park attractions for decades, the new droids would elevate the company's robot game to a new level.

Huang and Blue had a brief conversation during the presentation. Blue responded to Huang's questions with beeps, head nods, and body wiggles. The robots were remote-controlled by a human at the event.

Disney BDX Droids at Nvidia's AI Conference in March 2025.
Disney BDX Droids at Nvidia's conference on Tuesday.

Emma Cosgrove/Business Insider

"This is how we're going to train robots in the future," Huang said, adding that two Nvidia computers were operating inside Blue.

"The BDX droids are just the beginning. We're committed to bringing more characters to life in ways the world hasn't seen before, and this collaboration with Disney Research, Nvidia, and Google DeepMind is a key part of that vision," Kyle Laughlin, senior vice president at Disney Imagineering Research & Development, said in a press release.

During the keynote speech, Huang also announced an open-sourced humanoid robot foundational model, Isaac GR00T N1. A press release said Isaac GR00T N1 is "the first of a family of fully customizable models that Nvidia will pre-train and release to worldwide robotics developers."

Once the realm of science fiction, the hype is building around robots as AI accelerates their advancement. Nvidia is one of the leaders in the industry, which is largely powered by its AI computing chips.

Huang told reporters at the GTC conference that he expects humanoids to replace factory workers in just a few years rather than decades. "This is not a five-years-away problem," Huang said.

Huang ended his keynote address with a nod to Disney's droid.

"Okay, Blue. Let's go home," Huang said. "Good job."

Representatives for Google did not respond to a request for comment from Business Insider.

Read the original article on Business Insider

Nvidia's partnership with Taco Bell means AI could soon do more than take your drive-thru order

18 March 2025 at 13:00
taco bell
Yum! has been testing AI at some Taco Bell and Pizza Hut.

Ethan Miller/Getty Images

  • Taco Bell is using AI for drive-thru orders through a partnership with Nvidia.
  • Voice AI adoption is growing in fast food, with mixed results from McDonald's and Wendy's.
  • Yum! Brands plans to roll out voice ordering to about 500 restaurants starting next quarter.

AI is already taking orders at hundreds of Taco Bell restaurants, and there's more to come from the tech.

Soon, for example, it might be able to see how many cars are waiting in the drive-thru and suggest items to shorten the wait time, Joe Park, chief digital and technology officer at Yum! Brands, Taco Bell's parent company, told Business Insider.

Right now, "we just know there's a car at the speaker," Park said. Using AI could allow the store to see that there are five cars in line, which could be useful.

"You might want to suggest selling some quicker-turnover items versus big complex things that might take longer so we can make sure our customers have great speed of service," Park said.

Yum! Brands on Tuesday announced a strategic partnership with Nvidia, the chip firm that has cornered most of the market in AI computing.

While Nvidia has become a top tech player by designing the graphics processing units that power most AI, it has also become important in building software that allows less technical companies to harness it.

Nvidia and Taco Bell have already been testing voice AI for ordering at drive-thrus at some locations. The companies say they're looking at whether AI will be useful elsewhere at Yum! restaurants.

Other fast-food chains have also experimented with AI, especially at their drive-thrus. McDonald's rolled out voice AI to drive-thrus at about 100 restaurants with IBM before ending the pilot last year.

However, some McDonald's customers said that the AI tool got their orders wrong in videos posted on social media. The company then pulled the tech, saying that voice AI "will be part of our restaurants' future."

Wendy's is adding voice AI to hundreds of restaurants this year. The company says that its system's accuracy is improving, and CEO Kirk Tanner said last month that he personally tests it a few times a week.

Yum! plans to roll out voice ordering to about 500 restaurants, including Pizza Hut, Taco Bell, KFC, and Habit Burger and Grill, starting in the second quarter of 2025, Park said.

Still, Andrew Sun, Nvidia's director of global business development for retail, said several things have to go right for AI to equal the experience of ordering from a human. The quality and speed of the AI models can make the difference between a successful and a frustrating conversation with an AI order-taker.

"We've all seen what a poor drive-thru experience could look like when it goes viral in a bad way, and we want to make sure our team members have a great experience," Park said.

Park said Nvidia's approach of providing a speech-recognition model, a ready-made package of software and tools for specific tasks called microservices, and ongoing support should make the difference.

Yet, "you're going to have to build relationships with your customers and your team members to have trust in the system and not shut it off," he said.

He said an off-the-shelf large language model wouldn't be good enough in a scenario where plenty of menu item names are specific to Taco Bell. For example, Park explained that the AI needs to recognize "Limonada" instead of "Lemonade."

Improving voice AI's accuracy will involve "trial and error," Park continued.

"That's such a big unsaid part about AI and getting it to work," he added.

Read the original article on Business Insider

Why Nvidia CEO Jensen Huang paid homage to Denny's at the AI giant's big conference

Jensen Huang at Nvidia's GTC AI conference
Nvidia CEO Jensen Huang at Nvidia's GTC AI conference on March 18.

Emma Cosgrove/Business Insider

  • Nvidia's GTC AI event featured a pop-up Denny's, drawing lines of attendees.
  • Nvidia CEO Jensen Huang even wore a Denny's apron.
  • It's all about Silicon Valley's origin stories.

There's nothing better than a Lumberjack Slam with your GPUs in the morning.

Denny's offerings greeted conference attendees outside Nvidia's big GTC AI event on Tuesday in San Jose, California.

There was a pop-up Denny's outside, with a line of people waiting to be served.

Conference goers line up outside a Denny's pop-up restaurant outside Nvidia's GTC AI event
Conference-goers line up outside a Denny's pop-up restaurant outside Nvidia's GTC AI event.

Emma Cosgrove/Business Insider

There were also "Nvidia Breakfast Bytes," a play on the word for a unit of digital information.

A Denny's sign outside Nvidia's GTC AI conference
A Denny's sign outside Nvidia's GTC AI conference

Emma Cosgrove/Business Insider

When he arrived, Nvidia CEO Jensen Huang even wore a kitchen overall emblazoned with Denny's corporate colors and logos.

The guy likes to cook and has broadcast announcements from his palatial kitchen in Silicon Valley before, complete with an overly large collection of colored spatulas.

But why Denny's? As a billionaire, he could dine at Nobu rather than a breakfast diner.

It's all about Silicon Valley's love of a gritty origin story. Founders of massive, wealthy tech giants love to hearken back to the good old days when their companies were scrappy startups. It keeps the troops humble, hungry, and focused on inventing new things rather than sitting back on their digital laurels.

Amazon founder Jeff Bezos has "Day 1." Google founders Larry Page and Sergey Brin like to recall starting the search giant in former YouTube CEO Susan Wojcicki's garage. Meta CEO Mark Zuckerberg started Facebook in a college dorm room.

Jensen has Denny's. He and the other Nvidia cofounders, Chris Malachowsky and Curtis Priem, came up with the idea for the company over Grand Slam breakfasts and too many cups of coffee in a Denny's in San Jose.

It's the 2484 Berryessa Road location in San Jose if you want to go sometime.

Read the original article on Business Insider

She took down Intel. Now AMD's CEO has a new miracle to perform.

13 March 2025 at 02:00
Lisa Su
 AMD CEO Lisa Su.

I-HWA CHENG/AFP via Getty Images; BI

  • When a top analyst skewered AMD's software, CEO Lisa Su called him personally to chat.
  • AMD's AI chips have struggled to compete against Nvidia's dominance, and software is its weakness.
  • Those who know Su told Business Insider she will never settle for second place.

What should a CEO do when their company is publicly called out for an inferior product? Many would stay silent. Not AMD CEO Lisa Su.

In early February, AMD released new data showing how well its AI chips performed at training large language models, using benchmarks developed by a company called SemiAnalysis. Just weeks earlier, the same group had published a searing review of AMD's tentpole graphics processing unit.

The analysts wrote that while the chip looks good on paper, reaching its potential in reality was almost impossible with AMD's existing software. Chief analyst Dylan Patel and the SemiAnalysis team spent five months assessing AMD's GPU, which has struggled to gain market share and mindshare against the dominant player, Nvidia.

"We were hopeful that AMD could emerge as a strong competitor to Nvidia in training workloads, but, as of today, this is unfortunately not the case," SemiAnalysis published in December.

The next day, Patel got a call from Su. The call was scheduled for 30 minutes, but it lasted 90.

"Feedback is a gift even when it's critical," Su tweeted after the call. The new performance data released in February were a punch back in a fight that's far from over.

2024 was the year of Lisa Su. She was Time and Chief Executive Magazine's CEO of the year.

Last year, AMD outsold Intel in its data center business, overtaking its old rival in the traditional data center world. This became the triumphant apex of Su's first decade as CEO of AMD. Revenue for the whole of 2024 was up 14% year over year β€” gross profits up 22% β€” and yet when Su reported the results in February, the stock went down.

As Su achieved what many thought impossible and conquered her old foil, Intel, a new one had already presented itself in AI prognosticator Nvidia, led by CEO Jensen Huang, Su's cousin born in the same region of Taiwan as her. Shareholders wouldn't let her forget her biggest rival.

Whether AMD can meet the seemingly insurmountable challenge of Nvidia's estimated 90% market share may come down to the approaches of two Taiwanese-born, US-educated, distantly-related CEOs.

Su has already stated the company's goals. She's leaning into open-source software and beefing up support for large language model training and inference customers. Most importantly, she's raising the bar for AMD's software so that it can better stand up to Nvidia's β€” since Huang has long professed that software is Nvidia's secret sauce.

"We are still in the very early stages with AI, and we believe there's no one-size-fits-all approach to AI compute," an AMD spokesperson told Business Insider. AMD declined to make Su available for an interview.

BI spoke with nine people for this story β€” five of whom have at one point had a personal relationship with Su and three of whom worked under her at AMD. They said that whether in 2007, 2017, or 2027, the stoic, thoughtful, quietly confident executive walking the brightest stages in the tech world is exactly who she seems. Though she may never conquer Nvidia, she won't rest while she's in the number two spot. Her play involves intently listening to partners as well as critics, and it's worked before.

AMD CEO Lisa Su
AMD CEO Lisa Su's public presentation style has changed since she took over the company in 2014.

AMD

Lisa Su says, 'Why not?'

Some early indications suggested that starting at the bottom motivated Su to get to the top.

Born in Taiwan and raised in New York, Su intentionally picked the most challenging STEM field she could think of: electrical engineering. After earning her doctorate, she received multiple offers to stay in academia but decided to join Texas Instruments instead, Dimitri Antoniadis, her MIT thesis advisor, told BI. She wanted to manage people and projects, she recently told Stanford business students. After leaving TI for IBM, she was tapped to serve as a technical assistant for Lou Gerstner, IBM's chairman and CEO.

Antoniadis recalled late-night phone conversations with Su when she was at these "juncture points" in her career. She left IBM in 2007, spent four and a half years at Freescale Semiconductor, and then came to AMD in 2012.

The professor got one such call in late 2014. Su had managed AMD's various business units and operations for nearly three years β€” deep in the weeds of the entire company yet without the authority to set the overall direction. She called Antoniadis when she was asked to take the CEO job, which meant going after a market dominated by Intel. The original Silicon Valley icon had a market capitalization of over $150 billion and a reputation for ruthlessness. AMD's market cap was just $2 billion.

"At the time, I said, 'Lisa, are you serious? Taking on Intel?' She said, 'Why not?'" Antoniadis told BI.

Su sought multiple opinions on the big decision to head AMD. Lip-Bu Tan, a legendary semiconductor CEO turned investor who will step into the Intel CEO role on March 18, was also on the call list. Tan and Su met years earlier when she was at Freescale Semiconductor, and he was impressed from the start, he told BI.

Tan was fully aware of AMD's sad state at the time. The firm had completed two rounds of layoffs since 2011 and pulled out of the processor market. The company needed focus.

"Only the gaming business was doing well. The rest were struggling," Tan said. Despite this, he didn't hesitate to recommend the job to Su. He had just orchestrated a revival of a similar magnitude at Cadence Design Systems and knew the opportunity such a turnaround could be.

"The market value was less than $3 billion β€” you can't go wrong with that. It is so undervalued," Tan said of AMD.

Su took over AMD on October 8, 2014. A 7% staff cut proceeded, and Su set out to make long-term bets and win back customers. Tan said Su soon had top tech execs like Microsoft CEO Satya Nadella and Dell COO Jeff Clarke trusting her, mainly due to her hands-on style. Even for routine annual business reviews, Su knows all the numbers and listens intently to concerns, sources said.

"They love her. She is very engaged β€” very involved," Tan said.

Su has evolved her style over her 10 years as CEO of AMD. She's somewhat less stoic, makes jokes onstage, wears brighter colors, has more perfectly coiffed hair, and has come to appreciate Christian Louboutin heels. AMD declined to comment on these details.

"I am not surprised at all where she is right now. I truly expected it," Antoniadis told BI.

AMD's market cap has grown to about $160 billion as of Wednesday β€” much higher than its $2 billion market cap when Su first started. This month, all AI stocks have taken a dive amid uncertainty surrounding the Trump administration's policy shifts.

Now, Su has a new miracle to perform.

Lisa Su holding a chip.
Lisa Su, CEO of AMD.

AMD

Su v. Huang

In 2018, Su sat with a handful of Wall Street analysts in a private meeting space near the Las Vegas Convention Center. The Consumer Electronics Show, one of the largest tech conferences of the year, bustled in the massive building's halls.

Su was just over three years into her job as CEO of AMD, and the company's stock hovered above just $10 per share.

The analysts in that Las Vegas conference room had a lot of advice for the then 48-year-old CEO, according to a person present, who asked not to be named since the session was private. The room was full of men eager to tell Su how to seize on her progress and take AMD to the next level. There was chatter in the nerdier accelerated computing circles that machine learning was ready to scale, and the analysts weren't sure AMD was seeing the signs.

Su took it all in and politely thanked everyone. She knew accelerated computing would change the world as early as 2017, she has since said in interviews. The graphics processing unit made that possible.

At the time, the entertainment industry used them for gaming and graphics rendering. While AMD has designed this kind of hardware for two decades, Nvidia's Huang beat the entire tech industry to the punch when he identified the AI opportunity for GPUs and started building software to help it spread. Since ChatGPT's birth, Nvidia and AMD have been in an epic race β€” only Nvidia had a massive head start.

Both Huang and Su are notoriously hardworking β€” late nights and weekends are a given.

But Huang is a showman. He dominates a stage whether it be at the front of a boardroom or a concert arena. Su is less flashy. She rarely, if ever, raises her voice, and her business strategy echoes that quiet, inexhaustible, confident consistency, sources said.

"You know she's in charge, but she's also a very quiet leader," said Jodi Shelton, cofounder and CEO of the Global Semiconductor Alliance. Shelton recalled an intimate dinner at Su's Texas home with just the CEO and her husband, Daniel Lin, where Su asked most of the questions.

"She doesn't need to interject when someone's speaking. She doesn't have to be the loudest person in the room," Shelton continued.

Onstage, Su often paces, making measured announcements. At team meetings, she drills for answers about what needs to be done next and delegates tasks, personally reviewing AMD's GPU distribution on spreadsheets, the AMD employees said.

In a world where CEOs like former Intel leader Pat Gelsinger have announced plans such as five nodes in four years and fell short in execution, AMD has slowly marched forward. Even Nvidia's yearly cadence of new GPU generations has hit production and installation snags. Su is wary of overpromising and underexecuting, several sources said. Execution is non-negotiable.

"That's not very easy for people to do for such a long time," Lamini founder Sharon Zhou, who has committed to AMD hardware over Nvidia, told BI. "Which is why I think she presents the main threat to Nvidia. It forces Nvidia into a place where they can't make mistakes."

Huang is motivated by being so early that he can form new markets around new technologies, Nvidia executive Rev Lebaredian told BI. Su wants to meet existing demand with an unfailingly great product.

"She knows AMD products technically in and out and can hold her own discussing any product with its respective engineers," one AMD employee said. "She's pretty quiet in person, but you could tell by the way she was looking at the lab and talking to the engineers that she was proud and happy to be there."

Her relentless consistency and focus on strong, reciprocal customer relationships make her undiminishable as a competitor, even for Nvidia, sources said.

"She's one of the most responsive people," Zhou said. When Zhou was trying to close Series A funding for Lamini, Su offered up the entire afternoon to chat with potential investors, a day after an earnings call.

An Asian women in a black top and light blue suit, Lisa Su, listens to a man speak next to a server while onlookers raise their phones to capture the moment.
AMD CEO Lisa Su's star has risen even as the company struggles to take market share from Nvidia. Here, she is pictured at Computex Taipei, one of the world's largest computer and technology expos, in June 2024.

AP Photo/Chiang Ying-ying

The only way is software

Many chip industry analysts agree that while AMD's hardware has caught up, it can't truly compete without better software. Nvidia's CUDA software has become the industry standard and allows engineers to program GPUs with flexibility and relative ease. AMD's software is still a work in progress, as SemiAnalysis's report detailed.

For the full 2024 fiscal year, Nvidia reported $115.2 billion in revenue in its data center segment β€” where most AI computing happens. AMD reported $12.6 billion for data centers in the same time period (though the reporting periods are slightly different). It's an enormous gulf that even the best of execution may never close.

Those who know Su say she will never settle for second.

"She does want to win, which doesn't mean second. It actually means first. First, you have to be second, and then you get to be first," Zhou said.

If Su has a winning strategy in mind, it's still a mystery to some AMD watchers.

In a February 5 note to investors, Bank of America analyst Vivek Arya wrote that AMD had not yet "managed to articulate" how or from where it would wrest market share from Nvidia.

"It could take much more in software, scale deployment, and system-level integration to break AMD's current less than 5% market share," Arya wrote.

Winning for Su will be about picking her fights, said Tan, the incoming Intel CEO, who is also friends with Huang. In 2024, Nvidia's R&D budget was about twice AMD's. Su still has to be discerning.

"You have to pick your best field," Tan said. "You can't do everything, like Jensen," he continued. Huang makes the menu, he said. Su can only choose a few dishes to battle over.

On the company's February earnings call, Su moved up the company's next chip launch by a few months.

AMD's fourth-quarter earnings beat expectations, yet investors balked. The Wall Street analyst consensus was that revenue was growing, though not enough came from AI.

"This is a 10-year arc. This is not a 2-year arc. So let's not think about this as what's going to happen next quarter," Su told Salesforce CEO Marc Benioff at its conference in September.

Read the original article on Business Insider

Rolling job cuts by Broadcom have slashed VMware's workforce roughly in half

Broadcom CEO Hock E. Tan speaks at White House event in 2017.
Broadcom CEO Hock E. Tan speaks at an event held by President Donald Trump.

AP

  • Broadcom has cut VMware's workforce by roughly half since acquiring the company in 2023.
  • VMware had more than 38,000 employees in early 2023. That's down to about 16,000, two sources said.
  • Wall Street loves CEO Hock Tan's M&A playbook.

Broadcom's aggressive cost-cutting strategy has at least halved VMware's workforce while pleasing Wall Street analysts.

The chip giant closed one of the biggest tech deals ever when it acquired cloud software provider VMware in late 2023. Since closing the acquisition, Broadcom CEO Hock Tan has focused on making the subsidiary more profitable.

Broadcom has been cutting jobs in the past year, including cuts to VMware's salesforce in October and to its professional services team last week, according to current and former employees and LinkedIn posts. In addition, Broadcom has reduced the workforce in other business units or offices over the past few months.

VMware stated that it had more than 38,000 employees as of February 2023, according to a regulatory filing, though many employees and executives departed before the deal closed. Attrition has also contributed to the shrinking workforce.

By this January, VMware had around 16,000 employees, according to two sources. Employees in marketing, partnerships, and VMware Cloud Foundation have been let go over the past few months, according to former staff and LinkedIn posts.

Broadcom has implemented other workplace changes at VMware, such as requiring employees to return to the office. In cases where not enough employees badged into certain offices, Broadcom has shut down those locations and cut staff there, one former employee said. Broadcom did not respond to a request for comment last week.

Tan has concentrated on VMware's largest customers and switched to a subscription-based business model from a perpetual license approach. He said on Thursday's earnings call that 60% of customers had so far been converted.

The company also raised prices. VMware customers have said they've faced massive hikes due to product bundling, which means multiple products are packaged together and customers have to pay for them all.

"The VMware kicker"

Broadcom's stock has climbed about 40% in the past year. In December, Broadcom's market cap hit $1 trillion.

Analysts have largely been pleased by Broadcom's handling of the merger integration.

"The VMware kicker continues to execute, which shouldn't be a surprise, given Hock Tan's expertise on his wash, rinse, and repeat M&A playbook," said Dave Wagner, portfolio manager at Aptus Capital Advisors, ahead of Thursday's blockbuster earnings release.

William Blair analysts called VMware the "star in software," saying it was an "opportunity for Broadcom to drive sustained software growth and potentially reduce the impact of heightened customer churn going into 2027."

Following the acquisition, Broadcom made steep job cuts and consolidated teams at VMware, as BI previously reported. Broadcom had cut at least 2,000 employees around that time.

Broadcom also divested some business units, such as End-User Computing

Have a tip? Rosalie Chan via email at rmchan@businessinsider.com or Signal at rosal.13.

Contact Emma Cosgrove via email at ecosgrove@businessinsider.com or Signal at 443-333-9088.

Contact Hugh Langley via email at hlangley@businessinsider.com or Signal at 628-228-1836. Use a personal email address and a nonwork device; here's our guide to sharing information securely.

Read the original article on Business Insider

AI companies are copying each other's homework to make cheap models

Sam Altman illustration looking to the left
Sam Altman

Andrew Caballero-Reynolds/Getty Images; Jenny Chang-Rodriguez/BI

  • The price of building AI is falling to new lows.
  • New, cheaper AI development techniques have developers rejoicing β€” but it's not all upshot.
  • As costs hit rock bottom, Big Tech foundation model builders must justify expensive offerings.

How much does it cost to start an AI company?

The answer is less and less each day as large language models are being created for smaller and smaller sums.

The cost of AI computing is falling. Plus, a technique called distillation to make decent LLMs at discount prices is spreading. This has sent a spark through parts of the AI ecosystem and a chill through others.

Distillation is an old concept gaining new significance. For most, that's good news. For a select few, it's complicated. And for the future of AI, it's important.

Distillation defined

AI developers and experts say distillation is, at its core, using one model to improve another. A larger "teacher" model is prompted to generate responses and paths of reasoning and a smaller "student" model mimics its behavior.

Chinese firm DeepSeek caused a stir with OpenAI-competitive models reported to have trained for around $5 million. It sent the stock market into a panic, punishing Nvidia with a loss of $600 billion in market capitalization for the potential downshift in chip demand. (Such a decline has yet to materialize.)

A UC Berkeley team of researchers, flying further under the radar, trained two new models for under $1000 in computing costs each, according to research released in January.

In early February, researchers from Stanford University, The University of Washington, and the Allen Institute for AI were able to train a serviceable reasoning model for a fraction of that.

Distillation was an unlock for all of these developments.

It is a tool in developers' toolboxes, alongside fine-tuning, to improve models in the training phase, but at a much lower cost than other methods. Both techniques are used by developers to give models specific expertise or skills.

This could mean taking a generic foundation model like Meta's Llama and using another model to distill it into an expert on US tax law, for example.

It could also look like using DeepSeek's R1 reasoning model to distill Llama to have more reasoning capabilities β€” meaning when AI takes a longer time to generate an answer in order to question its own logic and lay out the process of reaching an answer step-by-step.

"Perhaps the most interesting part of the R1 paper was being able to turn non-reasoning smaller models into reasoning ones via fine-tuning them with outputs from a reasoning model," wrote the analysts at Semianalysis in January.

In addition to the bargain price tag β€” at least for AI β€” DeepSeek released distilled versions of other open-source models using the R1 reasoning model as the teacher. DeepSeek's full-sized models, along with the largest versions of Llama are so large that only certain hardware can run them. Distillation helps with that too.

"That distilled model has a smaller footprint, fewer parameters, less memory," explained Samir Kumar, a general partner at Touring Capital. "You can run it on your phone. You could run it on edge devices," he said.

DeepSeek's breakthrough was that the distilled models didn't get worse as they got smaller as was expected. In fact, they got better.

Distillation isn't new but it has changed

The distillation technique first surfaced in a 2015 paper authored by prominent Google AI chiefs Jeff Dean and Geoffrey Hinton, and current Google DeepMind research VP Oriol Vinyals.

Vinyals recently said the paper was rejected from the prestigious NeurIPS conference because it was not deemed to have much impact on the field. A decade later, distillation is suddenly at the forefront of AI discussion.

What makes distillation so powerful now as opposed to back then, is the number and quality of open-source models to use as teachers.

"I think by releasing a very capable model β€” the most capable model to date β€” in the open source with a permissible MIT license, DeepSeek is essentially eroding that competitive moat that all the big model providers have had to date, keeping their biggest models behind closed doors," Kate Soule, director of technical management for IBM's LLM Granite, said on the company's Mixture of Experts podcast in January.

How far distillation can go

Soule said Hugging Face, the internet repository for LLMs, is full of distilled versions of Meta's Llama and Alibaba's Qwen, both open-source traditional models.

Indeed, of the 1.5 million models available on Hugging Face, 30,000 of them contain the word "distill" in the name, which conventionally indicates a distilled model. But none of the distilled models have made the site's leaderboard.

Just like shopping at the dollar store in the physical world, distillation presents some of the lowest cost-to-performance ratios on the market, but the selection is somewhat limited and there are drawbacks.

Making a model particularly good at one type of task through distillation can erode its performance in other areas.

Apple researchers attempted to create a "distillation scaling law" that can predict the performance of a distilled AI model based on factors including the size of the model being built, the size of the "teacher" model, and the amount of computing power used.

They concluded that distillation can work better than traditional supervised learning in some cases, but only when a high-quality "teacher" model is used. The teacher also needs to be. larger than the model being trained, but not beyond a certain threshold. Improvement stops as teacher models grow too big.

Still, the technique can, for instance, close the distance between idea and prototype for founders and generally lower the barrier to entry for building AI.

Finding a shortcut to smarter, smaller models doesn't necessarily negate the need for big, expensive foundation models, said multiple AI experts. But it does call into question the financial prospects of the companies that build those big models.

Are foundation models doomed?

"Just about every AI developer in the world today," is using DeepSeek's R-1 to distill new models, Nvidia CEO Jensen Huang said on CNBC following the company's latest quarterly earnings.

Distillation has brought opportunity, but it is poised to meet opposition due to the threat it poses to massive, expensive, proprietary models like those made by OpenAI and Anthropic.

"I think the foundation models will become more and more commoditized. There's a limit that pre-trained models can achieve, and we're getting closer and closer to that wall," Jasper Zhang, cofounder of cloud platform Hyperbolic said.

Zhang said the answer for the big names of LLMs is to create beloved products, rather than beloved models β€” perhaps lending credence to Meta's decision to make its Llama models somewhat open.

There are also more aggressive tactics foundational model companies can take, according to a Google Deepmind researcher who asked to remain anonymous to discuss other companies.

Companies with reasoning models could remove or reduce the reasoning steps or "traces" shown to the user so that they can't be used for distillation. OpenAI hides the full reasoning path in its large o1 reasoning model but has since released a smaller version, o3-mini, which does show this information.

"One of the things you're going to see over the next few months is our leading AI companies trying to prevent distillation," David Sacks, President Donald Trump's adviser for cryptocurrency and artificial intelligence policy told Fox News in January.

Still, it may be difficult to put the genie back in the bottle, by tamping down distillation in the Wild West of open-source AI.

"Anyone can go to Hugging Face and find tons of data sets that were generated from GPT models, that are formatted and designed for training and likely taken without the rights to do so. This is like a secret that's not a secret that's been going on forever," Soule said on the same podcast.

Anthropic and OpenAI did not respond to requests for comment.

Read the original article on Business Insider

Nvidia investors' call gives the chip giant a chance to tell backers why they're wrong about DeepSeek's impact

24 February 2025 at 12:55
NVIDIA's CEO Jensen Huang in his signature black leather jacket
Nvidia CEO Jensen Huang has said the market's reaction to DeepSeek was a mistake.

EDGAR SU / Reuters

  • Nvidia finally has a chance to tell investors why their violent reaction to DeepSeek was a mistake.
  • The chip giant's Wednesday earnings are the first since DeepSeek's AI sparked market panic.
  • Key areas to watch are data center revenue, Blackwell's ramp-up, inference demand, and policy.

When Nvidia reports earnings on Wednesday, the chip giant will have the chance to tell investors why it thinks their intense reaction to the rise of DeepSeek was a mistake β€” or change the subject entirely.

Heading into 2025, Nvidia's rule looked unassailable as Elon Musk, Mark Zuckerberg, and others lined up for its chips. That is until Chinese startup DeepSeek released R1, an open-source reasoning model with benchmark results to rival OpenAI's o1 model.

Critically, R1 was reportedly produced with fewer and less powerful chips than o1.

"DeepSeek's remarkable feat has shaken the industry's assumptions about how much capital or GPU chips a company needs to stay ahead of the competition," Barclays analysts wrote last month.

Although Nvidia has largely recovered from the violent reaction markets had to DeepSeek β€” the chip firm lost $600 billion in market capitalization in one day to mark the biggest single drop in US market history β€” CEO Jensen Huang will need to show investors the party is nowhere near over and that the promise of AI isn't overhyped. Huang previewed his argument at a virtual event broadcast Thursday where he said investors had misinterpreted the signals of DeepSeek.

As Nvidia prepares to address investors officially for the first time since the DeepSeek saga, here's what to look out for in its earnings.

Data center revenue

Sam Altman, the co-founder and CEO of OpenAI.
OpenAI CEO Sam Altman is working with Nvidia on Stargate.

Sean Gallup/Getty Images

Analysts predict Nvidia's revenue, especially in its all-important data center business, will keep rising β€” bolstered by already-announced forthcoming data center buildouts.

From Stargate, with $500 billion in expected spending, to Meta forecasting an additional $65 billion in data centers this year, to Amazon forecasting $100 billion more computing power earlier this month, Nvidia's customers are still lining up.

It will offer Nvidia fresh evidence to present to investors concerned that DeepSeek's claim to use chips more efficiently β€” a key driver in lowering costs β€” would hurt demand.

"Despite DeepSeek's supposed 'revolutionary' optimizations, there is no change thus far to spending intentions at NVDA large customers including Microsoft and Meta," Bank of America analyst Vivek Arya wrote in a note to investors in early February.

Model improvements, paired with big data center buildouts, are another favorable evidence point for Nvidia.

Grok 3, Musk's latest model from xAI, is receiving praise for its performance. Musk's firms also recently collaborated on its second data center with roughly 12,000 Nvidia GPUs, BI exclusively reported.

Musk has been aggressively adding to the fleet of GPU-packed data centers supporting Grok, suggesting a link between progress and infrastructure.

Model builders and hyperscalers still have their eyes on artificial general intelligence, and cheaper, highly functional models like DeepSeek won't impact that pursuit, Morgan Stanley analysts told investors in a note from last week.

Blackwell ramp-up

Jensen Huang onstage showing Nvidia hardware.
Nvidia CEO Jensen Huang.

Justin Sullivan/Getty

Nvidia's latest and most powerful chip series, Blackwell, has struggled with a slow rollout due to manufacturing and overheating issues. Analysts, however, are expecting the company to report a strong ramp-up.

"Demand for Blackwell is very strong and will outstrip supply for several quarters," Synovus senior portfolio manager Daniel Morgan said in an investor note last week.

UBS's Timothy Arcuri wrote that after much consternation, investor fears of a botched rollout are relaxing, and strong sales numbers could put them fully at ease.

UBS analysts also said the fourth quarter was the last in which Blackwell chips won't make up the majority of Nvidia's GPU sales. Investors will likely favor that shift because Blackwell brings with it higher profit margins.

Nvidia's buzzy GTC conference will take place in San Jose, CA, next month, marking the first anniversary of Blackwell's debut.

Inference and applications

Further growth in inference demand would also be a proof point for Huang's theory of investor error surrounding the DeepSeek rout. Demand for inference, the process of using and improving models once they've been trained, increases when consumers and businesses find value in AI tools.

Investors will likely want to see the share of AI workloads continue to shift to inference, which also requires GPUs to run. On the company's last call in November, Huang repeatedly said that inference across Nvidia's platforms was growing.

Growth in the software layers of Nvidia's tech stack would be a good sign, too. This would suggest maturity in AI products and lend strength to a part of its business that's potentially even more difficult to compete with than the chips themselves.

"What Nvidia talks about on its long-term moats and its possible deployment on the AI application side probably matters more this time," Morgan Stanley analysts wrote in a note to investors Friday.

Worldwide wild card

Donald Trump
President Donald Trump.

Chip Somodevilla/Getty Images

Investors will also be looking for any signals from Nvidia about the company's approach to China as President Donald Trump threatens to upend business relations with the country.

Just before the end of his term, former President Joe Biden initiated new regulations on the export of high-powered chips like Nvidia's GPUs, which are in the midst of a 120-day comment period. Many policy analysts expect Trump to allow the rules to take effect as they align with his "America First" agenda, though Trump has yet to directly address them.

As Trump said last month, AI leadership is critical to ensuring "economic and national security."

Last month, Trump also threatened to impose tariffs on Taiwan, home of Nvidia's chip manufacturing partner, TSMC. Tariffs could lead to increased costs for Nvidia. Huang met with the president at the White House last month, but neither party provided details of the discussion.

Although Nvidia's share price has recovered much of its DeepSeek-induced losses, the $3 trillion juggernaut faces various potential headwinds. Huang's job Wednesday will be to reassure investors that those headwinds will be mild and reaffirm that Nvidia remains fundamental to the AI story.

Read the original article on Business Insider

Nvidia CEO Jensen Huang directly addresses the DeepSeek stock sell-off, saying investors got it wrong

20 February 2025 at 16:53
Jensen Huang, wearing a black shirt and leather jacket, sits in a gray chair and gestures while talking.
Nvidia CEO Jensen Huang discussed the investor reaction to DeepSeek at a virtual vendor event.

Courtesy of DDN

  • Nvidia CEO Jensen Huang said investors misinterpreted DeepSeek's AI advancements.
  • DeepSeek's large language models were built with weaker chips, rattling markets in January.
  • In a pre-taped interview released Thursday, Huang emphasized the importance of AI post-training.

Investors took away the wrong message from DeepSeek's advancements in AI, Nvidia CEO Jensen Huang said at a virtual event aired Thursday.

DeepSeek, a Chinese AI firm owned by the hedge fund High-Flyer, released a competitive, open-source reasoning model named R1 in January. The firm said the large language model underpinning R1 was built with weaker chips and a fraction of the funding of the predominant, Western-made AI models.

Investors reacted to this news by selling off Nvidia stock, resulting in a $600 billion loss in market capitalization. Huang himself temporarily lost nearly 20% of his net worth in the rout. The stock has since recovered much of its lost value.

Huang said in Thursday's pre-recorded interview, which was produced by Nvidia's partner DDN and part of an event debuting DDN's new software platform, Infinia, that the dramatic market response stemmed from investors' misinterpretation.

Investors have raised questions as to whether trillions in spending on AI infrastructure by Big Tech firms is needed, if less computing power is required to train models. Jensen said the industry still needed computing power for post-training methods, which allow AI models to draw conclusions or make predictions after training.

As post-training methods grow and diversify, the need for the computing power Nvidia chips provide will also grow, he continued.

"From an investor perspective, there was a mental model that the world was pre-training and then inference. And inference was, you ask an AI a question, and it instantly gives you an answer," he said at Thursday's event, adding, "I don't know whose fault it is, but obviously that paradigm is wrong."

Pre-training is still important, Huang said, but post-training is the "most important part of intelligence" and "where you learn to solve problems."

DeepSeek's innovations energize the AI world, he said.

"It is so incredibly exciting. The energy around the world as a result of R1 becoming open-sourced β€” incredible," Huang said.

Nvidia spokespeople have addressed the market reaction with written statements to a similar effect, though Huang had yet to make public comments on the topic until Thursday's event.

Huang has been defending against the growing concern that model scaling is in trouble for months. Even before DeepSeek burst into the public consciousness in January, reports that model improvements at OpenAI were slowing down roused suspicions that the AI boom might not deliver on its promise β€” and Nvidia, therefore, wouldn't continue to cash in at the same rate.

In November, Huang stressed that scaling was alive and well and that it had simply shifted from training to inference. Huang also said Thursday that post-training methods were "really quite intense" and that models would keep improving with new reasoning methods.

Huang's DeepSeek comments may serve as a preview for Nvidia's first earnings call of 2025, scheduled for February 26. DeepSeek has become a popular topic of discussion on earnings calls for companies across the tech spectrum, from Airbnb to Palantir.

Nvidia's rival AMD was asked the question earlier this month, and its CEO, Lisa Su, said DeepSeek was driving innovation that's "good for AI adoption."

Read the original article on Business Insider

Elon Musk quietly built a second mega-data center for xAI in Atlanta with $700 million worth of chips and cables

20 February 2025 at 09:41
The xAI and Grok logos are seen in this illustration photo taken on 05 November, 2023 in Warsaw, Poland. Elon Musks's xAI company this week introduced Grok, its converstional AI which is says can match GPT 3.5 in performance.
Elon Musk's xAI introduced Grok, its conversational AI it claims can match GPT 3.5.

Getty Images

  • xAI plans to operate a large data center to support X in Atlanta.
  • The data center will have about 12,000 GPUs and $700 million worth of equipment.
  • xAI set up a massive data center in Memphis last year.

xAI has been quietly setting up a data center in Atlanta, expanding its footprint beyond the massive data center in Memphis.

Elon Musk's AI startup plans to operate a large data center with X. The two companies are combining hardware to total roughly 12,000 graphics processing units, the Nvidia-designed chips used for most AI computation, according to a Business Insider tally of the equipment listed in the companies' agreement with Develop Fulton, one of Atlanta's economic development agencies.

In December, X and xAI signed similar agreements with Develop Fulton. Under the agreements, Develop Fulton orchestrated a municipal bond process to finance the $700 million in chips, cables, and other equipment going into the single facility. Fortune initially reported on the Atlanta data center. The size and scale of the data center have not been previously reported.

Representatives for X and xAI did not respond to a request for comment.

Inside the Fulton County data center

The Atlanta data center has sizable computing power, according to a data center solutions architect, and expert in AI hardware, who asked to remain nameless to comment on the documents. It's comparable to a data center a hyperscaler like Google or Amazon might set up.

X representatives described it as an "exascale" data center capable of computing "trillion-parameter AI." But the Georgia facility pales in comparison to the reported capacity of xAI's Memphis project, nicknamed Colossus, which Musk has called the largest data center in the world.

The Georgia facility will house an estimated 12,448 Nvidia GPUs. The vast majority of these are Hopper generation H100 GPUs, which cost between $277,000 and $500,000 for each rack of eight chips, according to the documents.

Roughly 3% of the chips are Nvidia's less-powerful A100 GPUs, which cost $147,000 for an equivalent configuration of eight chips. X is contributing all of the A100s, along with 11,000 H100s.

Neither of these chip designs requires liquid cooling, which has been a point of tension for Musk's companies in Memphis. When operating at full capacity, Colossus is expected to become among the largest consumers of water in the city.

In addition to H100 chips, xAI is contributing Mellanox switches and optics β€” high-bandwidth networking equipment that can facilitate chips working together faster, also purchased from Nvidia. Documents submitted to Develop Fulton by X indicate that the facility will be used to "develop and train artificial intelligence products."

Of the $700 million in accelerated computing hardware going into the facility, $442 million is allocated to X, and $258 million is allocated to xAI. The two companies will receive tax abatements that are estimated to be worth about $10 million over the course of ten years, to be split in proportion with their hardware investments, according to a representative at Develop Fulton. Kwanza Hall, chairman of the board at Develop Fulton, said the organization estimates the project will have an overall economic impact of more than $241 million.

The data center architect estimated the Atlanta facility would require 20 megawatts of total power, which it could realistically get from the power grid. xAI has requested 150 megawatts from the Tennessee Valley Authority for the Memphis facility, a representative for Memphis Light, Gas and Water said during a city council meeting in January.

X and xAI's partnership

The Atlanta facility is an example of Musk ostensibly pooling his resources to benefit both X and xAI. According to the records, X contributed 90% of the hardware for the data center, and xAI 10%.

xAI launched its chatbot, Grok, on X's platform in November 2023, but has since spun it off as an additional stand-alone app. On Monday, xAI released the latest version of the chatbot, and Musk said it has "more than 10 times" the compute power of its predecessor.

The equipment will be used to train large language models and semantic search products for the X platform, according to the documents. X has about 16 employees in the area, based on a review of LinkedIn profiles. xAI has one worker stationed at the Georgia facility and two additional employees listed as "X Corp Partner," the company's internal org chart shows. The deal with Develop Fulton states that 24 jobs will be maintained at the facility and none will be added.

Musk is trying to position xAI as a major competitor to Big Tech giants like OpenAI and Google, even drawing some talent from Tesla. The company built its Memphis data center in just 122 days, according to Nvidia β€” in record time for a data center of its size. xAI has also brought in hundreds of data annotators to train its chatbot over the past year with an eye to hiring thousands in the months to come, BI previously reported.

In February, Musk and a group of investors submitted a $97.4 billion bid to buy the nonprofit that controls OpenAI, but the billionaire later said he would withdraw the bid if OpenAI remains a non-profit entity. In response to Musk's initial offer, OpenAI CEO Sam Altman told reporters that the company is "not for sale."

"He obviously is a competitor," Altman said of the bid. "He's working hard, and he's raised a lot of money for xAI and they're trying to compete with us from a technological perspective, from getting the product into the market, and I wish he would just compete by building a better product."

Do you work for xAI or one of Musk's companies? Reach out to the reporter via a nonwork email and device at gkay@businessinsider.com or through the encrypted messaging platform Signal at 248-894-6012.

Have a tip or an insight to share? Contact Emma at ecosgrove@businessinsider.com or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

The big switch to DeepSeek is hitting a snag

6 February 2025 at 02:00
A backlit figure wearing glasses looks at a phone in front of the whale logo of DeepSeek
A woman holds a cell phone in front of a computer screen displaying the DeepSeek logo, on January 28, 2025, in Edmonton, Canada.

Artur Widak/NurPhoto

  • AI startups are clamoring for consistent, secure access to DeepSeek's large language models.
  • Cloud providers are having trouble offering it at usable speeds and DeepSeek's own API is hampered.
  • The troubles are delaying the switch to the low-cost AI that rocked markets last week.

DeepSeek may have burst into the mainstream with a bang last week, but US-based AI businesses trying to use the Chinese company's AI models are having a host of troubles.

"We're on our seventh provider," Neal Shah, CEO of Counterforce Health, told Business Insider.

Counterforce, like many startups, accesses AI models through APIs provided by cloud companies. These APIs charge by the token β€” the unit of measure for inputs and outputs of large language models. This allows costs to scale with usage when companies are young and can't afford to pay for expensive, dedicated computing capacity they might not fully use.

Right now, the company's service, which uses AI to generate responses to insurance claim denials, is free for individuals and in pilot tests with healthcare providers, so getting costs down as far as possible is paramount. DeepSeek's open model was a game-changer.

Since late January, Shah's team tried and struggled with six different API providers. The seventh, Fireworks AI, has been just consistent enough, Shah said. The others were too slow or unreliable.

Artificial Analysis, a website that tracks the availability and performance of AI models across cloud providers, showed seven clouds were running DeepSeek models on Wednesday. Most were running at one-third of the speed of DeepSeek's own API, except for Fireworks AI, which is about half the speed of the Chinese service.

Many businesses are concerned about sharing data with a Chinese API and prefer to use it through a US provider. But many API providers are struggling to offer consistent access to the full DeepSeek models at fast enough speeds for them to be useful.

The companies measured by Artificial Analysis batch providers together for AI inference to improve prices and use computing resources more efficiently. Companies with dedicated computing capacity β€” especially Nvidia's H200 chips β€” likely won't struggle. And those willing to pay hyperscaler cloud prices may find it reliable and easier to get.

The Chinese company that rocked markets so thoroughly because it was cheaper to build and much cheaper to run than Western alternatives β€” was touted as a booster pack and a leveler for the entire AI startup ecosystem. A few weeks into what was anticipated as a mass conversion, that shift isn't as easy as it may have seemed.

DeepSeek did not respond to BI's request for comment.

DeepSeek at speed is hard to find

Theo Browne would like to use DeepSeek, but he can't find a good source. Through his company Ping, Browne makes AI tools for software developers.

He started testing DeepSeek's models in December, when the company released V3, and found that he could get comparable or better results for one-fifteenth the price of proprietary models like Anthropic's Claude.

When the rest of the world caught wind in mid-January, options for accessing DeepSeek became inconsistent.

"Most companies are offering a really bad experience right now. Browne told BI. "It's taking 100 times longer to generate a response than any traditional model provider," he said.

Browne went straight to DeepSeek's API instead of using a US-based cloud, which wouldn't be an option for a more security-concerned company.

But then DeepSeek's China-hosted API went down on January 26 and has yet to be restored to full function. The company blamed a malicious attack and has been working to resolve it.

Attack aside, the reasons for slow and spotty service could also be because clouds don't have powerful enough hardware to run the large model β€” using more, weaker hardware further increases the complexity and slows speed. The immense uptick in demand could impact speed and reliability too.

Baseten, a company that provides mostly dedicated AI computing capacity to clients, has been working with DeepSeek and an outside research lab for months to get the model running well. CEO Tuhin Srivastava told BI that Baseten had the model running faster than DeepSeek's API before the attack.

Several platforms are also taking advantage of DeepSeek's technical prowess by running smaller versions or using DeepSeek's R1 reasoning model to "distill" other open-source models like Meta's Llama. That's what Groq, an aspiring Nvidia competitor and inference provider, is doing. The company signed up 15,000 new users within the first 24 hours of offering the hybrid model and more than 37,000 organizations have used the model so far, Chief Technology Evangelist Mark Heaps told BI.

Unseen risks

For businesses that can get access to high-speed DeepSeek models, there are other reasons to hesitate.

Pukar Hamal, CEO of software security startup Security Pal, has apprehension about the security of Chinese AI models and said he's concerned about using DeepSeeek's models for business, even if they're run locally on-premises or via a US-based API.

"I run a security company so I have to be super paranoid," Hamal told BI. A cheap Chinese model may be an attractive option for startups looking to get through the early years and scale. But if they want to sell whatever they're building to a large enterprise customer a Chinese model is going to be a liability, he said.

"The moment a startup wants to sell to an enterprise, an enterprise wants to know what your exact data architecture system looks like. If they see you're heavily relying on a Chinese-made LLM, ain't no way you're gonna be able to sell it," Hamal said.

He's convinced the DeepSeek moment was hype.

"I think we'll effectively stop talking about it in a couple of weeks," he said.

But for a lot of companies, the low cost is irresistible and the security concern is minimal β€” at least in the early stages of operation.

Shah, for one, is anonymizing user information before his software calls any model so that patients' identities remain secure.

"Frankly, we don't even fully trust Anthropic and other models. You don't really know where the data is going," Shah said.

DeepSeek's price is irresistible

Counterforce is a somewhat lucky fit for DeepSeek while it is in its awkward toddler phase. The startup can put a relatively large amount of data into the model and isn't too worried about output speed since patients are happy to wait a few minutes for a letter that could save them hundreds of dollars.

Shah is also developing an AI-enabled tool that will call insurance companies on behalf of patients. That means integrating language, voice, and listening models at the speed of conversation. For that to work and be cost-effective, DeepSeek's availability and speed need to improve.

Several cloud providers told BI they are actively working on it and developers have not stopped clamoring, said Jasper Zhang, cofounder and CEO of cloud service Hyperbolic.

"After we launched the new DeepSeek model, we saw inference users increase by 150%," Zhang said.

Fireworks, one of the few cloud services to consistently provide decent performance said January's new users increased 400% month over month.

Together AI cofounder and CEO Vipul Ved Prakash told BI the company is working on a fix that may improve speed this week.

Zhang is on the case too. His goal is to democratize access to AI so that any startup or individual can build with it. He said open-source models are quickly catching up to proprietary ones.

"R1 is a real killer," Zhang said. Still, DeepSeek's teething troubles leave a window for others to enter and the longer DeepSeek is hard to access, the higher the chance the next big open model could come to take its place.

Have a tip or an insight to share? Contact Emma at ecosgrove@businessinsider.com or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

DeepSeek is driving demand for Nvidia's H200 chips, some cloud firms say

31 January 2025 at 08:56
Nvidia CEO Jensen Huang on stage in San Jose, California.
Jensen Huang presenting at a Nvidia event in San Jose in March.

Justin Sullivan/Getty Images

  • Cloud and inference providers see rising demand for Nvidia H200 chips due to DeepSeek's AI models.
  • DeepSeek's open-source models require powerful hardware to run the full model for inference.
  • The trend runs counter to the Nvidia sell-off following growing awareness of DeepSeek.

Some cloud providers are experiencing a notable uptick in demand for Nvidia's H200 chips after Chinese AI company DeepSeek burst into the race for the winning foundation model this month.

Though the stock market caught wind of the powerful yet efficient large language model Monday, sending Nvidia's stock down 16%, DeepSeek, has been on the radar of AI researchers and developers since it released its first model, V2, in May 2024.

But the performance of V3, released in December, is what made AI developers sit up and take notice. When R1, the company's reasoning model, which competes with OpenAI's o1, was released in early January, demand for Nvidia's H200s started climbing.

"The launch of DeepSeek R1 has significantly accelerated demand for H200. We've seen such strong interest that enterprises are pre-purchasing large blocks of Lambda's H200 capacity, even before public availability," said Robert Brooks, founding team member and vice president of revenue at cloud provider Lambda.

DeepSeek's models are open source, which means users pay very little to use them. However, they still need hardware, or a cloud computing service to use them at scale.

Business Insider spoke with 10 cloud service and AI inference providers. Five reported a rapid increase in demand for Nvidia's H200 graphics processing units this month.

Amazon Web Services and Coreweave declined to comment. Oracle, Google, and Microsoft did not respond to requests for comment.

This week, AWS, Microsoft, Google, and Nvidia have made DeepSeek models available on their various cloud and AI-developer platforms, or provided instructions for users to do so themselves.

Nvidia declined to comment, citing a quiet period before its February 26 earnings release.

AI cloud offerings have exploded in the last two years, creating a slew of options beyond the mainstays of cloud computing like Microsoft Azure, and Amazon Web Services.

The demand has come from a range of customers from startups and individual researchers to massive multinational firms.

"We've heard from half a dozen of the 50 largest companies in the world. I'm really not exaggerating," Tuhin Srivastava, cofounder of inference provider Baseten, told BI.

Friday, semiconductor industry analysts at Seminanalysis reported "tangible effects" on pricing for H100 and H200 capacity in the market stemming from DeepSeek.

Total sales of Nvidia H200 GPUs have reached the "double digits billions, CFO Colette Kress said on the company's November earnings call.

'Exponential demand' for Nvidia H200s

Karl Mozurkewich and his team at cloud provider Valdi saw H200 demand ramp up throughout January and at first, they didn't know why.

The Valdi team doesn't own chips, it acquires capacity from existing data centers and sells that capacity to customers. The company doesn't know every use case for each chip it makes accessible, but it polled several H200 customers and all of them wanted the chips to run DeepSeek.

"Suddenly, R1 got everybody's attention β€” it caught fire β€” and then it kind of went exponential," Mozurkewich said.

American companies are eager to take advantage of DeepSeek's model performance and reasoning innovations, but most are not keen to share their data with a Chinese firm. That means they can either use an API offered by a US firm or run the model on their own hardware.

Since the model is open source, it can be downloaded and run locally without sharing data with DeepSeek.

For Valdi, the majority of its H200 demand is coming from startups, Mozurkewich said.

"It appears the market is reacting to DeepSeek by grabbing the best GPUs available for testing as quickly as possible," he said. "This makes sense, as most companies' current GPUs are likely to continue to work on ongoing tasks they've been allocated to," Mozurkewich continued.

Though many companies are still testing and experimenting, the Valdi team is seeing longer-term requests for additional hardware, suggesting an uptick in demand that could last beyond DeepSeek's initial hype cycle.

Chip light, compute-heavy

DeepSeek's models were trained with less powerful hardware than US models, according to the company's research paper. This efficiency has spooked the stock market.

Players like Meta, OpenAI, and Microsoft have invested billions in AI infrastructure, with billions more on the way. Investors are concerned about whether all that capacity will be needed. DeepSeek was created with fewer, relatively weak chips (though the number is hotly debated).

Training chips aside, using the models for inference is a compute-intensive task, cloud providers say.

"It is not light and easy to run," Srivastava said.

The size of a model is measured in "parameters." More parameters require more compute. The most powerful versions of DeepSeek's models have 678 billion parameters. That's less than OpenAI's ChatGPT-4 which has 1.76 trillion, but more than Meta's largest Llama model, which has 405 billion.

Srivastava said most firms were avoiding the 405 billion parameter Llama model if they coud help it since the smaller version was much easier to run. DeepSeek offers smaller versions too, and even its most powerful version is cheaper to run, which has stoked excitement with firms who want to use the full model, the cloud providers said.

H200 chips are the only widely available Nvidia chip that can run DeepSeek's V3 model in its full form on a single node (8 chips designed to work together).

You can also spread it across more lower-power GPUs, but that requires more expertise and leaves room for error. Adding that complexity almost inevitably slows down performance, Srivastava said.

Nvidia's Blackwell chips will also be able to handle the full V3 model in one node, but these chips have just begun shipping this year.

With demand spiking, finding enough chips to run V3 or R1 at high speed is tough if it hasn't already been allocated.

Baseten doesn't own GPUs; it buys capacity from data centers that do and then tinkers with all the software connections to make models run smoothly. Some of its customers have their own hardware in their own data centers but still hire Baseten to optimize model performance.

Its customers especially value inference speed β€” the speed that enables an AI-generated voice to converse in real time for example. DeepSeek's capacity at the open source price is a game-changer for its customers, according to Srivastava.

"It does feel like this is an inflection point," he said.

Have a tip or an insight to share? Contact Emma at ecosgrove@businessinsider.com or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

The tech industry is in a frenzy over DeepSeek. Here's who could win and lose from China's AI progress.

A computer chip with the DeepSeek logo.
DeepSeek has sent Silicon Valley and the tech industry into a frenzy.

Tyler Le/Business Insider

  • DeepSeek, a Chinese open-source AI firm, is taking over the discussion in tech circles.
  • Tech stocks, especially Nvidia, plunged Monday.
  • Companies leading the AI boom could be in for a reset as DeepSeek upends the status quo.

DeepSeek, a Chinese company with AI models that compete with OpenAI's at a fraction of the cost, is generating almost as many takes as tokens.

Across Silicon Valley, executives, investors, and employees debated the implications of such efficient models. Some called into question the trillions of dollars being spent on AI infrastructure since DeepSeek says its models were trained for a relative pittance.

"This is insane!!!!" Aravind Srinivas, CEO of startup Perplexity AI, wrote in response to a post on X noting that DeepSeek models are cheaper and better than some of OpenAI's latest offerings.

The takes on DeepSeek's implications are coming fast and hot. Here are eight of the most common.

Take 1: Generative AI adoption will explode

"Jevons paradox strikes again!" Microsoft CEO Satya Nadella posted on X Monday morning. "As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of."

The idea that as tech improves, whether smarter, cheaper, or both, it will only bring in exponentially more demand is based on a 19th-century economic principle. In this case, the barrier to entry for companies looking to dip their toe into AI has been high. Cheaper tools could encourage more experimentation and further the technology faster.

"Similar to Llama, it lowers the barriers to adoption, enabling more businesses to accelerate AI use cases and move them into production." Umesh Padval, managing director at Thomvest Ventures told Business Insider.

That said, even if AI grows faster than ever, that doesn't necessarily mean the trillions of investments that have flooded the space will pay off.

Take 2: DeepSeek broke the prevailing wisdom about the cost of AI

"DeepSeek seems to have broken the assumption that you need a lot of capital to train cutting-edge models," Debarghya Das, an investor at Menlo Ventures told BI.

The price of DeepSeek's open-source model is competitive β€” 20 to 40 times cheaper to use than comparable models from OpenAI, according to Bernstein analysts.

The exact cost of building DeepSeek models is hotly debated. The research paper from DeepSeek explaining its V3 model lists a training cost of $5.6 million β€” a harrowingly low number for other providers of foundation models.

However, the same paper says that the "aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data." So the $5 million figure is only part of the equation.

The tech ecosystem is also reacting strongly to the implication that DeepSeek's state-of-the-art model architecture will be cheaper to run.

"This breakthrough slashes computational demands, enabling lower fees β€” and putting pressure on industry titans like Microsoft and Google to justify their premium pricing," Kenneth Lamont, principal at Morningstar, wrote in a note on Monday.

He went on to remind investors that with early-stage technology, assuming the winners are set is folly.

"Mega-trends rarely unfold as expected, and today's dominant players might not be tomorrow's winners," Lamont wrote.

Dmitry Shevelenko, the chief business officer at Perplexity, a big consumer of compute and existing models, concurred that Big Tech players would need to rethink their numbers.

"It certainly challenges the margin structure that maybe they were selling to investors," Shevelenko told BI. "But in terms of accelerating the development of these technologies, this is a good thing." Perplexity has added DeepSeek's models to its platform.

Take 3: Considering a switch to DeepSeek

On Monday, several platforms that provide AI models for businessesβ€” Groq and Liquid.AI to name two β€” added DeepSeek's models to their offerings.

On Amazon's internal Slack, one person posted a meme suggesting that developers might drop Anthropic's Claude AI model in favor of DeepSeek's offerings. The post included an image of the Claude model crossed out.

"Friendship ended with Claude. Now DeepSeek is my best friend." the person wrote, according to a screenshot of the post seen by BI, which got more than 60 emoji reactions from colleagues.

Amazon has invested billions of dollars in Anthropic. The cloud giant also provides access to Claude models via its Amazon Web Service platform. And some AWS customers are asking for DeepSeek, BI has exclusively reported.

"We are always listening to customers to bring the latest emerging and popular models to AWS," an Amazon spokesperson said, while noting that customers can access some DeepSeek-related products on AWS right now through tools such as Bedrock.

"We expect to see many more models like this β€” both large and small, proprietary and open-source β€” excel at different tasks," the Amazon spokesperson added. "This is why the majority of Amazon Bedrock customers use multiple models to meet their unique needs and why we remain focused on providing our customers with choice β€” so they can easily experiment and integrate the best models for their specific needs into their applications."

Switching costs for companies creating their own products on top of foundation models are relatively low, which is generating a lot of questions as to whether DeepSeek will overtake other models from Meta, Anthropic, or OpenAI in popularity with enterprises. (It's already number one in Apple's app store.)

DeepSeek, however, is owned by Chinese hedge fund High-Flyer and the same security concerns haunting TikTok may eventually apply to DeepSeek.

"While open-source models like DeepSeek present exciting opportunities, enterprisesβ€”especially in regulated industriesβ€”may hesitate to adopt Chinese-origin models due to concerns about training data transparency, privacy, and security," Padval said.

Security concerns aside, the software companies that sell APIs to businesses have been adding DeepSeek throughout Monday.

Take 4: Infrastructure players could take a hit

Infrastructure-as-a-service companies, such as Oracle, Digital Ocean, and Microsoft could be in a precarious position should more efficient AI models rule in the future.

"The sheer efficiency of DeepSeek's pre and post training framework (if true) raises the question as to whether or not global hyperscalers and governments, that have and intend to continue to invest significant capex dollars into AI infrastructure, may pause to consider the innovative methodologies that have come to light with DeepSeek's research," wrote Stifel analysts.

If the same quantity of work requires less compute, those selling only compute could suffer, Barclays analysts wrote.

"With the increased uncertainty, we could see share price pressure amongst all three," according to the analysts.

Microsoft and Digital Ocean declined to comment. Oracle did not respond to a request for comment in time for publication.

Take 5: Scaling isn't dead, it's just moved

For months, AI luminaries, including Nvidia CEO Jensen Huang have been predicting a big shift in AI from a focus on training to a focus on inference. Training is the process by which models are created while inference is the type of computing that runs AI models and related tools such as ChatGPT.

The shift in computing's total share to inference has been underway for a while, but now, change is coming from two places. First, more AI users means more inference demand. The second is that part of DeepSeek's secret sauce is how improvement takes place in the inference stage. Nvidia took a positive spin, via a spokesperson.

"DeepSeek is an excellent AI advancement and a perfect example of Test Time Scaling. DeepSeek's work illustrates how new models can be created using that technique, leveraging widely-available models and compute that is fully export control compliant," an Nvidia spokesperson told BI.

"Inference requires significant numbers of NVIDIA GPUs and high-performance networking. We now have three scaling laws: pre-training and post-training, which continue, and new test-time scaling."

Take 6: Open-source changes model building

The most under-hyped part of DeepSeek's innovations is how easy it will now be to take any AI model and turn it into a more powerful "reasoning" model, according to Jack Clark, an Anthropic cofounder, and a former OpenAI employee, wrote about DeepSeek in his newsletter Import AI on Monday.

Clark also explained that some AI companies, such as OpenAI, have been hiding all the reasoning steps that their latest AI models take. DeepSeek's models show all these intermediate "chains of thought" for anyone to see and use. This radically changes how AI models are controlled, Clark wrote.

"Some providers like OpenAI had previously chosen to obscure the chains of thought of their models, making this harder," Clark explained. "There's now an open-weight model floating around the internet which you can use to bootstrap any other sufficiently powerful base model into being an AI reasoner. AI capabilities worldwide just took a one-way ratchet forward."

Take 7: Programmers still matter

DeepSeek improved by using novel programming methods, which Samir Kumar, co-founder and general partner at VC firm Touring Capital, saw as a reminder that humans are still coding the most exciting innovations in AI.

He told BI that DeepSeek is "a good reminder of the talent and skillset of hardcore human low-level programmers."

Got a tip or an insight to share? Contact BI's senior reporter Emma Cosgrove at ecosgrove@businessinsider.com or use the secure messaging app Signal: 443-333-9088.

Contact Pranav from a nonwork device securely on Signal at +1-408-905-9124 or email him at pranavdixit@protonmail.com.

You can email Jyoti at jmann@businessinsider.com or DM her via X @jyoti_mann1

Read the original article on Business Insider
❌
❌