❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Nvidia CEO Jensen Huang directly addresses the DeepSeek stock sell-off, saying investors got it wrong

20 February 2025 at 16:53
Jensen Huang, wearing a black shirt and leather jacket, sits in a gray chair and gestures while talking.
Nvidia CEO Jensen Huang discussed the investor reaction to DeepSeek at a virtual vendor event.

Courtesy of DDN

  • Nvidia CEO Jensen Huang said investors misinterpreted DeepSeek's AI advancements.
  • DeepSeek's large language models were built with weaker chips, rattling markets in January.
  • In a pre-taped interview released Thursday, Huang emphasized the importance of AI post-training.

Investors took away the wrong message from DeepSeek's advancements in AI, Nvidia CEO Jensen Huang said at a virtual event aired Thursday.

DeepSeek, a Chinese AI firm owned by the hedge fund High-Flyer, released a competitive, open-source reasoning model named R1 in January. The firm said the large language model underpinning R1 was built with weaker chips and a fraction of the funding of the predominant, Western-made AI models.

Investors reacted to this news by selling off Nvidia stock,Β resulting in a $600 billion loss in market capitalization. Huang himself temporarilyΒ lost nearly 20%Β of his net worth in the rout. The stock has since recovered much of its lost value.

Huang said in Thursday's pre-recorded interview, which was produced by Nvidia's partner DDN and part of an event debuting DDN's new software platform, Infinia, that the dramatic market response stemmed from investors' misinterpretation.

Investors have raised questions as to whether trillions in spending on AI infrastructure by Big Tech firms is needed, if less computing power is required to train models. Jensen said the industry still needed computing power for post-training methods, which allow AI models to draw conclusions or make predictions after training.

As post-training methods grow and diversify, the need for the computing power Nvidia chips provide will also grow, he continued.

"From an investor perspective, there was a mental model that the world was pre-training and then inference. And inference was, you ask an AI a question, and it instantly gives you an answer," he said at Thursday's event, adding, "I don't know whose fault it is, but obviously that paradigm is wrong."

Pre-training is still important, Huang said, but post-training is the "most important part of intelligence" and "where you learn to solve problems."

DeepSeek's innovations energize the AI world, he said.

"It is so incredibly exciting. The energy around the world as a result of R1 becoming open-sourced β€” incredible," Huang said.

Nvidia spokespeople have addressed the market reaction with written statements to a similar effect, though Huang had yet to make public comments on the topic until Thursday's event.

Huang has been defending against the growing concern that model scaling is in trouble for months. Even before DeepSeek burst into the public consciousness in January, reports that model improvements at OpenAI were slowing down roused suspicions that the AI boom might not deliver on its promise β€” and Nvidia, therefore, wouldn't continue to cash in at the same rate.

In November, Huang stressed thatΒ scaling was alive and well and that it had simply shifted from training to inference. Huang also said Thursday that post-training methods were "really quite intense" and that models would keep improving with new reasoning methods.

Huang's DeepSeek comments may serve as a preview for Nvidia's first earnings call of 2025, scheduled for February 26. DeepSeek has become a popular topic of discussion on earnings calls for companies across the tech spectrum, from Airbnb to Palantir.

Nvidia's rival AMD was asked the question earlier this month, and its CEO,Β Lisa Su, said DeepSeek was driving innovation that's "good for AI adoption."

Read the original article on Business Insider

Elon Musk quietly built a second mega-data center for xAI in Atlanta with $700 million worth of chips and cables

20 February 2025 at 09:41
The xAI and Grok logos are seen in this illustration photo taken on 05 November, 2023 in Warsaw, Poland. Elon Musks's xAI company this week introduced Grok, its converstional AI which is says can match GPT 3.5 in performance.
Elon Musk's xAI introduced Grok, its conversational AI it claims can match GPT 3.5.

Getty Images

  • xAI plans to operate a large data center to support X in Atlanta.
  • The data center will have about 12,000 GPUs and $700 million worth of equipment.
  • xAI set up a massive data center in Memphis last year.

xAI has been quietly setting up a data center in Atlanta, expanding its footprint beyond the massive data center in Memphis.

Elon Musk's AI startup plans to operate a large data center with X. The two companies are combining hardware to total roughly 12,000 graphics processing units, the Nvidia-designed chips used for most AI computation, according to a Business Insider tally of the equipment listed in the companies' agreement with Develop Fulton, one of Atlanta's economic development agencies.

In December, X and xAI signed similar agreements with Develop Fulton. Under the agreements, Develop Fulton orchestrated a municipal bond process to finance the $700 million in chips, cables, and other equipment going into the single facility. Fortune initially reported on the Atlanta data center. The size and scale of the data center have not been previously reported.

Representatives for X and xAI did not respond to a request for comment.

Inside the Fulton County data center

The Atlanta data center has sizable computing power, according to a data center solutions architect, and expert in AI hardware, who asked to remain nameless to comment on the documents. It's comparable to a data center a hyperscaler like Google or Amazon might set up.

X representatives described it as an "exascale" data center capable of computing "trillion-parameter AI." But the Georgia facility pales in comparison to the reported capacity of xAI's Memphis project, nicknamed Colossus, which Musk has called the largest data center in the world.

The Georgia facility will house an estimated 12,448 Nvidia GPUs. The vast majority of these are Hopper generation H100 GPUs, which cost between $277,000 and $500,000 for each rack of eight chips, according to the documents.

Roughly 3% of the chips are Nvidia's less-powerful A100 GPUs, which cost $147,000 for an equivalent configuration of eight chips. X is contributing all of the A100s, along with 11,000 H100s.

Neither of these chip designs requires liquid cooling, which has been a point of tension for Musk's companies in Memphis. When operating at full capacity, Colossus is expected to become among the largest consumers of water in the city.

In addition to H100 chips, xAI is contributing Mellanox switches and optics β€” high-bandwidth networking equipment that can facilitate chips working together faster, also purchased from Nvidia. Documents submitted to Develop Fulton by X indicate that the facility will be used to "develop and train artificial intelligence products."

Of the $700 million in accelerated computing hardware going into the facility, $442 million is allocated to X, and $258 million is allocated to xAI. The two companies will receive tax abatements that are estimated to be worth about $10 million over the course of ten years, to be split in proportion with their hardware investments, according to a representative at Develop Fulton. Kwanza Hall, chairman of the board at Develop Fulton, said the organization estimates the project will have an overall economic impact of more than $241 million.

The data center architect estimated the Atlanta facility would require 20 megawatts of total power, which it could realistically get from the power grid. xAI has requested 150 megawatts from the Tennessee Valley Authority for the Memphis facility, a representative for Memphis Light, Gas and Water said during a city council meeting in January.

X and xAI's partnership

The Atlanta facility is an example of Musk ostensibly pooling his resources to benefit both X and xAI. According to the records, X contributed 90% of the hardware for the data center, and xAI 10%.

xAI launched its chatbot, Grok, on X's platform in November 2023, but has since spun it off as an additional stand-alone app. On Monday, xAI released the latest version of the chatbot, and Musk said it has "more than 10 times" the compute power of its predecessor.

The equipment will be used to train large language models and semantic search products for the X platform, according to the documents. X has about 16 employees in the area, based on a review of LinkedIn profiles. xAI has one worker stationed at the Georgia facility and two additional employees listed as "X Corp Partner," the company's internal org chart shows. The deal with Develop Fulton states that 24 jobs will be maintained at the facility and none will be added.

Musk is trying to position xAI as a major competitor to Big Tech giants like OpenAI and Google, even drawing some talent from Tesla. The company built its Memphis data center in just 122 days, according to Nvidia β€” in record time for a data center of its size. xAI has also brought in hundreds of data annotators to train its chatbot over the past year with an eye to hiring thousands in the months to come, BI previously reported.

In February, Musk and a group of investors submitted a $97.4 billion bid to buy the nonprofit that controls OpenAI, but the billionaire later said he would withdraw the bid if OpenAI remains a non-profit entity. In response to Musk's initial offer, OpenAI CEO Sam Altman told reporters that the company is "not for sale."

"He obviously is a competitor," Altman said of the bid. "He's working hard, and he's raised a lot of money for xAI and they're trying to compete with us from a technological perspective, from getting the product into the market, and I wish he would just compete by building a better product."

Do you work for xAI or one of Musk's companies? Reach out to the reporter via a nonwork email and device at [email protected] or through the encrypted messaging platform Signal at 248-894-6012.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

The big switch to DeepSeek is hitting a snag

6 February 2025 at 02:00
A backlit figure wearing glasses looks at a phone in front of the whale logo of DeepSeek
A woman holds a cell phone in front of a computer screen displaying the DeepSeek logo, on January 28, 2025, in Edmonton, Canada.

Artur Widak/NurPhoto

  • AI startups are clamoring for consistent, secure access to DeepSeek's large language models.
  • Cloud providers are having trouble offering it at usable speeds and DeepSeek's own API is hampered.
  • The troubles are delaying the switch to the low-cost AI that rocked markets last week.

DeepSeek may have burst into the mainstream with a bang last week, but US-based AI businesses trying to use the Chinese company's AI models are having a host of troubles.

"We're on our seventh provider," Neal Shah, CEO of Counterforce Health, told Business Insider.

Counterforce, like many startups, accesses AI models through APIs provided by cloud companies. These APIs charge by the token β€” the unit of measure for inputs and outputs of large language models. This allows costs to scale with usage when companies are young and can't afford to pay for expensive, dedicated computing capacity they might not fully use.

Right now, the company's service, which uses AI to generate responses to insurance claim denials, is free for individuals and in pilot tests with healthcare providers, so getting costs down as far as possible is paramount. DeepSeek's open model was a game-changer.

Since late January, Shah's team tried and struggled with six different API providers. The seventh, Fireworks AI, has been just consistent enough, Shah said. The others were too slow or unreliable.

Artificial Analysis, a website that tracks the availability and performance of AI models across cloud providers, showed seven clouds were running DeepSeek models on Wednesday. Most were running at one-third of the speed of DeepSeek's own API, except for Fireworks AI, which is about half the speed of the Chinese service.

Many businesses are concerned about sharing data with a Chinese API and prefer to use it through a US provider. But many API providers are struggling to offer consistent access to the full DeepSeek models at fast enough speeds for them to be useful.

The companies measured by Artificial Analysis batch providers together for AI inference to improve prices and use computing resources more efficiently. Companies with dedicated computing capacity β€” especially Nvidia's H200 chips β€” likely won't struggle. And those willing to pay hyperscaler cloud prices may find it reliable and easier to get.

The Chinese company that rocked markets so thoroughly because it was cheaper to build and much cheaper to run than Western alternatives β€” was touted as a booster pack and a leveler for the entire AI startup ecosystem. A few weeks into what was anticipated as a mass conversion, that shift isn't as easy as it may have seemed.

DeepSeek did not respond to BI's request for comment.

DeepSeek at speed is hard to find

Theo Browne would like to use DeepSeek, but he can't find a good source. Through his company Ping, Browne makes AI tools for software developers.

He started testing DeepSeek's models in December, when the company released V3, and found that he could get comparable or better results for one-fifteenth the price of proprietary models like Anthropic's Claude.

When the rest of the world caught wind in mid-January, options for accessing DeepSeek became inconsistent.

"Most companies are offering a really bad experience right now. Browne told BI. "It's taking 100 times longer to generate a response than any traditional model provider," he said.

Browne went straight to DeepSeek's API instead of using a US-based cloud, which wouldn't be an option for a more security-concerned company.

But then DeepSeek's China-hosted API went down on January 26 and has yet to be restored to full function. The company blamed a malicious attack and has been working to resolve it.

Attack aside, the reasons for slow and spotty service could also be because clouds don't have powerful enough hardware to run the large model β€” using more, weaker hardware further increases the complexity and slows speed. The immense uptick in demand could impact speed and reliability too.

Baseten, a company that provides mostly dedicated AI computing capacity to clients, has been working with DeepSeek and an outside research lab for months to get the model running well. CEO Tuhin Srivastava told BI that Baseten had the model running faster than DeepSeek's API before the attack.

Several platforms are also taking advantage of DeepSeek's technical prowess by running smaller versions or using DeepSeek's R1 reasoning model to "distill" other open-source models like Meta's Llama. That's what Groq, an aspiring Nvidia competitor and inference provider, is doing. The company signed up 15,000 new users within the first 24 hours of offering the hybrid model and more than 37,000 organizations have used the model so far, Chief Technology Evangelist Mark Heaps told BI.

Unseen risks

For businesses that can get access to high-speed DeepSeek models, there are other reasons to hesitate.

Pukar Hamal, CEO of software security startup Security Pal, has apprehension about the security of Chinese AI models and said he's concerned about using DeepSeeek's models for business, even if they're run locally on-premises or via a US-based API.

"I run a security company so I have to be super paranoid," Hamal told BI. A cheap Chinese model may be an attractive option for startups looking to get through the early years and scale. But if they want to sell whatever they're building to a large enterprise customer a Chinese model is going to be a liability, he said.

"The moment a startup wants to sell to an enterprise, an enterprise wants to know what your exact data architecture system looks like. If they see you're heavily relying on a Chinese-made LLM, ain't no way you're gonna be able to sell it," Hamal said.

He's convinced the DeepSeek moment was hype.

"I think we'll effectively stop talking about it in a couple of weeks," he said.

But for a lot of companies, the low cost is irresistible and the security concern is minimal β€” at least in the early stages of operation.

Shah, for one, is anonymizing user information before his software calls any model so that patients' identities remain secure.

"Frankly, we don't even fully trust Anthropic and other models. You don't really know where the data is going," Shah said.

DeepSeek's price is irresistible

Counterforce is a somewhat lucky fit for DeepSeek while it is in its awkward toddler phase. The startup can put a relatively large amount of data into the model and isn't too worried about output speed since patients are happy to wait a few minutes for a letter that could save them hundreds of dollars.

Shah is also developing an AI-enabled tool that will call insurance companies on behalf of patients. That means integrating language, voice, and listening models at the speed of conversation. For that to work and be cost-effective, DeepSeek's availability and speed need to improve.

Several cloud providers told BI they are actively working on it and developers have not stopped clamoring, said Jasper Zhang, cofounder and CEO of cloud service Hyperbolic.

"After we launched the new DeepSeek model, we saw inference users increase by 150%," Zhang said.

Fireworks, one of the few cloud services to consistently provide decent performance said January's new users increased 400% month over month.

Together AI cofounder and CEO Vipul Ved Prakash told BI the company is working on a fix that may improve speed this week.

Zhang is on the case too. His goal is to democratize access to AI so that any startup or individual can build with it. He said open-source models are quickly catching up to proprietary ones.

"R1 is a real killer," Zhang said. Still, DeepSeek's teething troubles leave a window for others to enter and the longer DeepSeek is hard to access, the higher the chance the next big open model could come to take its place.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

DeepSeek is driving demand for Nvidia's H200 chips, some cloud firms say

31 January 2025 at 08:56
Nvidia CEO Jensen Huang on stage in San Jose, California.
Jensen Huang presenting at a Nvidia event in San Jose in March.

Justin Sullivan/Getty Images

  • Cloud and inference providers see rising demand for Nvidia H200 chips due to DeepSeek's AI models.
  • DeepSeek's open-source models require powerful hardware to run the full model for inference.
  • The trend runs counter to the Nvidia sell-off following growing awareness of DeepSeek.

Some cloud providers are experiencing a notable uptick in demand for Nvidia's H200 chips after Chinese AI company DeepSeek burst into the race for the winning foundation model this month.

Though the stock market caught wind of the powerful yet efficient large language model Monday, sending Nvidia's stock down 16%, DeepSeek, has been on the radar of AI researchers and developers since it released its first model, V2, in May 2024.

But the performance of V3, released in December, is what made AI developers sit up and take notice. When R1, the company's reasoning model, which competes with OpenAI's o1, was released in early January, demand for Nvidia's H200s started climbing.

"The launch of DeepSeek R1 has significantly accelerated demand for H200. We've seen such strong interest that enterprises are pre-purchasing large blocks of Lambda's H200 capacity, even before public availability," said Robert Brooks, founding team member and vice president of revenue at cloud provider Lambda.

DeepSeek's models are open source, which means users pay very little to use them. However, they still need hardware, or a cloud computing service to use them at scale.

Business Insider spoke with 10 cloud service and AI inference providers. Five reported a rapid increase in demand for Nvidia's H200 graphics processing units this month.

Amazon Web Services and Coreweave declined to comment. Oracle, Google, and Microsoft did not respond to requests for comment.

This week, AWS, Microsoft, Google, and Nvidia have made DeepSeek models available on their various cloud and AI-developer platforms, or provided instructions for users to do so themselves.

Nvidia declined to comment, citing a quiet period before its February 26 earnings release.

AI cloud offerings have exploded in the last two years, creating a slew of options beyond the mainstays of cloud computing like Microsoft Azure, and Amazon Web Services.

The demand has come from a range of customers from startups and individual researchers to massive multinational firms.

"We've heard from half a dozen of the 50 largest companies in the world. I'm really not exaggerating," Tuhin Srivastava, cofounder of inference provider Baseten, told BI.

Friday, semiconductor industry analysts at Seminanalysis reported "tangible effects" on pricing for H100 and H200 capacity in the market stemming from DeepSeek.

Total sales of Nvidia H200 GPUs have reached the "double digits billions, CFO Colette Kress said on the company's November earnings call.

'Exponential demand' for Nvidia H200s

Karl Mozurkewich and his team at cloud provider Valdi saw H200 demand ramp up throughout January and at first, they didn't know why.

The Valdi team doesn't own chips, it acquires capacity from existing data centers and sells that capacity to customers. The company doesn't know every use case for each chip it makes accessible, but it polled several H200 customers and all of them wanted the chips to run DeepSeek.

"Suddenly, R1 got everybody's attention β€” it caught fire β€” and then it kind of went exponential," Mozurkewich said.

American companies are eager to take advantage of DeepSeek's model performance and reasoning innovations, but most are not keen to share their data with a Chinese firm. That means they can either use an API offered by a US firm or run the model on their own hardware.

Since the model is open source, it can be downloaded and run locally without sharing data with DeepSeek.

For Valdi, the majority of its H200 demand is coming from startups, Mozurkewich said.

"It appears the market is reacting to DeepSeek by grabbing the best GPUs available for testing as quickly as possible," he said. "This makes sense, as most companies' current GPUs are likely to continue to work on ongoing tasks they've been allocated to," Mozurkewich continued.

Though many companies are still testing and experimenting, the Valdi team is seeing longer-term requests for additional hardware, suggesting an uptick in demand that could last beyond DeepSeek's initial hype cycle.

Chip light, compute-heavy

DeepSeek's models were trained with less powerful hardware than US models, according to the company's research paper. This efficiency has spooked the stock market.

Players like Meta, OpenAI, and Microsoft have invested billions in AI infrastructure, with billions more on the way. Investors are concerned about whether all that capacity will be needed. DeepSeek was created with fewer, relatively weak chips (though the number is hotly debated).

Training chips aside, using the models for inference is a compute-intensive task, cloud providers say.

"It is not light and easy to run," Srivastava said.

The size of a model is measured in "parameters." More parameters require more compute. The most powerful versions of DeepSeek's models have 678 billion parameters. That's less than OpenAI's ChatGPT-4 which has 1.76 trillion, but more than Meta's largest Llama model, which has 405 billion.

Srivastava said most firms were avoiding the 405 billion parameter Llama model if they coud help it since the smaller version was much easier to run. DeepSeek offers smaller versions too, and even its most powerful version is cheaper to run, which has stoked excitement with firms who want to use the full model, the cloud providers said.

H200 chips are the only widely available Nvidia chip that can run DeepSeek's V3 model in its full form on a single node (8 chips designed to work together).

You can also spread it across more lower-power GPUs, but that requires more expertise and leaves room for error. Adding that complexity almost inevitably slows down performance, Srivastava said.

Nvidia's Blackwell chips will also be able to handle the full V3 model in one node, but these chips have just begun shipping this year.

With demand spiking, finding enough chips to run V3 or R1 at high speed is tough if it hasn't already been allocated.

Baseten doesn't own GPUs; it buys capacity from data centers that do and then tinkers with all the software connections to make models run smoothly. Some of its customers have their own hardware in their own data centers but still hire Baseten to optimize model performance.

Its customers especially value inference speed β€” the speed that enables an AI-generated voice to converse in real time for example. DeepSeek's capacity at the open source price is a game-changer for its customers, according to Srivastava.

"It does feel like this is an inflection point," he said.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

The tech industry is in a frenzy over DeepSeek. Here's who could win and lose from China's AI progress.

A computer chip with the DeepSeek logo.
DeepSeek has sent Silicon Valley and the tech industry into a frenzy.

Tyler Le/Business Insider

  • DeepSeek, a Chinese open-source AI firm, is taking over the discussion in tech circles.
  • Tech stocks, especially Nvidia, plunged Monday.
  • Companies leading the AI boom could be in for a reset as DeepSeek upends the status quo.

DeepSeek, a Chinese company with AI models that compete with OpenAI's at a fraction of the cost, is generating almost as many takes as tokens.

Across Silicon Valley, executives, investors, and employees debated the implications of such efficient models. Some called into question the trillions of dollars being spent on AI infrastructure since DeepSeek says its models were trained for a relative pittance.

"This is insane!!!!" Aravind Srinivas, CEO of startup Perplexity AI, wrote in response to a post on X noting that DeepSeek models are cheaper and better than some of OpenAI's latest offerings.

The takes on DeepSeek's implications are coming fast and hot. Here are eight of the most common.

Take 1: Generative AI adoption will explode

"Jevons paradox strikes again!" Microsoft CEO Satya Nadella posted on X Monday morning. "As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of."

The idea that as tech improves, whether smarter, cheaper, or both, it will only bring in exponentially more demand is based on a 19th-century economic principle. In this case, the barrier to entry for companies looking to dip their toe into AI has been high. Cheaper tools could encourage more experimentation and further the technology faster.

"Similar to Llama, it lowers the barriers to adoption, enabling more businesses to accelerate AI use cases and move them into production." Umesh Padval, managing director at Thomvest Ventures told Business Insider.

That said, even if AI grows faster than ever, that doesn't necessarily mean the trillions of investments that have flooded the space will pay off.

Take 2: DeepSeek broke the prevailing wisdom about the cost of AI

"DeepSeek seems to have broken the assumption that you need a lot of capital to train cutting-edge models," Debarghya Das, an investor at Menlo Ventures told BI.

The price of DeepSeek's open-source model is competitive β€” 20 to 40 times cheaper to use than comparable models from OpenAI, according to Bernstein analysts.

The exact cost of building DeepSeek models is hotly debated. The research paper from DeepSeek explaining its V3 model lists a training cost of $5.6 million β€” a harrowingly low number for other providers of foundation models.

However, the same paper says that the "aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data." So the $5 million figure is only part of the equation.

The tech ecosystem is also reacting strongly to the implication that DeepSeek's state-of-the-art model architecture will be cheaper to run.

"This breakthrough slashes computational demands, enabling lower fees β€” and putting pressure on industry titans like Microsoft and Google to justify their premium pricing," Kenneth Lamont, principal at Morningstar, wrote in a note on Monday.

He went on to remind investors that with early-stage technology, assuming the winners are set is folly.

"Mega-trends rarely unfold as expected, and today's dominant players might not be tomorrow's winners," Lamont wrote.

Dmitry Shevelenko, the chief business officer at Perplexity, a big consumer of compute and existing models, concurred that Big Tech players would need to rethink their numbers.

"It certainly challenges the margin structure that maybe they were selling to investors," Shevelenko told BI. "But in terms of accelerating the development of these technologies, this is a good thing." Perplexity has added DeepSeek's models to its platform.

Take 3: Considering a switch to DeepSeek

On Monday, several platforms that provide AI models for businessesβ€” Groq and Liquid.AI to name two β€” added DeepSeek's models to their offerings.

On Amazon's internal Slack, one person posted a meme suggesting that developers might drop Anthropic's Claude AI model in favor of DeepSeek's offerings. The post included an image of the Claude model crossed out.

"Friendship ended with Claude. Now DeepSeek is my best friend." the person wrote, according to a screenshot of the post seen by BI, which got more than 60 emoji reactions from colleagues.

Amazon has invested billions of dollars in Anthropic. The cloud giant also provides access to Claude models via its Amazon Web Service platform. And some AWS customers are asking for DeepSeek, BI has exclusively reported.

"We are always listening to customers to bring the latest emerging and popular models to AWS," an Amazon spokesperson said, while noting that customers can access some DeepSeek-related products on AWS right now through tools such as Bedrock.

"We expect to see many more models like this β€” both large and small, proprietary and open-source β€” excel at different tasks," the Amazon spokesperson added. "This is why the majority of Amazon Bedrock customers use multiple models to meet their unique needs and why we remain focused on providing our customers with choice β€” so they can easily experiment and integrate the best models for their specific needs into their applications."

Switching costs for companies creating their own products on top of foundation models are relatively low, which is generating a lot of questions as to whether DeepSeek will overtake other models from Meta, Anthropic, or OpenAI in popularity with enterprises. (It's already number one in Apple's app store.)

DeepSeek, however, is owned by Chinese hedge fund High-Flyer and the same security concerns haunting TikTok may eventually apply to DeepSeek.

"While open-source models like DeepSeek present exciting opportunities, enterprisesβ€”especially in regulated industriesβ€”may hesitate to adopt Chinese-origin models due to concerns about training data transparency, privacy, and security," Padval said.

Security concerns aside, the software companies that sell APIs to businesses have been adding DeepSeek throughout Monday.

Take 4: Infrastructure players could take a hit

Infrastructure-as-a-service companies, such as Oracle, Digital Ocean, and Microsoft could be in a precarious position should more efficient AI models rule in the future.

"The sheer efficiency of DeepSeek's pre and post training framework (if true) raises the question as to whether or not global hyperscalers and governments, that have and intend to continue to invest significant capex dollars into AI infrastructure, may pause to consider the innovative methodologies that have come to light with DeepSeek's research," wrote Stifel analysts.

If the same quantity of work requires less compute, those selling only compute could suffer, Barclays analysts wrote.

"With the increased uncertainty, we could see share price pressure amongst all three," according to the analysts.

Microsoft and Digital Ocean declined to comment. Oracle did not respond to a request for comment in time for publication.

Take 5: Scaling isn't dead, it's just moved

For months, AI luminaries, including Nvidia CEO Jensen Huang have been predicting a big shift in AI from a focus on training to a focus on inference. Training is the process by which models are created while inference is the type of computing that runs AI models and related tools such as ChatGPT.

The shift in computing's total share to inference has been underway for a while, but now, change is coming from two places. First, more AI users means more inference demand. The second is that part of DeepSeek's secret sauce is how improvement takes place in the inference stage. Nvidia took a positive spin, via a spokesperson.

"DeepSeek is an excellent AI advancement and a perfect example of Test Time Scaling. DeepSeek's work illustrates how new models can be created using that technique, leveraging widely-available models and compute that is fully export control compliant," an Nvidia spokesperson told BI.

"Inference requires significant numbers of NVIDIA GPUs and high-performance networking. We now have three scaling laws: pre-training and post-training, which continue, and new test-time scaling."

Take 6: Open-source changes model building

The most under-hyped part of DeepSeek's innovations is how easy it will now be to take any AI model and turn it into a more powerful "reasoning" model, according to Jack Clark, an Anthropic cofounder, and a former OpenAI employee, wrote about DeepSeek in his newsletter Import AI on Monday.

Clark also explained that some AI companies, such as OpenAI, have been hiding all the reasoning steps that their latest AI models take. DeepSeek's models show all these intermediate "chains of thought" for anyone to see and use. This radically changes how AI models are controlled, Clark wrote.

"Some providers like OpenAI had previously chosen to obscure the chains of thought of their models, making this harder," Clark explained. "There's now an open-weight model floating around the internet which you can use to bootstrap any other sufficiently powerful base model into being an AI reasoner. AI capabilities worldwide just took a one-way ratchet forward."

Take 7: Programmers still matter

DeepSeek improved by using novel programming methods, which Samir Kumar, co-founder and general partner at VC firm Touring Capital, saw as a reminder that humans are still coding the most exciting innovations in AI.

He told BI that DeepSeek is "a good reminder of the talent and skillset of hardcore human low-level programmers."

Got a tip or an insight to share? Contact BI's senior reporter Emma Cosgrove at [email protected] or use the secure messaging app Signal: 443-333-9088.

Contact Pranav from a nonwork device securely on Signal at +1-408-905-9124 or email him at [email protected].

You can email Jyoti at [email protected] or DM her via X @jyoti_mann1

Read the original article on Business Insider

DeepSeek's cheaper models and weaker chips call into question trillions in AI infrastructure spending

27 January 2025 at 09:53
QTS Data Center worker
A worker inside a QTS Data Center.

Blackstone

  • China's DeepSeek model challenges US AI firms with cost-effective, efficient performance.
  • DeepSeek's model, using modest hardware, is 20 to 40 times cheaper than OpenAI's.
  • DeepSeek's efficiency raises questions about US investments in AI infrastructure.

The bombshell that is China's DeepSeek model has set the AI ecosystem alight.

The models are high-performing, relatively cheap, and compute-efficient, which has led many to posit that they pose an existential threat to American companies like OpenAI and Meta β€” and the trillions of dollars going into building, improving, and scaling US AI infrastructure.

The price of DeepSeek's open-source model is competitive β€” 20 to 40 times cheaper to run than comparable models from OpenAI, Bernstein analysts said.

But the potentially more nerve-racking element in the DeepSeek equation for US-built models is the relatively modest hardware stack used to build them.

The DeepSeek-V3 model, which is most comparable to OpenAI's ChatGPT, was trained on a cluster of 2,048 Nvidia H800 GPUs, according to the technical report published by the company.

H800s are the first version of the company's defeatured chip for the Chinese market. After the regulations were amended, the company made another defeatured chip, the H20 to comply with the changes.

Though this may not always be the case, chips are the most substantial cost in the large language model training equation. Being forced to use less powerful, cheaper chips created a constraint that the DeepSeek team has ostensibly overcome.

"Innovation under constraints takes genius," Sri Ambati, the CEO of the open-source AI platform H2O.ai, told Business Insider.

Even on subpar hardware, training DeepSeek-V3 took less than two months, the company's report said.

The efficiency advantage

DeepSeek-V3 is small relative to its capabilities and has 671 billion parameters, while ChatGPT-4 has 1.76 trillion, which makes it easier to run. But DeepSeek-V3 still hits impressive benchmarks of understanding.

Its smaller size comes in part by using a different architecture than ChatGPT, called a "mixture of experts." The model has pockets of expertise built in, which go into action when called upon and sit dormant when irrelevant to the query. This type of model is growing in popularity, and DeepSeek's advantage is that it built an extremely efficient version of an inherently efficient architecture.

"Someone made this analogy: It's almost as if someone released a $20 iPhone," Jared Quincy Davis, the CEO of Foundry, told BI.

The Chinese model used a fraction of the time, a fraction of the number of chips, and a less capable, less expensive chip cluster. Essentially, it's a drastically cheaper, competitively capable model that the firm is virtually giving away for free.

Bernstein analysts said that DeepSeek-R1, a reasoning model more comparable to OpenAI's o1 or o3, is even more concerning from a competitive standpoint. This model uses reasoning techniques to interrogate its own responses and thinking, similar to OpenAI's latest reasoning models.

R1 was built on top of V3, but the research paper released with the more advanced model doesn't include information about the hardware stack behind it. DeepSeek used strategies like generating its own training data to train R1, which requires more compute than using data scraped from the internet or generated by humans.

This technique is often referred to as "distillation" and is becoming standard practice, Ambati said.

Distillation brings with it another layer of controversy, though. A company using its own models to distill a smarter, smaller model is one thing. But the legality of using other company's models to distill new ones depends on licensing.

Still, DeepSeek's techniques are more iterative and likely to be taken up by the AI indsutry immediately.

For years, model developers and startups have focused on smaller models since their size makes them cheaper to build and operate. The thinking was that small models would serve specific tasks. But what DeepSeek and potentially OpenAI's o3 mini demonstrate is that small models can also be generalists.

It's not game over

A coalition of players including Oracle and OpenAI, with cooperation from the White House, announced Stargate, a $500 billion data center project in Texas β€” the latest in a quick procession of developments in large-scale conversion to accelerated computing. DeepSeek's surprise release has called that investment into question, and Nvidia, the largest beneficiary of the investment, is on a roller coaster as a result. The company's stock plummeted more than 13% Monday.

But Bernstein said the response is out of step with the reality.

"DeepSeek DID NOT 'build OpenAI for $5M'," Bernstein analysts wrote in a Monday investor note. The panic, especially on X, is blown out of proportion, the analysts said.

DeepSeek's own research paper on V3 says: "The aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data." So the $5 million figure is only part of the equation.

"The models look fantastic but we don't think they are miracles," Bernstein continued. Last week China also announced a roughly $140 billion investment in data centers, in a sign that infrastructure is still needed despite DeepSeek's achievements.

The competition for model supremacy is fierce, and OpenAI's moat may indeed be in question. But demand for chips shows no signs of slowing, Bernstein said. Tech leaders are circling back to a centuries-old economic adage to explain the moment.

The Jevons paradox is the idea that innovation begets demand. As technology gets cheaper or more efficient, demand increases much faster than prices drop. That's what providers of computing power, such as Foundry's Jared Quincy Davis, have been espousing for years. This week, Bernstein and Microsoft CEO Satya Nadella picked up the mantle, too.

"Jevons paradox strikes again!" Nadella posted on X Monday morning. "As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of."

Read the original article on Business Insider

Semiconductor legend Lip-Bu Tan told BI his strategy for investing in Nvidia competitors

23 January 2025 at 02:00
Cadence Design CEO Lip-Bu Tan delivers the key-note address during the CDN Live conference in Bangalore.
Cadence Design CEO Lip-Bu Tan is reportedly one of the people being considered to become Intel's CEO.

Dibyanshu Sarkar for AFP via Getty Images

  • Lip-Bu Tan is a renowned chip industry veteran with experience as CEO, board member, and investor.
  • Tan has held roles at Intel and SoftBank and brought Cadence back from the brink of crisis.
  • Tan invests in startups challenging Nvidia's power-hungry GPUs with efficient alternatives.

When Lip-Bu Tan speaks, the chip industry listens.

The semiconductor executive and investor has wielded his influence over most of the major names in the chip business and served on the boards of Annapurna Labs β€” which now fuels Amazon's chip ambitions β€” SoftBank, and Intel, among others.

Tan was even reported to be a candidate for Intel's open CEO spot, though when Business Insider spoke to him, he smiled and said "no comment" on that issue.

His legendary status was solidified long ago when he orchestrated an intervention at Cadence Design Systems β€” a semiconductor design firm in crisis.

He stepped in as CEO in 2009 with the stock below three dollars. He revamped the company's approach to its tech road map and customer relations and overhauled the business model from a perpetual license model to a subscription. Today, the stock is above $320 per share. He stayed in the CEO seat until 2021.

Tan's reputation is such that the day he resigned from Intel's board in August, the firm's stock dropped 6%.

He is supporting many players in the AI boom with funds and advice, he said. His investment firm, Walden International, has over $1.5 billion in assets under management, and funds startups via its $500 million venture fund Walden Catalyst, which he cofounded in 2021 with fellow chip industry veteran Young Sohn.

Nvidia investment alternatives

Tan is also a personal friend of both AMD CEO Lisa Su and Nvidia CEO Jensen Huang.

In fact, Tan told BI that when Huang told him many years ago that Nvidia would be a full-platform company with software just as valuable as its hardware, they argued.

"I said, 'No, you're semiconductor company.' And he said, 'No, you're completely wrong. I'm a software and system company,'" Tan recounted.

Today, Tan admits Huang was right. Nvidia's CUDA software is envied by many who try to compete with the $3 trillion juggernaut. Its 70% profit margins also make it look more like a software company than a chip firm.

But graphic processing units do have one problem. They are "power hungry," Tan said. That's where his attention is going when it comes to new companies looking to challenge Nvidia.

Tan personally invested in SambaNova Systems in 2018 for this reason.

"We can deliver the same performance at one-tenth of the power," he said of SambaNova.

Tan is also an investor in startup Rivos, which was founded in 2021. Both companies offer an alternative computing architecture to Nvidia and AMD's GPUs and claim to offer dramatic gains in power, speed, and cost-efficiency.

"Everybody is looking for an alternative. They are still going big time with Nvidia, but they are looking for 10%, or 15% of the different workloads that can use a better solution," he said.

That minority of the market, driven by concerns about cost and diversification, is where the opprtuntiy lies for Nvidia alternatives, he said.

On top of startups, AMD and companies like Broadcom, Marvel, and Micron Technology also stand to grow as the need for compute expands, he said.

He is especially looking for opportunities in industries where AI may have an outsized impact. Backing smart investments in applied AI means knowing where the data is, according to Tan.

"AI is already a 60-year-old technology. But really, the big difference is the data, the massive, massive data that is starting to become available," he said.

"Whatever business you want to be in, you have to be close to the data. If you are close to the data, you have a very strong chance of success," he continued. Healthcare AI β€” identifying important biomarkers and medical drug discovery β€” is where he sees the most opportunity today.

"I'm heavily investing in medical," he said.

Read the original article on Business Insider

Trump announces an up to $500 billion AI infrastructure investment involving OpenAI, Oracle, and SoftBank

21 January 2025 at 15:31
US President Trump speaks in the Roosevelt Room flanked by Masayoshi Son, Larry Ellison, and Sam Altman at the White House.
OpenAI, Oracle, and SoftBank will work together to create a venture named Stargate.

Jim WATSON / AFP

  • President Trump announced a private sector AI infrastructure investment of up to $500 billion.
  • OpenAI, Oracle, and SoftBank will work together to create a venture named Stargate.
  • The US aims to maintain AI leadership against China as geopolitical stakes remain high.

President Donald Trump on Tuesday announced a private sector investment of up to $500 billion for artificial intelligence infrastructure across the country.

OpenAI, Oracle, and SoftBank will work together to create a venture named Stargate, with the president calling it "the largest AI infrastructure project in history" while touting the role of the US in leading the effort.

"Together these world-leading technology giants are announcing the formation of Stargate … a new American company that will invest $500 billion at least in AI infrastructure in the United States," he said.

"Put that name down in your books, because I think you're going to hear a lot about it," he added.

Trump said the venture would create more than 100,000 American jobs, with the investment set to be made over the next four years.

Oracle's stock rose close to 5% in after-hours trading, while SoftBank rose over 9% in Japan following the announcement.

OpenAI chief executive Sam Altman, Oracle cofounder Larry Ellison, and SoftBank chief executive Masayoshi Son attended the White House announcement.

Altman, in his remarks, said he was "thrilled" by the venture and said it'll be the "most important project of this era."

"The fact that we get to do this in the United States is just wonderful," he said. "I believe that as this technology progresses, we will see diseases get cured at an unprecedented rate."

And Ellison, at the White House, described Stargate as a "very exciting program for Oracle to be a part of."

AI is energy-hungry

For foundation models to keep improving, companies that use them β€” like OpenAI and Anthropic β€” need lots of fuel.

That fuel comes in the form of chips, energy, and talent, and the Trump administration's policies on each stand to shape the future of computing, potentially creating winners and losers along the way.

The stakes are high. Growing AI in America isn't just a money-maker, it's a geopolitical card played against the other big global tech player: China.

Trump on AI

Trump on Monday revoked the Biden Administration's Executive Order on AI from October 2023. The order pushed for greater transparency from large companies developing and using AI.

After winning the 2024 presidential election, Trump appointed a new AI and cryptocurrency czar in venture capitalist David Sacks.

In his first term as president, Trump issued an executive order focused on AI leadership and non-regulatory approaches to expanding and maintaining it.

Chips

In his final weeks in office, President Joe Biden's Department of Commerce published new, sweeping export restrictions on the kind of semiconductors required for AI.

The regulations, set to go into effect early in Trump's second term, would restrict and cap the amount of computing power companies can amass in most countries outside a list of 18 allies.

If maintained by the Trump administration, the impact of these changes will likely be to concentrate AI data centers in the US.

Last week, Biden signed an executive order to grant federal sites to construct data centers and "clean energy."

Talent

The two ways AI companies acquire game-changing talent are in question as the US transitions from Biden to Trump.

The first is mergers and acquisitions. Nvidia, along with Microsoft and OpenAI, are the subject of a Department of Justice and Federal Trade Commission investigation into possible antitrust violations.

The scrutiny created a cooling effect across the AI ecosystem. Trump's anti-regulation tendencies suggest a more open field for mergers and acquisitions in his term.

The second factor impacting the availability of AI talent in the US is the H-1B visa program. In recent weeks, there has been some disagreement within Trump's party regarding the program, which is designed to facilitate the immigration of skilled workers to the US.

Energy and data centers

AI also needs chips, but assuring an ample supply of those is for naught without enough energy to run the data centers. Since AI chips are hungrier than traditional computer chips, power is the most important limiting factor to the industry's growth.

"They have to produce a lot of electricity," the president said on Tuesday. "And we'll make it possible for them to get this production done easily, at their own plants if they want."

Trump's views on energy production and the climate crisis differ greatly from Biden's. The new president has long focused on continuing American leadership in fossil fuel production.

Shubhangi Goel contributed reporting.

Read the original article on Business Insider

Nvidia's autonomous car business is rising. Here's how it could make every car self-driving.

21 January 2025 at 01:00
An man in a shiny leather jacket (Jensen Huang) gestures ion a stage, standing in front of an image featuring a car, a robot and the Earth.
Nvidia CEO Jensen Huang delivers a keynote address at the Consumer Electronics Show (CES) 2025, in Las Vegas, Nevada, USA, on January 6, 2025

Anadolu/Anadolu via Getty Images

  • Nvidia CEO Jensen Huang used CES 2025 as an opportunity to highlight autonomous vehicle tech.
  • Nvidia's Orin chips will power Toyota's driver-assist features, in a new partnership.
  • The chip designer offers a "shot in the arm" for a floundering industry.

Nvidia CEO Jensen Huang says true self-driving cars require three advanced computers.

There's a computer to train the models to understand the world, the computer running simulations that allow these models to practice encountering important but unlikely scenarios, and a computer inside the car itself.

Nvidia has strategically embedded itself in all three key steps that could make every car a self-driving car.

While segment leaders like Waymo and Tesla launch robotaxi fleets and enable drivers to scroll on X while their cars drive them to work, Nvidia approaches the market as an enabler β€” not a consumer brand.

The chip-design company is building upon a suite of functionalities that already power your car's advanced driver-assist systems, such as automatic lane keeping and adaptive cruise control.

With its long list of heavy-hitting automotive clients, including Toyota, Uber, and Hyundai among others, Nvidia is positioning itself to be the self-driving tech supplier to the automotive industry. To Huang, all these players are headed in the same direction.

"Every single car company will have to be autonomous, or you're not going to be a car company," Huang said at a fireside chat for financial analysts at this year's Consumer Electronic Show.

Huang made autonomous vehicle technology a centerpiece of his CES keynote speech, announcing confidently that self-driving cars aren't coming β€” they're already here.

"With Waymo's success and Tesla's success, it is very, very clear autonomous vehicles have finally arrived," said Huang onstage.

Self-driving stops and starts

Despite the recent preponderance of driverless Waymo rides, self-driving technology has been in limbo across the auto industry as carmakers cut costs and focus investments on more near-term technology like electric vehicles.

Many traditional carmakers are rethinking expensive autonomous technology development after decades of piecemeal progress and no clear path to profitability β€” ceding ground to tech-first players.

Who will win the chip war that lives inside the dashboards of most cars remains an open question. But after CES, analysts are much closer to calling the race.

Today's cars are chock-full of chips. Most are far less complicated than the kind needed to offload driving tasks to the computer. Nvidia's competition includes other automotive chip giants like Qualcomm and Israel's Mobileye, which develops microchips and other technologies for the automotive industry.

As AI converges with increased adoption of self-driving technology, Nvidia now appears to be taking the lead, according to Martin French, managing director at automotive consultancy Berylls.

Toyota, the world's largest automaker, will use Nvidia's Orin chips and automotive operating system to power its next generation of driver-assist features, Huang announced.

Orin is Nvidia's solution for putting the computing power and intelligence of AI inside a car. The system debuted in 2019 and has developed into a more all-encompassing solution over time.

Mercedez Benz, China's BYD, and many luxury EV makers have also adopted Orin.

Most of these are not fully self-driving, but the long road between cruise control and realizing the dream of sleeping in the back seat while a car drives itself will have many stops along the way.

Winning Toyota's business is a big deal. McKinsey estimates that the assisted and autonomous driving market could be worth $400 billion by 2035. Nvidia forecasts a $5 billion run rate for its automotive business in fiscal year 2025, a five-fold increase in the company's automotive business from 2023.

Nvidia is also joining forces with trucking startup Aurora Innovation and automotive supplier Continental to deploy self-driving trucks β€” an announcement that sent Aurora's stock soaring 35% last week.

Beyond the data center

Tesla's self-driving technology, which Huang frequently lauds, is trained on Nvidia GPUs. However, the chips that make Tesla's full-self-driving run are designed in-house and manufactured by Samsung.

As far back as 2019, Tesla and Nvidia shared an understanding of the importance of accelerated computing. But they've been on-and-off partners. Today the partnership is very much on, with Huang and Musk regularly trading praise.

Cementing relationships with Toyota, Tesla, and Aurora puts Nvidia in a good position to be the primary supplier of self-driving technology to the automotive industry.

Few companies can provide chips for cars and also the chips to train the AI needed for self-driving capabilities.

Despite a two-hour CES keynote presentation spanning humanoid robots to AI laptops, Philips Capital analysts called Nvidia's automotive offering the "most significant" revelation at the tentpole event. Averaging less than 2% of total revenue in the first three quarters of 2024, Nvidia's automotive business still pales in comparison to its data center business.

On the company's February 2024 earnings call, CFO Colette Kress said $1 billion of the firm's data center revenue, which is reported separately from the automotive chip business, was attributable to automotive customers.

"They are absolutely positioning themselves as the leader for autonomous technologies, period," French said.

'A shot in the arm' for self-driving

In recent years, major car companies have abandoned their expensive self-driving car projects to focus on electric vehicles.

Ford and Volkswagen pulled funding from now-defunct self-driving startup ArgoAI in 2022, while GM said at the end of 2024 it would end its Cruise division's robotaxi development.

"We've had a lot of bad news around self-driving tech in the past few years β€” it's been quite downbeat," said French. "Nvidia has reversed that and just gave autonomous driving an absolute shot in the arm."

Investors were growing impatient with the drag on car companies' profits and lacked faith in legacy automakers' ability to develop software, French said. What it took to get investors back on board with self-driving tech was to hear it from a tech company.

"For Jensen β€” one of the leading people in tech β€” to get up onstage and tell everyone autonomous driving is here and robotics are just around the corner holds a lot of weight with investors," French said.

Huang is well-known for having an appetite for market-making β€” frequently saying he looks for "zero-billion-dollar markets" to simultaneously create and conquer. The AV market could still yet be won by one company, according to French.

In a complex regulatory environment, the automotive industry often strives to find a single standard to follow on new tech. That usually creates a period of stiff competition as companies vie to develop the winning technology.

Take electric vehicle charging, for example.

For years, the industry couldn't agree on a single charger type, leading to mismatched plugs and ports for EV drivers searching for juice. But in 2023, the industry finally coalesced around the North American Charging Standard chargers used by Tesla.

Since AI has closed the technological space between self-driving cars and robotics the entire auto industry is about to find out what it's like to be part of Nvidia's next zero-billion dollar market.

Read the original article on Business Insider

What President Joe Biden's last-minute chip export restrictions mean for Nvidia

13 January 2025 at 08:38
Jensen Huang in a leather jacket in front of a large window.
Nvidia CEO Jensen Huang.

Jeff Chiu/ AP Images

  • Biden's Commerce Department is issuing new semiconductor export rules affecting Nvidia.
  • The rules categorize countries for GPU export controls, impacting Nvidia's market.
  • Critics argue the rules may stifle AI innovation. Supporters say they will keep the US on top.

The Biden administration's Commerce Department released 168-pages of fresh regulations for the US semiconductor industry Monday that could drastically change Nvidia's year.

The new rules target exports of graphics processing units, the types of highly powerful chips made by Nvidia, and challenger AMD. Global data centers are filling up with GPUs and Nvidia has so far claimed an estimated 90% of that market share.

Highly complex chips like GPUs are largely manufactured in Taiwan, but most of the companies that design them are based in the US and so their products are within the Department of Commerce's jurisdiction.

"To enhance U.S. national security and economic strength, it is essential that we do not offshore this critical technology and that the world's AI runs on American rails," the White House's announcement reads, adding that advanced computing in the wrong hands can lead to "development of weapons of mass destruction, supporting powerful offensive cyber operations, and aiding human rights abuses, such as mass surveillance."

In response to previous export restrictions, Nvidia created a less powerful chip model just for the Chinese market to keep doing business there after the Biden administration changed the rules in 2022.

The new regulations go further β€” grouping countries into three categories and placing different export controls on each.

The first is a group of 18 allies to which GPUs can ship freely. These are Australia, Belgium, Canada, Denmark, Finland, France, Germany, Ireland, Italy, Japan, the Netherlands, New Zealand, Norway, the Republic of Korea, Spain, Sweden, Taiwan, and the United Kingdom.

The second group is listed as "countries of concern" where exports of the most advanced GPUs will be banned entirely. These are China, Hong Kong and Macau, Russia, Iran, North Korea, Venezuela, Nicaragua, and Syria.

All other countries would be subject to a cap of 100,000 GPUs. The rules lay out a verification process for larger orders, in which the businesses looking to set up larger clusters in these countries would need US government approval to do so.

The administration said the regulations had provision that would keep small orders of chips flowing to research institutions and universities.

Nvidia has opposed the regulation along with The Semiconductor Industry Association.

"While cloaked in the guise of an "anti-China" measure, these rules would do nothing to enhance U.S. security," Ned Finkle, Nvidia's VP of government affairs wrote in a statement on the company's website.

Impact on Nvidia

Any restriction on the sale of GPUs anywhere is bound to hit Nvidia's sales.

"The Biden Administration now seeks to restrict access to mainstream computing applications with its unprecedented and misguided "AI Diffusion" rule, which threatens to derail innovation and economic growth worldwide," Finkle wrote.

But will the regulations dampen sales or shift them?

Chris Miller, the author of "Chip War" and a leading expert on the semiconductor industry told Business Insider he was uncertain if the overall volume of GPUs sold would be substantially impacted since demand for Nvidia's products is so high.

"I suspect that these rules will generally have the impact of shifting data center construction toward US firms," Miller said.

If demand does goes down, "it would change due to a reduction of GPU demand from countries or companies that are unwilling to rely on US cloud providers," Miller said.

The drafted rules had been circulating ahead of the Monday announcement and reactions from tech leaders have been fierce.

Oracle EVP Ken Glueck blogged about them for the first time in mid December and again in early January.

Both Finkle and Glueck zeroed in on the country caps as the most consequential element introduced.

"The extreme 'country cap' policy will affect mainstream computers in countries around the world, doing nothing to promote national security but rather pushing the world to alternative technologies," Finkle said in an emailed statement Friday.

It is particularly notable that Singapore, Mexico, Malaysia, UAE, Israel, Saudi Arabia, and India, are not in the unrestricted tier of countries, Glueck noted.

The exclusion of several Middle East countries could seriously change the course of the global AI infrastructure buildout, Miller said.

"The primary impact of these controls is that they make it much more likely that the most advanced AI systems are trained in the US as opposed to the Middle East," Miller said.

"Without these controls, wealthy Middle Eastern governments would have succeeded to some degree in convincing U.S. firms to train high-end AI systems in the Middle East by offering subsidized data centers. Now this won't be possible, so US firms will train their systems in the US," Miller said.

Glueck wrote that country quotas were the worst concept within the draft regulations, which will be formally published Wednesday, according to the Federal Register.

"Controlling GPUs makes no sense when you can achieve parity by simply adding more, if less-powerful, GPUs to solve the problem," Oracle's Glueck wrote in December. "The problem with this proposal is it assumes there are no other non-U.S. suppliers from which to procure GPU technology," he continued.

Republican support

The fate of the Biden's unprecedented export control rules is uncertain given their timing.

The Monday statement from Nvidia's Finkle referenced the Trump administration, stating that in his first term, Trump, "laid the foundation for America's current strength and success in AI."

The new rules are subject to a 120-day comment period before they are enforceable. President Biden will have left office when they are set to take effect.

Though they stemmed from an outgoing Democratic administration, the rules do have some support on the President-elect's side of the aisle.

Republican Congressman John Moolenaar and Raja Krishnamoorthi, chair and ranking member of the House Select Committee on the Chinese Communist Party, are in favor of the framework.

"GPUs, or any country that hosts Huawei cloud computing infrastructure should be restricted from accessing the model weights of closed-weight dual-use AI models," the two legislators published in a written statement.

Matt Pottinger, who served on the National Security Council in Trump's first term and current chairman of the China program at the Foundation for Defense of Democracies along with Anthropic CEO Dario Amodei penned an op-ed published in the Wall Street Journal on Jan. 6. They suggest that the existing export restricitions have been successful, but still allow room for China to set up data centers in friendly third-party countries, so more restrictions are needed.

"Skeptics of these restrictions argue that the countries and companies to which the rules apply will simply switch to Chinese AI chips. This argument overlooks that U.S. chips are superior, giving countries an incentive to follow U.S. rules," Pottinger and Amodei wrote.

"Countries that want to reap the massive economic benefits will have an incentive to follow the U.S. model rather than use China's inferior chips," they continued.

Miller confirmed that the fact that China is still purchasing Nvidia's "defeatured" GPUs is sign enough that locally-designed chips are not competitive, yet.

"So long as China's importing US GPUs, it won't be able to export, in which case these controls will be effective because there is no alternative source of high end GPUs," Miller said.

But Huawei is catching up, said Alvin Nguyen, senior analyst at Forrester. Additional US export controls could speed that work up in his view.

"They've caught up to one generation behind Nvidia," said Nguyen.

Another concern is that restricting the flow of advanced chips could segment the economic opportunity of AI spreading equally around the globe.

"If you're not working with the best infrastructure, the best models, you may not be able to leverage the data that you do have β€” creating the haves and have nots," Nguyen said.

Read the original article on Business Insider

The pros and cons of making advanced chips in America

8 January 2025 at 02:00
An Asian man presses his face against a clear box holding a computer chip
As AI chip designs diversify beyond Nvidia's GPU, US semiconductor fabs press their noses up against the window of the AI boom.

AP Photo/Ng Han Guan

  • Most AI chips are made in Taiwan by Taiwan Semiconductor Manufacturing Company.
  • Startups focused on lowering the cost of AI are working with US manufacturers.
  • AI chips are being made at fabrication facilities in New York and Arizona.

Attempting to compete with Nvidia is daunting, especially when it comes to manufacturing.

Nvidia and most of its competitors don't produce their own chips. They vie for capacity from the world's most advanced chip fabricator: Taiwan Semiconductor Manufacturing Company. Nvidia may largely control which companies get the latest and most powerful computing machines, but TSMC decides how many Nvidia can sell. The relationship between the two companies fascinates the industry.

But the bottom line is that there's no manufacturer better and there's no getting ahead of Nvidia for the types of manufacturing capacity relevant to AI.

Still, a few startups think they can find an advantage amid Nvidia's dominance and the ever-fluctuating dynamics surrounding the island nation of Taiwan by tapping chip fabs in the United States.

Positron AI, founded by Thomas Sohmers in 2023, has designed a chip architecture optimized for transformer models β€” the kind on which OpenAI's GPT models are built. With faster access to more memory, Sohmers claims Postiron's architecture can compete on performance and price for AI inference, which is the computation needed to produce an answer to a query after a model has been trained.

Positron's system has "woefully less FLOPS" than an Nvidia GPU, Sohmers joked. However, his architecture is intended to compensate for this with efficiency for Positron and its customers.

Smaller fabs are 'hungrier'

Positron's chips are made in Chandler, Arizona, by Intel-owned firm, Altera.

Intel acquired Altera, which specializes in a specific type of programmable chip, in 2015. In 2023, some early Positron employees and advisors came from Altera β€” bringing relationships and trust. The early partnership has given Positron some small influence over Altera's path and a cheaper, more flexible manufacturing partner.

The cost of AI comes from the chip itself and the power needed to make it work. Cutting costs on the chip means looking away from TSMC, Sohmers says, which currently holds seemingly infinite bargaining power.

"Fundamentally, Positron is trying to provide the best performance per dollar and performance per watt," Sohmers said.

Compared to other industries, AI offers a rare proposition: US production is often cheaper.

"In most other industries, made in the USA actually means that it's going to be more expensive. That's not the case for semiconductors β€” at least for now," Sohmers said.

Many fabs are eager to enter the AI game, but they don't have the same technical prowess, prestige, or track record, which can make finding customers challenging.

Startups, which often lack the high order volumes that carry market power, are a good fit for these fabs, Sohmers said. These less in-demand fabs offer more favorable terms, too, which Sohmers hopes will keep Positron competitive on price.

"If I have some optionality going with someone that is behind but has the ambition to get ahead, it's always good from a customer or partner perspective," he said, adding, "It gives both leverage."

Taking advantage of US fabs has kept the amount of funding Positron needs within reason and made it easier to scale, Sohmers said.

Positron isn't alone. Fellow Nvidia challenger Groq partners with GlobalFoundries in upstate New York and seeks to make a similar dent in the AI computing market by offering competitive performance at a lower price.

Less inherent trust

It's not all upside though. Some investors have been skeptical, Sohmers said. And as an engineer, not going with the best fab in the world can feel strange.

"You have a lot more faith that TSMC is going to get to a good yield number on a new design pretty quickly and that they have a good level of consistency while, at other fabs, it can be kind of a dice roll," he said.

With a global supply chain, no semiconductor is immune from geopolitical turmoil or the shifting winds of trade policy. So, the advantages of exiting the constantly simmering tension between Taiwan, China, and the US serve as a counterweight to any skepticism.

Positron is also working on sourcing more components and materials in North America, or at least outside China and Taiwan.

Sourcing from Mexico, for example, offers greater safety from geopolitical turmoil. The simpler benefit is that shipping is faster so prototyping can happen quickly.

It's taken a while, but Sohmers said the industry is waking up to the need for more players across the AI space.

"People are finally getting uncomfortable with Nvidia having 90-plus percent market share," he said.

Got a tip or an insight to share? Contact BI's senior reporter Emma Cosgrove at [email protected] or use the secure messaging app Signal: 443-333-9088.

Read the original article on Business Insider

Chip startups are making these New Year's resolutions to take on Nvidia in 2025

27 December 2024 at 02:00
Jensen Huang speaking on stage
Nvidia CEO Jensen Huang.

Chip Somodevilla/Getty Images

  • The AI computing market may shift in 2025, opening opportunities for smaller companies.
  • Nvidia dominates AI computing. Evolving workloads could benefit competitors.
  • Companies like Groq, Positron, and SambaNova focus on inference to challenge Nvidia's market hold.

In 2025, the tides may turn for companies hoping to compete with the $3 trillion gorilla in AI computing.

Nvidia holds an estimated 90% of the market share for AI computing. Still, as the use of AI grows,Β workloads are expected to change, and this evolution may give companies with competitive hardware an opening.

In 2024, the majority of AI compute spend shifted to inference, Thomas Sohmers, CEO of chip startup Positron AI, told BI. This will "continue to grow on what looks like an exponential curve," he added.

In AI, inference is the computation needed to produce the response to a user's query or request. The computing required to teach the model the knowledge needed to answer is called "training." Creating OpenAI's image generation platform Sora, for example, represents training. Each user who instructs it to create an image represents an inference workload.

OpenAI's other models have Sohmers and others excited about the growth in computing needs in 2025.

OpenAI's o1 and o3, Google's Gemini 2.0 Flash Thinking, and a handful of other AI models useΒ more compute-intensive strategies to improve results after training. These strategies are often called inference-time computing, chain-of-thought, chain-of-reasoning, or reasoning models.

Simply put, if the models think more before they answer, the responses are better. That thinking comes at a cost of time and money.

The startups vying for some of Nvidia's market share are attempting to optimize one or both.

Nvidia already benefits from these innovations, CEO Jensen Huang said on the company's November earnings call. Huang's wannabe competitors are betting that in 2025, new post-training strategies for AI will benefit all purveyors of inference chips.

Business Insider spoke to three challengers about their hopes and expectations for 2025. Here are their New Year's resolutions.

What's one thing within your control that could make 2025 a big year for alternative chips?

A tattooed man in a black shirt and jeans stands on a stage with a pink and black background that read Groq: what's next?
Mark Heaps is the chief technology evangelist for Nvidia challenger Groq.

Groq

Mark Heaps, chief technology evangelist, Groq:

"Execution, execution, execution. Right now, everybody at Groq has decided not to take a holiday break this year. Everyone is executing and building the systems. We are all making sure that we deliver to the opportunity that we've got because that is in our control.

I tell everyone our funnel right now is carbonated and bubbling over. It's unbelievable, the amount of customer interest. We have to build more systems, and we have to stand up those systems so we can serve the demand that we've got. We want to serve all those customers. We want to increase rate limits for everybody."

Rodrigo Liang, CEO, SambaNova Systems:

"For SambaNova, the most critical factor is executing on the shift from training to inference. The industry is moving rapidly toward real-time applications, and inference workloads are becoming the lion's share of AI demand. Our focus is on ensuring our technology enables enterprises to scale efficiently and sustainably."

Thomas Sohmers, CEO, Positron:

"My belief is if we can actually deploy enough compute β€” which thankfully I think we can from a supply chain perspective β€” by deploying significantly more inference-specific compute, we're going to be able to grow the adoption rate of 'chain of thoughts' and other inference-additional compute."

What's one thing you're hoping for that's not in your control for 2025?

Rodrigo Liang SambaNova Systems
Rodrigo Liang, CEO and cofounder of SambaNova Systems.

SambaNova Systems

Heaps:

"It's about customers recognizing that there are novel advancements against incumbent technologies. There's a lot of folks that have told us, 'We like what you have, but to use the old adage and rephrase it: No one ever got fired for buying from β€” insert incumbent.'

But we know that it's starting to boil up. People are realizing it's hard for them to get chips from the incumbent, and it's also not as performant as Groq is. So my wish would be that people are willing to take that chance and actually look to some of these new technologies."

Liang:

"If I had a magic wand, I'd address the power challenges around deploying AI. Today, most of the market is stuck using power-hungry hardware that wasn't designed for inference at scale. The result is an unsustainable approach β€” economically and environmentally.

At SambaNova, we've proven there's a better way. Our architecture consumes 10 times less power, making it possible for enterprises to deploy AI systems that meet their goals without blowing past their power budgets or carbon targets. I'd like to see the market move faster toward adopting technologies that prioritize efficiency and sustainability β€” because that's how we ensure AI can scale globally without overwhelming the infrastructure that supports it."

Sohmers:

"I would like people to actually adopt these chain of thought capabilities at the fastest rate possible. I think that is a huge shift β€” from a capabilities perspective. You have 8 billion parameter models surpassing 70 billion parameter models. So I'm trying to do everything I can to make that happen."

What's your New Year's resolution?

Positron AI executives stand near the startup's products
Positron AI executives. From left to right: Edward Kmett, Thomas Sohmers, Adam Huson, and Greg Davis.

Positron AI

Heaps:

"In the last six months, I've gone to a number of hackathons, and I've met developers. It's deeply inspiring. So my New Year's resolution is to try to amplify the signal of the good that people are doing with AI."

Liang:

"Making time for music. Playing guitar is something I've always loved, and I would love to get back into it. Music has this incredible way of clearing the mind and sparking creativity, which I find invaluable as we work to bring SambaNova's AI to new corners of the globe."

Sohmers:

I want to do as much to encourage the usage of these new tools to help, you know, my mom. Part of the reason I got into technology was because I wanted to see these tools lift up people to be able to do more with their time β€” to learn everything that they want beyond whatever job they're in. I think that bringing the cost down of these things will enable that proliferation.

I also personally want to see and try to use more of these things outside of my just work context because I've been obsessively using the o1 Pro model for the past few weeks, and it's been amazing for my personal work. But when I gave access to my mom what she would do with it was pretty interesting β€” those sort of normal, everyday person tasks for these things where it truly is being an assistant."

Read the original article on Business Insider

Groq is 'unleashing the beast' to chip away at Nvidia's CUDA advantage

18 December 2024 at 02:00
A tattooed man in a black shirt and jeans stands on a stage with a pink and black background that read Groq: what's next?
Mark Heaps is the chief technology evangelist for Nvidia challenger Groq

Groq

  • Groq is taking a novel approach to competing with Nvidia's much-lauded CUDA software.
  • The chip startup is using a free inference tier to attract hundreds of thousands of AI developers.
  • Groq aims to capture market share with faster inference and global joint ventures.

There is an active debate about Nvidia's competitive moat. Some say there's a prevailing perception of a 'safe' choice when investing billions in a technology, in which the return is still uncertain.

Many say it's Nvidia's software, particularly CUDA, which the company began developing decades before the AI boom. CUDA allows users to get the most out of graphics processing units.

Competitors have attempted to make comparable systems, but without Nvidia's headstart, it has been tough to get developers to learn, try, and ultimately improve their systems.

Groq, however, is an Nvidia competitor that focused early on the segment of AI computing that requires less need for directly programming chips, and investors are intrigued. The 8-year-old AI chip startup was valued at $2.8 billion at its $640 million Series D round in August.

Though at least one investor has called companies like Groq 'insane' for attempting to dent Nvidia's estimated 90% market share, the startup has been building its technology exactly for the opportunity that is coming in 2025, Mark Heaps, Groq's "chief tech evangelist" said.

'Unleashing the beast'

"What we decided to do was take all of our compute, make it available via a cloud instance, and we gave it away to the world for free," Heaps said. Internally, the team called the strategy, "unleashing the beast". Groq's free tier caps users at a ceiling marked by requests per day or tokens per minute.

Heaps, CEO and ex-Googler Jonathan Ross, and a relatively lean team have spent 2023 and 2024 recruiting developers to try Groq's tech. Through hackathons and contests, the company makes a promise β€” try the hardware via Groq's cloud platform for free, and break through walls you've hit with others.

Groq offers some of the fastest inference out there, according to rankings on Artificialanalysis.ai, which measures cost and latency for companies that allow users to buy access to specific models by the token β€” or output.

Inference is a type of computing that produces the answers to queries asked of large language models. Training, the more energy-intensive type of computing, is what gives the models the ability to answer. So far, the hardware used for those two tasks has been different.

Heaps and several of his Nvidia-challenging cohorts at companies like Cerebras and SambaNova Systems said that speed is a competitive advantage.

After the inference service was available for free, developers came out of the woodwork, he said, with projects that couldn't be successful on slower chips. With more speed, developers can send one request through multiple models and use another model to choose the best response β€” all in the time it would usually take to fulfill just one request.

Roughly 652,000 developers are now using Groq API keys, Heaps said.

Heaps expects speed to hook developers on Groq. But its novel plan for programming its chips gives the company a unique approach to the most crucial element within Nvidia's "moat."

No need for CUDA libraries

"Everybody, once they deployed models, was gonna need faster inference at a lower cost, and so that's what we focused on," Heaps said.

So where's the CUDA equivalent? It's all in-house.

"We actually have more than 1800 models built into our compiler. We use no kernels, and we don't need people to use CUDA libraries. So because of that, people can just start working with a model that's built-in," Heaps said.

Training, he said, requires more customization at the chip level. In inference, Groq's task is to choose the right models to offer customers and ensure they run as fast as possible.

"What you're seeing with this massive swell of developers who are building AI applications β€” they don't want to program at the chip level," he added.

The strategy comes with some level of risk. Groq is unlikely to accumulate a stable of developers who continuously troubleshoot and improve its base software like CUDA has. Its offering may be more like a restaurant menu than a grocery store. But this also means the barrier to entry for Groq users is the same as any other cloud provider and potentially lower than that of other chips.

Though Groq started out as a company with a novel chip design, today, of the company's roughly 300 employees, 60% are software engineers, Heaps said.

"For us right now, there is a billions and billions of dollars industry emerging, that we can go capture a big share of market in, while at the same time, we continue to mature the compiler," he said.

Despite being realistic about the near-term, Groq has lofty ambitions, which board CEO Jonathan Ross has described as "providing half the world's inference." Ross also says the goal is to cast a net over the globe β€” to be achieved via joint ventures. Saudi Arabia is on the way. Canada and Latin America are in the works.

Earlier this year, Ross told BI the company also has a goal to ship 108,000 of its language processing units or LPUs by the first quarter of next year β€” and 2 million chips by the end of 2025, most of which will be made available through its cloud.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

A chip company you probably never heard of is suddenly worth $1 trillion. Here's why, and what it means for Nvidia.

18 December 2024 at 01:00
Broadcom CEO Hock Tan speaking at a conference
Broadcom CEO Hock Tan

Ying Tang/NurPhoto via Getty Images

  • Broadcom's stock surged in recent weeks, pushing the company's market value over $1 trillion.
  • Broadcom is crucial for companies seeking alternatives to Nvidia's AI chip dominance.
  • Custom AI chips are gaining traction, enhancing tech firms' bargaining power, analysts say.

The rise of AI, and the computing power it requires, is bringing all kinds of previously under-the-radar companies into the limelight. This week it's Broadcom.

Broadcom's stock has soared since late last week, catapulting the company into the $1 trillion market cap club. The boost came from a blockbuster earnings report in which custom AI chip revenue grew 220% compared to last year.

In addition to selling lots of parts and components for data centers, Broadcom designs and sells ASICs, or application-specific integrated circuits β€” an industry acronym meaning custom chips.

Designers of custom AI chips, chief among them Broadcom and Marvell, are headed into a growth phase, according to Morgan Stanley.

Custom chips are picking up speed

The biggest players in AI buy a lot of chips from Nvidia, the $3 trillion giant with an estimated 90% of market share of advanced AI chips.

Heavily relying on one supplier isn't a comfortable position for any company, though, and many large Nvidia customers are also developing their own chips. Most tech companies don't have large teams of silicon and hardware experts in house. Of the companies they might turn to design them a custom chip, Broadcom is the leader.

Though multi-purpose chips like Nvidia's and AMD's graphics processing units are likely to maintain the largest share of the AI chip market in the long-term, custom chips are growing fast.

Morgan Stanley analysts this week forecast the market for ASICs to nearly double to $22 billion next year.

Much of that growth is attributable to Amazon Web Services' Trainium AI chip, according to Morgan Stanley analysts. Then there are Google's in-house AI chips, known as TPUs, which Broadcom helps make.

In terms of actual value of chips in use, Amazon and Google dominate. But OpenAI, Apple, and TikTok parent company ByteDance are all reportedly developing chips with Broadcom, too.

ASICs bring bargaining power

Custom chips can offer more value, in terms of the performance you get for the cost, according to Morgan Stanley's research.

ASICs can also be designed to perfectly match unique internal workloads for tech companies, accord to the bank's analysts. The better these custom chips get, the more bargaining power they may provide when tech companies are negotiating with Nvidia over buying GPUs. But this will take time, the analysts wrote.

In addition to Broadcom, Silicon Valley neighbor Marvell is making gains in the ASICs market, along with Asia-based players Alchip Technologies and Mediatek, they added in a note to investors.

Analysts don't expect custom chips to ever fully replace Nvidia GPUs, but without them, cloud service providers like AWS, Microsoft, and Google would have much less bargaining power against Nvidia.

"Over the long term, if they execute well, cloud service providers may enjoy greater bargaining power in AI semi procurement with their own custom silicon," the Morgan Stanley analysts explained.

Nvidia's big R&D budget

This may not be all bad news for Nvidia. A $22 billion ASICs market is smaller than Nvidia's revenue for just one quarter.

Nvidia's R&D budget is massive, and many analysts are confident in its ability to stay at the bleeding edge of AI computing.

And as Nvidia rolls out new, more advanced GPUs, its older offerings get cheaper and potentially more competitive with ASICs.

"We believe the cadence of ASICs needs to accelerate to stay competitive to GPUs," the Morgan Stanley analysts wrote.

Still, Broadcom and chip manufacturers on the supply chain rung beneath, such as TSMC, are likely to get a boost every time a giant cloud company orders up another custom AI chip.

Read the original article on Business Insider

Intel co-CEOs discuss splitting product and manufacturing businesses

12 December 2024 at 14:56
Intel in an eye
Intel.

Intel; Getty Images; Chelsea Jia Feng/BI

  • Intel's co-CEOs discussed splitting the firm's manufacturing and products businesses Thursday.
  • A separation could address Intel's poor financial performance. It also has political implications.
  • Intel Foundry is forming a separate operational board in the meantime, executives said.

Intel's new co-CEOs said the company is creating more separation between its manufacturing and products businesses and the possibility of a formal split is still in play.

When asked if separating the two units was a possibility and if the success of the company's crucial, new "18A" process could influence the decision, CFO David Zinsner and CEO of Intel Products Michelle Johnston Holthaus, now interim co-CEOs, said preliminary moves are in progress.

"We really do already run the businesses fairly independently," Holthaus said at a Barclays tech conference Thursday. She added that severing the connection entirely does not make sense in her view, "but, you know, someone will decide that," she said.

Ousted CEO Pat Gelsinger prioritized keeping the fabs as part of Intel proper. The fabs hold important geopolitical significance to both Intel and the US. The manufacturing part of the business also weighs on the company's financial results.

"As far as does it ever fully separate? I think that's an open question for another day," Zinsner said.

Already in motion

Though the co-CEOs made it clear a final decision on a potential break-up has not been made, Zinsner outlined a series of moves already in progress that could make a split easier.

"We already run the businesses separately, but we are going down the path of creating a subsidiary for Intel Foundry as part of the overall Intel company," Zinsner said.

In addition, the company is forming a separate operational board for Intel Foundry and separating the operations and inventory management software for the two sides of the business.

Until a permanent CEO is appointed by the board, the co-CEOs will manage most areas of the company together, but Zinsner alone will manage the Foundry business. The foundry aims to build a contract manufacturing business for other chip designers. Due to the sensitive, competitive intellectual property coming from clients into that business, separation is key.

"Obviously, they want firewalls. They want to protect their IPs, their product road maps, and so forth. So I will deal with that part of the foundry to separate that from the Intel Products business." Zinsner said.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088.

Read the original article on Business Insider

Will the world's fastest supercomputer please stand up?

11 December 2024 at 06:57
TRITON Supercomputer_13
TRITON Supercomputer at the University of Miami

T.J. Lievonen

  • Oracle and xAI love to flex the size of their GPU clusters.
  • It's getting hard to tell who has the most supercomputing power as more firms claim the top spot.
  • The real numbers are competitive intel and cluster size isn't everything, experts told BI.

In high school, as in tech, superlatives are important. Or maybe they just feel important in the moment. With the breakneck pace of the AI computing infrastructure buildout, it's becoming increasingly difficult to keep track of who has the biggest, fastest, or most powerful supercomputer β€” especially when multiple companies claim the title at once.

"We delivered the world's largest and fastest AI supercomputer, scaling up to 65,000 Nvidia H200 GPUs," Oracle CEO Safra Catz and Chairman, CTO, echoed by Founder Larry Ellison on the company's Monday earnings call.

In late October, Nvidia proclaimed xAI's Colossus as the "World's Largest AI Supercomputer," after Elon Musk's firm reportedly built a computing cluster with 100,000 Nvidia graphics processing units in a matter of weeks. The plan is to expand to 1 million GPUs next, according to the Greater Memphis Chamber of Commerce (where the supercomputer is located.)

The good ole days of supercomputing are gone

It used to be simpler. "Supercomputers" were most commonly found in research settings. Naturally, there's an official list ranking supercomputers. Until recently the world's most powerful supercomputer was named El Capitan. Housed at the Lawrence Livermore National Laboratory in California 11 million CPUs and GPUs from Nvidia-rival AMD add up to 1.742 Exaflops of computing capacity. (One exaflop is equal to one quintillion, or a billion billions, operations per second.)

"The biggest computers don't get put on the list," Dylan Patel, chief analyst at Semianalysis, told BI. "Your competitor shouldn't know exactly what you have," he continued. The 65,000-GPU supercluster Oracle executives were praising can reach up to 65 exaflops, according to the company.

It's safe to assume, Patel said, that Nvidia's largest customers, Meta, Microsoft, and xAI also have the largest, most powerful clusters. Nvidia CFO Colette Cress said 200 fresh exaflops of Nvidia computing would be online by the end of this year β€” across nine different supercomputers β€” on Nvidia's May earnings call.

Going forward, it's going to be harder to determine whose clusters are the biggest at any given moment β€” and even harder to tell whose are the most powerful β€” no matter how much CEOs may brag.

It's not the size of the cluster β€” it's how you use it

On Monday's call, Ellison was asked, if the size of these gigantic clusters is actually generating better model performance.

He said larger clusters and faster GPUs are elements that speed up model training. Another is networking it all together. "So the GPU clusters aren't sitting there waiting for the data," Ellison said Monday.

Thus, the number of GPUs in a cluster isn't the only factor in the computing power calculation. Networking and programming are important too. "Exaflops" are a result of the whole package so unless companies provide them, experts can only estimate.

What's certain is that more advanced models β€” the kind that consider their own thinking and check their work before answering queries β€” require more compute than their relatives of earlier generations. So training increasingly impressive models may indeed require an arms race of sorts.

But an enormous AI arsenal doesn't automatically lead to better or more useful tools.

Sri Ambati, CEO of open-source AI platform H2O.ai, said cloud providers may want to flex their cluster size for sales reasons, but given some (albeit slow) diversification of AI hardware and the rise of smaller, more efficient models, cluster size isn't the end all be all.

Power efficiency too, is a hugely important indicator for AI computing since energy is an enormous operational expense in AI. But it gets lost in the measuring contest.

Nvidia declined to comment. Oracle did not respond to a request for comment in time for publication.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088.

Read the original article on Business Insider

Intel's next CEO needs to decide the fate of its chip fabs

3 December 2024 at 14:54
Intel CEO Pat Gelsinger holding up a chip at a US Senate hearing
Former Intel CEO Pat Gelsinger holding up a chip at a US Senate hearing.

Getty

  • Intel's CEO departure reignited debate on splitting its factories from the company.
  • Intel's fabs are costly, but they're also considered vital for US national security.
  • CHIPS Act funding requires Intel to maintain majority control of its foundry.

One central question has been hanging over Intel for months: Should the 56-year-old Silicon Valley legend separate its chip factories, or fabs, from the rest of the company?

Intel's departing CEO, Pat Gelsinger, has opposed that strategy. As a longtime champion of the company's chip manufacturing efforts, he was reluctant to split it.

The company has taken some steps to look into this strategy. Bloomberg reported in August that Intel had hired bankers to help consider several options, including splitting off the fabs from the rest of Intel. The company also announced in September that it would establish its Foundry business as a separate subsidiary within the company.

Gelsinger's departure from the company, announced Monday, has reopened the question, although the calculus is more complicated than simple dollars and cents.

Splitting the fabs from the rest of its business could help Intel improve its balance sheet. It likely won't be easy since Intel was awarded $7.9 billion in CHIPS and Science Act funding, and it's required to maintain majority control of its foundries.

Intel declined to comment for this story.

A breakup could make Intel more competitive

Politically, fabs are importantΒ to Intel's place in the American economy and allow the US to reduce dependence on foreign manufacturers. At the same time, they drag down the company's balance sheet. Intel's foundry, the line of business that manufactures chips, has posted losses for years.

Fabs are immensely hard work. They're expensive to build and operate, and they require a level of precision beyond most other types of manufacturing.

Intel could benefit from a split, and the company maintains meaningful market share in its computing and traditional (not AI) data center businesses. Amid the broader CEO search, Intel also elevated executive Michelle Johnston Holthaus to CEO of Intel Products and the company's co-CEO. Analysts said this could better set up a split.

Regardless, analysts said finding new leadership for the fabs will be challenging.

"The choice for any new CEO would seem to center on what to do with the fabs," Bernstein analysts wrote in a note to investors after the announcement of Gelsinger's departure.

On one hand, the fabs are "deadweight" for Intel, the Bernstein analysts wrote. On the other hand, "scrapping them would also be fraught with difficulties around the product road map, outsourcing strategy, CHIPS Act and political navigation, etc. There don't seem to be any easy answers here, so whoever winds up filling the slot looks in for a tough ride," the analysts continued.

Intel's competitors and contemporaries are avoiding the hassle of owning and operating a fab. The world's leading chip design firm,Β Nvidia, outsourcesΒ all its manufacturing. Its runner-up, AMD, experienced similar woes when it owned fabs, eventually spinning them out in 2009.

Intel has also outsourced some chip manufacturing to rival TSMC in recent years β€” which sends a negative signal to the market about its own fabs.

Intel is getting CHIPS Act funding

Ownership of the fabs and CHIPS Act funding are highly intertwined. Intel must retain majority control of the foundry to continue receiving CHIPS Act funding and benefits, a November regulatory filing said.

Intel could separate its foundry business while maintaining majority control, said Dan Newman, CEO of The Futurum Group. Still, the CHIPS Act remains key to Intel's future.

"If you add it all up, it equates to roughly $40 billion in loans, tax exemptions, and grants β€” so quite significant," said Logan Purk, a senior research analyst at Edward Jones.

"Only a small slice of the commitment has come, though," he continued.

Intel's fabs need more customers

Intel is attempting to move beyond manufacturing its own chips to becoming a contract manufacturer. Amazon has already signed on as a customer. Though bringing in more manufacturing customers could mean more revenue, it first requires more investment.

There's a more ephemeral reason Intel might want separation between its Foundry and its chip design businesses, too. Foundries regularly deal with many competing clients.

"One of the big concerns for the fabless designers is any sort of information leakage," Newman said.

"The products department competes with many potential clients of the foundry. You want separation," he added.

It was once rumored that a third party might buy Intel. Analysts have balked at the prospect for political and financial reasons, particularly since running the fabs is a major challenge.

Read the original article on Business Insider

In an all-hands meeting, Intel's new leaders emphasized outgoing CEO Pat Gelsinger's 'personal decision'

2 December 2024 at 15:27
Pat Gelsinger gestures in front of a large screen that reads, " It starts with Intel."
Intel CEO Pat Gelsinger delivers a speech at Taipei Nangang Exhibition Center during Computex 2024, in Taipei on June 4, 2024.

I-Hwa CHENG / AFP

  • Intel CEO Pat Gelsinger is out of the top spot after a challenging 4-year tenure.
  • The company's interim co-CEOs addressed the workforce Monday morning in an all-hands meeting.
  • One Intel employee described the responses to questions as "vague" and the tone of the meeting as "damage control".

On Monday morning, Intel employees joined an all-hands meeting after receiving an email invite at 5 a.m. PT.

Accompanying the invite was the news that the company's CEO Pat Gelsinger had stepped down as of Sunday, and would be temporarily replaced by co-CEOs David Zinsner, Intel's chief financial officer for nearly three years, and Michelle Johnston Holthaus, the new CEO of product.

Gelsinger's move came without warning. He isn't staying on to transition out slowly or help with the search for his replacement. Come 9 a.m. the pair of fresh co-CEOs were bombarded with questions.

Why did Gelsinger leave so suddenly? What kind of CEO is Intel trying to get now? How can employees trust leadership after repeated missteps?, employees asked.

The man at the center of the conversation was not there. Being CEO of Intel was Pat Gelsinger's dream since he joined the company as a teenager in 1979. He achieved it improbably after being ousted once already.

"He was the prodigal son returning," described Alvin Nguyen, senior analyst at Forrester. Gelsinger returned a savior, but now he's retiring at 63 and Intel is far from saved. Multiple outlets reported Monday that Gelsinger's departure is the result of board rancor, with Bloomberg reporting that the CEO was given the choice to retire or be removed from the job.

Gelsinger's departure was a "personal decision", executives repeated in the all-hands, according to a current employee in attendance.

Intel's interim leadership brings deep knowledge of the company's finances, products, and customers.

Zisner has overseen the recent cost-cutting effort, and Holthaus has been steeped in Intel for nearly 20 years. But no one at the top has the technical expertise of Gelsinger, which Intel employees pointed out in their questions. Yet despite his technical prowess as Intel's first chief technology officer, Intel remains in critical condition.

The leaders emphasized that the company goals would not change: employees would improve efficiency and, reduce costs, and the company would need to execute better with products and with the crucial 18A process.

Holthaus told employees on the call that her leadership style is direct and transparent, according to the employee in attendance. She reminded them that she has worked at Intel for many years.

Intel declined to comment, but a spokesperson pointed to Gelsinger's departure press release.

Contending with Intel's many misses

Intel has more than 65% of the market for traditional PCs and 85% of the server market, according to Edward Jones. Yet critical missteps plague the company. Zisner and Holthaus likely can't wait for an executive search to conclude to address them.

Supporting the passage of the CHIPS Act and obtaining its promised funding has been a major focus of Gelsinger's nearly 4-year term as CEO. However, the funding is contingent upon hitting execution benchmarks, with which the company has struggled.

Last week, the Department of Commerce finalized its direct funding for Intel under the CHIPS Act, totaling $7.865 billion. Said funding fell short of the original amount of $8.5 billion announced.

"While we have made significant progress in regaining manufacturing competitiveness and building the capabilities to be a world-class foundry, we know that we have much more work to do at the company and are committed to restoring investor confidence," said Frank Yeary, now Intel's interim board executive chair, said in a statement.

Intel's overall fall from grace is most apparent in the context of the rise in the importance of accelerated computing and AI.

In 2021, when Gelsinger took over as CEO, shares of Nvidia were trading below $30. The GPU designer's recent rise to one of the most valuable companies in the world has put a spotlight on Intel's relative absence from the accelerated computing race that Nvidia has come to dominate. Median pay at Intel has remained stagnant the last five years compared to other competitors as employee cuts continue.

Gelsinger said last month that the company would miss its target of $500 million in sales this year of its AI chip, Gaudi 3. But analysts told Business Insider that 18A, the company's most advanced manufacturing node, is actually more important to Intel's resurgence than making a splash in AI.

"Intel has ostensibly 'bet' the company on 18A for salvation," Bernstein analysts wrote.

The costs of bringing this node online are likely to increase further, and it "still to get any external validation from large fabless customers," according to Bank of America analyst Vivek Arya. But this expensive work is essential to bring Intel back to the cutting edge and make it an attractive partner for bleeding-edge chip designers like Nvidia.

"The importance of bringing manufacturing back in-house can't be overstated," Futurum Group CEO Daniel Newman told BI. The fate of the company, and the legacy of Gelsinger rides on it.

"The cornerstone of Pat's tenure as CEO was built upon Intel achieving process leadership or at least parity and if they cannot execute with 18A, then it was all for naught," Logan Purk, senior research analyst at Edward Jones, told BI. Given slow-moving technological progress and cost-cutting, and fast-moving competitors, Intel's next CEO may be inheriting a harder job than Gelsinger did.

"It was a tough situation when Pat showed up, and things look much worse now," Bernstein analysts wrote in a note to investors.

No one has been a closer witness to this roller coaster than Intel employees, who have seen multiple waves of layoffs and buyouts.

Monday's meeting had the distinct flavor of "damage control", according to the employee.

Intel shares were down 60% Monday, compared to the day Gelsinger took the CEO job. However, shares jumped slightly upon Monday's announcement of Gelsinger's retirement.

Got a tip? Contact this reporter at [email protected] or use the secure messaging app Signal with the username hliwrites.99.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

Nvidia hopes lightning will strike twice as it aims to corner the burgeoning robotics market

30 November 2024 at 06:10
Jensen Huang in front of two humanoid robotic heads

ktsimage/Getty, Justin Sullivan/Getty, Tyler Le/BI

  • Nvidia's gaming past and mastering of the GPU made it well-positioned for the AI boom.
  • Its next market to corner is advanced robotics that could give way to humanoids.
  • Technical hurdles could be a reality check for Jensen Huang's robotics future.

Wearing his signature black leather jacket, Jensen Huang outstretched both arms, gesturing at the humanoid robots flanking him, and the audience applauded. "About my size," he joked from the stage at Computex 2024 in Taipei, Taiwan, in June.

"Robotics is here. Physical AI is here. This is not science fiction," he said. The robots, though, were flat, generated on a massive screen. What came onto the stage were wheeled machines resembling delivery robots.

Robots are a big part of Huang's vision of the future, which is shared by other tech luminaries, including Elon Musk. In addition to the Computex display, humanoid robots have come up on Nvidia's latest two earnings calls.

Most analysts agree that Nvidia's fate is all but sealed for a few years. Demand for graphics processing units has fueled it to a $3 trillion market capitalization β€” some days. But the semiconductor industry is cruel. Investment in data centers, which make up 87% of Nvidia's revenue, comes in booms and busts. Nvidia needs another big market.

At Computex, Huang said there would be two "high-volume" robotic products in the future. The first is self-driving cars, and the second is likely to be humanoid robots. Thanks to machine learning, the technologies are converging.

Both machines require humanlike perception of fast-changing surroundings and instantaneous reactions with little room for error. They also both require immense amounts of what Huang sells: AI computing power. But robotics is a tiny portion of Nvidia's revenue today. And growing it isn't just a matter of time.

If Nvidia's place in the tech stratosphere is to be permanent, Huang needs the market for robotics to be big. While the story of Nvidia's past few years has been one of incredible engineering, foresight, and timing, the challenge to make robots real may be even tougher.

How can Nvidia bring on the robots?

Artificial intelligence presents a massive unlock for robotics. But scaling the field means making the engineering and building more accessible.

"Robotic AI is the most complicated because a large language model is software, but robots are a mechanical-engineering problem, a software problem, and a physics problem. It's much more complicated," Raul Martynek, the CEO of the data-center landlord DataBank, said.

Most of the people working on robotics are experts with doctoral degrees in robotics because they have to be. The same was true of language-based AI 10 years ago. Now that foundation models and the computing to support them are widely available, it doesn't take a doctorate to build AI applications.

Layers of software and vast language and image libraries are intended to make users stickier and lower the barrier to entry so that almost anyone can build with AI.

Nvidia's robotics stack needs to do the same, but since using AI in physical spaces is harder, making it work for laypeople is also harder.

The Nvidia robotics stack takes some navigating. It's a sea of platforms, libraries, and names.

Omniverse is a simulation platform. It offers a virtual world that developers can customize and use to test simulations of robots. Isaac is what Nvidia calls a "gym" built on top of Omniverse. It's how you put your robot into an environment and practice tasks.

Jetson Thor is Nvidia's chip for powering robots. Project Groot, which the company refers to as a "moonshot" initiative, is a foundation model for humanoid robots. In July, the company launched a synthetic-data-generation service and Osmo, a software layer that ties it all together.

Huang often says that humanoids are easier to build because the world is already made for humans.

"The easiest robot to adapt in the world are humanoid robots because we built the world for us," he said at Computex, adding: "There's more data to train these robots because we have the same physique."

Gathering data on how we move still takes time, effort, and money. Tesla, for example, is paying people $48 an hour to perform tasks in a special suit to train its humanoid, Optimus.

"That's been the biggest problem in robotics β€” how much data is needed to give those foundational models an understanding of the world and adjust for it," Sophia Velastegui, an AI expert who's worked for Apple, Google, and Microsoft, said.

But analysts see the potential. The research firm William Blair's analysts recently wrote, "Nvidia's capabilities in robotics and digital twins (with Omniverse) have the potential to scale into massive businesses themselves." The analysts said they expected Nvidia's automotive business to grow 20% annually through 2027.

Nvidia has announced that BMW uses Isaac and Omniverse to train factory robots. Boston Dynamics, BYD Electronics, Figure, Intrinsic, Siemens, and Teradyne Robotics use Nvidia's stack to build robot arms, humanoids, and other robots.

But three robotics experts told Business Insider that so far, Nvidia has failed to lower the barrier to entry for wannabe robot builders like it has in language- and image-based AI. Competitors are coming in to try to open up the ideal stack for robotics before Nvidia can dominate that, too.

"We recognize that developing AI that can interact with the physical world is extremely challenging," an Nvidia spokesperson told BI via email. "That's why we developed an entire platform to help companies train and deploy robots."

In July, the company launched a humanoid-robot developer program. After submitting a successful application, developers can access all these tools.

Nvidia can't do it alone

Ashish Kapoor is acutely aware of all the progress the field has yet to make. For 17 years, he was a leader in Microsoft's robotics-research department. There, he helped to develop AirSim, a computer-vision simulation platform launched in 2017 that was sunsetted last year.

Kapoor left with the shutdown to make his own platform. Last year, he founded Scaled Foundations and launched Grid, a robotics-development platform designed for aspiring robot builders.

No one company can solve the tough problems of robotics alone, he said.

"The way I've seen it happen in AI, the actual solution came from the community when they worked on something together," Kapoor said. "That's when the magic started to happen, and this needs to happen in robotics right now."

It feels like every player aiming for humanoid robots is in it for themselves, Kapoor said. But there's a robotics-startup graveyard for a reason. The robots get into real-world scenarios, and they're simply not good enough. Customers give up on them before they can get better.

"The running joke is that every robot has a team of 10 people trying to run it," Kapoor said.

Grid offers a free tier or a managed service that offers more help. Scaled Foundations is building its own foundation model for robotics but encourages users to develop one, too.

Some elements of Nvidia's robotics stack are open source. And Huang often says that Nvidia is working with every robotics and AI company on the planet, but some developers fear the juggernaut will protect its own success first and support the ecosystem second.

"They're doing the Apple effect. To me, they're trying to lock you in as much as they can into their ecosystem," said Jonathan Stephens, the chief developer advocate at the computer-vision firm EveryPoint.

An Nvidia spokesperson told BI that this perception was inaccurate. The company "collaborates with the majority of the leading players in the robotics and humanoid developer ecosystem" to help them deploy robots faster. "Our success comes from the ecosystem," they said.

Scaled Foundations and Nvidia aren't the only ones working on a foundation model for robotics. Skild AI raised $300 million in July to build its version.

What makes a humanoid?

Simulators are an essential stop on the path to humanoid robots, but they don't necessarily lead to humanlike perception.

When describing a robotic arm at Computex, Huang said that Nvidia supplied "the computer, the acceleration layers, and the pretrained AI models" needed to put an AI robot into an AI factory. The goal of using robotic arms in factories at scale has been around for decades. Robotic arms have been building cars since 1961. But Huang was talking about an AI robot β€” an intelligent robot.

The arms that build cars are largely unintelligent. They're programmed to perform repetitive tasks and often "see" with sensors instead of cameras.

An AI-enabled robotic arm would be able to handle varied tasks β€” picking up diverse items and putting them down in diverse places without breaking them, maybe while on the move. They need to be able to perceive objects and guardrails and then make moves in a coherent order. But a humanoid robot is a world away from even the most useful nonhumanoid. Some roboticists doubt that it's the right target to aim for.

"I'm very skeptical," said a former Nvidia robotics expert with more than 15 years in the field who was granted anonymity to protect industry relationships. "The cost to make a humanoid robot and to make it versatile is going to be higher than if you make a robot that doesn't look like a human and can only do a single task but does the task well and faster."

But Huang is all in.

"I think Jensen has an obsession with robots because, ultimately, what he's trying to do is create the future," Martynek said.

Autonomous cars and robotics are a big part of Nvidia's future. The company told BI it expected everything to be autonomous eventually, starting with robotic arms and vehicles and leading to buildings and even cities.

"I was at Apple when we developed iPad inspired by 'Star Trek' and other future worlds in movies," Velastegui said, adding that Robotics taps into our imagination.

Read the original article on Business Insider

Nvidia workforce data explains its meteoric rise

28 November 2024 at 01:00
NVIDIA photo collage
Nvidia's workforce has increased more than 20-fold in the last twenty years.

Anna Kim/Getty, Tyler Le/BI

  • Nvidia's workforce has grown nearly 20-fold since 2003.
  • The company's stock price surge and low turnover have enriched many long-term employees.
  • Nvidia's median salary now surpasses Microsoft's and other Silicon Valley peers.

Nvidia was largely unknown just a few years ago.

In 2022, google searches for Jensen Huang, the company's charismatic CEO, were almost nonexistent. And Nvidia employees were not nearly the source of fascination and interest they are today.

Nvidia recruiters are now swamped at conferences, and platforms like Reddit and Blind are full of eager posters wondering how to land a job or at least get an interview at the company, which has around 30,000 employees.

They want to know how many Nvidians are millionaires β€” likely quite a few.

The skyrocketing stock price has made that the case, but so has the longevity of its employees. Twenty-year-plus tenures are not uncommon, and even now when AI talent has never been more prized, staff turnover has been falling in recent years. In January, the company reported a turnover rate of 2.7%. Tech industry turnover below 20% is notable, an HR firm told Business Insider earlier this year.

The data behind the evolution of Nvidia's workforce tells the story of the company's meteoric rise just as well, if not better than the revenue or stock price. Until the early 2000s, the chip design company, which was founded in 1993, was relatively under the radar. Here is Nvidia's story in four charts.

Nvidia's workforce has grown nearly 20-fold since 2003

Beyond Nvidia's historic rise in market value, the company has a lot to offer employees. It maintains a permissive remote work policy even as tech giants like Amazon mandate a return to the office. It has also built an appropriately futuristic new Santa Clara, California, headquarters which robotics leader Rev Lebaredian described to Business Insider as so tech-infused it is a "type of robot."

But the culture isn't for everyone.

Public feedback, for example, is a very intentional part of the workplace culture. Huang famously has dozens of direct reports and eschews one-on-one meetings, preferring to call out mistakes in public rather than saving harsh feedback for private conversation, so that everyone can learn.

Nvidia has become one of the best-paying firms in Silicon Valley

Four years ago, Nvidians' median salary wasn't at the top of the market. In 2019, Microsoft's median employee salary was nearly $20,000 higher than an Nvidia worker. But as of January 2024, Nvidia's median salary (excluding the CEO) surpassed Microsoft and has left other tech giants in the dust.

Yet, this chart only reports on base compensation.

Years of stock-based compensation and "special Jensen grants," along with four-digit growth in the stock price within the last decade, have led to wealthy employees and, at times, internal tension surrounding rich Nvidia employees not pulling their weight.

Certainly, not all Nvidians are millionaires and the compensation the company is required to report to shareholders every Spring isn't quite the full picture. Still, Huang has repeatedly said that despite Nvidia's AI dominance, he wakes up worrying about staying on top.

Nvidia's revenue per employee has recovered after years of investment

Divide the company's revenue by its employee headcount and its financial strategy shows through.

Beginning in 2006, long before using graphics processing units to run AI models was commonplace, Nvidia invested in building a programming software layer called compute unified device architecture (CUDA).

Nvidia's GPUs are capable of immense computing capacity at nearly unprecedented speed because they perform calculations simultaneously rather than one at a time. Instructing these powerful chips required a new software paradigm.

CUDA is that paradigm and building it took years and cost Nvidia dearly. In hindsight, the benefit of this investment period is undeniable. CUDA is the main element that keeps AI builders from easily or willingly switching to competing hardware like AMD's MI325 and Amazon's Trainium chips.

It's not a literal translation of every employee's contribution, but looking at the revenue-to-headcount ratio can show trends in efficiency, investment, and return.

Nvidia's revenue-to-headcount ratio showed a downward trend from 2003 until 2014, and then steady upward progress until the AI boom in 2023. During that year, this ratio doubled.

CUDA is likely not the only factor affecting this data point, but it may help explain why investors questioned CUDA expenditures for years β€” and why they no longer do.

But the company isn't as far ahead in other areas.

Nvidia has less than one in five women employees β€” but it has pay parity

Despite the dizzying progress of Nvidia's technological achievements, gender representation in the company's workforce and the semiconductor industry as a whole has remained relatively unchanged in the last decade. As of January 2024, Nvidia's global workforce was 19.7% female.

Nvidia's stats are in line with the industry totals for female representation, but ahead of the pack when it comes to women in technical and management positions.

According to a 2023 Accenture analysis, the median representation of women in the semiconductor industry is between 20% and 29%, up from between 20% and 25% in 2022. Over half of the companies in the sample reported less than 10% representation of women in technical director roles and less than 5% in technical executive leadership roles.

In January Nvidia reported that women at the company make 99.5% of what men make in terms of baseline compensation. For the last two years, the turnover rate for women at the company has been slightly lower than that for men.

Nvidia declined to comment on this dynamic when BI reported on it in September.

Do you work at Nvidia? Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider
❌
❌