❌

Reading view

There are new articles available, click to refresh the page.

Groq is 'unleashing the beast' to chip away at Nvidia's CUDA advantage

A tattooed man in a black shirt and jeans stands on a stage with a pink and black background that read Groq: what's next?
Mark Heaps is the chief technology evangelist for Nvidia challenger Groq

Groq

  • Groq is taking a novel approach to competing with Nvidia's much-lauded CUDA software.
  • The chip startup is using a free inference tier to attract hundreds of thousands of AI developers.
  • Groq aims to capture market share with faster inference and global joint ventures.

There is an active debate about Nvidia's competitive moat. Some say there's a prevailing perception of a 'safe' choice when investing billions in a technology, in which the return is still uncertain.

Many say it's Nvidia's software, particularly CUDA, which the company began developing decades before the AI boom. CUDA allows users to get the most out of graphics processing units.

Competitors have attempted to make comparable systems, but without Nvidia's headstart, it has been tough to get developers to learn, try, and ultimately improve their systems.

Groq, however, is an Nvidia competitor that focused early on the segment of AI computing that requires less need for directly programming chips, and investors are intrigued. The 8-year-old AI chip startup was valued at $2.8 billion at its $640 million Series D round in August.

Though at least one investor has called companies like Groq 'insane' for attempting to dent Nvidia's estimated 90% market share, the startup has been building its technology exactly for the opportunity that is coming in 2025, Mark Heaps, Groq's "chief tech evangelist" said.

'Unleashing the beast'

"What we decided to do was take all of our compute, make it available via a cloud instance, and we gave it away to the world for free," Heaps said. Internally, the team called the strategy, "unleashing the beast". Groq's free tier caps users at a ceiling marked by requests per day or tokens per minute.

Heaps, CEO and ex-Googler Jonathan Ross, and a relatively lean team have spent 2023 and 2024 recruiting developers to try Groq's tech. Through hackathons and contests, the company makes a promise β€” try the hardware via Groq's cloud platform for free, and break through walls you've hit with others.

Groq offers some of the fastest inference out there, according to rankings on Artificialanalysis.ai, which measures cost and latency for companies that allow users to buy access to specific models by the token β€” or output.

Inference is a type of computing that produces the answers to queries asked of large language models. Training, the more energy-intensive type of computing, is what gives the models the ability to answer. So far, the hardware used for those two tasks has been different.

Heaps and several of his Nvidia-challenging cohorts at companies like Cerebras and SambaNova Systems said that speed is a competitive advantage.

After the inference service was available for free, developers came out of the woodwork, he said, with projects that couldn't be successful on slower chips. With more speed, developers can send one request through multiple models and use another model to choose the best response β€” all in the time it would usually take to fulfill just one request.

Roughly 652,000 developers are now using Groq API keys, Heaps said.

Heaps expects speed to hook developers on Groq. But its novel plan for programming its chips gives the company a unique approach to the most crucial element within Nvidia's "moat."

No need for CUDA libraries

"Everybody, once they deployed models, was gonna need faster inference at a lower cost, and so that's what we focused on," Heaps said.

So where's the CUDA equivalent? It's all in-house.

"We actually have more than 1800 models built into our compiler. We use no kernels, and we don't need people to use CUDA libraries. So because of that, people can just start working with a model that's built-in," Heaps said.

Training, he said, requires more customization at the chip level. In inference, Groq's task is to choose the right models to offer customers and ensure they run as fast as possible.

"What you're seeing with this massive swell of developers who are building AI applications β€” they don't want to program at the chip level," he added.

The strategy comes with some level of risk. Groq is unlikely to accumulate a stable of developers who continuously troubleshoot and improve its base software like CUDA has. Its offering may be more like a restaurant menu than a grocery store. But this also means the barrier to entry for Groq users is the same as any other cloud provider and potentially lower than that of other chips.

Though Groq started out as a company with a novel chip design, today, of the company's roughly 300 employees, 60% are software engineers, Heaps said.

"For us right now, there is a billions and billions of dollars industry emerging, that we can go capture a big share of market in, while at the same time, we continue to mature the compiler," he said.

Despite being realistic about the near-term, Groq has lofty ambitions, which board CEO Jonathan Ross has described as "providing half the world's inference." Ross also says the goal is to cast a net over the globe β€” to be achieved via joint ventures. Saudi Arabia is on the way. Canada and Latin America are in the works.

Earlier this year, Ross told BI the company also has a goal to ship 108,000 of its language processing units or LPUs by the first quarter of next year β€” and 2 million chips by the end of 2025, most of which will be made available through its cloud.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

A chip company you probably never heard of is suddenly worth $1 trillion. Here's why, and what it means for Nvidia.

Broadcom CEO Hock Tan speaking at a conference
Broadcom CEO Hock Tan

Ying Tang/NurPhoto via Getty Images

  • Broadcom's stock surged in recent weeks, pushing the company's market value over $1 trillion.
  • Broadcom is crucial for companies seeking alternatives to Nvidia's AI chip dominance.
  • Custom AI chips are gaining traction, enhancing tech firms' bargaining power, analysts say.

The rise of AI, and the computing power it requires, is bringing all kinds of previously under-the-radar companies into the limelight. This week it's Broadcom.

Broadcom's stock has soared since late last week, catapulting the company into the $1 trillion market cap club. The boost came from a blockbuster earnings report in which custom AI chip revenue grew 220% compared to last year.

In addition to selling lots of parts and components for data centers, Broadcom designs and sells ASICs, or application-specific integrated circuits β€” an industry acronym meaning custom chips.

Designers of custom AI chips, chief among them Broadcom and Marvell, are headed into a growth phase, according to Morgan Stanley.

Custom chips are picking up speed

The biggest players in AI buy a lot of chips from Nvidia, the $3 trillion giant with an estimated 90% of market share of advanced AI chips.

Heavily relying on one supplier isn't a comfortable position for any company, though, and many large Nvidia customers are also developing their own chips. Most tech companies don't have large teams of silicon and hardware experts in house. Of the companies they might turn to design them a custom chip, Broadcom is the leader.

Though multi-purpose chips like Nvidia's and AMD's graphics processing units are likely to maintain the largest share of the AI chip market in the long-term, custom chips are growing fast.

Morgan Stanley analysts this week forecast the market for ASICs to nearly double to $22 billion next year.

Much of that growth is attributable to Amazon Web Services' Trainium AI chip, according to Morgan Stanley analysts. Then there are Google's in-house AI chips, known as TPUs, which Broadcom helps make.

In terms of actual value of chips in use, Amazon and Google dominate. But OpenAI, Apple, and TikTok parent company ByteDance are all reportedly developing chips with Broadcom, too.

ASICs bring bargaining power

Custom chips can offer more value, in terms of the performance you get for the cost, according to Morgan Stanley's research.

ASICs can also be designed to perfectly match unique internal workloads for tech companies, accord to the bank's analysts. The better these custom chips get, the more bargaining power they may provide when tech companies are negotiating with Nvidia over buying GPUs. But this will take time, the analysts wrote.

In addition to Broadcom, Silicon Valley neighbor Marvell is making gains in the ASICs market, along with Asia-based players Alchip Technologies and Mediatek, they added in a note to investors.

Analysts don't expect custom chips to ever fully replace Nvidia GPUs, but without them, cloud service providers like AWS, Microsoft, and Google would have much less bargaining power against Nvidia.

"Over the long term, if they execute well, cloud service providers may enjoy greater bargaining power in AI semi procurement with their own custom silicon," the Morgan Stanley analysts explained.

Nvidia's big R&D budget

This may not be all bad news for Nvidia. A $22 billion ASICs market is smaller than Nvidia's revenue for just one quarter.

Nvidia's R&D budget is massive, and many analysts are confident in its ability to stay at the bleeding edge of AI computing.

And as Nvidia rolls out new, more advanced GPUs, its older offerings get cheaper and potentially more competitive with ASICs.

"We believe the cadence of ASICs needs to accelerate to stay competitive to GPUs," the Morgan Stanley analysts wrote.

Still, Broadcom and chip manufacturers on the supply chain rung beneath, such as TSMC, are likely to get a boost every time a giant cloud company orders up another custom AI chip.

Read the original article on Business Insider

Intel co-CEOs discuss splitting product and manufacturing businesses

Intel in an eye
Intel.

Intel; Getty Images; Chelsea Jia Feng/BI

  • Intel's co-CEOs discussed splitting the firm's manufacturing and products businesses Thursday.
  • A separation could address Intel's poor financial performance. It also has political implications.
  • Intel Foundry is forming a separate operational board in the meantime, executives said.

Intel's new co-CEOs said the company is creating more separation between its manufacturing and products businesses and the possibility of a formal split is still in play.

When asked if separating the two units was a possibility and if the success of the company's crucial, new "18A" process could influence the decision, CFO David Zinsner and CEO of Intel Products Michelle Johnston Holthaus, now interim co-CEOs, said preliminary moves are in progress.

"We really do already run the businesses fairly independently," Holthaus said at a Barclays tech conference Thursday. She added that severing the connection entirely does not make sense in her view, "but, you know, someone will decide that," she said.

Ousted CEO Pat Gelsinger prioritized keeping the fabs as part of Intel proper. The fabs hold important geopolitical significance to both Intel and the US. The manufacturing part of the business also weighs on the company's financial results.

"As far as does it ever fully separate? I think that's an open question for another day," Zinsner said.

Already in motion

Though the co-CEOs made it clear a final decision on a potential break-up has not been made, Zinsner outlined a series of moves already in progress that could make a split easier.

"We already run the businesses separately, but we are going down the path of creating a subsidiary for Intel Foundry as part of the overall Intel company," Zinsner said.

In addition, the company is forming a separate operational board for Intel Foundry and separating the operations and inventory management software for the two sides of the business.

Until a permanent CEO is appointed by the board, the co-CEOs will manage most areas of the company together, but Zinsner alone will manage the Foundry business. The foundry aims to build a contract manufacturing business for other chip designers. Due to the sensitive, competitive intellectual property coming from clients into that business, separation is key.

"Obviously, they want firewalls. They want to protect their IPs, their product road maps, and so forth. So I will deal with that part of the foundry to separate that from the Intel Products business." Zinsner said.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088.

Read the original article on Business Insider

Will the world's fastest supercomputer please stand up?

TRITON Supercomputer_13
TRITON Supercomputer at the University of Miami

T.J. Lievonen

  • Oracle and xAI love to flex the size of their GPU clusters.
  • It's getting hard to tell who has the most supercomputing power as more firms claim the top spot.
  • The real numbers are competitive intel and cluster size isn't everything, experts told BI.

In high school, as in tech, superlatives are important. Or maybe they just feel important in the moment. With the breakneck pace of the AI computing infrastructure buildout, it's becoming increasingly difficult to keep track of who has the biggest, fastest, or most powerful supercomputer β€” especially when multiple companies claim the title at once.

"We delivered the world's largest and fastest AI supercomputer, scaling up to 65,000 Nvidia H200 GPUs," Oracle CEO Safra Catz and Chairman, CTO, echoed by Founder Larry Ellison on the company's Monday earnings call.

In late October, Nvidia proclaimed xAI's Colossus as the "World's Largest AI Supercomputer," after Elon Musk's firm reportedly built a computing cluster with 100,000 Nvidia graphics processing units in a matter of weeks. The plan is to expand to 1 million GPUs next, according to the Greater Memphis Chamber of Commerce (where the supercomputer is located.)

The good ole days of supercomputing are gone

It used to be simpler. "Supercomputers" were most commonly found in research settings. Naturally, there's an official list ranking supercomputers. Until recently the world's most powerful supercomputer was named El Capitan. Housed at the Lawrence Livermore National Laboratory in California 11 million CPUs and GPUs from Nvidia-rival AMD add up to 1.742 Exaflops of computing capacity. (One exaflop is equal to one quintillion, or a billion billions, operations per second.)

"The biggest computers don't get put on the list," Dylan Patel, chief analyst at Semianalysis, told BI. "Your competitor shouldn't know exactly what you have," he continued. The 65,000-GPU supercluster Oracle executives were praising can reach up to 65 exaflops, according to the company.

It's safe to assume, Patel said, that Nvidia's largest customers, Meta, Microsoft, and xAI also have the largest, most powerful clusters. Nvidia CFO Colette Cress said 200 fresh exaflops of Nvidia computing would be online by the end of this year β€” across nine different supercomputers β€” on Nvidia's May earnings call.

Going forward, it's going to be harder to determine whose clusters are the biggest at any given moment β€” and even harder to tell whose are the most powerful β€” no matter how much CEOs may brag.

It's not the size of the cluster β€” it's how you use it

On Monday's call, Ellison was asked, if the size of these gigantic clusters is actually generating better model performance.

He said larger clusters and faster GPUs are elements that speed up model training. Another is networking it all together. "So the GPU clusters aren't sitting there waiting for the data," Ellison said Monday.

Thus, the number of GPUs in a cluster isn't the only factor in the computing power calculation. Networking and programming are important too. "Exaflops" are a result of the whole package so unless companies provide them, experts can only estimate.

What's certain is that more advanced models β€” the kind that consider their own thinking and check their work before answering queries β€” require more compute than their relatives of earlier generations. So training increasingly impressive models may indeed require an arms race of sorts.

But an enormous AI arsenal doesn't automatically lead to better or more useful tools.

Sri Ambati, CEO of open-source AI platform H2O.ai, said cloud providers may want to flex their cluster size for sales reasons, but given some (albeit slow) diversification of AI hardware and the rise of smaller, more efficient models, cluster size isn't the end all be all.

Power efficiency too, is a hugely important indicator for AI computing since energy is an enormous operational expense in AI. But it gets lost in the measuring contest.

Nvidia declined to comment. Oracle did not respond to a request for comment in time for publication.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088.

Read the original article on Business Insider

Intel's next CEO needs to decide the fate of its chip fabs

Intel CEO Pat Gelsinger holding up a chip at a US Senate hearing
Former Intel CEO Pat Gelsinger holding up a chip at a US Senate hearing.

Getty

  • Intel's CEO departure reignited debate on splitting its factories from the company.
  • Intel's fabs are costly, but they're also considered vital for US national security.
  • CHIPS Act funding requires Intel to maintain majority control of its foundry.

One central question has been hanging over Intel for months: Should the 56-year-old Silicon Valley legend separate its chip factories, or fabs, from the rest of the company?

Intel's departing CEO, Pat Gelsinger, has opposed that strategy. As a longtime champion of the company's chip manufacturing efforts, he was reluctant to split it.

The company has taken some steps to look into this strategy. Bloomberg reported in August that Intel had hired bankers to help consider several options, including splitting off the fabs from the rest of Intel. The company also announced in September that it would establish its Foundry business as a separate subsidiary within the company.

Gelsinger's departure from the company, announced Monday, has reopened the question, although the calculus is more complicated than simple dollars and cents.

Splitting the fabs from the rest of its business could help Intel improve its balance sheet. It likely won't be easy since Intel was awarded $7.9 billion in CHIPS and Science Act funding, and it's required to maintain majority control of its foundries.

Intel declined to comment for this story.

A breakup could make Intel more competitive

Politically, fabs are importantΒ to Intel's place in the American economy and allow the US to reduce dependence on foreign manufacturers. At the same time, they drag down the company's balance sheet. Intel's foundry, the line of business that manufactures chips, has posted losses for years.

Fabs are immensely hard work. They're expensive to build and operate, and they require a level of precision beyond most other types of manufacturing.

Intel could benefit from a split, and the company maintains meaningful market share in its computing and traditional (not AI) data center businesses. Amid the broader CEO search, Intel also elevated executive Michelle Johnston Holthaus to CEO of Intel Products and the company's co-CEO. Analysts said this could better set up a split.

Regardless, analysts said finding new leadership for the fabs will be challenging.

"The choice for any new CEO would seem to center on what to do with the fabs," Bernstein analysts wrote in a note to investors after the announcement of Gelsinger's departure.

On one hand, the fabs are "deadweight" for Intel, the Bernstein analysts wrote. On the other hand, "scrapping them would also be fraught with difficulties around the product road map, outsourcing strategy, CHIPS Act and political navigation, etc. There don't seem to be any easy answers here, so whoever winds up filling the slot looks in for a tough ride," the analysts continued.

Intel's competitors and contemporaries are avoiding the hassle of owning and operating a fab. The world's leading chip design firm,Β Nvidia, outsourcesΒ all its manufacturing. Its runner-up, AMD, experienced similar woes when it owned fabs, eventually spinning them out in 2009.

Intel has also outsourced some chip manufacturing to rival TSMC in recent years β€” which sends a negative signal to the market about its own fabs.

Intel is getting CHIPS Act funding

Ownership of the fabs and CHIPS Act funding are highly intertwined. Intel must retain majority control of the foundry to continue receiving CHIPS Act funding and benefits, a November regulatory filing said.

Intel could separate its foundry business while maintaining majority control, said Dan Newman, CEO of The Futurum Group. Still, the CHIPS Act remains key to Intel's future.

"If you add it all up, it equates to roughly $40 billion in loans, tax exemptions, and grants β€” so quite significant," said Logan Purk, a senior research analyst at Edward Jones.

"Only a small slice of the commitment has come, though," he continued.

Intel's fabs need more customers

Intel is attempting to move beyond manufacturing its own chips to becoming a contract manufacturer. Amazon has already signed on as a customer. Though bringing in more manufacturing customers could mean more revenue, it first requires more investment.

There's a more ephemeral reason Intel might want separation between its Foundry and its chip design businesses, too. Foundries regularly deal with many competing clients.

"One of the big concerns for the fabless designers is any sort of information leakage," Newman said.

"The products department competes with many potential clients of the foundry. You want separation," he added.

It was once rumored that a third party might buy Intel. Analysts have balked at the prospect for political and financial reasons, particularly since running the fabs is a major challenge.

Read the original article on Business Insider

In an all-hands meeting, Intel's new leaders emphasized outgoing CEO Pat Gelsinger's 'personal decision'

Pat Gelsinger gestures in front of a large screen that reads, " It starts with Intel."
Intel CEO Pat Gelsinger delivers a speech at Taipei Nangang Exhibition Center during Computex 2024, in Taipei on June 4, 2024.

I-Hwa CHENG / AFP

  • Intel CEO Pat Gelsinger is out of the top spot after a challenging 4-year tenure.
  • The company's interim co-CEOs addressed the workforce Monday morning in an all-hands meeting.
  • One Intel employee described the responses to questions as "vague" and the tone of the meeting as "damage control".

On Monday morning, Intel employees joined an all-hands meeting after receiving an email invite at 5 a.m. PT.

Accompanying the invite was the news that the company's CEO Pat Gelsinger had stepped down as of Sunday, and would be temporarily replaced by co-CEOs David Zinsner, Intel's chief financial officer for nearly three years, and Michelle Johnston Holthaus, the new CEO of product.

Gelsinger's move came without warning. He isn't staying on to transition out slowly or help with the search for his replacement. Come 9 a.m. the pair of fresh co-CEOs were bombarded with questions.

Why did Gelsinger leave so suddenly? What kind of CEO is Intel trying to get now? How can employees trust leadership after repeated missteps?, employees asked.

The man at the center of the conversation was not there. Being CEO of Intel was Pat Gelsinger's dream since he joined the company as a teenager in 1979. He achieved it improbably after being ousted once already.

"He was the prodigal son returning," described Alvin Nguyen, senior analyst at Forrester. Gelsinger returned a savior, but now he's retiring at 63 and Intel is far from saved. Multiple outlets reported Monday that Gelsinger's departure is the result of board rancor, with Bloomberg reporting that the CEO was given the choice to retire or be removed from the job.

Gelsinger's departure was a "personal decision", executives repeated in the all-hands, according to a current employee in attendance.

Intel's interim leadership brings deep knowledge of the company's finances, products, and customers.

Zisner has overseen the recent cost-cutting effort, and Holthaus has been steeped in Intel for nearly 20 years. But no one at the top has the technical expertise of Gelsinger, which Intel employees pointed out in their questions. Yet despite his technical prowess as Intel's first chief technology officer, Intel remains in critical condition.

The leaders emphasized that the company goals would not change: employees would improve efficiency and, reduce costs, and the company would need to execute better with products and with the crucial 18A process.

Holthaus told employees on the call that her leadership style is direct and transparent, according to the employee in attendance. She reminded them that she has worked at Intel for many years.

Intel declined to comment, but a spokesperson pointed to Gelsinger's departure press release.

Contending with Intel's many misses

Intel has more than 65% of the market for traditional PCs and 85% of the server market, according to Edward Jones. Yet critical missteps plague the company. Zisner and Holthaus likely can't wait for an executive search to conclude to address them.

Supporting the passage of the CHIPS Act and obtaining its promised funding has been a major focus of Gelsinger's nearly 4-year term as CEO. However, the funding is contingent upon hitting execution benchmarks, with which the company has struggled.

Last week, the Department of Commerce finalized its direct funding for Intel under the CHIPS Act, totaling $7.865 billion. Said funding fell short of the original amount of $8.5 billion announced.

"While we have made significant progress in regaining manufacturing competitiveness and building the capabilities to be a world-class foundry, we know that we have much more work to do at the company and are committed to restoring investor confidence," said Frank Yeary, now Intel's interim board executive chair, said in a statement.

Intel's overall fall from grace is most apparent in the context of the rise in the importance of accelerated computing and AI.

In 2021, when Gelsinger took over as CEO, shares of Nvidia were trading below $30. The GPU designer's recent rise to one of the most valuable companies in the world has put a spotlight on Intel's relative absence from the accelerated computing race that Nvidia has come to dominate. Median pay at Intel has remained stagnant the last five years compared to other competitors as employee cuts continue.

Gelsinger said last month that the company would miss its target of $500 million in sales this year of its AI chip, Gaudi 3. But analysts told Business Insider that 18A, the company's most advanced manufacturing node, is actually more important to Intel's resurgence than making a splash in AI.

"Intel has ostensibly 'bet' the company on 18A for salvation," Bernstein analysts wrote.

The costs of bringing this node online are likely to increase further, and it "still to get any external validation from large fabless customers," according to Bank of America analyst Vivek Arya. But this expensive work is essential to bring Intel back to the cutting edge and make it an attractive partner for bleeding-edge chip designers like Nvidia.

"The importance of bringing manufacturing back in-house can't be overstated," Futurum Group CEO Daniel Newman told BI. The fate of the company, and the legacy of Gelsinger rides on it.

"The cornerstone of Pat's tenure as CEO was built upon Intel achieving process leadership or at least parity and if they cannot execute with 18A, then it was all for naught," Logan Purk, senior research analyst at Edward Jones, told BI. Given slow-moving technological progress and cost-cutting, and fast-moving competitors, Intel's next CEO may be inheriting a harder job than Gelsinger did.

"It was a tough situation when Pat showed up, and things look much worse now," Bernstein analysts wrote in a note to investors.

No one has been a closer witness to this roller coaster than Intel employees, who have seen multiple waves of layoffs and buyouts.

Monday's meeting had the distinct flavor of "damage control", according to the employee.

Intel shares were down 60% Monday, compared to the day Gelsinger took the CEO job. However, shares jumped slightly upon Monday's announcement of Gelsinger's retirement.

Got a tip? Contact this reporter at [email protected] or use the secure messaging app Signal with the username hliwrites.99.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

Nvidia hopes lightning will strike twice as it aims to corner the burgeoning robotics market

Jensen Huang in front of two humanoid robotic heads

ktsimage/Getty, Justin Sullivan/Getty, Tyler Le/BI

  • Nvidia's gaming past and mastering of the GPU made it well-positioned for the AI boom.
  • Its next market to corner is advanced robotics that could give way to humanoids.
  • Technical hurdles could be a reality check for Jensen Huang's robotics future.

Wearing his signature black leather jacket, Jensen Huang outstretched both arms, gesturing at the humanoid robots flanking him, and the audience applauded. "About my size," he joked from the stage at Computex 2024 in Taipei, Taiwan, in June.

"Robotics is here. Physical AI is here. This is not science fiction," he said. The robots, though, were flat, generated on a massive screen. What came onto the stage were wheeled machines resembling delivery robots.

Robots are a big part of Huang's vision of the future, which is shared by other tech luminaries, including Elon Musk. In addition to the Computex display, humanoid robots have come up on Nvidia's latest two earnings calls.

Most analysts agree that Nvidia's fate is all but sealed for a few years. Demand for graphics processing units has fueled it to a $3 trillion market capitalization β€” some days. But the semiconductor industry is cruel. Investment in data centers, which make up 87% of Nvidia's revenue, comes in booms and busts. Nvidia needs another big market.

At Computex, Huang said there would be two "high-volume" robotic products in the future. The first is self-driving cars, and the second is likely to be humanoid robots. Thanks to machine learning, the technologies are converging.

Both machines require humanlike perception of fast-changing surroundings and instantaneous reactions with little room for error. They also both require immense amounts of what Huang sells: AI computing power. But robotics is a tiny portion of Nvidia's revenue today. And growing it isn't just a matter of time.

If Nvidia's place in the tech stratosphere is to be permanent, Huang needs the market for robotics to be big. While the story of Nvidia's past few years has been one of incredible engineering, foresight, and timing, the challenge to make robots real may be even tougher.

How can Nvidia bring on the robots?

Artificial intelligence presents a massive unlock for robotics. But scaling the field means making the engineering and building more accessible.

"Robotic AI is the most complicated because a large language model is software, but robots are a mechanical-engineering problem, a software problem, and a physics problem. It's much more complicated," Raul Martynek, the CEO of the data-center landlord DataBank, said.

Most of the people working on robotics are experts with doctoral degrees in robotics because they have to be. The same was true of language-based AI 10 years ago. Now that foundation models and the computing to support them are widely available, it doesn't take a doctorate to build AI applications.

Layers of software and vast language and image libraries are intended to make users stickier and lower the barrier to entry so that almost anyone can build with AI.

Nvidia's robotics stack needs to do the same, but since using AI in physical spaces is harder, making it work for laypeople is also harder.

The Nvidia robotics stack takes some navigating. It's a sea of platforms, libraries, and names.

Omniverse is a simulation platform. It offers a virtual world that developers can customize and use to test simulations of robots. Isaac is what Nvidia calls a "gym" built on top of Omniverse. It's how you put your robot into an environment and practice tasks.

Jetson Thor is Nvidia's chip for powering robots. Project Groot, which the company refers to as a "moonshot" initiative, is a foundation model for humanoid robots. In July, the company launched a synthetic-data-generation service and Osmo, a software layer that ties it all together.

Huang often says that humanoids are easier to build because the world is already made for humans.

"The easiest robot to adapt in the world are humanoid robots because we built the world for us," he said at Computex, adding: "There's more data to train these robots because we have the same physique."

Gathering data on how we move still takes time, effort, and money. Tesla, for example, is paying people $48 an hour to perform tasks in a special suit to train its humanoid, Optimus.

"That's been the biggest problem in robotics β€” how much data is needed to give those foundational models an understanding of the world and adjust for it," Sophia Velastegui, an AI expert who's worked for Apple, Google, and Microsoft, said.

But analysts see the potential. The research firm William Blair's analysts recently wrote, "Nvidia's capabilities in robotics and digital twins (with Omniverse) have the potential to scale into massive businesses themselves." The analysts said they expected Nvidia's automotive business to grow 20% annually through 2027.

Nvidia has announced that BMW uses Isaac and Omniverse to train factory robots. Boston Dynamics, BYD Electronics, Figure, Intrinsic, Siemens, and Teradyne Robotics use Nvidia's stack to build robot arms, humanoids, and other robots.

But three robotics experts told Business Insider that so far, Nvidia has failed to lower the barrier to entry for wannabe robot builders like it has in language- and image-based AI. Competitors are coming in to try to open up the ideal stack for robotics before Nvidia can dominate that, too.

"We recognize that developing AI that can interact with the physical world is extremely challenging," an Nvidia spokesperson told BI via email. "That's why we developed an entire platform to help companies train and deploy robots."

In July, the company launched a humanoid-robot developer program. After submitting a successful application, developers can access all these tools.

Nvidia can't do it alone

Ashish Kapoor is acutely aware of all the progress the field has yet to make. For 17 years, he was a leader in Microsoft's robotics-research department. There, he helped to develop AirSim, a computer-vision simulation platform launched in 2017 that was sunsetted last year.

Kapoor left with the shutdown to make his own platform. Last year, he founded Scaled Foundations and launched Grid, a robotics-development platform designed for aspiring robot builders.

No one company can solve the tough problems of robotics alone, he said.

"The way I've seen it happen in AI, the actual solution came from the community when they worked on something together," Kapoor said. "That's when the magic started to happen, and this needs to happen in robotics right now."

It feels like every player aiming for humanoid robots is in it for themselves, Kapoor said. But there's a robotics-startup graveyard for a reason. The robots get into real-world scenarios, and they're simply not good enough. Customers give up on them before they can get better.

"The running joke is that every robot has a team of 10 people trying to run it," Kapoor said.

Grid offers a free tier or a managed service that offers more help. Scaled Foundations is building its own foundation model for robotics but encourages users to develop one, too.

Some elements of Nvidia's robotics stack are open source. And Huang often says that Nvidia is working with every robotics and AI company on the planet, but some developers fear the juggernaut will protect its own success first and support the ecosystem second.

"They're doing the Apple effect. To me, they're trying to lock you in as much as they can into their ecosystem," said Jonathan Stephens, the chief developer advocate at the computer-vision firm EveryPoint.

An Nvidia spokesperson told BI that this perception was inaccurate. The company "collaborates with the majority of the leading players in the robotics and humanoid developer ecosystem" to help them deploy robots faster. "Our success comes from the ecosystem," they said.

Scaled Foundations and Nvidia aren't the only ones working on a foundation model for robotics. Skild AI raised $300 million in July to build its version.

What makes a humanoid?

Simulators are an essential stop on the path to humanoid robots, but they don't necessarily lead to humanlike perception.

When describing a robotic arm at Computex, Huang said that Nvidia supplied "the computer, the acceleration layers, and the pretrained AI models" needed to put an AI robot into an AI factory. The goal of using robotic arms in factories at scale has been around for decades. Robotic arms have been building cars since 1961. But Huang was talking about an AI robot β€” an intelligent robot.

The arms that build cars are largely unintelligent. They're programmed to perform repetitive tasks and often "see" with sensors instead of cameras.

An AI-enabled robotic arm would be able to handle varied tasks β€” picking up diverse items and putting them down in diverse places without breaking them, maybe while on the move. They need to be able to perceive objects and guardrails and then make moves in a coherent order. But a humanoid robot is a world away from even the most useful nonhumanoid. Some roboticists doubt that it's the right target to aim for.

"I'm very skeptical," said a former Nvidia robotics expert with more than 15 years in the field who was granted anonymity to protect industry relationships. "The cost to make a humanoid robot and to make it versatile is going to be higher than if you make a robot that doesn't look like a human and can only do a single task but does the task well and faster."

But Huang is all in.

"I think Jensen has an obsession with robots because, ultimately, what he's trying to do is create the future," Martynek said.

Autonomous cars and robotics are a big part of Nvidia's future. The company told BI it expected everything to be autonomous eventually, starting with robotic arms and vehicles and leading to buildings and even cities.

"I was at Apple when we developed iPad inspired by 'Star Trek' and other future worlds in movies," Velastegui said, adding that Robotics taps into our imagination.

Read the original article on Business Insider

Nvidia workforce data explains its meteoric rise

NVIDIA photo collage
Nvidia's workforce has increased more than 20-fold in the last twenty years.

Anna Kim/Getty, Tyler Le/BI

  • Nvidia's workforce has grown nearly 20-fold since 2003.
  • The company's stock price surge and low turnover have enriched many long-term employees.
  • Nvidia's median salary now surpasses Microsoft's and other Silicon Valley peers.

Nvidia was largely unknown just a few years ago.

In 2022, google searches for Jensen Huang, the company's charismatic CEO, were almost nonexistent. And Nvidia employees were not nearly the source of fascination and interest they are today.

Nvidia recruiters are now swamped at conferences, and platforms like Reddit and Blind are full of eager posters wondering how to land a job or at least get an interview at the company, which has around 30,000 employees.

They want to know how many Nvidians are millionaires β€” likely quite a few.

The skyrocketing stock price has made that the case, but so has the longevity of its employees. Twenty-year-plus tenures are not uncommon, and even now when AI talent has never been more prized, staff turnover has been falling in recent years. In January, the company reported a turnover rate of 2.7%. Tech industry turnover below 20% is notable, an HR firm told Business Insider earlier this year.

The data behind the evolution of Nvidia's workforce tells the story of the company's meteoric rise just as well, if not better than the revenue or stock price. Until the early 2000s, the chip design company, which was founded in 1993, was relatively under the radar. Here is Nvidia's story in four charts.

Nvidia's workforce has grown nearly 20-fold since 2003

Beyond Nvidia's historic rise in market value, the company has a lot to offer employees. It maintains a permissive remote work policy even as tech giants like Amazon mandate a return to the office. It has also built an appropriately futuristic new Santa Clara, California, headquarters which robotics leader Rev Lebaredian described to Business Insider as so tech-infused it is a "type of robot."

But the culture isn't for everyone.

Public feedback, for example, is a very intentional part of the workplace culture. Huang famously has dozens of direct reports and eschews one-on-one meetings, preferring to call out mistakes in public rather than saving harsh feedback for private conversation, so that everyone can learn.

Nvidia has become one of the best-paying firms in Silicon Valley

Four years ago, Nvidians' median salary wasn't at the top of the market. In 2019, Microsoft's median employee salary was nearly $20,000 higher than an Nvidia worker. But as of January 2024, Nvidia's median salary (excluding the CEO) surpassed Microsoft and has left other tech giants in the dust.

Yet, this chart only reports on base compensation.

Years of stock-based compensation and "special Jensen grants," along with four-digit growth in the stock price within the last decade, have led to wealthy employees and, at times, internal tension surrounding rich Nvidia employees not pulling their weight.

Certainly, not all Nvidians are millionaires and the compensation the company is required to report to shareholders every Spring isn't quite the full picture. Still, Huang has repeatedly said that despite Nvidia's AI dominance, he wakes up worrying about staying on top.

Nvidia's revenue per employee has recovered after years of investment

Divide the company's revenue by its employee headcount and its financial strategy shows through.

Beginning in 2006, long before using graphics processing units to run AI models was commonplace, Nvidia invested in building a programming software layer called compute unified device architecture (CUDA).

Nvidia's GPUs are capable of immense computing capacity at nearly unprecedented speed because they perform calculations simultaneously rather than one at a time. Instructing these powerful chips required a new software paradigm.

CUDA is that paradigm and building it took years and cost Nvidia dearly. In hindsight, the benefit of this investment period is undeniable. CUDA is the main element that keeps AI builders from easily or willingly switching to competing hardware like AMD's MI325 and Amazon's Trainium chips.

It's not a literal translation of every employee's contribution, but looking at the revenue-to-headcount ratio can show trends in efficiency, investment, and return.

Nvidia's revenue-to-headcount ratio showed a downward trend from 2003 until 2014, and then steady upward progress until the AI boom in 2023. During that year, this ratio doubled.

CUDA is likely not the only factor affecting this data point, but it may help explain why investors questioned CUDA expenditures for years β€” and why they no longer do.

But the company isn't as far ahead in other areas.

Nvidia has less than one in five women employees β€” but it has pay parity

Despite the dizzying progress of Nvidia's technological achievements, gender representation in the company's workforce and the semiconductor industry as a whole has remained relatively unchanged in the last decade. As of January 2024, Nvidia's global workforce was 19.7% female.

Nvidia's stats are in line with the industry totals for female representation, but ahead of the pack when it comes to women in technical and management positions.

According to a 2023 Accenture analysis, the median representation of women in the semiconductor industry is between 20% and 29%, up from between 20% and 25% in 2022. Over half of the companies in the sample reported less than 10% representation of women in technical director roles and less than 5% in technical executive leadership roles.

In January Nvidia reported that women at the company make 99.5% of what men make in terms of baseline compensation. For the last two years, the turnover rate for women at the company has been slightly lower than that for men.

Nvidia declined to comment on this dynamic when BI reported on it in September.

Do you work at Nvidia? Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

Amazon makes massive downpayment on dethroning Nvidia

Anthropic CEO Dario Amodei at the 2023 TechCrunch Disrupt conference
Dario Amodei, an OpenAI employee turned Anthropic CEO, at TechCrunch Disrupt 2023.

Kimberly White/Getty

  • Amazon on Friday announced another $4 billion investment in the AI startup Anthropic.
  • The deal includes an agreement for Anthropic to use Amazon's AI chips more.
  • The cloud giant is trying to challenge Nvidia and get developers to switch away from those GPUs.

Amazon's Trainium chips are about to get a lot busier β€” at least that's what Amazon hopes will happen after it pumps another $4 billion into the AI startup Anthropic.

The companies announced a huge new deal on Friday that brings Amazon's total investment in Anthropic to $8 billion. The goal of all this money is mainly to get Amazon's AI chips to be used more often to train and run large language models.

Anthropic said that in return for this cash injection, it would use AWS as its "primary cloud and training partner." It said it would also help Amazon design future Trainium chips and contribute to building out an Amazon AI-model-development platform called AWS Neuron.

This is an all-out assault on Nvidia, which dominates the AI chip market with its GPUs, servers, and CUDA platform. Nvidia's stock dropped by more than 3% on Friday after the Amazon-Anthropic news broke.

The challenge will be getting Anthropic to actually use Trainium chips in big ways. Switching away from Nvidia GPUs is complicated, time-consuming, and risky for AI-model developers, and Amazon has struggled with this.

Earlier this week, Anthropic CEO Dario Amodei didn't sound like he was all in on Amazon's Trainium chips, despite another $4 billion coming his way.

"We use Nvidia, but we also use custom chips from both Google and Amazon," he said at the Cerebral Valley tech conference in San Francisco. "Different chips have different trade-offs. I think we're getting value from all of them."

In 2023, Amazon made its first investment in Anthropic, agreeing to put in $4 billion. That deal came with similar strings attached. At the time, Anthropic said that it would use Amazon's Trainium and Inferentia chips to build, train, and deploy future AI models and that the companies would collaborate on the development of chip technology.

It's unclear whether Anthropic followed through. The Information reported recently that Anthropic preferred to use Nvidia GPUs rather than Amazon AI chips. The publication said the talks about this latest investment focused on getting Anthropic more committed to using Amazon's offerings.

There are signs that Anthropic could be more committed now, after getting another $4 billion from Amazon.

In Friday's announcement, Anthropic said it was working with Amazon on its Neuron software, which offers the crucial connective tissue between the chip and the AI models. This competes with Nvidia's CUDA software stack, which is the real enabler of Nvidia's GPUs and makes these components very hard to swap out for other chips. Nvidia hasΒ a decadelong head startΒ on CUDA, and competitors have found that difficult to overcome.

Anthropic's "deep technical collaboration" suggests a new level of commitment to using and improving Amazon's Trainium chips.

Though several companies make chips that compete with or even beat Nvidia's in certain elements of computing performance, no other chip has touched the company in terms of market or mind share.

Amazon's AI chip journey

Amazon is on a short list of cloud providers attempting to stock their data centers with their own AI chips and avoid spending heavily on Nvidia GPUs, which have profit margins that often exceed 70%.

Amazon debuted its Trainium and Inferentia chips β€” named after the training and inference tasks they're built for β€” in 2020.

The aim was to become less dependent on Nvidia and find a way to make cloud computing in the AI age cheaper.

"As customers approach higher scale in their implementations, they realize quickly that AI can get costly," Amazon CEO Andy Jassy said on the company's October earnings call. "It's why we've invested in our own custom silicon in Trainium for training and Inferentia for inference."

But like its many competitors, Amazon has found that breaking the industry's preference for Nvidia is difficult. Some say that's because ofΒ CUDA, which offers an abundant software stack with libraries, tools, and troubleshooting help galore. Others say it's simple habit or convention.

In May, the Bernstein analyst Stacy Rasgon told Business Insider he wasn't aware of any companies using Amazon AI chips at scale.

With Friday's announcement, that might change.

Jassy said in October that the next-generation Trainium 2 chip was ramping up. "We're seeing significant interest in these chips, and we've gone back to our manufacturing partners multiple times to produce much more than we'd originally planned," Jassy said.

Still, Anthropic's Amodei sounded this week like he was hedging his bets.

"We believe that our mission is best served by being an independent company," he said. "If you look at our position in the market and what we've been able to do, the independent partnerships we have Google, with Amazon, with others, I think this is very viable."

Read the original article on Business Insider

Jensen Huang says the 3 elements of AI scaling are all advancing. Nvidia's Blackwell demand will prove it.

Jensen Huang and Sam Altman
Scaling has been a key concern for AI leaders.

I-HWA CHENG/AFP via Getty Images; Justin Sullivan/Getty Images

  • Reports on an AI progress slowdown raised concerns about model scaling on Nvidia's earnings call.
  • An analyst questioned if models are plateauing and if Nvidia's Blackwell chips could help.
  • Huang said there are three elements in scaling and that each continues to advance.

If the foundation models driving the panicked rush toward generative AI stop improving, Nvidia will have a problem. Silicon Valley's whole value proposition is the continued demand for more and more computing power.

Concerns about scaling laws started recently with reports that OpenAI's progress in improving its models was slowing. But Jensen Huang isn't worried.

The Nvidia CEO got the question Wednesday, on the company's third-quarter earnings call. Has progress stalled? And could the power of Nvidia's Blackwell chips start it up again?

"Foundation model pre-training scaling is intact and it's continuing," Huang said.

He added that scaling isn't as narrow as many think.

In the past, it may have been true that models only improved with more data and more pre-training. Now, AI can generate synthetic data and check its own answer to β€”in a wayβ€” train itself. But, we're running out of data that hasn't already been ingested by these models, and the impact of synthetic data for pre-training is debatable.

As the AI ecosystem matures, tools for improving models are gaining importance. The first generation of post-training improvement for models came from armies of humans checking AI's responses one by one.

Huang shouted out OpenAI's Strawberry or o1 model, which uses more modern strategies like "chain of thought reasoning" and "multi-path planning." These are both tactics that encourage the models to think longer and in a more step-by-step fashion so that the responses are more considered.

"The longer it thinks, the better and higher quality answer it produces," Huang said.

Pre-training, post-training improvements, and new reasoning strategies all improve models, Huang said. Of course, if the model is doing more computing to answer the same fundamental question, that's where higher-powered compute is necessary β€” especially since users want their responses just as fast, if not faster.

The demand for Blackwell is the result, he said.

After all, the first generation of foundation models took about 100,000 Hopper chips to build. "You know, the next generation starts at 100,000 Blackwells," Huang said. The company said commercial shipments of Blackwell chips are just beginning.

Read the original article on Business Insider

❌