Chip company Nvidia gets the green light from the European Union to complete its acquisition of Run:ai. The EU came to a unanimous decision today that Nvidia could go ahead with its acquisition of Israeli GPU orchestration platform Run:ai, according to reporting from Bloomberg. The European Commission determined that if the merger went through, other [β¦]
Broadcom's CEO says he's too busy riding the AI wave to consider a takeover of rival Intel.
In an interview with the Financial Times, Hock Tan said he had no interest in "hostile takeovers."
Broadcom has soared to a $1 trillion market capitalization for the first time thanks to the AI boom.
The chief executive of $1 trillion AI chip giant Broadcom has dismissed the prospect of a takeover bid for struggling rival Intel.
In an interview with the Financial Times, Hock Tan said that he has his "hands very full" from riding the AI boom, responding to rumors that his company could make a move for its Silicon Valley rival.
"That is driving a lot of my resources, a lot of my focus," Tan said, adding that he has "not been asked" to bid on Intel.
The Broadcom boss is also adopting a "no hostile takeovers" policy after Donald Trump blocked his company's offer for Qualcomm in 2018 on national security grounds. Broadcom was incorporated in Singapore at the time.
"I can only make a deal if it's actionable," Tan told the FT. "Actionability means someone comes and asks me. Ever since Qualcomm, I learned one thing: no hostile offers."
Broadcom and Intel have experienced diverging fortunes since the start of the generative AI boom. Broadcom has more than doubled in value since the start of the year to hit a $1 trillion market capitalization for the first time, while Intel has collapsed by more than half to $82 billion.
Broadcom, which designs custom AI chips and components for data centers, hit a record milestone last week after reporting its fourth-quarter earnings. Revenues from its AI business jumped 220% year over year.
Intel, meanwhile, has had a much rougher year. Its CEO Pat Gelsinger β who first joined Intel when he was 18 and was brought back in 2021 after a stint at VMWare β announced his shock retirement earlier this month after struggling to keep pace with rivals like Nvidia in the AI boom.
Gelsinger, who returned to revitalize Intel's manufacturing and design operations, faced several struggles, leading him to announce a head count reduction of roughly 15,000 in August and a suspension of Intel's dividend.
Its challenges have led to several rumors of being bought by a rival, a move that would mark a stunning end to the decades-old chip firm. Buyer interest remains uncertain, however. Bloomberg reported in November that Qualcomm's interest in an Intel takeover has cooled.
Broadcom did not immediately respond to BI's request for comment outside regular working hours.
Microsoft bought more than twice as many Nvidia Hopper chips this year than any of its biggest rivals. The tech giant bought 485,000 Nvidia Hopper chips across 2024 according to reporting from the Financial Times, which cited data from tech consultancy Omdia. To compare, Meta bought 224,000 of the same flagship Nvidia chip this year. [β¦]
Groq is taking a novel approach to competing with Nvidia's much-lauded CUDA software.
The chip startup is using a free inference tier to attract hundreds of thousands of AI developers.
Groq aims to capture market share with faster inference and global joint ventures.
There is an active debate about Nvidia's competitive moat. Some say there's a prevailing perception of a 'safe' choice when investing billions in a technology, in which the return is still uncertain.
Many say it's Nvidia's software, particularly CUDA, which the company began developing decades before the AI boom. CUDA allows users to get the most out of graphics processing units.
Competitors have attempted to make comparable systems, but without Nvidia's headstart, it has been tough to get developers to learn, try, and ultimately improve their systems.
Groq, however, is an Nvidia competitor that focused early on the segment of AI computing that requires less need for directly programming chips, and investors are intrigued. The 8-year-old AI chip startup was valued at $2.8 billion at its $640 million Series D round in August.
Though at least one investor has called companies like Groq 'insane' for attempting to dent Nvidia's estimated 90% market share, the startup has been building its technology exactly for the opportunity that is coming in 2025, Mark Heaps, Groq's "chief tech evangelist" said.
'Unleashing the beast'
"What we decided to do was take all of our compute, make it available via a cloud instance, and we gave it away to the world for free," Heaps said. Internally, the team called the strategy, "unleashing the beast". Groq's free tier caps users at a ceiling marked by requests per day or tokens per minute.
Heaps, CEO and ex-Googler Jonathan Ross, and a relatively lean team have spent 2023 and 2024 recruiting developers to try Groq's tech. Through hackathons and contests, the company makes a promise β try the hardware via Groq's cloud platform for free, and break through walls you've hit with others.
Groq offers some of the fastest inference out there, according to rankings on Artificialanalysis.ai, which measures cost and latency for companies that allow users to buy access to specific models by the token β or output.
Inference is a type of computing that produces the answers to queries asked of large language models. Training, the more energy-intensive type of computing, is what gives the models the ability to answer. So far, the hardware used for those two tasks has been different.
After the inference service was available for free, developers came out of the woodwork, he said, with projects that couldn't be successful on slower chips. With more speed, developers can send one request through multiple models and use another model to choose the best response β all in the time it would usually take to fulfill just one request.
Roughly 652,000 developers are now using Groq API keys, Heaps said.
Heaps expects speed to hook developers on Groq. But its novel plan for programming its chips gives the company a unique approach to the most crucial element within Nvidia's "moat."
No need for CUDA libraries
"Everybody, once they deployed models, was gonna need faster inference at a lower cost, and so that's what we focused on," Heaps said.
So where's the CUDA equivalent? It's all in-house.
"We actually have more than 1800 models built into our compiler. We use no kernels, and we don't need people to use CUDA libraries. So because of that, people can just start working with a model that's built-in," Heaps said.
Training, he said, requires more customization at the chip level. In inference, Groq's task is to choose the right models to offer customers and ensure they run as fast as possible.
"What you're seeing with this massive swell of developers who are building AI applications β they don't want to program at the chip level," he added.
The strategy comes with some level of risk. Groq is unlikely to accumulate a stable of developers who continuously troubleshoot and improve its base software like CUDA has. Its offering may be more like a restaurant menu than a grocery store. But this also means the barrier to entry for Groq users is the same as any other cloud provider and potentially lower than that of other chips.
Though Groq started out as a company with a novel chip design, today, of the company's roughly 300 employees, 60% are software engineers, Heaps said.
"For us right now, there is a billions and billions of dollars industry emerging, that we can go capture a big share of market in, while at the same time, we continue to mature the compiler," he said.
Despite being realistic about the near-term, Groq has lofty ambitions, which board CEO Jonathan Ross has described as "providing half the world's inference." Ross also says the goal is to cast a net over the globe β to be achieved via joint ventures. Saudi Arabia is on the way. Canada and Latin America are in the works.
Earlier this year, Ross told BI the company also has a goal to ship 108,000 of its language processing units or LPUs by the first quarter of next year β and 2 million chips by the end of 2025, most of which will be made availablethrough its cloud.
Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088
Broadcom's stock surged in recent weeks, pushing the company's market value over $1 trillion.
Broadcom is crucial for companies seeking alternatives to Nvidia's AI chip dominance.
Custom AI chips are gaining traction, enhancing tech firms' bargaining power, analysts say.
The rise of AI, and the computing power it requires, is bringing all kinds of previously under-the-radar companies into the limelight. This week it's Broadcom.
Broadcom's stock has soared since late last week, catapulting the company into the $1 trillion market cap club. The boost came from a blockbuster earnings report in which custom AI chip revenue grew 220% compared to last year.
In addition to selling lots of parts and components for data centers, Broadcom designs and sells ASICs, or application-specific integrated circuits β an industry acronym meaning custom chips.
Designers of custom AI chips, chief among them Broadcom and Marvell, are headed into a growth phase, according to Morgan Stanley.
Custom chips are picking up speed
The biggest players in AI buy a lot of chips from Nvidia, the $3 trillion giant with an estimated 90% of market share of advanced AI chips.
Heavily relying on one supplier isn't a comfortable position for any company, though, and many large Nvidia customers are also developing their own chips. Most tech companies don't have large teams of silicon and hardware experts in house. Of the companies they might turn to design them a custom chip, Broadcom is the leader.
Though multi-purpose chips like Nvidia's and AMD's graphics processing units are likely to maintain the largest share of the AI chip market in the long-term, custom chips are growing fast.
Morgan Stanley analysts this week forecast the market for ASICs to nearly double to $22 billion next year.
Much of that growth is attributable to Amazon Web Services' Trainium AI chip, according to Morgan Stanley analysts. Then there are Google's in-house AI chips, known as TPUs, which Broadcom helps make.
In terms of actual value of chips in use, Amazon and Google dominate. But OpenAI, Apple, and TikTok parent company ByteDance are all reportedly developing chips with Broadcom, too.
ASICs bring bargaining power
Custom chips can offer more value, in terms of the performance you get for the cost, according to Morgan Stanley's research.
ASICs can also be designed to perfectly match unique internal workloads for tech companies, accord to the bank's analysts. The better these custom chips get, the more bargaining power they may provide when tech companies are negotiating with Nvidia over buying GPUs. But this will take time, the analysts wrote.
In addition to Broadcom, Silicon Valley neighbor Marvell is making gains in the ASICs market, along with Asia-based players Alchip Technologies and Mediatek, they added in a note to investors.
Analysts don't expect custom chips to ever fully replace Nvidia GPUs, but without them, cloud service providers like AWS, Microsoft, and Google would have much less bargaining power against Nvidia.
"Over the long term, if they execute well, cloud service providers may enjoy greater bargaining power in AI semi procurement with their own custom silicon," the Morgan Stanley analysts explained.
Nvidia's big R&D budget
This may not be all bad news for Nvidia. A $22 billion ASICs market is smaller than Nvidia's revenue for just one quarter.
Nvidia's R&D budget is massive, and many analysts are confident in its ability to stay at the bleeding edge of AI computing.
And as Nvidia rolls out new, more advanced GPUs, its older offerings get cheaper and potentially more competitive with ASICs.
"We believe the cadence of ASICs needs to accelerate to stay competitive to GPUs," the Morgan Stanley analysts wrote.
Still, Broadcom and chip manufacturers on the supply chain rung beneath, such as TSMC, are likely to get a boost every time a giant cloud company orders up another custom AI chip.
Rumors have suggested that Nvidia will be taking the wraps off of some next-generation RTX 50-series graphics cards at CES in January. And as we get closer to that date, Nvidia's partners and some of the PC makers have begun to inadvertently leak details of the cards.
According to recent leaks from both Zotac and Acer, it looks like Nvidia is planning to announce four new GPUs next month, all at the high end of its lineup: The RTX 5090, RTX 5080, RTX 5070 Ti, and RTX 5070 were all briefly listed on Zotac's website, as spotted by VideoCardz. There's also an RTX 5090D variant for the Chinese market, which will presumably have its specs tweaked to conform with current US export restrictions on high-performance GPUs.
Though the website leak didn't confirm many specs, it did list the RTX 5090 as including 32GB of GDDR7, an upgrade from the 4090's 24GB of GDDR6X. An Acer spec sheet for new Predator Orion desktops also lists 32GB of GDDR7 for the 4090, as well as 16GB of GDDR7 for the RTX 5080. This is the same amount of RAM included with the RTX 4080 and 4080 Super.
Hiya, folks, welcome to TechCrunchβs regular AI newsletter. If you want this in your inbox every Wednesday, sign upΒ here. Longtime readers of the newsletter mightβve noticed that we skipped a week last week. That wasnβt our intent, and we do apologize. The reason was weβve reached an inflection point in the AI news cycle. Weβre [β¦]
Oracle and xAI love to flex the size of their GPU clusters.
It's getting hard to tell who has the most supercomputing power as more firms claim the top spot.
The real numbers are competitive intel and cluster size isn't everything, experts told BI.
In high school, as in tech, superlatives are important. Or maybe they just feel important in the moment. With the breakneck pace of the AI computing infrastructure buildout, it's becoming increasingly difficult to keep track of who has the biggest, fastest, or most powerful supercomputer β especially when multiple companies claim the title at once.
"We delivered the world's largest and fastest AI supercomputer, scaling up to 65,000 Nvidia H200 GPUs," Oracle CEO Safra Catz and Chairman, CTO, echoed by Founder Larry Ellison on the company's Monday earnings call.
In late October, Nvidia proclaimed xAI's Colossus as the "World's Largest AI Supercomputer," after Elon Musk's firm reportedly built a computing cluster with 100,000 Nvidia graphics processing units in a matter of weeks. The plan is to expand to 1 million GPUs next, according to the Greater Memphis Chamber of Commerce (where the supercomputer is located.)
The good ole days of supercomputing are gone
It used to be simpler. "Supercomputers" were most commonly found in research settings. Naturally, there's an official list ranking supercomputers. Until recently the world's most powerful supercomputer was named El Capitan. Housed at the Lawrence Livermore National Laboratory in California 11 million CPUs and GPUs from Nvidia-rival AMD add up to 1.742 Exaflops of computing capacity. (One exaflop is equal to one quintillion, or a billion billions, operations per second.)
"The biggest computers don't get put on the list," Dylan Patel, chief analyst at Semianalysis, told BI. "Your competitor shouldn't know exactly what you have," he continued. The 65,000-GPU supercluster Oracle executives were praising can reach up to 65 exaflops, according to the company.
It's safe to assume, Patel said, that Nvidia's largest customers, Meta,Microsoft, and xAI also have the largest, most powerful clusters. Nvidia CFO Colette Cress said 200 fresh exaflops of Nvidia computing would be online by the end of this year β across nine different supercomputers β on Nvidia's May earnings call.
Going forward, it's going to be harder to determine whose clusters are the biggest at any given moment β and even harder to tell whose are the most powerful β no matter how much CEOs may brag.
It's not the size of the cluster β it's how you use it
On Monday's call, Ellison was asked, if the size of these gigantic clusters is actually generating better model performance.
He said larger clusters and faster GPUs are elements that speed up model training. Another is networking it all together. "So the GPU clusters aren't sitting there waiting for the data," Ellison said Monday.
Thus, the number of GPUs in a cluster isn't the only factor in the computing power calculation. Networking and programming are important too. "Exaflops" are a result of the whole package so unless companies provide them, experts can only estimate.
What's certain is that more advanced models β the kind that consider their own thinking and check their work before answering queries β require more compute than their relatives of earlier generations. So training increasingly impressive models may indeed require an arms race of sorts.
But an enormous AI arsenal doesn't automatically lead to better or more useful tools.
Sri Ambati, CEO of open-source AI platform H2O.ai, said cloud providers may want to flex their cluster size for sales reasons, but given some (albeit slow) diversification of AI hardware and the rise of smaller, more efficient models, cluster size isn't the end all be all.
Power efficiency too, is a hugely important indicator for AI computing since energy is an enormous operational expense in AI. But it gets lost in the measuring contest.
Nvidia declined to comment. Oracle did not respond to a request for comment in time for publication.
Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088.
When it comes to market capitalization, Nvidia is currently the second-biggest public company in the world, behind Apple. Thatβs why all eyes are on Nvidia these days. And now, as Bloomberg spotted, China Central Television, a public TV broadcaster, is reporting that Chinaβs market regulator has opened a probe into Nvidiaβs acquisition of Mellanox. If [β¦]
China's top antimonopoly regulator is investigating Nvidia.
The investigation is related to the company's 2020 acquisition of an Israeli chip firm.
Nvidia's stock fell by 2.2% in premarket trading on Monday.
China's top antimonopoly regulator has launched an investigation into Nvidia, whose shares dropped by 2.2% in premarket trading on Monday following the latest escalation of chip tensions with the US.
The State Administration for Market Regulation said on Monday that it was investigating whether the chipmaker giant violated antimonopoly regulations.
The probe is related to Nvidia's acquisition of Mellanox Technologies, an Israeli chip firm, in 2020. China's competition authority approved the $7 billion takeover in 2020 on the condition that rivals be notified of new products within 90 days of allowing Nvidia access to them.
The US-China chip war has been escalating. Last week, China's commerce ministry said it would halt shipments of key materials needed for chip production to the US. The ministry said the measures were in response to US chip export bans, also announced last week.
Nvidia, which is headquartered in Santa Clara, California, has also faced antitrust scrutiny in the US. The Department of Justice has been examining whether Nvidia might have abused its market dominance to make it difficult for buyers to change suppliers.
Nvidia did not immediately respond to a request for comment from Business Insider made outside normal working hours.
AMD CEO Lisa Su and Nvidia CEO Jensen Huang are first cousins once removed, a researcher said.
Su told Bloomberg they did not grow up together and were "really distant."
"No family dinners," she said. "It is an interesting coincidence."
AMD CEO Lisa Su said in a recent interview that she never met Nvidia CEO Jensen Huang, her competitor and distant relative, until later in their careers.
"We were really distant, so we didn't grow up together," Su said in an interview with Bloomberg's Emily Chang published Thursday. "We actually met at an industry event. So it wasn't until we were well into our careers."
Former journalist and genealogist Jean Wu said last year that Su and Huang, both Taiwanese chief executives of global chip powerhouses, are first cousins, once removed. Huang, 61, is the older cousin to Su, 55. Huang's mother is a sister to Su's grandfather, a condensed family tree Wu published on her Facebook account showed.
Su confirmed the familial relationship with her competitor in 2020, saying that the two are "distant relatives, so some complex second cousin type of thing."
An Nvidia spokesperson confirmed to CNN last year that Su is Huang's distant cousin through his mother's side.
An Nvidia spokesperson declined to comment on this story, and an AMD spokesperson did not immediately respond to a request for comment.
Huang and Su have eerily similar career paths but different upbringings.
Su was born in Tainan, whereas Huang was born in Taiwan's capital, Taipei.
The AMD CEO later moved to the US, where she grew up in New York and studied at the Massachusetts Institute of Technology.
Huang lived in Washington and Kentucky before settling in Oregon. He later attended Oregon State University.
Su said in the Bloomberg interview that she has a large family she visits when she travels back to Taiwan.
"My dad had like nine siblings, and my mom had like six, so it was like a big family," she said. "So there are lots and lots of cousins and aunts and uncles."
Despite their familial ties, Su and Huang never crossed paths at those family gatherings.
"No family dinners," she said. "It is an interesting coincidence."
AWS has not committed to offering cloud access to AMD's AI chips in part due to low customer demand.
AWS said it was considering offering AMD's new AI chips last year.
AMD recently increased the sales forecast for its AI chips.
Last year, Amazon Web Service said it was considering offering cloud access to AMD's latest AI chips.
18 months in, the cloud giant still hasn't made any public commitment to AMD's MI300 series.
One reason: low demand.
AWS is not seeing the type of huge customer demand that would lead to selling AMD's AI chips via its cloud service, according to Gadi Hutt, senior director for customer and product engineering at Amazon's chip unit, Annapurna Labs.
"We follow customer demand. If customers have strong indications that those are needed, then there's no reason not to deploy," Hutt told Business Insider at AWS's re:Invent conference this week.
AWS is "not yet" seeing that high demand for AMD's AI chips, he added.
AMD shares dropped roughly 2% after this story first ran.
AMD's line of AI chips has grown since its launch last year. The company recently increased its GPU sales forecast, citing robust demand. However, the chip company still is a long way behind market leader Nvidia.
AWS provides cloud access to other AI chips, such as Nvidia's GPUs. At re:Invent, AWS announced the launch of P6 servers, which come with Nvidia's latest Blackwell GPUs.
AWS and AMD are still close partners, according to Hutt. AWS offers cloud access to AMD's CPU server chips, and AMD's AI chip product line is "always under consideration," he added.
Hutt discussed other topics during the interview, including AWS's relationship with Nvidia, Anthropic, and Intel.
An AMD spokesperson declined to comment.
Do you work at Amazon? Got a tip?
Contact the reporter, Eugene Kim, via the encrypted-messaging apps Signal or Telegram (+1-650-942-3061) or email ([email protected]). Reach out using a nonwork device. Check out Business Insider's source guide for other tips on sharing information securely.
Editor's note: This story was first published on December 6, 2024, and was updated later that day to reflect developments in AMD's stock price.
AWS's new AI chips aren't meant to go after Nvidia's lunch, said Gadi Hutt, a senior director of customer and product engineering at the company's chip-designing subsidiary, Annapurna Labs. The goal is to give customers a lower-cost option, as the market is big enough for multiple vendors, Hutt told Business Insider in an interview at AWS's re:Invent conference.
"It's not about unseating Nvidia," Hutt said, adding, "It's really about giving customers choices."
AWS has spent tens of billions of dollars on generative AI. This week the company unveiled its most advanced AI chip, called Trainium 2, which can cost roughly 40% less than Nvidia's GPUs, and a new supercomputer cluster using the chips, called Project Rainier. Earlier versions of AWS's AI chips had mixed results.
Hutt insists this isn't a competition but a joint effort to grow the overall size of the market. The customer profiles and AI workloads they target are also different. He added that Nvidia's GPUs would remain dominant for the foreseeable future.
In the interview, Hutt discussed AWS's partnership with Anthropic, which is set to be Project Rainer's first customer. The two companies have worked closely over the past year, and Amazon recently invested an additional $4 billion in the AI startup.
He also shared his thoughts on AWS's partnership with Intel, whose CEO, Pat Gelsinger, just retired. He said AWS would continue to work with the struggling chip giant because customer demand for Intel's server chips remained high.
Last year AWS said it was considering selling AMD's new AI chips. But Huttsaidthose chips still weren't available on AWS because customers hadn't shown strong demand.
This Q&A has been edited for clarity and length.
There have been a lot of headlines saying Amazon is out to get Nvidia with its new AI chips. Can you talk about that?
I usually look at these headlines, and I giggle a bit because, really, it's not about unseating Nvidia. Nvidia is a very important partner for us. It's really about giving customers choices.
We have a lot of work ahead of us to ensure that we continuously give more customers the ability to use these chips. And Nvidia is not going anywhere. They have a good solution and a solid road map. We just announced the P6 instances [AWS servers with Nvidia's latest Blackwell GPUs], so there's a continuous investment in the Nvidia product line as well. It's really to give customers options. Nothing more.
Nvidia is a great supplier of AWS, and our customers love Nvidia. I would not discount Nvidia in any way, shape, or form.
So you want to see Nvidia's use case increase on AWS?
If customers believe that's the way they need to go, then they'll do it. Of course, if it's good for customers, it's good for us.
The market is very big, so there's room for multiple vendors here. We're not forcing anybody to use those chips, but we're working very hard to ensure that our major tenets, which are high performance and lower cost, will materialize to benefit our customers.
Does it mean AWS is OK being in second place?
It's not a competition. There's no machine-learning award ceremony every year.
In the case of a customer like Anthropic, there's very clear scientific evidence that larger compute infrastructure allows you to build larger models with more data. And if you do that, you get higher accuracy and more performance.
Our ability to scale capacity to hundreds of thousands of Trainium 2 chips gives them the opportunity to innovate on something they couldn't have done before. They get a 5x boost in productivity.
Is being No. 1 important?
The market is big enough. No. 2 is a very good position to be in.
I'm not saying I'm No. 2 or No. 1, by the way. But it's really not something I'm even thinking about. We're so early in our journey here in machine learning in general, the industry in general, and also on the chips specifically, we're just heads down serving customers like Anthropic, Apple, and all the others.
We're not even doing competitive analysis with Nvidia. I'm not running benchmarks against Nvidia. I don't need to.
For example, there's MLPerf, an industry performance benchmark. Companies that participate in MLPerf have performance engineers working just to improve MLPerf numbers.
That's completely a distraction for us. We're not participating in that because we don't want to waste time on a benchmark that isn't customer-focused.
On the surface, it seems like helping companies grow on AWS isn't always beneficial for AWS's own products because you're competing with them.
We are the same company that is the best place Netflix is running on, and we also have Prime Video. It's part of our culture.
I will say that there are a lot of customers that are still on GPUs. A lot of customers love GPUs, and they have no intention to move to Trainium anytime soon. And that's fine, because, again, we're giving them the options and they decide what they want to do.
Do you see these AI tools becoming more commoditized in the future?
I really hope so.
When we started this in 2016, the problem was that there was no operating system for machine learning. So we really had to invent all the tools that go around these chips to make them work for our customers as seamlessly as possible.
If machine learning becomes commoditized on the software and hardware sides, it's a good thing for everybody. It means that it's easier to use those solutions. But running machine learning meaningfully is still an art.
What are some of the different types of workloads customers might want to run on GPUs versus Trainium?
GPUs are more of a general-purpose processor of machine learning. All the researchers and data scientists in the world know how to use Nvidia pretty well. If you invent something new, if you do that on GPU, then things will work.
If you invent something new on specialized chips, you'll have to either ensure compiler technology understands what you just built or create your own compute kernel for that workload. We're focused mainly on use cases where our customers tell us, "Hey, this is what we need." Usually the customers we get are the ones that are seeing increased costs as an issue and are trying to look for alternatives.
So the most advanced workloads are usually reserved for Nvidia chips?
Usually. If data-science folks need to continuously run experiments, they'll probably do that on a GPU cluster. When they know what they want to do, that's where they have more options. That's where Trainium really shines, because it gives high performance at a lower cost.
AWS CEO Matt Garman previously said the vast majority of workloads will continue to be on Nvidia.
It makes sense. We give value to customers who have a large spend and are trying to see how they can control the costs a bit better. When Matt says the majority of the workloads, it means medical imaging, speech recognition, weather forecasting, and all sorts of workloads that we're not really focused on right now because we have large customers who ask us to do bigger things. So that statement is 100% correct.
In a nutshell, we want to continue to be the best place for GPUs and, of course, Trainium when customers need it.
What has Anthropic done to help AWS in the AI space?
They have very strong opinions of what they need, and they come back to us and say, "Hey, can we add feature A to your future chip?" It's a dialogue. Some ideas they came up with weren't feasible to even implement in a piece of silicon. We actually implemented some ideas, and for others we came back with a better solution.
Because they're such experts in building foundation models, this really helps us home in on building chips that are really good at what they do.
We just announced Project Rainier together. This is someone who wants to use a lot of those chips as fast as possible. It's not an idea β we're actually building it.
Can you talk about Intel? AWS's Graviton chips are replacing a lot of Intel chips at AWS data centers.
I'll correct you here. Graviton is not replacing x86. It's not like we're yanking out x86 and putting Graviton in place. But again, following customer demand, more than 50% of our recent landings on CPUs were Graviton.
It means that the customer demand for Graviton is growing. But we're still selling a lot of x86 cores too for our customers, and we think we're the best place to do that. We're not competing with these companies, but we're treating them as good suppliers, and we have a lot of business to do together.
How important is Intel going forward?
They will for sure continue to be a great partner for AWS. There are a lot of use cases that run really well on Intel cores. We're still deploying them. There's no intention to stop. It's really following customer demand.
Is AWS still considering selling AMD's AI chips?
AMD is a great partner for AWS. We sell a lot of AMD CPUs to customers as instances.
The machine-learning product line is always under consideration. If customers strongly indicate that they need it, then there's no reason not to deploy it.
And you're not seeing that yet for AMD's AI chips?
Not yet.
How supportive are Amazon CEO Andy Jassy and Garman of the AI chip business?
They're very supportive. We meet them on a regular basis. There's a lot of focus across leadership in the company to make sure that the customers who need ML solutions get them.
There's also a lot of collaboration within the company with science and service teams that are building solutions on those chips. Other teams within Amazon, like Rufus, the AI assistant available to all Amazon customers, run entirely on Inferentia and Trainium chips.
Do you work at Amazon? Got a tip?
Contact the reporter, Eugene Kim, via the encrypted-messaging apps Signal or Telegram (+1-650-942-3061) or email ([email protected]). Reach out using a nonwork device. Check out Business Insider's source guide for other tips on sharing information securely.
Elon Musk's xAI plans to make a tenfold increase to the number of GPUs at its Memphis supercomputer.
The expansion aims to help the startup compete with OpenAI and Google in the AI race.
Nvidia, Dell, and Supermicro Computer also plan to establish operations in Memphis.
Elon Musk's xAI is ramping up its Memphis supercomputer to house at least 1 million graphic processing units, the Greater Memphis Chamber said on Wednesday.
The supercomputer, called Colossus, is already considered the largest of its kind in the world. The expansion would increase the number of its GPUs tenfold.
The move is part of xAI's effort to ramp up AI development and outpace rivals like OpenAI and Google. The GPUs are used to train and run xAI's AI-powered chatbot Grok, the company's answer to products like OpenAI's ChatGPT and Google's Gemini.
"In Memphis, we're pioneering development in the heartland of America," Brent Mayo, an xAI engineer, said in a statement. "We're not just leading from the front; we're accelerating progress at an unprecedented pace while ensuring the stability of the grid utilizing megapack technology."
The Greater Memphis Chamber said Nvidia β the leader in the GPU market and a supplier to Colossus β along with Dell and Supermicro Computer, also plan to establish operations in Memphis.
A 'superhuman' task
xAI built its supercomputer, Colossus, at a rapid pace. The supporting facility and supercomputer were built by xAI and Nvidia in just 122 days, according to a press release from Nvidia.
In an interview with Jordan Peterson on X in June, Musk said it took 19 days to get Colossus from hardware installation to beginning training, adding it was "the fastest by far anyone's been able to do that."
The speed of xAI's expansion won praise from Jensen Huang, the CEO of Nvidia, who described the effort as a "superhuman" task and hailed Musk's understanding of engineering.
Huang said a project like Colossus would normally take "three years to plan" and another year to get it up and running.
Musk's AI startup has also been on a fundraising streak. The Wall Street Journal reported that xAI is valued at $50 billion, doubling itsΒ valuation since the spring.
Investors in xAI's latest funding round reportedly include Sequoia Capital and Andreessen Horowitz. Earlier this year, xAI raised a $6 billion Series B from A16z and Sequoia Capital at a $24 billion post-money valuation. The new round means the AI company has raised a total of $11 billion this year.
AWS unveiled a new AI chip and a supercomputer at its Re: Invent conference on Tuesday.
It's a sign that Amazon is ready to reduce its reliance on Nvidia for AI chips.
Amazon isn't alone: Google, Microsoft, and OpenAI are also designing their own AI chips.
Big Tech's next AI era will be all about controlling silicon and supercomputers of their own. Just ask Amazon.
At its Re: Invent conference on Tuesday, the tech giant's cloud computing unit, Amazon Web Services, unveiled the next line of its AI chips, Trainium3, while announcing a new supercomputer that will be built with its own chips to serve its AI ambitions.
It marks a significant shift from the status quo that has defined the generative AI boom since OpenAI's release of ChatGPT, in which the tech world has relied on Nvidia to secure a supply of its industry-leading chips, known as GPUs, for training AI models in huge data centers.
While Nvidia has a formidable moat β experts say its hardware-software combination serves as a powerful vendor lock-in system β AWS' reveal shows companies are finding ways to take ownership of the tech shaping the next era of AI development.
Putting your own chips on the table
On the chip side, Amazon shared that Trainium2, which was first unveiled at last year's Re: Invent, was now generally available. Its big claim was that the chip offers "30-40% better price performance" than the current generation of servers with Nvidia GPUs.
That would mark a big step up from its first series of chips, which analysts at SemiAnalysis described on Tuesday as "underwhelming" for generative AI training and used instead for "training non-complex" workloads within Amazon, such as credit card fraud detection.
"With the release of Trainium2, Amazon has made a significant course correction and is on a path to eventually providing a competitive custom silicon," the SemiAnalysis researchers wrote.
Trainium3, which AWS gave a preview of ahead of a late 2025 release, has been billed as a "next-generation AI training chip." Servers loaded with Trainium3 chips offer four times greater performance than those packed with Trainium2 chips, AWS said.
Matt Garman, the CEO of AWS, told The Wall Street Journal that some of the company's chip push is due to there being "really only one choice on the GPU side" at present, given Nvidia's dominant place in the market. "We think that customers would appreciate having multiple choices," he said.
It's an observation that others in the industry have noted and responded to. Google has been busy designing its own chips that reduce its dependence on Nvidia, while OpenAI is reported to be exploring custom, in-house chip designs of its own.
But having in-house silicon is just one part of this.
The supercomputer advantage
AWS acknowledged that as AI models trained on GPUs continue to get bigger, they are "pushing the limits of compute and networking infrastructure."
With this in mind, AWS shared that it was working with Anthropic to build an "UltraCluster" of servers that form the basis of a supercomputer it has named Project Rainier. According to Amazon, it will scale model training across "hundreds of thousands of Trainium2 chips."
"When completed, it is expected to be the world's largest AI compute cluster reported to date available for Anthropic to build and deploy their future models on," AWS said in a blog, adding that it will be "over five times the size" of the cluster used to build Anthropic's last model.
The supercomputer push follows similar moves elsewhere. The Information first reported earlier this year that OpenAI and Microsoft were working together to build a $100 billion AI supercomputer called Stargate.
Of course, Nvidia is also in the supercomputer business and aims to make them a big part of its allure to companies looking to use its next-generation AI chips, Blackwell.
AWS made no secret that it remains tied to Nvidia for now. In an interview with The Wall Street Journal, Garman acknowledged that Nvidia is responsible for "99% of the workloads" for training AI models today and doesn't expect that to change anytime soon.
That said, Garman reckoned "Trainium can carve out a good niche" for itself. He'll be wise to recognize that everyone else is busy carving out a niche for themselves, too.
AWS announced plans for an AI supercomputer, UltraCluster, with Trainium 2 chips at re:Invent.
AWS may be able to reduce reliance on Nvidia by developing its own AI infrastructure.
Apple said it's using Trainium 2 chips for Apple Intelligence.
Matt Garman, the CEO of Amazon Web Services, made several significant new AWS announcements at the re:Invent conference on Tuesday.
His two-and-a-half hour keynote delved into AWS's current software and hardware offerings and updates, with words from clients including Apple and JPMorgan. Graphics processing units (GPUs), supercomputers, and a surprise Apple cameo stuck out among the slew of information.
AWS, the cloud computing arm of Amazon, has been developing its own semiconductors to train AI. On Tuesday, Garman said it's creating UltraServers β containing 64 of its Trainium 2 chips β so companies can scale up their GenAI workloads.
Moreover, it's also building an AI supercomputer, an UltraCluster made up of UltraServers, in partnership with AI startup Anthropic. Named Project Rainier, it will be "the world's largest AI compute cluster reported to date available for Anthropic to build and deploy its future models on" when completed, according to an Amazon blog post. Amazon has invested $8 billion in Anthropic.
Such strides could push AWS further into competition with other tech firms in the ongoing AI arms race, including AI chip giant Nvidia.
Here are four takeaways from Garman's full keynote on Tuesday.
AWS' Trainium chips could compete with Nvidia.
Nvidia currently dominates the AI chip market with its sought-after and pricey GPUs, but Garman backed AWS's homegrown silicon during his keynote on Tuesday. His company's goal is to reduce the cost of AI, he said.
"Today, there's really only one choice on the GPU side, and it's just Nvidia. We think that customers would appreciate having multiple choices," Garman told the Wall Street Journal.
AI is growing rapidly, and the demand for chips that make the technology possible is poised to grow alongside it. Major tech companies, like Google and Microsoft, are venturing into chip creation as well to find an alternative to Nvidia.
However, Garman told The Journal the doesn't expect Trainium to dethrone Nvidia "for a long time."
"But, hopefully, Trainium can carve out a good niche where I actually think it's going to be a great option for many workloads β not all workloads," he said.
AWS also introduced Trainium3, its next-gen chip.
AWS' new supercomputer could go toe to toe with Elon Musk's xAI.
According to The Journal, the chip cluster known as Project Rainier is expected to be available in 2025. Once it is ready, Anthropic plans to use it to train AI models.
With "hundreds of thousands" of Trainium chips, it would challenge Elon Musk's xAI's Colossus β a supercomputer with 100,000 of Nvidia's Hopper chips.
Apple is considering Trainium 2 for Apple Intelligence training.
Garman said that Apple is one of its customers using AWS chips, like Amazon Graviton and Inferentia, for services including Siri.
Benoit Dupin, senior director of AI and machine learning at Apple, then took to the stage at the Las Vegas conference. He said the company worked with AWS for "virtually all phases" of its AI and machine learning life cycle.
"One of the unique elements of Apple business is the scale at which we operate and the speed with which we innovate," Dupin said.
He added, "AWS has been able to keep the pace, and we've been customers for more than a decade."
Now, Dupin said Apple is in the early stages of testing Trainium 2 chips to potentially help train Apple Intelligence.
The company introduced a new generation of foundational models, Amazon Nova.
Amazon announced some new kids on the GenAI block.
AWS customers will be able to use Amazon Nova-powered GenAI applications "to understand videos, charts, and documents, or generate videos and other multimedia content," Amazon said. There are a range of models available at different costs, it said.
"Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro are at least 75% less expensive than the best-performing models in their respective intelligence classes in Amazon Bedrock," Amazon said.