❌

Normal view

There are new articles available, click to refresh the page.
Today β€” 6 March 2025Main stream
Yesterday β€” 5 March 2025Main stream

AMD Radeon RX 9070 and 9070 XT review: RDNA 4 fixes a lot of AMD’s problems

AMD is a company that knows a thing or two about capitalizing on a competitor's weaknesses. The company got through its early-2010s nadir partially because its Ryzen CPUs struck just as Intel's current manufacturing woes began to set in, first with somewhat-worse CPUs that were great value for the moneyΒ and later with CPUs that were better than anything Intel could offer.

Nvidia's untrammeled dominance of the consumer graphics card market should also be an opportunity for AMD. Nvidia's GeForce RTX 50-series graphics cards have given buyers very little to get excited about, with an unreachably expensive high-end 5090 refresh and modest-at-best gains from 5080 and 5070-series cards that are also pretty expensive by historical standards, when you can buy them at all. Tech YouTubersβ€”both the people making the videos and the people leaving comments underneath themβ€”have been almost uniformly unkind to the 50 series, hinting at consumer frustrations and pent-up demand for competitive products from other companies.

Enter AMD's Radeon RX 9070 XT and RX 9070 graphics cards. These are aimed right at the middle of the current GPU market at the intersection of high sales volume and decent profit margins. They promise good 1440p and entry-level 4K gaming performance and improved power efficiency compared to previous-generation cards, with fixes for long-time shortcomings (ray-tracing performance, video encoding, and upscaling quality) that should, in theory, make them more tempting for people looking to ditch Nvidia.

Read full article

Comments

Β© Andrew Cunningham

Before yesterdayMain stream

Nvidia investors' call gives the chip giant a chance to tell backers why they're wrong about DeepSeek's impact

24 February 2025 at 12:55
NVIDIA's CEO Jensen Huang in his signature black leather jacket
Nvidia CEO Jensen Huang has said the market's reaction to DeepSeek was a mistake.

EDGAR SU / Reuters

  • Nvidia finally has a chance to tell investors why their violent reaction to DeepSeek was a mistake.
  • The chip giant's Wednesday earnings are the first since DeepSeek's AI sparked market panic.
  • Key areas to watch are data center revenue, Blackwell's ramp-up, inference demand, and policy.

When Nvidia reports earnings on Wednesday, the chip giant will have the chance to tell investors why it thinks their intense reaction to the rise of DeepSeek was a mistake β€” or change the subject entirely.

Heading into 2025, Nvidia's rule looked unassailable as Elon Musk, Mark Zuckerberg, and others lined up for its chips. That is until Chinese startup DeepSeek released R1, an open-source reasoning model with benchmark results to rival OpenAI's o1 model.

Critically, R1 was reportedly produced with fewer and less powerful chips than o1.

"DeepSeek's remarkable feat has shaken the industry's assumptions about how much capital or GPU chips a company needs to stay ahead of the competition," Barclays analysts wrote last month.

Although Nvidia has largely recovered from the violent reaction markets had to DeepSeek β€” the chip firm lost $600 billion in market capitalization in one day to mark the biggest single drop in US market history β€” CEO Jensen Huang will need to show investors the party is nowhere near over and that the promise of AI isn't overhyped. Huang previewed his argument at a virtual event broadcast Thursday where he said investors had misinterpreted the signals of DeepSeek.

As Nvidia prepares to address investors officially for the first time since the DeepSeek saga, here's what to look out for in its earnings.

Data center revenue

Sam Altman, the co-founder and CEO of OpenAI.
OpenAI CEO Sam Altman is working with Nvidia on Stargate.

Sean Gallup/Getty Images

Analysts predict Nvidia's revenue, especially in its all-important data center business, will keep rising β€” bolstered by already-announced forthcoming data center buildouts.

From Stargate, with $500 billion in expected spending, to Meta forecasting an additional $65 billion in data centers this year, to Amazon forecasting $100 billion more computing power earlier this month, Nvidia's customers are still lining up.

It will offer Nvidia fresh evidence to present to investors concerned that DeepSeek's claim to use chips more efficiently β€” a key driver in lowering costs β€” would hurt demand.

"Despite DeepSeek's supposed 'revolutionary' optimizations, there is no change thus far to spending intentions at NVDA large customers including Microsoft and Meta," Bank of America analyst Vivek Arya wrote in a note to investors in early February.

Model improvements, paired with big data center buildouts, are another favorable evidence point for Nvidia.

Grok 3, Musk's latest model from xAI, is receiving praise for its performance. Musk's firms also recently collaborated on its second data center with roughly 12,000 Nvidia GPUs, BI exclusively reported.

Musk has been aggressively adding to the fleet of GPU-packed data centers supporting Grok, suggesting a link between progress and infrastructure.

Model builders and hyperscalers still have their eyes on artificial general intelligence, and cheaper, highly functional models like DeepSeek won't impact that pursuit, Morgan Stanley analysts told investors in a note from last week.

Blackwell ramp-up

Jensen Huang onstage showing Nvidia hardware.
Nvidia CEO Jensen Huang.

Justin Sullivan/Getty

Nvidia's latest and most powerful chip series, Blackwell, has struggled with a slow rollout due to manufacturing and overheating issues. Analysts, however, are expecting the company to report a strong ramp-up.

"Demand for Blackwell is very strong and will outstrip supply for several quarters," Synovus senior portfolio manager Daniel Morgan said in an investor note last week.

UBS's Timothy Arcuri wrote that after much consternation, investor fears of a botched rollout are relaxing, and strong sales numbers could put them fully at ease.

UBS analysts also said the fourth quarter was the last in which Blackwell chips won't make up the majority of Nvidia's GPU sales. Investors will likely favor that shift because Blackwell brings with it higher profit margins.

Nvidia's buzzy GTC conference will take place in San Jose, CA, next month, marking the first anniversary of Blackwell's debut.

Inference and applications

Further growth in inference demand would also be a proof point for Huang's theory of investor error surrounding the DeepSeek rout. Demand for inference, the process of using and improving models once they've been trained, increases when consumers and businesses find value in AI tools.

Investors will likely want to see the share of AI workloads continue to shift to inference, which also requires GPUs to run. On the company's last call in November, Huang repeatedly said that inference across Nvidia's platforms was growing.

Growth in the software layers of Nvidia's tech stack would be a good sign, too. This would suggest maturity in AI products and lend strength to a part of its business that's potentially even more difficult to compete with than the chips themselves.

"What Nvidia talks about on its long-term moats and its possible deployment on the AI application side probably matters more this time," Morgan Stanley analysts wrote in a note to investors Friday.

Worldwide wild card

Donald Trump
President Donald Trump.

Chip Somodevilla/Getty Images

Investors will also be looking for any signals from Nvidia about the company's approach to China as President Donald Trump threatens to upend business relations with the country.

Just before the end of his term, former President Joe Biden initiated new regulations on the export of high-powered chips like Nvidia's GPUs, which are in the midst of a 120-day comment period. Many policy analysts expect Trump to allow the rules to take effect as they align with his "America First" agenda, though Trump has yet to directly address them.

As Trump said last month, AI leadership is critical to ensuring "economic and national security."

Last month, Trump also threatened to impose tariffs on Taiwan, home of Nvidia's chip manufacturing partner, TSMC. Tariffs could lead to increased costs for Nvidia. Huang met with the president at the White House last month, but neither party provided details of the discussion.

Although Nvidia's share price has recovered much of its DeepSeek-induced losses, the $3 trillion juggernaut faces various potential headwinds. Huang's job Wednesday will be to reassure investors that those headwinds will be mild and reaffirm that Nvidia remains fundamental to the AI story.

Read the original article on Business Insider

OpenAI’s secret weapon against Nvidia dependence takes shape

10 February 2025 at 13:00

OpenAI is entering the final stages of designing its long-rumored AI processor with the aim of decreasing the company's dependence on Nvidia hardware, according to a Reuters report released Monday. The ChatGPT creator plans to send its chip designs to Taiwan Semiconductor Manufacturing Co. (TSMC) for fabrication within the next few months, but the chip has not yet been formally announced.

The OpenAI chip's full capabilities, technical details, and exact timeline are still unknown, but the company reportedly intends to iterate on the design and improve it over time, giving it leverage in negotiations with chip suppliersβ€”and potentially granting the company future independence with a chip design it controls outright.

In the past, we've seen other tech companies, such as Microsoft, Amazon, Google, and Meta, create their own AI acceleration chips for reasons that range from cost reduction to relieving shortages of AI chips supplied by Nvidia, which enjoys a near-market monopoly on high-powered GPUs (such as the Blackwell series) for data center use.

Read full article

Comments

Β© OsakaWayne Studios via GettyImages

Amazon doubles down on AI with a massive $100B spending plan for 2025

6 February 2025 at 15:54

Amazon is joining other Big Tech companies by announcing huge AI spending plans for 2025.

Β© 2024 TechCrunch. All rights reserved. For personal use only.

Lambda Labs' COO has left the AI cloud provider to head Positron, a startup trying to compete with Nvidia

27 January 2025 at 03:11
Mitesh Agrawal
Mitesh Agrawal has moved on from Lambda Labs for Positron, a new player in the AI hardware space.

Kavita Agrawal

  • Lambda Labs COO Mitesh Agrawal has left to head AI hardware startup Positron.
  • Lambda focuses on deploying cloud infrastructure to customers and is valued at over $2 billion.
  • Positron aims to compete with Nvidia by offering faster, energy-efficient AI hardware.

Lambda Labs, a Nvidia partner, has lost its chief operating officer to a little-known company building hardware for the AI industry.

Lambda COO Mitesh Agrawal told Business Insider he stepped into a new role as CEO of Positron earlier this month. Positron builds hardware for transformer model inference, which is how chatbots like ChatGPT respond to user requests.

Agrawal's departure is significant given his role in shaping Lambda into one of Silicon Valley's best-funded and most valuable startups.

During its Series C round last February, the company was valued at about $1.5 billion. Agrawal declined to share the company's exact valuation but said it has grown to over $2 billion since then.

Agrawal told BI that when he joined Lambda in 2017, the company was focused on building machines for image generation models. This was five years after twin brothers Stephen and Michael Balaban founded it as a company developing facial recognition technology. It wasn't long after Agrawal's arrival, however, that the company shifted its focus, designing infrastructure for full-scale data centers and pivoting into cloud services.

He said Lambda's business now focuses on deploying cloud infrastructure to customers, renting out servers powered by Nvidia's graphics processing units. It also offers the requisite software, including APIs for inference and machine learning libraries for customers.

Agrawal said that his move to Positron comes amid a growing appetite for inference β€” the capacity for AI models to apply their training to new data.

Between chatbots like ChatGPT and xAI's Grok, and new reasoning models like OpenAI's o1 tackling PhD-level problems, "the curve of technology for inference is just going up, which means the computational requirement is really going up," Agrawal said. So, he said he's thinking a lot about "how to solve and how to run these models with as much efficiency as possible."

He believes Positron is well-positioned to take on that challenge.

Positron was founded in 2023 by Thomas Sohmers, whom Agrawal met in 2015. The two also overlapped at Lambda during Sohmers's stint at the company between 2020 and 2021. Sohmers, who will move into the role of chief technology officer, told BI that, in simplest terms, the company is "building hardware competing against Nvidia."

Positron says its hardware outperforms Nvidia's H100 and H200 GPUs β€” which fueled the AI race before it released its more powerful Blackwell chips β€” in performance, power, and affordability.

Going up against a behemoth like Nvidia β€” which overtook Apple as the world's most valuable company last week β€” is no easy task for an up-and-coming company. But by focusing more narrowly on providing hardware for transformer model inference, Sohmers said Positron can differentiate itself from the competition.

Transformer models β€” neural networks that learn the context and meaning of data to generate new data β€” are behind some of the most popular generative AI applications. Unlike convolutional neural networks, which underpinned previous decades of machine learning advances, transformer models have greater memory demands. Sohmers said he saw an opportunity to capitalize on those demands.

"I would say the whole reason we started Positron is we thought that there was a better way to do things," Sohmers said. "Nvidia, as a large company that also has a lot of other product focuses wasn't going to really optimize and focus on the particular niche that we're focused on, which is transformer model inference."

Agrawal, too, is confident in the performance and energy efficiency of Positron's hardware. Its compatibility with a range of transformer models will also help it attract customers from competitors, he said.

"Nvidia has such a strong ecosystem in the world of AI models. You hear about their CUDA moat, and you heard about the software moat," he said, referring to the software network the company has built between its products to retain customers.

"What Positron really did was completely remove this friction of anything," Agrawal said. That means a company can take a model trained on an Nvidia GPU and "run that model's inference on a Positron card just like you would run on an Nvidia GPU," he said.

Agrawal said the jump from an established player like Lambda to a young startup like Positron presents an "exciting challenge."

"You get to compete against an industry veteran as well as in a field that is just so big," he said.

Read the original article on Business Insider

Nvidia starts to wind down support for old GPUs, including the long-lived GTX 1060

Nvidia is launching the first volley of RTX 50-series GPUs based on its new Blackwell architecture, starting with the RTX 5090 and working downward from there. The company also appears to be winding down support for a few of its older GPU architectures, according to these CUDA release notes spotted by Tom's Hardware.

The release notes say that CUDA support for the Maxwell, Pascal, and Volta GPU architectures "is considered feature-complete and will be frozen in an upcoming release." While all of these architecturesβ€”which collectively cover GeForce GPUs from the old GTX 700 series all the way up through 2016's GTX 1000 series, plus a couple of Quadro and Titan workstation cardsβ€”are still currently supported by Nvidia's December Game Ready driver package, the end of new CUDA feature support suggests that these GPUs will eventually be dropped from these driver packages soon.

It's common for Nvidia and AMD to drop support for another batch of architectures all at once every few years; Nvidia last dropped support for older cards in 2021, and AMD dropped support for several prominent GPUs in 2023. Both companies maintain a separate driver branch for some of their older cards, but releases usually only happen every few months, and they focus on security updates, not on providing new features or performance optimizations for new games.

Read full article

Comments

Β© Mark Walton

What President Joe Biden's last-minute chip export restrictions mean for Nvidia

13 January 2025 at 08:38
Jensen Huang in a leather jacket in front of a large window.
Nvidia CEO Jensen Huang.

Jeff Chiu/ AP Images

  • Biden's Commerce Department is issuing new semiconductor export rules affecting Nvidia.
  • The rules categorize countries for GPU export controls, impacting Nvidia's market.
  • Critics argue the rules may stifle AI innovation. Supporters say they will keep the US on top.

The Biden administration's Commerce Department released 168-pages of fresh regulations for the US semiconductor industry Monday that could drastically change Nvidia's year.

The new rules target exports of graphics processing units, the types of highly powerful chips made by Nvidia, and challenger AMD. Global data centers are filling up with GPUs and Nvidia has so far claimed an estimated 90% of that market share.

Highly complex chips like GPUs are largely manufactured in Taiwan, but most of the companies that design them are based in the US and so their products are within the Department of Commerce's jurisdiction.

"To enhance U.S. national security and economic strength, it is essential that we do not offshore this critical technology and that the world's AI runs on American rails," the White House's announcement reads, adding that advanced computing in the wrong hands can lead to "development of weapons of mass destruction, supporting powerful offensive cyber operations, and aiding human rights abuses, such as mass surveillance."

In response to previous export restrictions, Nvidia created a less powerful chip model just for the Chinese market to keep doing business there after the Biden administration changed the rules in 2022.

The new regulations go further β€” grouping countries into three categories and placing different export controls on each.

The first is a group of 18 allies to which GPUs can ship freely. These are Australia, Belgium, Canada, Denmark, Finland, France, Germany, Ireland, Italy, Japan, the Netherlands, New Zealand, Norway, the Republic of Korea, Spain, Sweden, Taiwan, and the United Kingdom.

The second group is listed as "countries of concern" where exports of the most advanced GPUs will be banned entirely. These are China, Hong Kong and Macau, Russia, Iran, North Korea, Venezuela, Nicaragua, and Syria.

All other countries would be subject to a cap of 100,000 GPUs. The rules lay out a verification process for larger orders, in which the businesses looking to set up larger clusters in these countries would need US government approval to do so.

The administration said the regulations had provision that would keep small orders of chips flowing to research institutions and universities.

Nvidia has opposed the regulation along with The Semiconductor Industry Association.

"While cloaked in the guise of an "anti-China" measure, these rules would do nothing to enhance U.S. security," Ned Finkle, Nvidia's VP of government affairs wrote in a statement on the company's website.

Impact on Nvidia

Any restriction on the sale of GPUs anywhere is bound to hit Nvidia's sales.

"The Biden Administration now seeks to restrict access to mainstream computing applications with its unprecedented and misguided "AI Diffusion" rule, which threatens to derail innovation and economic growth worldwide," Finkle wrote.

But will the regulations dampen sales or shift them?

Chris Miller, the author of "Chip War" and a leading expert on the semiconductor industry told Business Insider he was uncertain if the overall volume of GPUs sold would be substantially impacted since demand for Nvidia's products is so high.

"I suspect that these rules will generally have the impact of shifting data center construction toward US firms," Miller said.

If demand does goes down, "it would change due to a reduction of GPU demand from countries or companies that are unwilling to rely on US cloud providers," Miller said.

The drafted rules had been circulating ahead of the Monday announcement and reactions from tech leaders have been fierce.

Oracle EVP Ken Glueck blogged about them for the first time in mid December and again in early January.

Both Finkle and Glueck zeroed in on the country caps as the most consequential element introduced.

"The extreme 'country cap' policy will affect mainstream computers in countries around the world, doing nothing to promote national security but rather pushing the world to alternative technologies," Finkle said in an emailed statement Friday.

It is particularly notable that Singapore, Mexico, Malaysia, UAE, Israel, Saudi Arabia, and India, are not in the unrestricted tier of countries, Glueck noted.

The exclusion of several Middle East countries could seriously change the course of the global AI infrastructure buildout, Miller said.

"The primary impact of these controls is that they make it much more likely that the most advanced AI systems are trained in the US as opposed to the Middle East," Miller said.

"Without these controls, wealthy Middle Eastern governments would have succeeded to some degree in convincing U.S. firms to train high-end AI systems in the Middle East by offering subsidized data centers. Now this won't be possible, so US firms will train their systems in the US," Miller said.

Glueck wrote that country quotas were the worst concept within the draft regulations, which will be formally published Wednesday, according to the Federal Register.

"Controlling GPUs makes no sense when you can achieve parity by simply adding more, if less-powerful, GPUs to solve the problem," Oracle's Glueck wrote in December. "The problem with this proposal is it assumes there are no other non-U.S. suppliers from which to procure GPU technology," he continued.

Republican support

The fate of the Biden's unprecedented export control rules is uncertain given their timing.

The Monday statement from Nvidia's Finkle referenced the Trump administration, stating that in his first term, Trump, "laid the foundation for America's current strength and success in AI."

The new rules are subject to a 120-day comment period before they are enforceable. President Biden will have left office when they are set to take effect.

Though they stemmed from an outgoing Democratic administration, the rules do have some support on the President-elect's side of the aisle.

Republican Congressman John Moolenaar and Raja Krishnamoorthi, chair and ranking member of the House Select Committee on the Chinese Communist Party, are in favor of the framework.

"GPUs, or any country that hosts Huawei cloud computing infrastructure should be restricted from accessing the model weights of closed-weight dual-use AI models," the two legislators published in a written statement.

Matt Pottinger, who served on the National Security Council in Trump's first term and current chairman of the China program at the Foundation for Defense of Democracies along with Anthropic CEO Dario Amodei penned an op-ed published in the Wall Street Journal on Jan. 6. They suggest that the existing export restricitions have been successful, but still allow room for China to set up data centers in friendly third-party countries, so more restrictions are needed.

"Skeptics of these restrictions argue that the countries and companies to which the rules apply will simply switch to Chinese AI chips. This argument overlooks that U.S. chips are superior, giving countries an incentive to follow U.S. rules," Pottinger and Amodei wrote.

"Countries that want to reap the massive economic benefits will have an incentive to follow the U.S. model rather than use China's inferior chips," they continued.

Miller confirmed that the fact that China is still purchasing Nvidia's "defeatured" GPUs is sign enough that locally-designed chips are not competitive, yet.

"So long as China's importing US GPUs, it won't be able to export, in which case these controls will be effective because there is no alternative source of high end GPUs," Miller said.

But Huawei is catching up, said Alvin Nguyen, senior analyst at Forrester. Additional US export controls could speed that work up in his view.

"They've caught up to one generation behind Nvidia," said Nguyen.

Another concern is that restricting the flow of advanced chips could segment the economic opportunity of AI spreading equally around the globe.

"If you're not working with the best infrastructure, the best models, you may not be able to leverage the data that you do have β€” creating the haves and have nots," Nguyen said.

Read the original article on Business Insider

US splits world into three tiers for AI chip access

On Monday, the US government announced a new round of regulations on global AI chip exports, dividing the world into roughly three tiers of access. The rules create quotas for about 120 countries and allow unrestricted access for 18 close US allies while maintaining existing bans on China, Russia, Iran, and North Korea.

AI-accelerating GPU chips, like those manufactured by Nvidia, currently serve as the backbone for a wide variety of AI model deployments, such as chatbots like ChatGPT, AI video generators, self-driving cars, weapons targeting systems, and much more. The Biden administration fears that those chips could be used to undermine US national security.

According to the White House, "In the wrong hands, powerful AI systems have the potential to exacerbate significant national security risks, including by enabling the development of weapons of mass destruction, supporting powerful offensive cyber operations, and aiding human rights abuses."

Read full article

Comments

Β© SEAN GLADWELL via Getty Images

AMD unveils new chips for laptops, desktops, and gaming handhelds at CES 2025

6 January 2025 at 11:45

At CES 2025 in Las Vegas, AMD unveiled a slew of new chips destined for devices ranging from desktops to gaming handhelds. AMD is riding high coming into this year’s CES. The company commanded a 28.7% share of the desktop CPU segment in Q3 2024, up 9.6 percentage points compared to the same quarter the […]

Β© 2024 TechCrunch. All rights reserved. For personal use only.

Rumors say next-gen RTX 50 GPUs will come with big jumps in power requirements

Nvidia is reportedly gearing up to launch the first few cards in its RTX 50-series at CES next week, including an RTX 5090, RTX 5080, RTX 5070 Ti, and RTX 5070. The 5090 will be of particular interest to performance-obsessed, money-is-no-object PC gaming fanatics since it's the first new GPU in over two years that can beat the performance of 2022's RTX 4090.

But boosted performance and slower advancements in chip manufacturing technology mean that the 5090's maximum power draw will far outstrip the 4090's, according to leakers. VideoCardz reports that the 5090's thermal design power (TDP) will be set at 575 W, up from 450 W for the already power-hungry RTX 4090. The RTX 5080's TDP is also increasing to 360 W, up from 320 W for the RTX 4080 Super.

That also puts the RTX 5090 close to the maximum power draw available over a single 12VHPWR connector, which is capable of delivering up to 600 W of power (though once you include the 75 W available via the PCI Express slot on your motherboard, the actual maximum possible power draw for a GPU with a single 12VHPWR connector is a slightly higher 675 W).

Read full article

Comments

Β© Sam Machkovech

A chip company you probably never heard of is suddenly worth $1 trillion. Here's why, and what it means for Nvidia.

18 December 2024 at 01:00
Broadcom CEO Hock Tan speaking at a conference
Broadcom CEO Hock Tan

Ying Tang/NurPhoto via Getty Images

  • Broadcom's stock surged in recent weeks, pushing the company's market value over $1 trillion.
  • Broadcom is crucial for companies seeking alternatives to Nvidia's AI chip dominance.
  • Custom AI chips are gaining traction, enhancing tech firms' bargaining power, analysts say.

The rise of AI, and the computing power it requires, is bringing all kinds of previously under-the-radar companies into the limelight. This week it's Broadcom.

Broadcom's stock has soared since late last week, catapulting the company into the $1 trillion market cap club. The boost came from a blockbuster earnings report in which custom AI chip revenue grew 220% compared to last year.

In addition to selling lots of parts and components for data centers, Broadcom designs and sells ASICs, or application-specific integrated circuits β€” an industry acronym meaning custom chips.

Designers of custom AI chips, chief among them Broadcom and Marvell, are headed into a growth phase, according to Morgan Stanley.

Custom chips are picking up speed

The biggest players in AI buy a lot of chips from Nvidia, the $3 trillion giant with an estimated 90% of market share of advanced AI chips.

Heavily relying on one supplier isn't a comfortable position for any company, though, and many large Nvidia customers are also developing their own chips. Most tech companies don't have large teams of silicon and hardware experts in house. Of the companies they might turn to design them a custom chip, Broadcom is the leader.

Though multi-purpose chips like Nvidia's and AMD's graphics processing units are likely to maintain the largest share of the AI chip market in the long-term, custom chips are growing fast.

Morgan Stanley analysts this week forecast the market for ASICs to nearly double to $22 billion next year.

Much of that growth is attributable to Amazon Web Services' Trainium AI chip, according to Morgan Stanley analysts. Then there are Google's in-house AI chips, known as TPUs, which Broadcom helps make.

In terms of actual value of chips in use, Amazon and Google dominate. But OpenAI, Apple, and TikTok parent company ByteDance are all reportedly developing chips with Broadcom, too.

ASICs bring bargaining power

Custom chips can offer more value, in terms of the performance you get for the cost, according to Morgan Stanley's research.

ASICs can also be designed to perfectly match unique internal workloads for tech companies, accord to the bank's analysts. The better these custom chips get, the more bargaining power they may provide when tech companies are negotiating with Nvidia over buying GPUs. But this will take time, the analysts wrote.

In addition to Broadcom, Silicon Valley neighbor Marvell is making gains in the ASICs market, along with Asia-based players Alchip Technologies and Mediatek, they added in a note to investors.

Analysts don't expect custom chips to ever fully replace Nvidia GPUs, but without them, cloud service providers like AWS, Microsoft, and Google would have much less bargaining power against Nvidia.

"Over the long term, if they execute well, cloud service providers may enjoy greater bargaining power in AI semi procurement with their own custom silicon," the Morgan Stanley analysts explained.

Nvidia's big R&D budget

This may not be all bad news for Nvidia. A $22 billion ASICs market is smaller than Nvidia's revenue for just one quarter.

Nvidia's R&D budget is massive, and many analysts are confident in its ability to stay at the bleeding edge of AI computing.

And as Nvidia rolls out new, more advanced GPUs, its older offerings get cheaper and potentially more competitive with ASICs.

"We believe the cadence of ASICs needs to accelerate to stay competitive to GPUs," the Morgan Stanley analysts wrote.

Still, Broadcom and chip manufacturers on the supply chain rung beneath, such as TSMC, are likely to get a boost every time a giant cloud company orders up another custom AI chip.

Read the original article on Business Insider

Will the world's fastest supercomputer please stand up?

11 December 2024 at 06:57
TRITON Supercomputer_13
TRITON Supercomputer at the University of Miami

T.J. Lievonen

  • Oracle and xAI love to flex the size of their GPU clusters.
  • It's getting hard to tell who has the most supercomputing power as more firms claim the top spot.
  • The real numbers are competitive intel and cluster size isn't everything, experts told BI.

In high school, as in tech, superlatives are important. Or maybe they just feel important in the moment. With the breakneck pace of the AI computing infrastructure buildout, it's becoming increasingly difficult to keep track of who has the biggest, fastest, or most powerful supercomputer β€” especially when multiple companies claim the title at once.

"We delivered the world's largest and fastest AI supercomputer, scaling up to 65,000 Nvidia H200 GPUs," Oracle CEO Safra Catz and Chairman, CTO, echoed by Founder Larry Ellison on the company's Monday earnings call.

In late October, Nvidia proclaimed xAI's Colossus as the "World's Largest AI Supercomputer," after Elon Musk's firm reportedly built a computing cluster with 100,000 Nvidia graphics processing units in a matter of weeks. The plan is to expand to 1 million GPUs next, according to the Greater Memphis Chamber of Commerce (where the supercomputer is located.)

The good ole days of supercomputing are gone

It used to be simpler. "Supercomputers" were most commonly found in research settings. Naturally, there's an official list ranking supercomputers. Until recently the world's most powerful supercomputer was named El Capitan. Housed at the Lawrence Livermore National Laboratory in California 11 million CPUs and GPUs from Nvidia-rival AMD add up to 1.742 Exaflops of computing capacity. (One exaflop is equal to one quintillion, or a billion billions, operations per second.)

"The biggest computers don't get put on the list," Dylan Patel, chief analyst at Semianalysis, told BI. "Your competitor shouldn't know exactly what you have," he continued. The 65,000-GPU supercluster Oracle executives were praising can reach up to 65 exaflops, according to the company.

It's safe to assume, Patel said, that Nvidia's largest customers, Meta, Microsoft, and xAI also have the largest, most powerful clusters. Nvidia CFO Colette Cress said 200 fresh exaflops of Nvidia computing would be online by the end of this year β€” across nine different supercomputers β€” on Nvidia's May earnings call.

Going forward, it's going to be harder to determine whose clusters are the biggest at any given moment β€” and even harder to tell whose are the most powerful β€” no matter how much CEOs may brag.

It's not the size of the cluster β€” it's how you use it

On Monday's call, Ellison was asked, if the size of these gigantic clusters is actually generating better model performance.

He said larger clusters and faster GPUs are elements that speed up model training. Another is networking it all together. "So the GPU clusters aren't sitting there waiting for the data," Ellison said Monday.

Thus, the number of GPUs in a cluster isn't the only factor in the computing power calculation. Networking and programming are important too. "Exaflops" are a result of the whole package so unless companies provide them, experts can only estimate.

What's certain is that more advanced models β€” the kind that consider their own thinking and check their work before answering queries β€” require more compute than their relatives of earlier generations. So training increasingly impressive models may indeed require an arms race of sorts.

But an enormous AI arsenal doesn't automatically lead to better or more useful tools.

Sri Ambati, CEO of open-source AI platform H2O.ai, said cloud providers may want to flex their cluster size for sales reasons, but given some (albeit slow) diversification of AI hardware and the rise of smaller, more efficient models, cluster size isn't the end all be all.

Power efficiency too, is a hugely important indicator for AI computing since energy is an enormous operational expense in AI. But it gets lost in the measuring contest.

Nvidia declined to comment. Oracle did not respond to a request for comment in time for publication.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088.

Read the original article on Business Insider

China opens antimonopoly probe into Nvidia, escalating the chip war with the US

9 December 2024 at 04:28
Nvidia CEO Jensen Huang.
Nvidia CEO Jensen Huang.

Sam Yeh/AFP via Getty Images

  • China's top antimonopoly regulator is investigating Nvidia.
  • The investigation is related to the company's 2020 acquisition of an Israeli chip firm.
  • Nvidia's stock fell by 2.2% in premarket trading on Monday.

China's top antimonopoly regulator has launched an investigation into Nvidia, whose shares dropped by 2.2% in premarket trading on Monday following the latest escalation of chip tensions with the US.

The State Administration for Market Regulation said on Monday that it was investigating whether the chipmaker giant violated antimonopoly regulations.

The probe is related to Nvidia's acquisition of Mellanox Technologies, an Israeli chip firm, in 2020. China's competition authority approved the $7 billion takeover in 2020 on the condition that rivals be notified of new products within 90 days of allowing Nvidia access to them.

The US-China chip war has been escalating. Last week, China's commerce ministry said it would halt shipments of key materials needed for chip production to the US. The ministry said the measures were in response to US chip export bans, also announced last week.

Nvidia, which is headquartered in Santa Clara, California, has also faced antitrust scrutiny in the US. The Department of Justice has been examining whether Nvidia might have abused its market dominance to make it difficult for buyers to change suppliers.

Nvidia did not immediately respond to a request for comment from Business Insider made outside normal working hours.

Read the original article on Business Insider

Amazon isn't seeing enough demand for AMD's AI chips to offer them via its cloud

6 December 2024 at 13:30
AWS logo at re:Invent 2024
AWS logo at re:Invent 2024

Noah Berger/Getty Images for Amazon Web Services

  • AWS has not committed to offering cloud access to AMD's AI chips in part due to low customer demand.
  • AWS said it was considering offering AMD's new AI chips last year.
  • AMD recently increased the sales forecast for its AI chips.

Last year, Amazon Web Service said it was considering offering cloud access to AMD's latest AI chips.

18 months in, the cloud giant still hasn't made any public commitment to AMD's MI300 series.

One reason: low demand.

AWS is not seeing the type of huge customer demand that would lead to selling AMD's AI chips via its cloud service, according to Gadi Hutt, senior director for customer and product engineering at Amazon's chip unit, Annapurna Labs.

"We follow customer demand. If customers have strong indications that those are needed, then there's no reason not to deploy," Hutt told Business Insider at AWS's re:Invent conference this week.

AWS is "not yet" seeing that high demand for AMD's AI chips, he added.

AMD shares dropped roughly 2% after this story first ran.

AMD's line of AI chips has grown since its launch last year. The company recently increased its GPU sales forecast, citing robust demand. However, the chip company still is a long way behind market leader Nvidia.

AWS provides cloud access to other AI chips, such as Nvidia's GPUs. At re:Invent, AWS announced the launch of P6 servers, which come with Nvidia's latest Blackwell GPUs.

AWS and AMD are still close partners, according to Hutt. AWS offers cloud access to AMD's CPU server chips, and AMD's AI chip product line is "always under consideration," he added.

Hutt discussed other topics during the interview, including AWS's relationship with Nvidia, Anthropic, and Intel.

An AMD spokesperson declined to comment.

Do you work at Amazon? Got a tip?

Contact the reporter, Eugene Kim, via the encrypted-messaging apps Signal or Telegram (+1-650-942-3061) or email ([email protected]). Reach out using a nonwork device. Check out Business Insider's source guide for other tips on sharing information securely.

Editor's note: This story was first published on December 6, 2024, and was updated later that day to reflect developments in AMD's stock price.

Read the original article on Business Insider

Silicon and supercomputers will define the next AI era. AWS just made a big bet on both.

4 December 2024 at 07:07
AWS CEO Matt Garman onstage at Re: Invent 2024.
Amazon is betting on its own chips and supercomputers to forge ahead with its AI ambitions.

Noah Berger/Getty Images for Amazon Web Services

  • AWS unveiled a new AI chip and a supercomputer at its Re: Invent conference on Tuesday.
  • It's a sign that Amazon is ready to reduce its reliance on Nvidia for AI chips.
  • Amazon isn't alone: Google, Microsoft, and OpenAI are also designing their own AI chips.

Big Tech's next AI era will be all about controlling silicon and supercomputers of their own. Just ask Amazon.

At its Re: Invent conference on Tuesday, the tech giant's cloud computing unit, Amazon Web Services, unveiled the next line of its AI chips, Trainium3, while announcing a new supercomputer that will be built with its own chips to serve its AI ambitions.

It marks a significant shift from the status quo that has defined the generative AI boom since OpenAI's release of ChatGPT, in which the tech world has relied on Nvidia to secure a supply of its industry-leading chips, known as GPUs, for training AI models in huge data centers.

While Nvidia has a formidable moat β€” experts say its hardware-software combination serves as a powerful vendor lock-in system β€” AWS' reveal shows companies are finding ways to take ownership of the tech shaping the next era of AI development.

Putting your own chips on the table

Amazon CEO Andy Jassy.
Amazon is pushing forward with its own brand of chips called Trainium.

Noah Berger/Getty Images for Amazon Web Services

On the chip side, Amazon shared that Trainium2, which was first unveiled at last year's Re: Invent, was now generally available. Its big claim was that the chip offers "30-40% better price performance" than the current generation of servers with Nvidia GPUs.

That would mark a big step up from its first series of chips, which analysts at SemiAnalysis described on Tuesday as "underwhelming" for generative AI training and used instead for "training non-complex" workloads within Amazon, such as credit card fraud detection.

"With the release of Trainium2, Amazon has made a significant course correction and is on a path to eventually providing a competitive custom silicon," the SemiAnalysis researchers wrote.

Trainium3, which AWS gave a preview of ahead of a late 2025 release, has been billed as a "next-generation AI training chip." Servers loaded with Trainium3 chips offer four times greater performance than those packed with Trainium2 chips, AWS said.

Matt Garman, the CEO of AWS, told The Wall Street Journal that some of the company's chip push is due to there being "really only one choice on the GPU side" at present, given Nvidia's dominant place in the market. "We think that customers would appreciate having multiple choices," he said.

It's an observation that others in the industry have noted and responded to. Google has been busy designing its own chips that reduce its dependence on Nvidia, while OpenAI is reported to be exploring custom, in-house chip designs of its own.

But having in-house silicon is just one part of this.

The supercomputer advantage

AWS acknowledged that as AI models trained on GPUs continue to get bigger, they are "pushing the limits of compute and networking infrastructure."

That means companies serious about building their own AI models β€” like Amazon in its partnership with Anthropic, the OpenAI rival that raised a total of $8 billion from the tech giant β€” will need access to highly specialized computing that can handle a new era of AI.

Adam Selipsky and Dario Amodei sitting onstage at a conference with the logos of Amazon and Anthropic behind them.
Amazon has a close partnership with OpenAI rival Anthropic.

Noah Berger/Getty

With this in mind, AWS shared that it was working with Anthropic to build an "UltraCluster" of servers that form the basis of a supercomputer it has named Project Rainier. According to Amazon, it will scale model training across "hundreds of thousands of Trainium2 chips."

"When completed, it is expected to be the world's largest AI compute cluster reported to date available for Anthropic to build and deploy their future models on," AWS said in a blog, adding that it will be "over five times the size" of the cluster used to build Anthropic's last model.

The supercomputer push follows similar moves elsewhere. The Information first reported earlier this year that OpenAI and Microsoft were working together to build a $100 billion AI supercomputer called Stargate.

Of course, Nvidia is also in the supercomputer business and aims to make them a big part of its allure to companies looking to use its next-generation AI chips, Blackwell.

Last month, for instance, Nvidia announced that SoftBank, the first customer to receive its new Blackwell-based servers, would use them to build a supercomputer for AI development. Elon Musk has also bragged about his company xAI building a supercomputer with 100,000 Nvidia GPUs in Memphis this year.

AWS made no secret that it remains tied to Nvidia for now. In an interview with The Wall Street Journal, Garman acknowledged that Nvidia is responsible for "99% of the workloads" for training AI models today and doesn't expect that to change anytime soon.

That said, Garman reckoned "Trainium can carve out a good niche" for itself. He'll be wise to recognize that everyone else is busy carving out a niche for themselves, too.

Read the original article on Business Insider

4 things we learned from Amazon's AWS conference, including about its planned supercomputer

3 December 2024 at 15:59
AWS chip
AI chips were the star of AWS CEO Matt Garman's re:Invent keynote.

Business Wire/BI

  • AWS announced plans for an AI supercomputer, UltraCluster, with Trainium 2 chips at re:Invent.
  • AWS may be able to reduce reliance on Nvidia by developing its own AI infrastructure.
  • Apple said it's using Trainium 2 chips for Apple Intelligence.

Matt Garman, the CEO of Amazon Web Services, made several significant new AWS announcements at the re:Invent conference on Tuesday.

His two-and-a-half hour keynote delved into AWS's current software and hardware offerings and updates, with words from clients including Apple and JPMorgan. Graphics processing units (GPUs), supercomputers, and a surprise Apple cameo stuck out among the slew of information.

AWS, the cloud computing arm of Amazon, has been developing its own semiconductors to train AI. On Tuesday, Garman said it's creating UltraServers β€” containing 64 of its Trainium 2 chips β€” so companies can scale up their GenAI workloads.

Moreover, it's also building an AI supercomputer, an UltraCluster made up of UltraServers, in partnership with AI startup Anthropic. Named Project Rainier, it will be "the world's largest AI compute cluster reported to date available for Anthropic to build and deploy its future models on" when completed, according to an Amazon blog post. Amazon has invested $8 billion in Anthropic.

Such strides could push AWS further into competition with other tech firms in the ongoing AI arms race, including AI chip giant Nvidia.

Here are four takeaways from Garman's full keynote on Tuesday.

AWS' Trainium chips could compete with Nvidia.

Nvidia currently dominates the AI chip market with its sought-after and pricey GPUs, but Garman backed AWS's homegrown silicon during his keynote on Tuesday. His company's goal is to reduce the cost of AI, he said.

"Today, there's really only one choice on the GPU side, and it's just Nvidia. We think that customers would appreciate having multiple choices," Garman told the Wall Street Journal.

AI is growing rapidly, and the demand for chips that make the technology possible is poised to grow alongside it. Major tech companies, like Google and Microsoft, are venturing into chip creation as well to find an alternative to Nvidia.

However, Garman told The Journal the doesn't expect Trainium to dethrone Nvidia "for a long time."

"But, hopefully, Trainium can carve out a good niche where I actually think it's going to be a great option for many workloads β€” not all workloads," he said.

AWS also introduced Trainium3, its next-gen chip.

AWS' new supercomputer could go toe to toe with Elon Musk's xAI.

According to The Journal, the chip cluster known as Project Rainier is expected to be available in 2025. Once it is ready, Anthropic plans to use it to train AI models.

With "hundreds of thousands" of Trainium chips, it would challenge Elon Musk's xAI's Colossus β€” a supercomputer with 100,000 of Nvidia's Hopper chips.

Apple is considering Trainium 2 for Apple Intelligence training.

Garman said that Apple is one of its customers using AWS chips, like Amazon Graviton and Inferentia, for services including Siri.

Benoit Dupin, senior director of AI and machine learning at Apple, then took to the stage at the Las Vegas conference. He said the company worked with AWS for "virtually all phases" of its AI and machine learning life cycle.

"One of the unique elements of Apple business is the scale at which we operate and the speed with which we innovate," Dupin said.

He added, "AWS has been able to keep the pace, and we've been customers for more than a decade."

Now, Dupin said Apple is in the early stages of testing Trainium 2 chips to potentially help train Apple Intelligence.

The company introduced a new generation of foundational models, Amazon Nova.

Amazon announced some new kids on the GenAI block.

AWS customers will be able to use Amazon Nova-powered GenAI applications "to understand videos, charts, and documents, or generate videos and other multimedia content," Amazon said. There are a range of models available at different costs, it said.

"Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro are at least 75% less expensive than the best-performing models in their respective intelligence classes in Amazon Bedrock," Amazon said.

Read the original article on Business Insider

Amazon makes massive downpayment on dethroning Nvidia

22 November 2024 at 11:09
Anthropic CEO Dario Amodei at the 2023 TechCrunch Disrupt conference
Dario Amodei, an OpenAI employee turned Anthropic CEO, at TechCrunch Disrupt 2023.

Kimberly White/Getty

  • Amazon on Friday announced another $4 billion investment in the AI startup Anthropic.
  • The deal includes an agreement for Anthropic to use Amazon's AI chips more.
  • The cloud giant is trying to challenge Nvidia and get developers to switch away from those GPUs.

Amazon's Trainium chips are about to get a lot busier β€” at least that's what Amazon hopes will happen after it pumps another $4 billion into the AI startup Anthropic.

The companies announced a huge new deal on Friday that brings Amazon's total investment in Anthropic to $8 billion. The goal of all this money is mainly to get Amazon's AI chips to be used more often to train and run large language models.

Anthropic said that in return for this cash injection, it would use AWS as its "primary cloud and training partner." It said it would also help Amazon design future Trainium chips and contribute to building out an Amazon AI-model-development platform called AWS Neuron.

This is an all-out assault on Nvidia, which dominates the AI chip market with its GPUs, servers, and CUDA platform. Nvidia's stock dropped by more than 3% on Friday after the Amazon-Anthropic news broke.

The challenge will be getting Anthropic to actually use Trainium chips in big ways. Switching away from Nvidia GPUs is complicated, time-consuming, and risky for AI-model developers, and Amazon has struggled with this.

Earlier this week, Anthropic CEO Dario Amodei didn't sound like he was all in on Amazon's Trainium chips, despite another $4 billion coming his way.

"We use Nvidia, but we also use custom chips from both Google and Amazon," he said at the Cerebral Valley tech conference in San Francisco. "Different chips have different trade-offs. I think we're getting value from all of them."

In 2023, Amazon made its first investment in Anthropic, agreeing to put in $4 billion. That deal came with similar strings attached. At the time, Anthropic said that it would use Amazon's Trainium and Inferentia chips to build, train, and deploy future AI models and that the companies would collaborate on the development of chip technology.

It's unclear whether Anthropic followed through. The Information reported recently that Anthropic preferred to use Nvidia GPUs rather than Amazon AI chips. The publication said the talks about this latest investment focused on getting Anthropic more committed to using Amazon's offerings.

There are signs that Anthropic could be more committed now, after getting another $4 billion from Amazon.

In Friday's announcement, Anthropic said it was working with Amazon on its Neuron software, which offers the crucial connective tissue between the chip and the AI models. This competes with Nvidia's CUDA software stack, which is the real enabler of Nvidia's GPUs and makes these components very hard to swap out for other chips. Nvidia hasΒ a decadelong head startΒ on CUDA, and competitors have found that difficult to overcome.

Anthropic's "deep technical collaboration" suggests a new level of commitment to using and improving Amazon's Trainium chips.

Though several companies make chips that compete with or even beat Nvidia's in certain elements of computing performance, no other chip has touched the company in terms of market or mind share.

Amazon's AI chip journey

Amazon is on a short list of cloud providers attempting to stock their data centers with their own AI chips and avoid spending heavily on Nvidia GPUs, which have profit margins that often exceed 70%.

Amazon debuted its Trainium and Inferentia chips β€” named after the training and inference tasks they're built for β€” in 2020.

The aim was to become less dependent on Nvidia and find a way to make cloud computing in the AI age cheaper.

"As customers approach higher scale in their implementations, they realize quickly that AI can get costly," Amazon CEO Andy Jassy said on the company's October earnings call. "It's why we've invested in our own custom silicon in Trainium for training and Inferentia for inference."

But like its many competitors, Amazon has found that breaking the industry's preference for Nvidia is difficult. Some say that's because ofΒ CUDA, which offers an abundant software stack with libraries, tools, and troubleshooting help galore. Others say it's simple habit or convention.

In May, the Bernstein analyst Stacy Rasgon told Business Insider he wasn't aware of any companies using Amazon AI chips at scale.

With Friday's announcement, that might change.

Jassy said in October that the next-generation Trainium 2 chip was ramping up. "We're seeing significant interest in these chips, and we've gone back to our manufacturing partners multiple times to produce much more than we'd originally planned," Jassy said.

Still, Anthropic's Amodei sounded this week like he was hedging his bets.

"We believe that our mission is best served by being an independent company," he said. "If you look at our position in the market and what we've been able to do, the independent partnerships we have Google, with Amazon, with others, I think this is very viable."

Read the original article on Business Insider

❌
❌