❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Is the tech industry ready for AI 'super agents'?

2 April 2025 at 06:26
James Doohan as Lt. Commander Montgomery Scotty Scott with a communicator on the Star Trek: The Original Series episode "Space Seed."
James Doohan as Lt. Commander Montgomery Scotty Scott on Star Trek

CBS via Getty Images

  • If AI agents catch on, there may not be enough computing capacity.
  • AI agents generate many more tokens than chatbots, increasing computational demands.
  • More AI chips may be needed if AI agents grow, Barclays analysts warned.

In Star Trek, the Starship Enterprise had a chief engineer, Montgomery "Scotty" Scott, who regularly had to explain to Captain Kirk that certain things were impossible to pull off, due to practicalities such as the laws of physics.

"The engines cannae take it, Captain!" is a famous quote that the actor may actually not have said on the TV show. But you get the idea.

We may be approaching such a moment in the tech industry right now, as the AI agent trend gathers momentum.

The field is beginning to shift from relatively simple chatbots to more capable AI agents that can autonomously complete complex tasks. Is there enough computing power to sustain this transformation?

According to a recent Barclays report, the AI industry will have enough capacity to support 1.5 billion to 22 billion AI agents.

This could be enough to revolutionize white-collar work, but additional computing power may be needed to run these agents while also satisfying consumer demand for chatbots, the Barclays analysts explained in a note to investors this week.

It's all about tokens

AI agents generate far more tokens per user query than traditional chatbots, making them more computationally expensive.

Tokens are the language of generative AI and are at the core of emerging pricing models in the industry. AI models break down words and other inputs into numerical tokens to make them easier to process and understand. One token is about ΒΎ of a word.

More powerful AI agents may rely on "reasoning" models, such as OpenAI's o1 and o3 and DeepSeek's R1, which break queries and tasks into more manageable chunks. Each step in these chains of thought creates more tokens, which must be processed by AI servers and chips.

"Agent products run on reasoning models for the most part, and generate about 25x more tokens per query compared to chatbot products," the Barclays analysts wrote.

"Super Agents"

OpenAI offers a ChatGPT Pro service that costs $200 monthly and taps into its latest reasoning models. The Barclays analysts estimated that if this service used the startup's o1 model, it would generate about 9.4 million tokens per year per subscriber.

There's been media reports recently that OpenAI could offer even more powerful AI agent services that cost $2,000 a month or even $20,000 a month.

The Barclays analysts referred to these as "super agents," and estimated that these services could generate 36 million to 356 million tokens per year, per user.

More chips, Captain!

That's a mind-blowing amount of tokens that would consume a mountain of computing power.

The AI industry is expected to have 16 million accelerators, a type of AI chip, online this year. Roughly 20% of that infrastructure may be dedicated to AI inference β€” essentially the computing power needed to run AI applications in real time.

If agentic products take off and are very useful to consumers and enterprise users, we will likely need "many more inference chips," the Barclays analysts warned.

The tech industry may even need to repurpose some chips that were previously used to train AI models and use those for inference, too, the analysts added.

They also predicted that cheaper, smaller, and more efficient models, like those developed by DeepSeek, will have to be used for AI agents, rather than pricier proprietary models.

Read the original article on Business Insider

CoreWeave IPO debut: Pay attention to this potentially expensive hardware problem

28 March 2025 at 09:20
Michael Intrator, cofounder and CEO of CoreWeave, speaks on stage at the Web Summit 2024 in Lisbon
Michael Intrator, the CEO of CoreWeave, is on the cusp of a big initial public offering.

Bruno de Carvalho/SOPA Images/LightRocket via Getty Images

  • Rapid AI advancements may reduce the useful life of CoreWeave's Hopper-based Nvidia GPUs.
  • CoreWeave has a lot of Hopper-based GPUs, which are becoming outdated due to the Blackwell rollout.
  • Amazon recently cut the estimated useful life of its servers, citing AI advancements.

I recently wrote about Nvidia's latest AI chip-and-server package and how this new advancement may dent the value of the previous product.

Nvidia's new offering likely caused Amazon to reduce the useful life of its AI servers, which took a big chunk out of earnings.

Other Big Tech companies, such as Microsoft, Google, and Meta, may have to take similar tough medicine, according to analysts.

This issue might also impact Nvidia-backed CoreWeave, which is doing an IPO. Its shares listed on Friday under the ticker "CRWV." The company is a so-called neocloud, specializing in generative AI workloads that rely mostly on Nvidia GPUs and servers.

Like its bigger rivals, CoreWeave has been buying oodles of Nvidia GPUs and renting them out over the internet. The startup had deployed more than 250,000 GPUs by the end of 2024, per its filing to go public.

These are incredibly valuable assets. Tech companies and startups have been jostling for the right to buy Nvidia GPUs in recent years, so any company that has amassed a quarter of a million of these components has done very well.

There's a problem, though. AI technology is advancing so rapidly that it can make existing gear obsolete, or at least less useful, more quickly. This is happening now as Nvidia rolls out its latest AI chip-and-server package, Blackwell. It's notably better than the previous version, Hopper, which came out in 2022.

Veteran tech analyst Ross Sandler recently shared a chart showing that the cost of renting older Hopper-based GPUs has plummeted as the newer Blackwell GPUs become more available.

A chart showing the cost of renting Nvidia H100 GPUs
A chart showing the cost of renting Nvidia H100 GPUs

Ross Sandler/Barclays Capital

The majority of CoreWeave's deployed GPUs are based on the older Hopper architecture, according to its latest IPO filing from March 20.

Sometimes, in situations like this,Β companies have to adjust their financials to reflect the quickly changing product landscape. This is done by reducing the estimated useful life of the assets in question. Then, through depreciation, the value of assets is reduced over a short time period to reflect things like wear and tear and, ultimately, obsolescence. The faster the depreciation, the bigger the hit to earnings.

Amazon's AI-powered depreciation

Amazon, the world's largest cloud provider, just did this. On a recent conference call with analysts, the company "observed an increased pace of technology development, particularly in the area of artificial intelligence and machine learning."

That caused Amazon Web Services to decrease the useful life of some of its servers and networking gear from six years to five years, beginning in January.

Sandler, the Barclays analyst, thinks other tech companies may have to do the same, which could cut operating income by billions of dollars.

Will CoreWeave have to do the same, just as it's trying to pull off one of the biggest tech IPOs in years?

I asked a CoreWeave spokeswoman about this, but she declined to comment. This is not unusual, because companies in the midst of IPOs have to follow strict rules that limit what they can say publicly.Β 

CoreWeave's IPO risk factor

CoreWeave talks about this issue in its latest IPO filing, writing that the company is always upgrading its platform, which includes replacing old equipment.

"This requires us to make certain estimates with respect to the useful life of the components of our infrastructure and to maximize the value of the components of our infrastructure, including our GPUs, to the fullest extent possible."

The company warned those estimates could be inaccurate. CoreWeave said its calculations involve a host of assumptions that could change and infrastructure upgrades that might not go according to plan β€” all of which could affect the company, now and later.

This caution is normal because companies have to detail everything that could hit their bottom line, from pandemics to cybersecurity attacks.

As recently as January 2023, CoreWeave was taking the opposite approach to this situation, according to its IPO filing. The company increased the estimated useful life of its computing gear from five years to six years.Β That change reduced expenses by $20 million and boosted earnings by 10 cents a share for the 2023 year.

If the company now follows AWS and reduces the useful life of its gear, that might dent earnings. Again, CoreWeave's spokeswoman declined to comment, citing IPO rules.

An important caveat: Just because one giant cloud provider made an adjustment like this, it doesn't mean others will have to do the same. CoreWeave might design its AI data centers differently, somehow making Nvidia GPU systems last longer or become less obsolete less quickly, for instance.Β 

It'sΒ also worth noting that other big cloud companies, including Google, Meta, and Microsoft, have increased the estimated useful life of their data center equipment in recent years.

Google and Microsoft's current estimates are six years, like CoreWeave's, while Meta's is 5.5 years.

However, Sandler, the Barclays analyst, thinks some of these big companies will follow AWS and shorten these estimates.Β 

Read the original article on Business Insider

Nvidia CEO Jensen Huang joked about something that could cost his biggest customers billions of dollars

22 March 2025 at 02:00
A man in a dark suit and light shirt sits in a chair on a stage smiling.
Nvidia CEO Jensen Huang.

Chip Somodevilla/Getty Images

  • Nvidia's new Blackwell GPUs mean the older Hopper models are less useful, affecting cloud giants.
  • Rapid tech advancements may force cloud giants to adjust asset depreciation, denting earnings.
  • Amazon leads in adjusting server lifespan. Meta and Google could see profit hits.

Nvidia CEO Jensen Huang made a joke this week that his biggest customers probably won't find funny.

"I said before that when Blackwell starts shipping in volume, you couldn't give Hoppers away," he said at Nvidia's big AI conference Tuesday.

"There are circumstances where Hopper is fine," he added. "Not many."

He was talking about Nvidia's latest AI chip-and-server package, Blackwell. It's notably better than the previous version, Hopper, which came out in 2022.

Big cloud companies, such as Amazon, Microsoft, and Google, buy a ton of these GPU systems to train and run the giant models powering the generative AI revolution. Meta has also gone on a GPU spending spree in recent years.

These companies should be happy about an even more powerful GPU like Blackwell. It's generally great news for the AI community. But there's a problem, too.

AI obsolescence

When new technology like this improves at such a rapid pace, the previous versions become obsolete, or at least less useful, much faster.

This makes these assets less valuable, so the big cloud companies may have to adjust. This is done through depreciation, where the value of assets are reduced over time to reflect things like wear and tear and ultimately obsolescence. The faster the depreciation, the bigger the hit to earnings.

Ross Sandler, a top tech analyst at Barclays, warned investors on Friday that the big cloud companies and Meta will probably have to make these adjustments, which could significantly reduce profits.

"Hyperscalers are likely overstating earnings," he wrote.

Google and Meta did not respond to Business Insider's questions about this on Friday. Microsoft declined to comment.

Amazon takes the plunge first

Take the example of Amazon Web Services, the largest cloud provider. In February, it became the first to take the pain.

CFO Brian Olsavsky said on Amazon's earnings call last month that the company "observed an increased pace of technology development, particularly in the area of artificial intelligence and machine learning."

"As a result, we're decreasing the useful life for a subset of our servers and networking equipment from 6 years to 5 years, beginning in January 2025," Olsavsky said, adding that this will cut operating income this year by about $700 million.

Then, more bad news: Amazon "early-retired" some of its servers and network equipment, Olsavsky said, adding that this "accelerated depreciation" cost about $920 million and that the company expects it will decrease operating income in 2025 by about $600 million.

A much larger problem for others

Sandler, the Barclays analyst, included a striking chart in his research note on Friday. It showed the cost of renting H100 GPUs, which use Nvidia's older Hopper architecture. As you can see, the price has plummeted as the company's new, better Blackwell GPUs became more available.

A chart showing the cost of renting Nvidia H100 GPUs
A chart showing the cost of renting Nvidia H100 GPUs.

Ross Sandler/Barclays Capital

"This could be a much larger problem at Meta and Google and other high-margin software companies," Sandler wrote.

For Meta, he estimated that a one-year reduction in the useful life of the company's servers would increase depreciation in 2026 by more than $5 billion and chop operating income by a similar amount.

For Google, a similar change would knock operating profit by $3.5 billion, Sandler estimated.

An important caveat: Just because one giant cloud provider has already made an adjustment like this, it doesn't mean the others will have to do exactly the same thing. Some companies might design their AI data centers differently, somehow making Nvidia GPU systems last longer or become less obsolete less quickly.

The time has come

When the generative AI boom was picking up steam in the summer of 2023, Bernstein analysts already started worrying about this depreciation.

"All those Nvidia GPUs have to be going somewhere. And just how quickly do these newer servers depreciate? We've heard some worrying timetables," they wrote in a note to investors at the time.

One Bernstein analyst, Mark Shmulik, discussed this with my colleague Eugene Kim.

"I'd imagine the tech companies are paying close attention to GPU useful life, but I wouldn't expect anyone to change their depreciation timetables just yet," he wrote in an email to BI at the time.

Now, that time has come.

Read the original article on Business Insider

AI's $3 trillion question: Will the Chinchilla live or die?

14 March 2025 at 02:01
A contestant holds a pair of chinchillas at the Fourth Annual Chinchilla Show in New York.
A contestant holds a pair of chinchillas at the Fourth Annual Chinchilla Show in New York.

Getty Images

  • Chinchillas are cuddly and cute.
  • Chinchilla is also an established way to build huge AI models using mountains of data.
  • There's at least $3 trillion riding on whether this approach continues or not.

About five years ago, researchers at OpenAI discovered that combining more computing power and more data in ever-larger training runsΒ produces better AI models.

A couple of years later, Google researchers found that adding more data to this mix produces even better results. They showed this by building a new AI model called Chinchilla.

These revelations helped create large language models and other giant models, like GPT-4, that support powerful AI tools such as ChatGPT. Yet in the future, the "Chinchilla" strategy of smashing together oodles of computing and mountains of data into bigger and longer pre-training runs may not work as well.

So what if this process doesn't end up being how AI is made in the future? To put it another way: What if the Chinchilla dies?

Building these massive AI models has so far required huge upfront investments. Mountains of data are mashed together in an incredibly complex and compute-intensive process known as pre-training.

This has sparked the biggest wave of infrastructure upgrades in technology's history. Tech companies across the US and elsewhere are frantically erecting energy-sucking data centers packed with Nvidia GPUs.

The rise of new "reasoning" models has opened up a new potential future for the AI industry, where the amount of required infrastructure could be much less. We're talking trillions of dollars of capital expenditure that might not happen in coming years.

Recently, Ross Sandler, a top tech analyst at Barclays Capital, and his team estimated the different capex requirements of these two possible outcomes:

  • The "Chinchilla" future is where the established paradigm of huge computing and data-heavy pre-training runs continue.
  • The "Stall-Out" alternative is one in which new types of models and techniques require less computing gear to produce more powerful AI.

The difference is stunning in terms of how much money will or will not be spent. $3 trillion or more in capex is on the line here.

The reason is "reasoning"

"Reasoning" AI models are on the rise, such as OpenAI's o1 and o3 offerings, DeepSeek's R1, and Google's Gemini 2.0 Flash Thinking.

These new models use an approach called test-time or inference-time compute, which slices queries into smaller tasks, turning each into a new prompt that the model tackles.

Reasoning models often don't need massive, intense, long pre-training runs to be created. They may take longer to respond, but their outputs can be more accurate, and they can be cheaper to run, too, the Barclays analysts said.

The analysts said that DeepSeek's R1 has shown how open-source reasoning models can drive incredible performance improvements with far less training time, even if this AI lab may have overstated some of its efficiency gains.

"AI model providers are no longer going to need to solely spend 18-24 months pre-training their next expensive model to achieve step-function improvements in performance," the Barclays analysts wrote in a recent note to investors. "With test-time-compute, smaller base models can run repeated loops and get to a far more accurate response (compared to previous techniques)."

Mixture of Experts

A rescued chinchilla is held by a veterinarian at the San Diego Humane Society in Oceanside, California after Hollywood mogul and co-creator of The Simpsons, Sam Simon, financed the purchase of a chinchilla farm in order to rescue over 400 chinchillas and close the Vista, California business August 19, 2014. REUTERS/Mike Blake
Another photo of a chinchilla

Thomson Reuters

When it comes to running new models, companies are embracing other techniques that will likely reduce the amount of computing infrastructure needed.

AI labs increasingly use an approach called mixture of experts, or MoE, where smaller "expert" models are trained on their tasks and subject areas and work in tandem with an existing huge AI model to answer questions and complete tasks.

In practice, this often means only part of these AI models is used, which reduces the computing required, the Barclays analysts said.

Where does this leave the poor Chinchilla?

pet chinchilla drinking water
Yet another photo of a chinchilla.

Shutterstock

The "Chinchilla" approach has worked for the past five years or more, and it's partly why the stock prices of many companies in the AI supply chain have soared.

The Barclays analysts question whether this paradigm can continue because the performance gains from this method may decline as the cost goes up.

"The idea of spending $10 billion on a pre-training run on the next base model, to achieve very little incremental performance, would likely change," they wrote.

Many in the industry also think data for training AI models is running out β€” there may not be enough quality information to keep feeding this ravenous chinchilla.

So, top AI companies might stop expanding this process when models reach a certain size. For instance, OpenAI could build its next huge model, GPT-5, but may not go beyond that, the analysts said.

A "synthetic" solution?

chinchilla
OK, the final picture of a chinchilla, I promise.

Itsuo Inouye/File/AP

The AI industry has started using "synthetic" training data, often generated by existing models. Some researchers think this feedback loop of models helping to create new, better models will take the technology to the next level.

The Chinchillas could, essentially, feed on themselves to survive.

Kinda gross, though that would mean tech companies will still spend massively on AI in the coming years.

"If the AI industry were to see breakthroughs in synthetic data and recursive self-improvement, then we would hop back on the Chinchilla scaling path, and compute needs would continue to go up rapidly," Sandler and his colleagues wrote. "While not entirely clear right now, this is certainly a possibility we need to consider."

Read the original article on Business Insider

Apple commits $500B to US manufacturing, including a new AI server facility in Houston

24 February 2025 at 04:11

The U.S. government is leaning hard on tech companies to make more commitments to building their businesses in the country, and Big Tech is falling in line. On Monday, Apple laid out its own plans in that area: It will spend $500 billion over the next four years in areas like high-end manufacturing, engineering, and […]

Β© 2024 TechCrunch. All rights reserved. For personal use only.

Dell embodied 2 of the corporate world's biggest themes in 2024: AI and RTO. It's paying off.

30 December 2024 at 02:57
Michael Dell, Chairman and CEO of Dell Technologies, is speaking at the ''New Strategies for a New Era'' keynote at the Mobile World Congress 2024 in Barcelona, Spain, on February 27, 2024
Michael Dell, Chairman and CEO of Dell Technologies.

NurPhoto/Getty

  • In 2024, companies were seizing the AI opportunity and calling workers back to the office.
  • Few big businesses embodied those trends more than PC maker and cloud storage provider Dell.
  • BI spoke to the company and analysts about some of Dell's biggest developments over the year.

Dell made its name in the 1990s as the trusty brand for office PCs.

It has since evolved into a major server vendor and data storage provider, but outside tech circles, the company has mostly retained its original reputation.

In the last year, Dell's 40th as a business, it's become clear that another transformation is underway at the Texas-based company, one that positions it as a key player in the AI game.

The company has also embodied another major business trend of the year β€” the RTO movement.

Business Insider spoke to the company and tech analysts about some of Dell's biggest developments over the year.

Dell's AI transformation

Adopting AI has been at the forefront of most business strategies this year, Dell included.

The company rolled out AI across its internal operating model in the summer. It has also made it its mission to help all enterprises do the same.

"Our purpose really is to accelerate the adoption of AI by our customer," Vivek Mohindra, Dell's senior vice president of corporate strategy, told BI.

Bob O'Donnell, president and chief analyst at Technalysis Research, said Dell has been "aggressive" in bringing all the infrastructure and services needed for AI adoption to market.

Dell's product suite, which it refers to as the Dell AI Factory, now includes AI PCs, GPU-enabled servers, storage offerings, networking solutions, and advisory services.

Mohindra said Dell's lineup of PowerEdge servers has doubled this year from five to 10; six are air-cooled, and four are direct liquid-cooled.

They are exactly the kind of energy-efficient, high-density systems that companies require to run their own models on-premises. If Dell's servers can power heavy AI workloads, then its data and cloud-based offering can help streamline and scale data workflows.

Patrick Moorhead and Michael Dell at the SXSW 2024 Conference and Festivals in March 2024.
Patrick Moorhead and Michael Dell at the SXSW 2024 Conference and Festivals in March 2024.

Errich Petersen/SXSW Conference & Festivals via Getty Images

The nuanced offering has helped Dell capture the market of very large customers or "tier-2 CSPs."

Think the likes of Morgan Stanley, Bank of America, Pfizer, or Vultr, explained Patrick Moorhead, CEO and chief analyst at Moor Insights & Strategy.

Moorhead, who has been following Dell for 14 years, said the company had done even better than he expected this year. It is taking advantage of the surge in companies wanting to run their own models and store data on-premises. It has succeeded in optimizing its offering by adding deployment services on top of great engineering, he added.

"It's a clever strategy and it's something that I didn't necessarily expect to see so much success so quickly," said Moorhead. "They're pulling it all together and making it a reality for enterprises."

Dell is also partnering with Silicon Valley's biggest names. It already works closely with Nvidia, Qualcomm, and Intel. In June, it announced that it was providing hardware to power the supercomputer being built by Elon Musk's company, xAI.

In November, Dell and Meta joined forces to provide on-premises AI infrastructure using Llama 2 AI models and Dell hardware.

These partnerships show how much Dell is extending its reach and make a statement that there is opportunity at the company, said O'Donnell.

"The fact that they're able to meet the requirements and demands of somebody like a Meta is a great sign."

Dell and Nvidia-powered quality control technology is monitoring a conveyor belt at Dell's pavilion during the Mobile World Congress 2024 in Barcelona, Spain, on April 3, 2024.
Dell and Nvidia-powered quality control technology is monitoring a conveyor belt in April 2024.

Joan Cros/NurPhoto via Getty Images

The success of this AI strategy was evident in Dell's most recent quarterly earnings.

Revenue from the Infrastructure Solutions Group (ISG) β€” which includes AI servers, storage, and other network capabilities β€” jumped to a record $11.4 billion for the third quarter, a 34% increase on the previous year.

Specifically, servers and networking revenue was up 58% year over year.Β The company's sharesΒ have now soared from below $34 in September 2022 to around $117 in late December 2024, giving it a market capitalization of around $82 billion.

Nobody out there is indestructible, said Moorhead, but Dell's broad offering, strong supply chain, and scalability have set it up for continued success.

"They're one of the few companies in the world that sells all of those pieces. So I think they've positioned themselves pretty well," he said.

Mohindra is just as positive: "As I tell my teams, buckle up because next year, the change is going to be even more accelerated than this year."

RTO

As it rolled out products for the AI future, Dell was also making some big internal changes.

In August, the company implemented a major restructuring of business operations, including a round of layoffs. Dell also pushed a steady RTO policy throughout the year, which was connected to AI.

"As we enter a new AI world, in-person human interaction will be more important than ever," an internal memo sent by executives in September stated.

The policy blocked some workers from promotions and saw workers tracked for their attendance. For more than a decade, Dell had allowed some staff to work remotely leaving many of its employees frustrated by the changes.

BI obtained data on the workforce that showed close to 50% of Dell's full-time workers in the US opted to stay remote.

Dell Round Rock Texas
Dell's HQ in Round Rock, Texas, where employees have been asked to return to this year.

Brandon Bell / Getty

In September, another RTO policy called sales staff back to the office full time. "It became clear to us that there are huge benefits for sales to be together in terms of learning from each other, training, and mentorship," Mohindra told BI.

Several employees told BI that they had heard unofficially from managers that the five-day order would be extended to other departments in 2025. When asked if that was true, Mohindra said Dell is "a continuously learning organization."

Dell was more vocal than most of its competitors about RTO, said O'Donnell and Moorhead, but both analysts did not believe it would have a major impact on the company.

"It doesn't seem like their policies are radically different than what a fair number of tech companies are starting to do," said O'Donnell. "It's not like I think Dell's going to lose a whole bunch of people to HP or Lenovo."

"I think it will be a good thing for growth," said Moorhead, "especially product development."

Read the original article on Business Insider

❌
❌