❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Big Tech is winning the battle of the bulge

A google, Microsoft, and Intel logo being flattened
Microsoft CEO Satya Nadella

Getty images; Tyler Le/BI

  • Microsoft is among the latest to cut middle management jobs.
  • Tech giants like Intel, Amazon, and Google are also flattening structures for efficiency.
  • Experts warn that while flattening can speed decisions, it is possible to take it too far.

Companies are shedding bloated layers of management in an attempt to reduce bureaucracy. Some employees are applauding the move, known as flattening the middle, in the hopes of getting faster and boosting efficiency.

Microsoft said Tuesday it's slashing around 6,000 employees. While the days since have made it clear many of those cut were individual contributor-level engineers, executives previously told BI one motivation behind the recent cuts was to increase managers' "span of control," or the number of reports per manager.

Intel announced a great flattening last month, emphasizing more time in the office, less admin, and leaner teams.

"The best leaders get the most done with the fewest people," said the chip giant's new CEO, Lip-Bu Tan, in a memo to staff.

Amazon has also increased the ratio of individual contributors to managers. They call it a "builder ratio." Google CEO Sundar Pichai told staff late last year that the company cut vice president and manager roles by 10% as part of an efficiency push. Meta has been at it for years, with CEO Mark Zuckerberg writing in a 2023 memo, "flatter is faster."

The risk is that these companies cut too many managers, leaving the remaining folks with too many direct reports.

But for now, it appears to be a risk companies are willing to take.

Agility and expertise

The logic of cutting from the middle to speed up is sound, management experts say.

"You can't go faster and be more connected to a larger ecosystem if you're having to go up and down a hierarchy for every decision," Deborah Ancona, a professor of management at Massachusetts Institute of Technology, told Business Insider.

While some companies have been trying for decades to zap management layers, there's a new urgency to do so. Businesses exist in "an exponentially changing world," Ancona said.

Dell executives explained this to employees earlier this month, when they began reorganizing managers to have more direct reports. The company, whose head count has dropped by 25,000 in two years, also pointed to the influx of artificial intelligence as a reason it needed to move faster.

Ideally, companies would remove layers and spread decision-making throughout the organization so that those closest to customers or technology, for example, could generate ideas and make decisions, Ancona said.

"You're kind of flipping the organization," she said. "Rather than all the ideas coming from on high, you have entrepreneurial leaders who are lower down in the organization coming up with new ideas."

Bayer CEO Bill Anderson is leery of having to run everything up the chain. After taking over the German biotech company in 2023, he began implementing what he calls a "dynamic shared ownership" setup that has cut thousands of managers. Staffers come together in "mini networks" for 90-day stretches to work on projects.

"We hire highly educated, trained people, and then we put them in these environments with rules and procedures and eight layers of hierarchy," Anderson previously told BI. "Then we wonder why big companies are so lame most of the time."

Fewer managers, more reporting, more meetings?

When middle managers are cut and layers condensed, inevitably, more workers report to fewer managers. The logistics of that vary, and the success in terms of morale has a lot to do with the starting point.

Amazon started flattening last year. In September, CEO Andy Jassy ordered a 15% increase in the ratio of individual contributors to managers by March. BI reported that senior Amazon Web Services managers received a memo in January instructing them to restrict high-level hiring and increase their number of direct reports.

An Amazon spokesperson told BI at the time that the memo may have been intended for one team, but does not apply to the company at large. The Amazon spokesperson also referenced a September memo from Jassy on the importance of reducing management layers.

An AWS manager told BI this month that the flatter structure has since put more burden on employees on her team to report on what they're doing day-to-day, in addition to their actual work, since managers have less time to inspect individuals' work.

Plus, this manager said they are spending more time in meetings as they took on a more diverse group of direct reports. The Amazon spokesperson also emphasized that the individual employee's anecdote does not represent the company as a whole.

Yvonne Lee-Hawkins was assigned 21 direct reports when she worked for Amazon's human resources. She told BI that she had to quickly learn new skills to handle the load, like asynchronous work strategies, but her teams' performance suffered as her number of reports grew from 11 to 21 employees.

Weekly one-on-ones β€” the subject of much debate among tech titans β€” became impossible, and she had to cut them in half.

At Microsoft, a half-dozen employees who spoke to BI about the manager flattening trend generally regarded it as a positive step to eliminate inefficient and unnecessary levels of managers. Some managers have as few as one or two reports.

Microsoft ended up with many management layers, the people said, because it often tried to reward good engineers by promoting them to become managers. Often, those engineers-turned-managers still spent most of their time in the codebase and weren't very effective as managers.

Meanwhile, larger groups of direct reports often work better for senior employees, who need less one-on-one time and can do more things in a group setting.

A Microsoft spokesperson did not comment when asked about these factors.

Gary Hamel, a visiting professor at London Business School who lives in Silicon Valley, told BI that pushing managers to take on more direct reports can reduce micromanaging, a common bane of corporate existence.

When managers have a lot of people to oversee, it pushes them to hire people they trust, mentor rather than manage, and give up a "pretty big dose" of their authority.

"Those are all hugely positive things," he said, even if they require "a fairly dramatic change" in how managers see their role.

How many direct reports is too many?

Nvidia CEO Jensen Huang famously has 60 direct reports. Managers at Dell have been told they should have 15 to 20. An AWS document viewed by BI in January mandated no fewer than eight per manager, up from six. An Amazon spokesperson told BI there are no such requirements companywide.

Gallup research indicates that the quality of a manager matters more than the number of direct reports in terms of how well teams perform. That's because more engaged managers tend to lead to more engaged teams. And small teams β€” those with fewer than 10 people β€” show both the highest and lowest levels of engagement because managers can have an outsize effect, for better or worse.

That may explain why some companies seem to thrive with dozens of direct reports per manager and others fail.

The nature of the work matters, too. When work is more complex, it can be harder for managers to oversee too many people.

Managing dozens of people gets harder when "life intersects with work," Ravin Jesuthasan, the global leader for transformation services at the consulting firm Mercer, told BI.

When employees have an issue, they often need someone to talk to about it.

"As a manager, you are the first port of call," he said.

That's one reason, Jesuthasan said, that having something like 20 direct reports would likely be "really hard." For most managers, the couple of dozen direct reports that many tech companies are aiming for is probably the limit, he said.

Strong managers can powerfully boost a company's ability to develop talent and its bottom line. A 2023 analysis from McKinsey & Company, for example, found that organizations with "top-performing" managers led to significantly better total shareholder returns over five years compared with those entities that had only average or subpar managers.

While flattening schemes may be successful at reducing bulk in the middle and speeding up decision-making, they can hinder future growth if they're not well-managed.

Jane Edison Stevenson, global vice chair for board and CEO services at the organizational consulting firm Korn Ferry, told BI that removing layers from a management pyramid can help elevate those high performers. But flatter companies may fail to develop leaders who can pull together the disparate parts of an organization.

At some point, she said, "You've got to start to make a bet on the leaders that are going to have a chance to build muscle across, not just vertically."

Read the original article on Business Insider

A guide to the Nvidia products driving the AI boom and beyond — from data center GPUs to automotive and consumer tech

17 May 2025 at 03:11
A man wearing all black and a leather jacket holds a consumer GPU and a laptop on a stage
Nvidia products, such as GPUs and software, are driving the AI boom.

Brittany Hosea-Small/REUTERS

  • Nvidia products, such as data center GPUs, are crucial for AI, making it the leader in the industry.
  • Nvidia's CUDA software stack supports GPU programming, enhancing its competitive edge.
  • Nvidia's automotive and consumer tech ventures expand its influence beyond data centers.

Nvidia products are at the heart of the boom in artificial intelligence.

Despite starting in gaming and designing semiconductors that touch many diverse industries, the products Nvidia designs to go inside high-powered data centers are the most important to the company today, and to the future of AI.

Graphics processing units, designed to be clustered together in dozens of racks inside massive temperature-controlled warehouses, made Nvidia a household name. They also got Nvidia into the Dow Jones Industrial Average, and put it in the position to control the flow of a crucial but finite resource: artificial intelligence.

Nvidia's first generation of chips for the data center launched in 2017. That first generation was called Volta. Along with the Volta chips, Nvidia designed DGX (which stands for Deep GPU Xceleration) systems β€” the full stack of technologies and equipment necessary to bring GPUs online in a data center and make them work to the best of their ability. DGX was the first of its kind. As AI has become more mainstream, other companies such as Dell and and Supermicro have put forth designs for running GPUs at scale in a data center too.

Ampere, Hopper, Blackwell, and Beyond

The next GPU generation designed for the data center, Ampere, which launched in 2020, can still be found in data centers today.

Though Ampere generation GPUs are slowly fading into the background in favor of more powerful models, this generation did support the first iteration of Nvidia's Omniverse, a simulation platform that the company purports as key to a future where robots work alongside humans doing physical tasks.

The Hopper generation of GPUs is the one that has enabled much of the latest innovation in large language models and broader AI.

Nvidia's Hopper generation of chips, which include the H100 and the H200, debuted in 2022 and remain in high demand. The H200 model in particular has added capacity that has proven increasingly important as AI models grow in size, complexity, and capability.

The most powerful chip architecture Nvidia has launched to date is Blackwell. Jensen Huang announced the step change in accelerated computing in 2024 at GTC, Nvidia's developers conference, and though the rollout has been rocky, racks of Blackwells are now available from cloud providers.

Nvidia's Jensen Huang holds up one of the company's Blackwell chips at the 2024 GTC conference.
Nvidia unveiled its Blackwell chip at the GTC conference in 2024.

Andrej Sokolow/picture alliance via Getty Images

Inside the data center, Nvidia does have competitors, even though it has the vast majority of the market for AI computing. Those competitors include AMD, Intel, Huawei, custom AI chips, and a cavalcade of startups.

The company has already teased that the next generation will be called "Blackwell Ultra," followed by "Rubin" in 2026. Nvidia also plans to launch a new CPU, or traditional computer chip alongside Rubin, which it hasn't done since 2022. CPUs work alongside GPUs to triage tasks and direct the firepower that is parallel computing.

Nvidia is a software company, too

None of this high-powered computing is possible without software and Nvidia recognized this need sooner than any other company.

Development for Nvidia's tentpole software stack, CUDA or Compute Unified Device Architecture, began as early as 2006. CUDA is software that allows developers to use widely known coding languages to program GPUs, since these chips require layers of code to work relatively few developers have the needed skills to program the chips directly.

Still "CUDA developer" is a skillset and there are millions who claim this ability, according to Nvidia.

When GPUs started going into data centers, CUDA was ready and that's why it's often touted as the basis for Nvidia's competitive moat.

Within CUDA are dozens of libraries that help developers use GPUs in specific fields such as medical imaging, data science, or weather analytics.

Nvidia began at home

Just two years after Nvidia's founding, the company released its first graphics card in 1995. For more than a decade, the chips mostly resided in homes and offices β€” used by gamers and graphics professionals.

The current generation includes the GeForce RTX 5090 and 5080, which was released in May 2025. RTX 4090, 4080, 4070, and 4060, were released in 2022 and 2023. GPUs in gaming enabled the more sophisticated shadows, texture, and light to make games hyperrealistic.

In addition to the consumer work stations, Nvidia partners with device-makers like Apple and ASUS to produce laptops and personal computers. Though gaming is now a minority of the company's revenue, the business continues to grow.

Nvidia has also made new efforts to enable high powered computing at home for the machine-learning obsessed. It launched Project DIGITS, which is a personal-sized supercomputer capable of working with some of the largest large language models.

Nvidia in the car

Nvidia is angling to be a primary player in a future where self-driving cars are the norm, but the company has also been in the automotive semiconductor game for many years.

Nvidia's Jensen Huang holds an Nvidia Drive PX Auto-Pilot Computer while giving a speech.
Nvidia first launched its DRIVE PX, for developing autopilot capabilities for vehicles, in 2015.

Kim Kulish/Corbis via Getty Images

It launched Nvidia DRIVE, a platform for autonomous vehicle development, in 2015, and over time it developed or acquired technologies for mapping, driver assist, and driver monitoring.

The company designs various chips for all of functions in partnerships with Mediatek and Foxconn. Nvidia's automotive customers include Toyota, Uber, and Hyundai.

Read the original article on Business Insider

Meta's Llama has reached a turning point with developers as delays and disappointment mount

Mark Zuckerberg, a white man in a grey polo shirt and dark pants sits in a white chair holding a microphone in front of a dark purple background.
Almost a year passed between the release of Meta's Llama 3 and Llama 4. A lot can happen in a year.

AP Photo/Jeff Chiu

  • Meta's Llama 4 models had a lukewarm start and haven't seen as much adoption as past models.
  • The muted reception of Meta's latest models has some questioning its relevance.
  • Developers told Business Insider Llama slipped from the cutting edge, but it still plays a key role.

At LlamaCon, Meta's first-ever conference focused on its open-source large language models held last month, developers were left wanting.

Several of them told Business Insider they expected a reasoning model to be announced at the inaugural event and would have even settled for a traditional model that can beat alternatives like DeepSeek's V3 and Qwen, a group of models built by Alibaba's cloud firm.

A month earlier, Meta released the fourth generation of its Llama family of LLMs, including two open-weight models: Llama 4 Scout and Llama 4 Maverick. Scout is designed to run on a single graphics processing unit, but with the performance of a larger model, and Maverick is a larger version meant to compete with other foundation models.

Alongside Scout and Maverick, Meta also previewed Llama 4 Behemoth, a much larger "teacher model" still in training. It is designed for distillation, which enables the creation of smaller, specialized models from a larger one.

The Wall Street Journal reported on Thursday that Behemoth would be delayed, and that the entire suite of models was struggling to compete. Meta said these models achieve state-of-the-art performance.

Meta's Llama used to be a big deal. But now it's sliding farther down the AI world's leaderboards, and to some, its relevance is fading.

"It would be exciting if they were beating Qwen and DeepSeek," Vineeth Sai Varikuntla, a developer working on medical AI applications, told BI at the conference last month. "Qwen is ahead, way ahead of what they are doing in general use cases and reasoning."

The disappointment reflected a growing sense among some developers and industry observers that Meta's once-exciting open-source models are losing momentum, both in technical performance and developer mindshare.

While Meta continues to tout its commitment to openness, ecosystem-building, and innovation, rivals like DeepSeek, Qwen, and OpenAI are setting a faster pace in areas like reasoning, tool use, and real-world deployment.

Meta aimed to reassert its leadership in open-source AI. Instead, it raised fresh questions about whether Llama is keeping up.

"We're constantly listening to feedback from the developer community and our partners to make our models and features even better and look forward to working with the community to continue iterating and unlocking their value," Meta spokesperson Ryan Daniels told BI.

A promising start

In 2023, Nvidia CEO Jensen Huang called the launch of Llama 2 "probably the biggest event in AI" that year. By July 2024, the release of Llama 3 was held up as a breakthrough β€” the first open large language model that could compete with OpenAI.

Llama 3 created an immediate surge in demand for computing power, SemiAnalysis Chief Analyst Dylan Patel told BI at the time."The moment Meta's new model was released, there was a big shift. Prices went up for renting GPUs."

Google searches containing "Meta" and "Llama" similarly peaked in late July 2024.

Llama 3 was an American-made, open, top-of-the-line LLM. Though Llama never consistently topped the leaderboard on industry benchmarks, it's traditionally been influential β€” relevant.

But that has started to change.

The models introduced a new-to-Meta architecture called "mixture of experts," which was popularized by China's DeepSeek.

The architecture allows the model to activate only the most relevant expertise for a given task, making a large model function more efficiently, like a smaller one.

Llama 4's debut quickly met criticism when developers noticed that the version Meta used for public benchmarking was not the same version available for download and deployment. This prompted accusations that Meta was gaming the leaderboard. The company denied this, saying the variant in question was experimental and that evaluating multiple versions of a model is standard practice.

While competing models paced out ahead, Meta looked rudderless.

"It did seem like a bit of a marketing push for Llama," said Mahesh Sathiamoorthy, cofounder of Bespoke Labs, a Mountain View-based startup that creates AI tools for data curation and training LLMs, previously told BI.

There's no singular resource that can measure which model or family of models is winning with developers. But what data exists shows Llama's latest models aren't among the leaders.

Qwen, in particular, hovers around the top of leaderboards across the internet.

Artificial Analysis is a site that ranks models based on performance, and when it comes to intelligence, it places Llama 4 Maverick and Scout just above OpenAI's GPT-4 model, released at the end of last year, and below xAI's Grok and Anthropic's Claude.

Openrouter offers a platform for developers to access various models and then publishes leaderboards for model use through its own API. It shows Lama 3.3 among the top 20 models used as of early May, but not Llama 4.

"They wanted to cast a wider net and appeal to enterprises, but I think the technical community was looking for more substantial model improvements," Sathiamoorthy said.

More than a model

The standard evaluations of Llama 4 released to the public were lackluster, according to experts.

But the muted enthusiasm for Llama 4, compared to Llama 3, goes beyond the model itself, AJ Kourabi, an analyst at SemiAnalysis focused on models, told BI.

"Sometimes it's not the evals that necessarily matter. It's the tool-calling and capability for the model to extend beyond just being a chatbot," Kourabi said.

"Tool-calling" is a model's ability to access and instruct other applications on the internet or on a user's computer or device. It's essential for agentic AI, which promises to eventually book our airline tickets and file our work expenses.

Meta told BI that Llama models support tool-calling, including through its API in preview.

Theo Browne, a YouTuber and developer whose company, Ping, builds AI software for other developers, told BI that tool-calling is increasingly important as agentic tools are coming into focus, and it is almost a requirement for cutting-edge relevance.

Anthropic was an early leader in this, and other proprietary models like OpenAI are catching up, Browne said.

"Having a model that will reliably call the right tool to get the right information to generate the right response is incredibly valuable, and OpenAI went from kind of ignoring this to seemingly being all in on tools," Browne said.

Kourabi says the biggest indicator that Meta has fallen behind is the absence of a reasoning model, perhaps an even more fundamental element in the agentic AI equation.

"The reasoning model is the main thing, because when we think about what has unlocked a lot of these agentic capabilities, it's the ability to reason through a specific task, and to decide what to do," he said.

Llama: Who is it good for?

Some see Llama 4 as evidence that Meta is falling behind, but like Meta's foundational product, Facebook, AI practitioners say, it's still almost impossible to write it off.

Nate Jones, the head of product at RockerBox, offers advice to young developers through his Substack, YouTube, and TikTok. He encourages them to put Llama and any other models they're intimately familiar with on their rΓ©sumΓ©s.

"In 2025, people will already have Llama in their stack and they will look for people who have worked with it," Jones said.

Paul Baier, the CEO and principal analyst at GAI Insights, consults with companies on AI projects, with an emphasis on non-tech companies. He said Llama is likely to stay in the mix of many, if not most, of his clients.

"Enterprises continue to see that open source is an important part to have in the mix of their intelligence," Baier told BI. Open models, Llama most prominent among them, can handle less complicated tasks and keep costs down. "They want closed and open," Baier said.

And that's what many developers think too, Baris Gultekin, Head of AI at Snowflake, said.

"When our customers evaluate models, they are rarely looking at these benchmarks," Gultekin said. "Instead, they'll evaluate these models on their own problem statement. Given the very low cost, Llama is sufficient."

At Snowflake, Llama powers workloads like summarizing sales call transcripts and extracting structured information from customer reviews. At data platform company Dremio, Llama generates structured query language code and writes marketing emails.

"For 80% of applications, the model probably doesn't matter," Tomer Shiran, cofounder and chief product officer at Dremio, told BI. "All the models now are good enough. OpenAI, Llama, Claude, Gemini β€” they all meet a specific need that the user has."

Llama may be slipping away from direct competition with the proprietary models, at least for now. But other analysis suggests that the field is diversifying, and Llama's role in it is solidifying.

Benchmarks are not what drives model choice a lot of the time.

"Everybody's just testing it on their own use cases," said Shiran. "It's the customer's data, and it's also going to keep changing."

Gultekin added: "They usually make these decisions not as a one-time thing, but rather per use case."

Llama may be losing developers like Browne, who breathlessly await the next toy from a company on the frontier. But the rest of the developer world, the one that's just trying to make AI-powered tools that work, Llama hasn't lost them yet. That means Llama's potential could still be intact.

It's also part of an open-source playbook Zuckerberg has used since 2013, when the company launched React, a library for building consumer interfaces that's still in use.

PyTorch is a machine learning framework created in 2016 that overtook Google's similar effort. Meta transferred PyTorch to the Linux Foundation in 2022 to maintain its neutrality.

"If Meta anchors another successful ecosystem, Meta gets a lot of labor from the open-source community," RockerBox's Jones said. "Zuckerberg gets tailwinds that he wouldn't have had otherwise."

Have a tip? Contact this reporter via email at [email protected] or Signal at 443-333-9088. Use a personal email address and a nonwork device; here's our guide to sharing information securely.

Have a tip? Contact this reporter via email at [email protected] or Signal at +1408-905-9124. Use a personal email address and a nonwork device; here's our guide to sharing information securely.

Read the original article on Business Insider

A guide to Nvidia's competitors: AMD, Qualcomm, Broadcom, startups, and more are vying to compete in the AI chip market

11 May 2025 at 01:57
Nvidia CEO Jensen Huang
Nvidia CEO Jensen Huang and his team got an early foothold in the AI chip market β€” but now, they have competition.

I-Hwa Cheng/AFP/Getty Images

  • Nvidia dominates the AI semiconductor market.
  • Nvidia's dominance stems from early investment in GPUs for AI, starting with CUDA software.
  • AMD, Qualcomm, Broadcom, and startups challenge Nvidia with new AI chip designs and strategies.

Nvidia is undoubtably dominant in the AI semiconductor space. Estimates fluctuate, but the company has more than 80% market share by some estimates when it comes to the chips that reside inside data centers and make products like ChatGPT and Claude possible.

That enviable dominance goes back almost two decades, when researchers began to realize that the same kind of intensive computing that made complex, visually stunning video games and graphics possible, could enable other types of computing too.

The company started building its famous software stack, named Compute Unified Device Architecture or CUDA, 16 years before the launch of ChatGPT. For much of that time, it lost money. But CEO Jensen Huang and a team of true believers saw the potential for graphics processing units to enable artificial intelligence. And today, Nvidia and its products are responsible for most of the artificial intelligence at work in the world.

Thanks to the prescience of Nvidia's leadership, the company had a big head start when it came to AI computing, but challengers are running fast to catch up. Some were competitors in the gaming or traditional semiconductor spaces, and others have started up from scratch.

AMD

AMD CEO Lisa Su holds up one of the company's AI chips during a keynote speech.
AMD, led by CEO Lisa Su, lags behind Nvidia but remains its most serious competitor.

I-Hwa Cheng/AFP via Getty Images

AMD is Nvidia's top competitor in the market for AI computing in the data center. Helmed by its formidable CEO Lisa Su, AMD launched its own GPU, called the MI300, for the data center in 2024, more than a full year after Nvidia's second generation of data center GPUs started shipping.

Though experts and analysts have touted the chip's specifications and potential based on its design and architecture, the company's software is still somewhat behind that of Nvidia, making these chips somewhat harder to program and use to their full potential.

Analysts predict that the company has under 15% market share. But AMD executives insist that they are committed to bringing its software up to par and that future expectations for the evolution of the accelerated computing market will benefit the company β€” specifically, the spread of AI into so-called edge devices like phones and laptops.

Qualcomm, Broadcom, and custom chips

Also challenging Nvidia are application-specific integrated circuits or ASICs. These custom-designed chips are less versatile than GPUs, but they can be designed for specific AI computing workloads at a much lower cost, which have made them a popular option for hyperscalers.

Though multipurpose chips like Nvidia's and AMD's graphics processing units are likely to maintain the largest share of the AI-chip market in the long term, custom chips are growing fast. Morgan Stanley analysts expected the market for ASICs to double in size in 2025.

Companies that specialize in ASICs include Broadcom and Marvell, along with the Asia-based players Alchip Technologies and MediaTek.

Marvell is in part responsible for Amazon's Trainium chips while Broadcom builds Google's tensor processing units, among others. OpenAI, Apple, Microsoft, Meta, and TikTok parent company ByteDance have all entered the race for a competitive ASIC as well.

Amazon and Google

While also being prominent customers of Nvidia, the major cloud providers like Amazon Web Services and Google Cloud Platform, often called hyperscalers, have also made efforts to design their own chips, often with the help of semiconductor companies.

Amazon's Trainium chips and Google's TPUs are the most scaled of these efforts and offer a cheaper alternative to Nvidia chips, mostly for the companies' internal AI workloads. However, the companies have shown some progress in getting customers and partners to use their chips as well. Anthropic has committed to running some workloads on Amazon's chips, and Apple has done the same with Google's.

Intel

A hand holds up Intel's "Gaudi 3" AI chip.
Intel and its Gaudi line of AI chips has struggled to compete with Nvidia and AMD.

Mandel Ngan/AFP via Getty Images

Once the great American name in chip-making, Intel has fallen far behind its competitors in the age of AI. But, the firm does have a line of AI chips called Gaudi that some reports have said can stand up to Nvidia's in some respects.

Intel installed a new CEO, semiconductor veteran Lip-Bu Tan, in the first quarter of 2025 and one of his first actions was to flatten the organization so that the AI chip operations reports directly to him.

Huawei

Though Nvidia's American hopeful challengers are many, China's Huawei is perhaps the most concerning competitor of all for Nvidia and all those concerned with continued US supremacy in AI.

Huang himself has called Huawei the "single most formidable" tech company in China. Reports that Huawei's AI chip innovation is catching up are increasing in frequency. New restrictions from the Biden and Trump administrations on shipping even lower-power GPUs to China have further incentivized the company to catch up and serve the Chinese markets for AI. Analysts say further restrictions being considered by the Trump administration are now unlikely to hamper China's AI progress.

Startups

Also challenging Nvidia are a host of startups offering new chip designs and business models to the AI computing market.

These firms are starting out at a disadvantage, as they don't have the full-sized sales and distribution machines decades of chip sales in other types of tech bring. But several are holding their own by finding use cases, customers, and distribution methods that are attractive to customers based on faster processing speeds or lower cost. These new AI players include Cerebras, Etched, Groq, Positron AI, Sambanova Systems, and Tenstorrent, among others.

Read the original article on Business Insider

I'm nervous about running into delays and other issues at Newark airport. BI's aviation reporter made me feel better.

A United Airlines plane lands at Newark Liberty International Airport in front of the New York skyline on September 17, 2023 in Newark, New Jersey.
To make your life easier if you are flying through Newark, only bring a carry-on if possible, says senior aviation reporter Taylor Rains.

Justin Sullivan/Getty Images

  • Newark airport is facing delays due to ATC staffing and runway construction issues.
  • Business Insider reporter Emma Cosgrove feared delays for an upcoming trip.
  • So she hit up BI's aviation expert, Taylor Rains to find out how much she should really worry.

In a couple weeks I have a long-planned vacation to an idyllic European destination β€” and I'm flying out of and then back into Newark airport.

I'm mostly excited for a break and some sun. I'm less excited about flying out of an airport that's been in the headlines for the last several weeks.

The airport is facing ongoing travel chaos amid Air Traffic Control issues, runway construction, and even a couple instances of equipment outages that have prevented controllers from talking to aircraft. The problems intensified last month when air traffic issues first forced dozens of delays and more than 100 cancellations. Increasingly, travelers are finding themselves stranded in Newark.

I'm not a nervous flyer most of the time, but given the news I'd be lying if I said I wasn't questioning whether I should change my flight as it gets closer.

So I did what journalists do. I consulted an expert. In this case, that's Taylor Rains, Business Insider's intrepid aviation reporter. Spoiler alert: she made me feel a lot better. Here's what she said.

Emma Cosgrove: I'm flying out of Newark airport in the next two weeks and the news is making me nervous. Should I be nervous?

Taylor Rains: Safety wise, no, even despite ATC staffing and equipment issues sounding like a scary thing. The controllers and pilots are professionals and can maintain extremely high levels of safety. Lower staffing will mean slower air traffic arrivals and departures rates, as to specifically not overload any controllers β€” so that's the main issue creating the disruptions regarding ATC. I'm not personally nervous about safety in Newark. The concern travelers should have is if you're connecting to or from Newark with a short layover. You could get delayed and miss that onward flight.

Should I think about changing my flight at considerable expense?

It depends on how important your travel is. People have missed out on nonrefundable hotel nights or train tickets because of delays or cancelations. Right now, I recommend avoiding Newark if your travel is flexible. Because most Newark travelers are United or its partners, that largely means your options are flying from LaGuardia, or switching your Newark layover to another United hub. United doesn't fly from JFK. You'd have to reroute via DC, Houston, Chicago, or Denver.

Say, for example, you're flying some regional city to Europe via Newark. Call United and ask to be rerouted via one of the other hubs. They have loads of international flying from up and down the East Coast and the middle of the country. Right now, their policy is scheduled flights between May 6 and May 23 (booked before May 4) can change their flight for free (no change fee or fare difference).

The flight must be between the same two cities (or from either LaGuardia or Philadelphia), be the same cabin, and be 2 days before or after the original flight, their policy says. If you're flying Spirit or another non-United airline, you'll need to call to see your options about changing.

If you're looking to a book a future flight, I'd recommend going through a different airport altogether, or making sure your Newark layover is long because the delays are not just out of Newark but into Newark, too.

To make your life easier if you are flying through Newark, only bring a carry-on if possible. It makes last-minute changes more flexible because you don't have a checked bag to worry about.

Also, please do not take your frustrations out on airport employees! They are just as stressed as the customers and are bound by the rules of the airline, and they cannot in any way change the weather or speed up controllers. Give everyone grace and pack your patience.

Is there any difference between flying in and out of Newark in terms of safety?

Nope! There's a series of different control facilities going in and out and they all work together. If that center that has had two outages already goes down, pilots are trained to stay the course or their last known clearance. They won't go rogue or panic because they don't hear back from ATC, they're trained for situations like this and can guide themselves if necessary.

Is the situation at Newark really all that unique?

Yes and no. Newark has had years of ATC staffing issues that have created similar problems, it's just compounded right now by the construction of its main runway. That's closed until at least mid-June, so people can expect delays relating to that until then, on top of any other ATC problems and weather. Days of bad summer weather would create a trifecta of issues that will leave people sitting at their gates for likely several hours.

What should I monitor in the days leading up to my flight to have the best info about safety and delays?

People should be checking their airline for updates, so sign up for email and text alerts about delays, cancellations, and gate changes. Check the boards, and honestly, just go in expecting delays so you don't have surprises or disappointments. You can also check the FAA Advisories website. EWR is the airport code for Newark. If you check today, May 9, you'll see 262 average minutes delay for weather and 75 average minutes delay for runway construction. If you're flying on a sunny day, then you'll want to look for the runway construction note and a volume note. The one that says "volume" at the end is referring to overcapacity or staffing issues causing air traffic delays.

Read the original article on Business Insider

AT&T's switch from ChatGPT to open-source AI helped it hang on to thousands of customers

7 May 2025 at 07:30
A graphically treated image with a man walking in front of an At&t store.
AT&T changed AI tools and reaped the benefits.

Getty Images; Alyssa Powell/ BI

  • AT&T uses AI to categorize 40 million customer service calls annually.
  • ChatGPT initially helped but was costly, so AT&T developed a cheaper, faster open-source AI system.
  • This article is part of "How AI Is Changing Everything," a series on AI adoption across industries.

AT&T gets 40 million customer service calls annually. Some callers want to add phone lines, register new addresses, or reschedule appointments, but many have problems to report. Those calls contain valuable information, but extracting it isn't easy.

A person listening to each of them would get a good idea of what new issues are arising and could catch small problems before they grow into big ones. But with thousands of calls coming in each day, that would be an arduous, virtually impossible task.

Transcription has been automated for a while, so AT&T used to do the sorting by hand. But employees had to read millions of summaries and put each call into one of 80 categories to be analyzed for any follow-up actions that could be taken. The ultimate goal is to prevent what consumer-oriented companies call "churn." Essentially, they want to keep the customers from leaving.

Hien Lam, a senior data scientist at AT&T, explained the process during a presentation at Nvidia's GTC Conference in March.

Now, with large language models, AI can ingest the summaries and categorize the calls.

The ChatGPT way

It was pretty simple. AT&T used ChatGPT to read and sort the summaries. It did a good enough job, but Lam's team saw problems coming down the road.

"While the GPT-4 model did produce very high-quality outputs, and we were able to save 50,000 customers annually, it was very expensive," Lam said. Plus, customers of ChatGPT sometimes have to wait for the powerful and expensive chips to run AI systems, called graphics processing units, to become available.

Sorting the calls was a daily task. "If it takes you longer than you can run overnight, then it's not a reasonable workflow," said Ryan Chesler, a principal data scientist at the open-source AI platform H2O.ai, who worked with Lam on the project.

So they set out to create a more flexible system that AT&T could have more control over, but that also cost less.

The open-source way

Lam teamed up with Chesler, working under the theory that if they could stitch together several open-source AI models with different "skills," they could achieve similar results with dramatically lower cost while keeping the company's data private.

First, they distilled GPT-4 into three smaller, open-source models. The most basic model was smart enough to sort roughly a quarter of the categories. A call that mentioned a competitor's name, for example, was easy for a model to identify. A call with a nuanced story about a store team member required a more sophisticated model.

About half the calls could be handled by an open-source model called Danube, a small but powerful model created by H2O.ai. Lam worked with Chesler to fine-tune it to AT&T's needs.

The most complex calls went to Meta's Llama 70B model, which is larger and more costly to run. Open-source models are inherently cheaper, but they're not free to run since they still require computing power. But by only using the larger models when necessary, the team kept costs down.

In fact, the open-source patchwork solution cost 35% of what AT&T was paying to use ChatGPT, with 91% relative accuracy, Lam said. It was also faster.

"Using GPT-4, it took 15 hours to process one day's worth of summaries. In our new solution, it took a little under five hours," Lam said.

Next, they're looking to speed things up even more.

"Because it takes 4Β½ hours for a full day, we are looking to do it real time after you hang up with AT&T," Lam said. "We could get those outputs immediately."

Read the original article on Business Insider

How a bank CEO turned VC investor thinks about AI — and uses ChatGPT to handle emails

3 May 2025 at 02:01
A woman stands at a dais with NYSE on the front. She wears a brown suede jacket and a black shirt. Behind her is a sign that says "Fintech Village NYC 2024"
Rakefet Russak-Aminoach Team8 managing partner

Ohad Kab

  • Rakefet Russak-Aminoach transitioned from corporate banking to venture capital at Team8.
  • She was CEO of Israel's largest bank and founded Israel's first mobile-only bank.
  • Russak-Aminoach told BI that dramatic disruption is coming as a result of AI.

Rakefet Russak-Aminoach started her career as an accountant, and today she's the managing partner of an Israeli startup foundry. Her unconventional path was made even more indirect by multiple stops in C-suites on the way.

She was CEO of KPMG Israel. And then she was the CEO of what is now Israel's largest bank. At Bank Leumi, she orchestrated a massive, disruptive technological shift, becoming Salesforce's first customer in Israel and bringing β€” sometimes dragging β€” the bank into the digital age.

She also founded Israel's first mobile-only bank. But now, she's moved off the corporate ladder and into venture capital as Managing Partner of Team8 β€” a startup foundry.

Russak-Aminoach spoke to Business Insider about how her corporate career informs her tech investing.

This Q&A has been edited for clarity and length.

How did you get into VC?

When I decided to step down from the bank, I didn't have the option that many CEOs have β€” to take on a larger company. Because there wasn't one.

I decided that I wanted to be in the best industry in Israel, and that was tech.

To transform Bank Leumi, I often didn't build the technology; many times, I worked with fintechs. As an organization, we always asked ourselves, 'Should we buy? Or should we build?

This is how I built Israel's first neobank.

I made Bank Leumi much, much more advanced with digital tools, but I wanted to have an app to attract more youngsters to the bank. This became Israel's fastest-growing bank.

So I was very much into the high-tech world. And even more than that, at a certain point, I said, 'How come we don't have a bank tailored to startups in Israel?'

So I built Leumitech, which is a very successful segment of Bank Leumi today.

Two years into my tenure, I had a cybersecurity event at the bank, and I needed help. I met Nadav Zafrir, the CEO of Check Point Software, a cybersecurity company.

The incident itself was terrible, but they were amazing. And I gained a friend. Eventually, when I decided to leave banking, the only thing that was interesting to me was tech.

So he said, 'Why don't you try building Fintech?'. Fast-forward five years, and we lead a group of cyber, fintech, data, enterprise tech, and digital health startups.

Our company, which has 80 employees, 12 partners, has helped build 23 startups and invested in 25 more.

Do you think job losses from AI are going to increase at some point in the future?

Absolutely, because when the tool is there and you don't disrupt yourself, you don't make this change to your organization β€” someone else will. No one can continue to work without AI when there is AI.

Take banks, for example. In any bank, there are so many parts that you can now replace with AI, processes, people, just everything. And if you are a responsible CEO, you must do that.

You have lots of functions that were built with people, and you start to embed AI within them. It takes time to see the result. AI is still new for many, and I don't think we have seen the full impact. It's just the beginning.

In a few years, many parts of financial services β€”and other organizations β€” will be built from scratch with AI agents, and then you will embed the human within.

Does it scare you from a societal standpoint?

No, because I remember that conversation around the digital revolution.

I started my tenure at Bank Leumi with 14,000 people. I ended after seven years with 9,000. And now, five years later, there are 7,000. So people ask me, 'What's going to happen to the workforce?' New jobs. The world is changing, and the way people work is different, and people find different things to do. I don't think that people will be out of work.

I extensively research the credit world because that's my professional background. I think tech can be used to enhance credit underwriting. It will be much more efficient, and it will be less risky because humans make mistakes, and the user experience will be better. So it will be just win, win, win.

And then the question is, what will all of these credit officers do? Something else β€” I'm not worried about that. At this point, I've seen many technological advancements, and it has never gone the wrong way.

How do you use AI?

It's not nice to say, but I don't know how I ever just Googled things. I completely left Google. I only use ChatGPT now, but I'm not sure that it's the best one either. It's just the one I use. It writes my emails. It helps me with every question. I just live with it. And a few months ago, I didn't even know it existed.

Talk me through that. If you have 30 emails to write in a day, what do you do?

I dictate, I don't even write. I'll say, "'I need an email saying XYZ.'" Many of my emails are just an answer to another email. So I copy it and I say, 'Answer this email positively.'

And then, at the end, ChatGPT will ask, 'Do you want it to be more casual, or more polite, or more formal?'. It's incredible.

Read the original article on Business Insider

'Burn the boats': To stay at the bleeding edge, AI developers are trashing old tech fast

27 April 2025 at 02:00
A man in s leather jacket stands in front of two computer servers holding a GPU chip
Nvidia CEO Jensen Huang

JOSH EDELSON/AFP via Getty Images

  • Hardware, software, and models are constantly evolving in AI.
  • When tech becomes obsolete, developers say it's often best to 'burn the boats,' or trash it.
  • The idea is a mindset shift for big enterprises, but it could be essential for AI success.

It's not uncommon for AI companies to fear that Nvidia will swoop in and make their work redundant. But when it happened to Tuhin Srivastava, he was perfectly calm.

"This is the thing about AI β€” you gotta burn the boats," Srivastava, the cofounder of AI inference platform Baseten, told Business Insider. He hasn't burned his quite yet, but he's bought the kerosene.

The story goes back to when DeepSeek took the AI world by storm at the beginning of this year. Srivastava and his team had been working with the model for weeks, but it was a struggle.

The problem was a tangle of AI jargon, but essentially, inference, the computing process that happens when AI generates outputs, needed to be scaled up to quickly run these big, complicated, reasoning models.

Multiple elements were hitting bottlenecks and slowing down delivery of the model responses, making it a lot less useful for Baseten's customers, who were clamoring for access to the model.

Srivastava's company has access to Nvidia's H200 chips β€” the best, widely available chip that could handle the advanced model at the time β€” but Nvidia's inference platform was glitching.

A software stack called Triton Inference Server was getting bogged down with all the inference required for DeepSeek's reasoning model R1, Srivastava said. So Baseten built their own, which they still use now.

Then, in March, Jensen Huang took to the stage at the company's massive GTC conference and launched a new inference platform: Dynamo.

Dynamo is open-source software that helps Nvidia chips handle the intensive inference used for reasoning models at scale.

"It is essentially the operating system of an AI factory," Huang said onstage.

"This was where the puck was going," Srivastava said. And Nvidia's arrival wasn't a surprise. When the juggernaut inevitably surpasses Baseten's equivalent platform, the small team will abandon what they built and switch, Srivastava said.

He expects it will take a couple of months max.

'Burn the boats'

It's not just Nvidia making tools with its massive team and research and development budget to match. Machine learning is constantly evolving. Models get more complex and require more computing power and engineering genius to work at scale, and then they shrink again when those engineers find new efficiencies and the math changes. Researchers and developers are balancing cost, time, accuracy, and hardware inputs, and every change reshuffles the deck.

"You cannot get married to a particular framework or a way of doing things," said Karl Mozurkewich, principal architect at cloud firm Valdi.

"This is my favorite thing about AI," said Theo Brown, a YouTuber and developer whose company, Ping, builds AI software for other developers. "It makes these things that the industry has historically treated as super valuable and holy, and just makes them incredibly cheap and easy to throw away," he told BI.

Browne spent the early years of his career coding for big companies like Twitch. When he saw a reason to start over on a coding project instead of building on top of it, he faced resistance, even when it would save time or money. Sunk cost fallacy reigned.

"I had to learn that rather than waiting for them to say, 'No,' do it so fast they don't have the time to block you," Browne said.

That's the mindset of many bleeding-edge builders in AI.

It's also often what sets startups apart from large enterprises.

Quinn Slack, CEO of AI coding platform Sourcegraph, frequently explains this to his customers when he meets with Fortune 500 companies that may have built their first AI round on shaky foundations.

" I would say 80% of them get there in an hourlong meeting," he said.

The firmer ground is up the stack

Ben Miller, CEO of real estate investment platform Fundrise, is building an AI product for the industry, and he doesn't worry too much about the latest model. If a model works for its purpose, it works, and moving up to the latest innovation is unlikely to be worth the engineers' hours.

"I'm sticking with what works well enough for as long as I can," he said. That's in part because Miller has a large organization, but it's also because he's building things farther up the stack.

That stack consists of hardware at the bottom, usually Nvidia's GPUs, and then layers upon layers of software. Baseten is a few layers up from Nvidia. The AI models, like R1 and GPT-4o, are a few layers up from Baseten. And Miller is just about at the top where consumers are.

"There's no guarantee you're going to grow your customer base or your revenue just because you're releasing the latest bleeding-edge feature," Mozurkewich said. "When you're in front of the end-user, there are diminishing returns to moving fast and breaking things."

Have a tip? Contact this reporter via email at [email protected] or Signal at 443-333-9088. Use a personal email address and a nonwork device; here's our guide to sharing information securely.

Read the original article on Business Insider

China's AI growth will be 'largely unaffected' by chip export rules, analysts say

23 April 2025 at 15:05
Huawei sign with people walking by

Tingshu Wang/Reuters

  • Banning Nvidia chips won't halt China's AI progress, analysts say.
  • Chinese firms are reducing reliance on Nvidia, finding alternatives like Huawei.
  • Banning H20 chip exports would 'make no sense,' according to Bernstein.

Banning the export of Nvidia chips is unlikely to stymie China's development of advanced AI, according to Bernstein analysts.

Nvidia notified investors in a new regulatory filing last week that it expects the Trump administration to require a license for exporting the type of powerful semiconductors used to build AI products to China. Analysts widely interpreted the license requirement as an export ban.

The US chip firm said it would incur $5.5 billion in charges related to inventory, purchase commitments, and reserves for its H20 chip model in the first quarter, which ends on April 27.

Nvidia designed its H20 chip to exactly fit with Biden administration limits on the power of chips that could be sold to Chinese companies, the aim of which was to curb China's AI progress. (A new congressional inquiry takes issue with this reaction to the regulations.)

"Banning the H20 would make no sense as its performance is already well below Chinese alternatives; a ban would simply hand the Chinese AI market completely over to Huawei," Bernstein analysts wrote in a note to investors Wednesday.

How Chinese AI progressed despite chip limits

Chinese companies have been reducing their reliance on Nvidia chips, according to the analysts. To do so they have found ways to perform model training on unrestricted edge devices, like personal computers and laptops. They've also moved much of the inference workloads, the AI-generated responses and actions, to Nvidia alternatives.

Chinese companies have also engineered ways for chips designed by their homegrown tech giant, Huawei, or other locally made chips, and Nvidia chips to be networked together, though software remains a challenge in fully converting from chip to chip.

"Our channel checks have shown that most companies are able to carry on without H20 chips," the analysts wrote.

Chinese companies with revenue from foundation model subscriptions β€” similar to US firms OpenAI or Anthropic β€” will have the hardest time converting from Nvidia chips to alternatives, since training models is more dependent on Nvidia's proprietary software CUDA.

One Chinese company required 200 engineers and six months to move a model from the Nvidia platform to Huawei chips, and it still only reached 90% of the previous performance, according to Bernstein.

Huawei presents the most formidable challenge to Nvidia in China.

"In the longer run, expect Huawei to keep closing the gap in performance and Chinese foundational models making up for compute deficiency with Deepseek-like innovation," the analysts wrote.

Chip supply, though, is likely to be constrained for the foreseeable future, they added, as Huawei, like most major players in the AI chips game, is somewhat dependent on production from Taiwan Semiconductor Manufacturing Company.

Read the original article on Business Insider

Nvidia is the original hardcore tech company. Alumni say CEO Jensen Huang's demanding pace reigns.

21 April 2025 at 02:00
Nvidia CEO Jensen Huang talks to a robot at the company's AI conference on March 18, 2025.
Nvidia CEO Jensen Huang talking to a robot at the company's AI conference.

JOSH EDELSON / AFP

  • Nvidia has a reputation for a demanding work culture.
  • Yet it has stayed relatively flexible as other tech firms return to the office and cut perks.
  • Huang's fast pace, long days, and streamlined five-point emails help to maintain accountability.

Nvidia doesn't need a big cultural shift to get workers to be hardcore. They've been there for years.

Companies like Shopify, Microsoft, and Meta are ramping up the intensity for workers, pushing the need to get ahead in AI and drive efficiency. The shift inside tech companies has led to a culling of "low performers," inflexible return-to-office mandates, and a reduction in perks.

Nvidia's staff has grown immensely in the past few years, and its market capitalization is on a wild roller coaster ride. But the tentpoles of the company's culture go back much further than the AI boom. The company and its CEO, Jensen Huang, are also the subject of two books released in the past four months, which corroborate what former Nvidians have told Business Insider.

Nvidia has a demanding work culture that trickles down from its famous CEO, providing a foil for the tech firms that aspire to be hardcore but do so by fiat.

"Basically, every single person in Nvidia is directly accountable to Jensen," said Stephen Witt, the author of "The Thinking Machine: Jensen Huang, Nvidia, and the World's Most Coveted Microchip."

Nvidia declined a request for comment from BI.

'The mission is the boss'

Nvidia has an extremely horizontal structure, with dozens of people β€” about 60 β€” reporting directly to Huang.

He sets the direction and the goal, but the Santa Clara, California-based company also has a defining mantra: "The mission is the boss," author Tae Kim wrote in his book "The Nvidia Way: Jensen Huang and the Making of a Tech Giant."

Nvidia shies away from short-term goals, Kim said. There is a central goal or mission, but planning and strategizing are constant processes that don't focus on management incentives or satisfying a hierarchy.

Project leaders may suddenly find themselves reporting directly to Huang. These newly anointed direct reports are dubbed "pilots in charge" and are subject to his wrath and carry his weight, Kim said.

According to a former Nvidia employee who asked to remain anonymous to discuss internal matters, everyone in the company must be prepared to answer Huang in detail.

"His ability to track small details across countless projects is incredible," a former director told BI.

This method of extreme accountability means Nvidia hasn't had to rein in employees as many other companies have post-pandemic. Nvidia is still remote-friendly, for example. But meetings are far from relaxing.

Huang is known to publicly discuss failures and disagreements to benefit the group rather than spare feelings. If he suspects someone isn't on top of their work, a public, cross-examination may ensue. Perks are few, but that's always been the case, two former Nvidians told BI.

The "mission is boss" ethos helps Nvidia avoid the pitfalls of large firms, which often struggle to make quick decisions, let alone pivot when needed, Kim wrote.

"Jensen really doesn't tolerate bullshit," a former engineer from Nvidia's early days told BI. This intolerance makes playing politics nearly impossible, they said.

"It's not just, 'You did something wrong.' It's, 'You did something wrong that was self-serving' β€” that's the typical problem in big companies," they said.

The philosophy is that the mission can change, but as long as everyone serves it rather than their manager, the company should thrive. Nvidia's pivot to focus on machine learning was even communicated in a companywide Friday-night email in 2014. By Monday, Nvidia was an AI company, Witt wrote.

Email accountability

Huang is known to send more than 100 emails a day, which brings another Jensen-ism into play. (Kim's book has an entire appendix of "Jensen-isms")

The 62-year-old CEO often refers to Nvidia's modus operandi as "speed of light." That's how fast Huang wants everything at Nvidia to progress. He's publicly used the phrase to refer to everything from hiring processes to fixing technical problems.

Witt thinks that Nvidia's email culture was possibly an inspiration for a memorable moment from the early days of DOGE. On a Saturday, Elon Musk requested that every federal government employee send a five-point email recounting what they had done that week. Jensen Huang has requested these emails from his staff since 2020.

According to Kim, at least 100 of Huang's daily diet of emails are "top five" emails.

'Nowhere to hide'

The irony in Nvidia's position among the Silicon Valley elite is that the "mission is boss" mentality and the constant email status updates mean a lot of Nvidians have flexibility that most of Big Tech, including Nvidia's largest customers, have abandoned in 2025.

The hours can be long at Nvidia, which also stems from Huang. Sixty-hour weeks are the norm, and 80-hour weeks are likely at crucial times, offering contrast to companies that feel the need to delineate exact office hours.

"I don't even know when Jensen sleeps," another former Nvidia director said.

Many Nvidians are still able to work from wherever they like. The reason is two-fold, Witt said.

"One of the reasons he's so big on work-from-home is because it gives women, and especially young mothers, the opportunity to continue their work without their careers getting interrupted," Witt said.

Inspired by his wife, Lori Huang, a brilliant electrical engineer who dropped out of the workforce after becoming a mother of two, Huang is aware that some valuable engineering brains find balancing work and family difficult.

"It works really well at Nvidia," Witt said. "You know if you're dropping the ball at Nvidia, the spotlight is turning directly at you, more or less instantly. There is nowhere to hide if you are shirking your work at Nvidia, and I think that makes work-from-home work better for them."

Nvidia for life

If there's one hallmark of the new era of hardcore tech culture, it's layoffs. Rolling layoffs are constantly whirring in the background of tech workplaces in 2025.

That's where Nvidia fully diverges from the pack.

The company hasn't had layoffs since 2008, and despite the hard-charging atmosphere rife with accountability, the turnover at the company is minuscule β€” under 5% annually for the past two years.

Witt said that's in part due to a self-selection dynamic. Engineers who like a no-nonsense atmosphere where technological supremacy is the focus naturally gravitate toward the company.

"He can get these guys to work for Nvidia on little more than a dream, but those guys will do it because they know the circuits, they know the technology. And they know that Jensen's always at the cutting edge, even if it's not making money. They'll do anything to be at the cutting edge," Witt told BI.

But another reason many Nvidians spend decades with the company could come from Huang's competitive anxiety.

"When Nvidia is evaluating an engineer, they won't think just about what they're worth. They'll think about what it's worth to keep that person away from the competition," Witt said.

Huang, though, has offered a different explanation.

"I don't like giving up on people because I think that they could improve," Huang said at a Stripe event last year. "It's kind of tongue in cheek, but people know that I'd rather torture them into greatness."

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

AMD's CTO says AI inference will move out of data centers and increasingly to phones and laptops

19 April 2025 at 02:00
A white man in a blue plaid jacket and black pants gestures on a stage in front of a background that appears to be blueprints.
Mark Papermaster is CTO of AMD.

2025 ARPA-E Energy Innovation Summit

  • AMD sees the AI inference shift as a chance to grab market share from Nvidia.
  • AI inference will move from data centers to edge devices, like smartphones, AMD's CTO says.
  • Mark Papermaster expects an AI 'killer app' in the next three to six years.

The lion's share of artificial intelligence workloads moving from training to inference is great news for AMD, its CTO said.

AI training workloads β€” the ones that make up the gargantuan task of building large language models, imbuing them with a familiar writing or speaking style, and knowledge β€” used to be most of what AI computing was used for. Inference is the computing process that happens when AI generates outputs like answering questions or creating images.

It's hard to pin down exactly when the switch happened β€” probably some time last year. But inference is now and will likely stay the largest segment of accelerated computing going forward. Since then, AMD executives have been hyping up a window of opportunity to wrest market share from Nvidia.

"People like the work that we've done in inference," CEO Lisa Su said on the company's February earnings call.

AI at scale is all about inference.

If you ask Mark Papermaster, AMD's Chief Technology Officer, where it all goes from there, he'll tell you that as inference grows, it's headed for the edge.

"Edge devices" are the industry term for computers that live outside the data center. Our phones and laptops all qualify, but so could smart traffic lights or sensors in factories. Papermaster's job is to make sure AMD is headed in the right direction to meet the demand for AI computing across devices as it grows.

AMD has had to play catch-up in the data center since Nvidia's 10-year head start. But at the edge? The field is more open.

Business Insider asked Papermaster what he thinks the future of handheld AI looks like.

This Q&A has been edited for clarity and length.

What's the most prominent use for AI computing in edge devices like laptops and phones?

The use case you're starting to see is local, immediate, low-latency content creation.

Why do we use PCs? We use them to communicate, and we use them to create content. As you and I are talking β€” this is a Microsoft Teams event β€” AI is running underneath this. I could have a correction on it such that if I look side to side, you just see me centered. That's an option. I can hit automatic translation β€” you could be in Saudi Arabia and not speak any English, and we could have simultaneous translation once these things become truly embedded and operational, which is imminent.

It's truly amazing what's coming because just locally on your PC, you'll be able to verbally describe: 'Hey, I'm building a PowerPoint. I need this. I need these features. I'm running Adobe. This is what I want.'

Today, I've got to go back to the cloud. I've got to run the big, heavy compute. It's more expensive and it takes more time.

That's the immediate example that's front and center, and this is why we've invested heavily in AI PCs. That's imminent from Microsoft and others in the next six months.

The other application that we're already seeing is autonomous anything. It starts with cars, but it's way beyond cars. It's the autonomous factory floor.

OK, say it's 2030 β€” how much inference is done at the edge?

Over time, it'll be a majority. I can't say when the switch over is because it's driven by the applications β€” the development of the killer apps that can run on edge devices. We're just seeing the tip of the spear now, but I think this moves rapidly.

You might consider phones as an analogy. Those phones were just a nice assist until the App Store came out and made it really easy to create a ton of applications on your phone.

Now, things that used to always be done with more performant computing could be done more locally. Things that were done in the cloud could be done locally. As we start to get killer applications, we're going to start to see that shift go very rapidly. So it's in the next three to six years, no doubt.

I keep running into examples that suggest the way models are getting better is to just keep piling on more inference compute.

How do you know that three years from now, there's not going to be some breakthrough that makes all these devices being designed now completely out of date?

Everything you're describing is to gain even more capability and accuracy. It doesn't mean that what we have is not useful. It's just going to be constantly improving, and the improvement goes into two vectors.

One vector is becoming more accurate. It can do more things, and typically drives more compute. There's an equal vector that runs in parallel, saying, 'How could I be more optimized?'

I call it the DeepSeek moment. It sort of shook the world. Now you have everybody β€” Microsoft, Meta, Google β€” making their models more efficient. So you have both examples where it's taking more and more compute and examples where there's innovation driving more and more efficiency. That's not going to change.

Read the original article on Business Insider

Nvidia probed over how its chips may have been obtained by DeepSeek, which US lawmakers accused of spying for China

Nvidia and DeepSeek logos
US lawmakers are looking into how DeepSeek may have gotten Nvidia chips.

Jakub Porzycki/NurPhoto via Getty Images

  • US lawmakers are looking into how DeepSeek may have gotten Nvidia chips despite export controls.
  • They also accused DeepSeek of funneling American user data to the Chinese government.
  • The lawmakers urged stricter export controls to limit China's AI advancements and data access.

US lawmakers are looking into how advanced Nvidia chips may have gotten into the hands of the Chinese AI company DeepSeek, which they also accused of spying on Americans on behalf of China.

House Representatives released a report on Wednesday that they said "reveals that DeepSeek covertly funnels American user data to the Chinese Communist Party, manipulates information to align with CCP propaganda, and was trained using material unlawfully obtained from US AI models."

The lawmakers β€” Reps. John Moolenaar, a Republican from Michigan, and Raja Krishnamoorthi, a Democrat from Illinois β€” said it appeared DeepSeek, which released a powerful AI model that made headlines in January, had used 60,000 chips from Nvidia despite US sanctions limiting the ability of the company to sell some of its hardware to China.

Nvidia is already having a tough week. Its stock fell nearly 7% on Wednesday after the company announced that it had been informed that the Trump administration would require a new license for all accelerated chips shipping to China. The company said it expected a $5.5 billion decrease in earnings due to the Trump administration's tariffs.

DeepSeek and a representative for Moolenaar did not respond to requests for comment from Business Insider about the report.

"DeepSeek isn't just another AI app β€” it's a weapon in the Chinese Communist Party's arsenal, designed to spy on Americans, steal our technology, and subvert US law," Moolenaar said in a statement, which called DeepSeek a "serious national security threat" to the US.

The lawmakers said Nvidia CEO Jensen Huang directed the design of chips to get around US export controls.

They also sent a letter to Huang requesting lists of customers located in China and Southeast Asia and any communications between Nvidia and DeepSeek.

Nvidia said in a statement to Business Insider that "the US government instructs American business on what they can sell and where β€” we follow the government's directions to the letter."

The company also said it sells its products to companies worldwide, adding that its reported Singapore revenue indicates the billing addresses of its clients, many of which the company said are subsidiaries of US companies.

"The associated products are shipped to other locations, including the United States and Taiwan, not to China," the statement said.

The lawmakers' report also found it was likely DeepSeek had deployed methods to copy leading AI models from US companies, violating those companies' terms of service.

OpenAI told lawmakers "DeepSeek employees circumvented guardrails in OpenAI's models" to accelerate the development of its own models at a lower cost, according to the report.

OpenAI said in January it was investigating if DeepSeek used the outputs of its AI models to "inappropriately" train its own models.

The report also found that 85% of responses from DeepSeek models purposefully suppress content related to democracy, Taiwan, Hong Kong, and human rights, the statement said.

The recommendations in the report include increasing the effectiveness of US export control policy and further restricting China's ability to develop and deploy advanced AI models by expanding export controls on chips.

They also encourage Congress to consider requiring that chips companies track the eventual user of their products, not just the purchaser.

Read the original article on Business Insider

Nvidia could be hit hard by the new chip export license. Analysts warn the big decision is still to come.

16 April 2025 at 13:48
Jensen Huang holding up a chip at the CES in Las Vegas
Nvidia CEO Jensen Huang

Patrick T. Fallon for AFP via Getty Images

  • Nvidia faces new export license rules for selling chips to China and other countries.
  • The Trump administration's decision could impact Nvidia's revenue. Its stock sank Wednesday.
  • Analysts predict this move by the Trump administration could bring better news in the near future.

Wall Street analysts had some choice words for the latest shake-up to Nvidia's regulatory landscape β€” "disruptive," "surprise," and "abrupt," just to name a few. Bernstein analysts went so far as to say "The Trump rug remains in full effect."

New rules regarding Nvidia's Chinese business surprised many company stakeholders this week. On Tuesday, after market close, the company announced that it had been informed that the Trump administration would require a new license for all accelerated chips shipping to China and a small group of other countries including Russia.

Nvidia said it would take a charge of up to $5.5 billion in inventory, purchase commitments, and reserves in the first quarter, which ends on April 27.

"Based on our discussions, this is effectively a ban," wrote UBS analysts in a note to investors Tuesday.

Even those analysts unwilling to read the disclosure as a full-on ban said any licensing process is likely to be lengthy, so revenue from Nvidia's H20 chip, the one the company designed specifically to meet Biden-era export restrictions, is expected to be minimal for the foreseeable future.

"This is not a ban; it's a licensing requirement, but again, the inventory write-down suggests that the company is not optimistic about being granted licenses," Morgan Stanley analysts wrote.

At the time of writing, the regulation has not appeared in the Federal Register or the Department of Commerce website, so all analyst reactions are related to Nvidia's disclosure. The company's stock was down more than 7% from Tuesday to Wednesday market close.

A spokesperson from Nvidia declined to comment.

China chips are big money for Nvidia

Nvidia priced the charges it will likely incur in the first quarter (ending April 27) at $5.5 billion. However, there was no warning about the company's first-quarter results, which will be announced on May 28. Though China sales will almost certainly be lower than expected, several analysts expect the company may still be able to meet revenue targets for the first quarter.

"Given the strong demand for H200 chips since DeepSeek's launch, we think NVDA could offset somewhat lost China H20 revenues," BNP Paribas analysts wrote. The same analysts estimated Nvidia's China data center business constitutes 10% to 12% of Nvidia's total revenue.

UBS suggested earnings per share would fall by 20 cents, and Morgan Stanley analysts expect 8% to 9% of data center revenues to disappear in the near term.

A decrease in ongoing sales to China was already expected.

Restrictions on what the company is allowed to sell to China are not new. Hence the company has attempted to reduce its reliance on that market over the last two years. Since the H20 is only relevant to the Chinese market, oversupply won't affect Nvidia's sales of any other chips, Morgan Stanley wrote.

Bigger decisions down the road

Since Nvidia's chips are the most expensive in existence, and buyers still keep lining up, tariffs are less of a concern than export restrictions, Morgan Stanley analysts said. Beyond the Chinese market, there are bigger potential impacts looming.

The Biden administration's AI diffusion rules are set to go into effect next month β€” and could have an even more material impact on Nvidia if enacted as is, since they restrict exports to many more countries such as Singapore, Mexico, Malaysia, UAE, Israel, Saudi Arabia, and India.

Since the White House and Nvidia have demonstrated some cooperation this week, with the Trump administration celebrating the company's announcements around expanded US manufacturing, analysts have converged around a theory about what comes next.

"We are optimistic that the company's demonstrably good relationship with the government, as Trump tweeted yesterday, will mitigate these concerns," Morgan Stanley analysts wrote.

Read the original article on Business Insider

Nvidia dips its toes further into the 'American-made' waters

14 April 2025 at 08:37
Nvidia CEO Jensen Huang
Nvidia, led by founder and CEO Jensen Huang, announced on April 14 that it planned to begin making some AI chips in the US.

I-Hwa Cheng/AFP/Getty Images

  • Nvidia has made its manufacturing debut in the US and plans to expand its presence.
  • The chipmaker announced it plans to "build and test NVIDIA Blackwell chips in Arizona and AI supercomputers in Texas."
  • CEO Jensen Huang recently visited with Trump.

Nvidia is going (partially) American-made.

The company announced on Monday that it planned to begin manufacturing some of its products in the US for the first time.

Nvidia said it's already begun production of Blackwell chips at TSMC's facilities in Phoenix while the company works to construct "supercomputer manufacturing plants" in Texas β€” partnering with Foxconn in Houston and Wistron in Dallas.

Foxconn, the world's largest electronics manufacturer, is a partner and a customer of Nvidia's. The firm, most known for its role in Apple's supply chain, has had on-again, off-again expansion projects in Wisconsin and Ohio and has been working with Nvidia on domestic Blackwell production since last year.

Wistron, another Taiwanese electronics manufacturer, also uses Nvidia technology in its factories.

Nvidia said mass production at the Texas plants should begin within the next year or so.

"Within the next four years, NVIDIA plans to produce up to half a trillion dollars of AI infrastructure in the United States through partnerships with TSMC, Foxconn, Wistron, Amkor and SPIL," its announcement said.

Amkor, an American firm, assembles and tests semiconductors. And SPIL, or Siliconware Precision Industries Co., is a Taiwanese packaging and testing firm.

Most semiconductors were already exempted from Trump's round of "Liberation Day" tariffs, levies that he has since largely backed away from with a 90-day pause, with the exception of duties on Chinese imports. And perhaps thanks to Nvidia CEO' Jensen Huang's political maneuvering at Mar-a-Lago, the administration has also avoided further export controls on the company's H20 AI chip, which was tailor-made to comply with Biden-era restrictions on the performance of hips sold to China.

Sparing chips alone, though, wouldn't have been enough to insulate data centers as the AI "ecosystem" requires more than chips to function and grow.

And Trump said over the weekend that semiconductors will ultimately not escape tariffs, with Commerce Secretary Howard Lutnick promising new duties in "a month or two."

Huang has previously said that Nvidia has been preparing to "manufacture onshore" and that its supply chain remains "agile."

"We are now running production silicon in Arizona," Huang said last month at the company's annual GTC conference. "And so, we will manufacture onshore. The rest of the systems, we'll manufacture as much onshore as we need to."

Now, it's the US's turn to see production of the "engines of the world's AI infrastructure," Huang said in Monday's announcement. Ramping to production of Blackwell chips at scale will take 12-15 months, the company said.

"Adding American manufacturing helps us better meet the incredible and growing demand for AI chips and supercomputers, strengthens our supply chain and boosts our resiliency," Huang said.

Read the original article on Business Insider

Nvidia's healthcare boss talks changes coming to doctors' offices — and which companies may get a boost and lower costs

14 April 2025 at 02:00
Kimberly Powell speaking onstage at the HLTH 2024 conference, with a large HLTH sign behind her.
Kimberly Powell Nvidia healthcare general manager

HLTH

  • Nvidia aims to integrate AI into every industry.
  • Medical imaging firms could be early adopters of AI in healthcare, said Nvidia GM Kimberly Powell.
  • Powell named two healthtech companies that could surge from AI.

Nvidia's vision of the future puts AI in every corner, crack, and crevice of the healthcare industry. But getting there may not be easy, perhaps, especially in the US.

The graphics processing unit, Nvidia's breakthrough chip that enables most of AI today, is a powerful tool. but it can't fix antiquated systems, regulations made in the paper era, and privacy concerns from both doctors and patients.

Still, Nvidia CEO Jensen Huang is famously drawn to what he calls "zero-billion-dollar markets," where he can build technology from scratch and educate the industry it fits into. In healthcare, that job falls to Nvidia General Manager for healthcare, Kimberly Powell.

Business Insider spoke to Powell about the realities of integrating AI into the sometimes intransigent healthcare industry. Powell has worked at Nvidia for 17 years, and healthcare, she has said, is one of the company's largest oppounities.

This Q&A has been edited for clarity and length.

What's one area of the healthcare industry you spend a lot of time in?

One of our entry points into healthcare was medical imaging. When you go see your doctor, you're usually going in with some kind of symptom. And the first thing that happens is you go get imaged β€” they're going to look for something.

Radiology is a very step-by-step process: Set up the machine, capture the right image, and make sure the quality is good. Can I do some analysis before it gets to the radiologist? Should I circle this thing?

So you can add AI at a bunch of those stages to improve the whole workflow.

As I learn more about Nvidia's full-stack I notice some partners that have been in the niche supercomputer space and have grown immensely alongside Nvidia. Are there any companies in healthcare that you feel are way under the radar but will become incredibly important because of AI?

I think GE is one of them, which is why we partner with them. Diagnostic imaging is a $50 billion industry that only serves one-third of the population. If it were more autonomous, it could serve the entire population, perhaps, and be three times as large.

Some of these companies are 100 years old. They're used to selling hardware. Their big transformation over the last decade of working with them has been how to not just read the sensor data so that another human can read it, but enhance the sensor data to see more things. De-noise it, or enhance, it or add AI on top of it.

Another company most people don't know is IQVIA.

IQVIA is a clinical research company. For decades, it has conducted clinical trial research for the pharmaceutical industry.

That means they're working inside healthcare systems, as you're doing trials, you're collecting all this information about patients and their electronic health records, their labs, their imaging, what they ate for breakfast β€” that kind of stuff.

They've essentially created a data network because they serve 10s of 1000s of customers. So they have a data network that now all of a sudden can get an agentic AI β€” an intelligence layer β€” put on top of it to start offering services.

What they've done in the past is taken their data and they've sold it to Company A, Company B, and Company C, so that A, B, and C would take it in-house and try to build their own intelligence layer on top, recreating the wheel at every single one and doing it for different purposes.

They now have an opportunity to take a very pristine dataset and overlay an intelligence layer on it. IQVIA is like the SAP and ServiceNow of the healthcare industry.

A lot of the examples of AI in healthcare sound like they could change the costs. Imaging would have to be billed very differently if it was so ubiquitous in the way you're describing. Do you think about how the costs are going to shift? It feels like a pretty radical change.

It is a radical change, and it will take radical change across a very complicated economic system.

There's lots of conclusive evidence that if you catch, for example, some levels of cancer at stage one versus stage four, they're preventable and the cost to the system is drastically changed.

But we're still in a very 'treat the sick' rather than 'keep them healthy' mode. If we could have systems that can catch things earlier, then that should create a calculation that would make a lot of sense.

I'm wondering if you spend any time talking to ordinary people about AI and healthcare.

Yeah, I'm curious to know how people would interpret having a phone listen to their doctor because, on the one hand, they think about all these privacy concerns, but on the other hand, when I explain the benefits, they realize β€” 'Oh my God, that would change the experience forever for good.' It's just, sometimes it seems invasive.

But they don't know the 55 positive byproducts of AI being in the system. I pretty much live and breathe healthcare. So I'm talking about it all the time.

I was talking to my 9-year-old daughter, for example, and she was like, 'Why do you want a robot to do that?' And it's like, well, "Just like you get tired at night and your handwriting gets worse, you know, surgeons also get tired, and so if a robot could be there to help them." These are the conversations that I have. We're all patients, right?

Read the original article on Business Insider

Jensen Huang shot down comparisons to Elon Musk and yelled at his biographer. The author told BI what Huang is like.

8 April 2025 at 10:32
Jensen Huang standing in leather jacket
Nvidia CEO Jensen Huang.

Artur Widak/NurPhoto

  • Nvidia CEO Jensen Huang likes to conduct intense, public examinations of his team's work.
  • Stephen Witt's book about Huang and Nvidia debuted in the US on Tuesday.
  • Witt experienced Huang's ire when he brought up the more sci-fi-adjacent potential for AI.

At Nvidia, getting a dressing down from CEO Jensen Huang is a rite of passage.

The CEO has confirmed this in interviews before, but the writer Stephen Witt can now speak from experience. Witt is the author of "The Thinking Machine: Jensen Huang, Nvidia, and the World's Most Coveted Microchip," which chronicles the CEO's life and career and Nvidia's historic rise from a background player to a star of the AI revolution.

Witt describes a fair bit of yelling throughout Nvidia's history.

The company's culture is demanding. Huang prefers to pick apart the team's work in large meetings so that the whole group can learn. Witt's book delves into not just what Nvidians have done but how they think β€” or don't think β€” about what their inventions will bring in the grander scheme of history.

A book in Mandarin with a picture of Jensen Huang on the cover is displayed in a bookstore.
Stephen Witt's "The Thinking Machine" in a bookstore in Taipei, Taiwan. The book was first released in Asia and released in the US on Tuesday.

Robert Smith

In the final scene of the book, which was already available in Asia and was released in the US on Tuesday, Witt interviewed Huang last year in a room covered in whiteboards detailing Nvidia's past and future. Huang was visibly tired, having just wrapped up the company's nonstop annual conference. After a series of short, curt responses, Witt played a clip from 1964 of the science-fiction writer Arthur C. Clarke musing that machines will one day think faster than humans, and Huang changed entirely.

Witt wrote that he felt like he had hit a "trip wire." Huang didn't want to talk about AI destroying jobs, continue the interview, or cooperate with the book.

Witt told Business Insider about that day and why Huang sees himself differently than other tech titans like Tesla's Elon Musk and OpenAI's Sam Altman. Nvidia declined to comment.

This Q&A has been edited for clarity and length.

At the end of the book, Huang mentions Elon Musk and the difference between them. You asked him to grapple with the future that he's building. And he said, "I feel like you're interviewing Elon right now, and not me." What does that mean?

I think what Jensen is saying is that Elon is a science-fiction guy. Almost everything he does kind of starts with some science-fiction vision or concept of the future, and then he works backward to the technology that he'll need to put in the air.

In the most concrete example, Elon wants to stand on the surface of Mars. That's kind of a science-fiction vision. Working backward, what does he have to do today to make that happen?

Jensen is exactly the opposite. His only ambition, honestly, is that Nvidia stays in business. And so he's going to take whatever is in front of him right now and build forward into the farthest that he can see from first principles and logic. But he does not have science-fiction visions, and he hates science fiction. That is actually why he yelled at me. He's never read a single Arthur C. Clarke book β€” he said so.

He's meeting Elon Musk, Sam Altman, and other entrepreneurs in the middle. They're coming from this beautiful AGI future. Jensen's like, "I'm going to just build the hardware these guys need and see where it's going." Look at Sam Altman's blog posts about the next five stages of AI. It's really compelling stuff. Jensen does not produce documents like that, and he refuses to.

So, for instance, last month, Musk had a livestreamed Tesla all-hands where he talked about the theory of abundance that could be achieved through AI.

Exactly. Jensen's not going to do that. He just doesn't speculate about the future in that way. Now, he does like to reason forward about what the future is going to look like, but he doesn't embrace science-fiction visions. Jensen's a complicated guy, and I'm not still completely sure why he yelled at me.

This is hard to believe, but I guarantee you it is true. He hates public speaking, he hates being interviewed, and he hates presenting onstage. He's not just saying that. He actually β€” which is weird, because he's super good at it β€” hates it, and he gets nervous when he has to do it. And so now that GTC has become this kind of atmosphere, it really stresses him out.

A white, bald man (Stephen Witt) wearing a grey shirt looks directly into the camera
Stephen Witt is the author of "The Thinking Machine: Jensen Huang, Nvidia, and the World's Most Coveted Microchip."

Stephen Witt

Earlier in the book, Huang flippantly told you that he hopes he dies before the book comes out. The comment made me think about who might succeed 62-year-old Huang. Did you run into any concrete conversations about a succession plan?

He can't do it forever, but he's in great shape. He's a bundle of energy. He's just bouncing around. For the next 10 years, at least, it's going to be all Jensen.

I asked them, and they said they have no succession plan. Jensen said: "I don't have a successor."

Jensen's org chart is him and then 60 people directly below him. I say this in the book β€” he doesn't have a second in command. I know the board has asked this question. They didn't give me any names.

You describe in the book how you were a gamer and used Nvidia graphic cards until you very consciously stopped playing out of worry you were addicted. Did Nvidia just fall off your radar for 10 to 15 years after that? How did you end up writing this book?

This is an interesting story. I should have put this in the book. I bought Nvidia stock in the early 2000s and then sold it out of frustration when it went up.

I basically mirrored [Nvidia cofounder] Curtis Priem's experience and sold it in 2005 or 2006 β€” which looked like a great trade for seven years because it went all the way back down. I was like, "Oh, thank God I sold that," because it went down another 90% after that.

I probably broke even or lost a small amount of money. I have worked in finance and one of the counterintuitive things that people don't understand about finance is the best thing you can do for your portfolio is sell your worst-performing stock because you get tax advantages.

So I was aware of kind of the sunk-cost fallacy, and it looked like a great trade. Then I paid no attention to the company for 17 years. It wasn't until ChatGPT came along that I even paid attention to them coming back. And I was like, wait β€” what's going on with Nvidia? Why is this gaming company up so much? I started researching, and I realized these guys had a monopoly on the back end of AI.

I was like, "Oh, I'll just take Jensen and pitch him to The New Yorker." I honestly thought the story would be relatively boring. I was shocked at what an interesting person Jensen is. I thought for sure when I first saw the stock go up, they must have some new CEO doing something interesting.

To my great surprise, I learned that Jensen was still in charge of the company and in fact, at that point, was the single longest-serving tech CEO in the S&P 500.

I was like, it's the same guy? Doing the same thing? And then Jensen was so much more compelling of a character than I ever could hope for.

Read the original article on Business Insider

AI models get stuck 'overthinking.' Nvidia, Google, and Foundry could have a fix.

8 April 2025 at 10:01
A phone screen shows the two apps of ChatGPT and DeepSeek.
OpenAI's ChatGPT o1 and DeepSeek's R1 models could benefit from answering the same question repeatedly and picking the best answer.

picture alliance/dpa/Getty Images

  • Large language models like DeepSeek's R1 are "overthinking," affecting their accuracy.
  • Models are trained to question logic, but overthinking can lead to incorrect answers.
  • A new open-source framework aims to fix this and could give a glimpse of where AI is headed.

Large language models β€” they're just like us. Or at least they're trained to respond like us. And now they're even displaying some of the more inconvenient traits that come along with reasoning capabilities and "overthinking."

Reasoning models like OpenAI's o1 or DeepSeek's R1, have been trained to question their logic and check their own answers. But if they do that for too long, the quality of the responses they generate starts to degrade.

"The longer it thinks, the more likely it is to get the answer wrong because it's getting stuck," Jared Quincy Davis, the founder and CEO of Foundry, told Business Insider. Relatable, no?

"It's like if a student is taking an exam and they're taking three hours on the first question. It's overthinking β€” it's stuck in a loop," Davis continued.

Davis, along with researchers from Nvidia, Google, IBM, MIT, Stanford, DataBricks, and more, launched an open-source framework Tuesday called Ember that could portend the next phase of large language models.

Overthinking and diminishing returns

The concept of "overthinking" might seem to contradict another big break in model improvement: inference-time scaling. Just a few months ago, models that took a little more time to come up with a more considered response were touted by AI luminaries like Jensen Huang as the future of model improvement.

Reasoning models and inference-time scaling are still huge steps, Davis said, but future developers will likely think about using them differently.

Davis and the Ember team are formalizing a structure around a concept that he and other AI researchers have been playing with for months.

Nine months ago β€” an eon in the machine learning world β€” Davis described his hack of asking, referred to as "calling," ChatGPT 4 the same question many times and taking the best of the responses.

Now, Ember's researchers are taking that method and supercharging it, envisioning compound systems wherein each question or task would call a patchwork of models, all for different amounts of thinking time, based on what's optimal for each model and each question.

"Our system is a framework for building these networks of networks where you want to, for example, compose many, many calls into some broader system that has its own properties. So this is like a new discipline that I think jumped from research to practice very quickly," Davis said.

In the future, the model will choose you

When humans overthink, therapists might tell us to break problems down into smaller pieces and address them one at a time. Ember starts with that theory, but the similarity ends pretty quickly.

Right now, when you log into Perplexity or ChatGPT, you choose your model with a dropdown menu or toggle switch. Davis thinks that won't be the case for much longer as AI companies seek better results with these more complex strategies of routing questions through different models with different numbers and lengths of calls.

"You can imagine, instead of being a million calls, it might be a trillion calls or quadrillion calls. You have to sort the calls," Davis said. "You have to choose models for each call. Should each call be GPT 4? Or should some calls be GPT 3? Should some calls be Anthropic or Gemini, and others call DeepSeek? What should the prompts be for each call?"

It's thinking in more dimensions than the binary question-and-answer we've known. And it's going to be particularly important as we move into the era of AI agents where models perform tasks without human intervention.

Davis likened these compound AI systems to chemical engineering.

"This is a new science," he said.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

Building AI is about to get even more expensive — even with the semiconductor tariff exemption

7 April 2025 at 02:25
semiconductor
While the wafers that power AI chips are exempt from tariffs other components are not.

Michael Buholzer/Reuters

  • Most semiconductors are tariff-exempt, but full products with chips inside may face tariffs.
  • Without supply chain shifts, Nvidia's non-exempt AI products could see cost impacts from tariffs.
  • On-shoring assembly of chips may mitigate tariff effects, but increase costs.

Most Semiconductors, the silicon microchips that run everything from TV remote controls to humanoid robots are exempt from the slew of tariffs rolled out by the Trump administration last week. But that's not the end of the story for the industry which also powers the immense shift in computing toward artificial intelligence that's already underway, led by the US.

There are roughly $45 billion worth of semiconductors (based on 2024 totals gathered by Bernstein), that remain tariff-free β€” $12 billion of which comes from Taiwan, where AI chip leader Nvidia manufactures. But, the AI ecosystem requires much more than chips alone.

Data centers and the myriad materials and components required to generate depictions of everyone as an anime character are not exempt. For instance, an imported remote-controlled toy car with chips in both components would need an exception for toys, to avoid fees.

"We still have more questions than answers about this," wrote Morgan Stanley analysts in a note sent to investors Thursday morning. "Semiconductors are exempt. But what about modules? Cards?"

As of Friday morning, analysts were still scratching their heads as to the impact, despite the exemption.

"We're not exactly sure what to do with all this," wrote Bernstein's analysts. "Most semiconductors enter the US inside other things for which tariffs are likely to have a much bigger influence, hence secondary effects are likely to be far more material."

AI needs lots of things that aren't exempt

Nvidia designs chips and software, but what it mainly sells are boards, according to Dylan Patel, chief analyst at Semianalysis. Boards contain multiple chips, but also power delivery controls, and other components to make them work.

"On the surface, the exemption does not exempt Nvidia shipments as they ship GPU board assemblies," Patel told Business Insider. "If accelerator boards are excluded in addition to semiconductors, then the cost would not go up much," he continued.

These boards are just the beginning of the bumper crop of AI data centers in the works right now. Server racks, steel cabinets, and all the cabling, cooling gear, and switches to manage data flow and power are mostly imported.

A spokesperson for AMD, which, like Nvidia, produces its AI chips in Taiwan, told BI the company is closely monitoring the regulations.

"Although semiconductors are exempt from the reciprocal tariffs, we are assessing the details and any impacts on our broader customer and partner ecosystem," the spokesperson said in an email statement.

Nvidia declined to comment on the implications of the tariffs. But CEO Jensen Huang got the question from financial analysts last month at the company's annual GTC conference.

"We're preparing and we have been preparing to manufacture onshore," he said. Taiwanese manufacturer TSMC has invested $100 billion in a manufacturing facility in Arizona.

"We are now running production silicon in Arizona. And so, we will manufacture onshore. The rest of the systems, we'll manufacture as much onshore as we need to," Huang said. "We have a super agile supply chain, we're manufacturing in so many different places, we could shift things," he continued.

In addition to potentially producing chips in the US, it's plausible that companies, including Nvidia, could do more of their final assembly in the US, Chris Miller, the author of "Chip War" and a leading expert on the semiconductor industry told BI. Moving the later steps of the manufacturing process to un-tariffed shores, which right now include Canada and Mexico as well as the US, could theoretically allow these companies to import bare silicon chips and avoid levies. But that change would come with a cost as well, Miller said.

With retaliatory tariffs rolling in, US manufacturers could find tariffs weighing down demand in international markets too.

Supply chain shifts and knock-on effects

Semiconductor industry veteran Sachin Gandhi just brought his startup Retym out of stealth mode last week, with a semiconductor that helps data move between data centers faster. Technology like his has been most relevant to the telecom industry for decades and is now finding new markets in AI data centers.

Retym's finished product is exempt from tariffs when it enters the US, but the semiconductor supply chain is complex. Products often cross borders while being manufactured in multiple steps, packaged, tested, and validated, and then shipped to the final destination.

A global tariff-rich environment will probably bring up his costs in one way or another, Gandhi told BI. End customers like hyperscalers and the ecosystem of middlemen who bundle all these elements together and sell them will figure out how to cover these costs without too much consternation to a point, he said.

"Demand is not particularly price sensitive," wrote Morgan Stanley analyst Joe Moore Thursday.

AI is already an area where companies appear willing to spend with abandon. But, it's also maturing. Now, when companies are working to put together realistic estimates for normal business metrics like return on investment, unit economics, and profitability, tariffs risk pushing that down the road, potentially years.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

AI has ushered in a new kind of hacker

1 April 2025 at 02:00
Male hacker coding.
AI is offering hackers new openings.

GettyImages/ Hero Images

  • Hackers are using new AI models to infiltrate companies with old tricks.
  • Open-source models are gaining popularity, but raise the bar for cybersecurity.
  • Researchers scoured Hugging Face for malicious models and found hundreds.

AI doomsayers continue to worry about the technology's potential to bring about societal collapse. But the most likely scenario for now is that small-time hackers will have a field day.

Hackers usually have three main objectives, Yuval Fernbach, the chief technology officer of machine learning operations at software supply chain company JFrog, told Business Insider. They shut things down, they steal information, or they change the output of a website or tool.

Scammers and hackers, like employees of any business, are already using AI to jump-start their productivity. Yet, it's the AI models themselves that present a new way for bad actors to get inside companies since malicious code is easily hidden inside open-source large language models, according to Fernbach.

"We are seeing many, many attacks," he said. Overloading a model so that it can no longer respond is particularly on the rise, according to JFrog.

"It's quite easy to get to that place of a model not responding anymore," Fernbach said. Industry leaders are starting to organize to cut down on malicious models. JFrog has a scanner product to check models before they go into production. But to some extent the responsibility will always be on each company.

Malicious models attack

When businesses want to start using AI, they have to pick a model from a company like OpenAI, Anthropic, or Meta, as most don't go to the immense expense of building one in-house from scratch. The former two firms offer proprietary models so the security is somewhat more assured. But proprietary models cost more to use and many companies are wary of sharing their data.

Going with an open-source model from Meta or any one of the thousands available is increasingly popular. Companies can use APIs or download models and run them locally. Roughly half of the companies in a recent survey of 1,400 businesses by JFrog and InformationWeek were running downloaded models on their own servers.

As AI matures, companies are more likely to stitch together multiple models with different skills and expertise. Thoroughly checking them all for every update runs counter to the fast, free-wheeling AI experimentation phase of companies, Fernbach said.

Each new model, and any updates of data or functionality down the road, could contain malicious code or simply a change to the model that impacts the outcome, Fernbach said.

The consequences of complacency can be meaningful.

In 2024, a Canadian court ordered Air Canada to give a bereavement discount to a traveler who had been given incorrect information on how to obtain the benefit from the company's chatbot, even after human representatives of the airline denied it. The airline had to refund hundreds of dollars and cover legal fees. At scale, this kind of mistake could be costly, Fernbach said. For example, Banks are already concerned that generative AI is advancing faster than they can respond.

An man in a black t-shirt has a conversation in a grey room with a JFrog logo on a screen in the background.
JFrog CTO Yuval Fernbach speaks at a company event in New York City in March 2025.

JFrog

To find out the scale of the problem, JFrog partnered with online repository for AI models, Hugging Face, last year. Four hundred of the more than 1 million models contained malicious code β€” less than 1% and about the same chance as landing four of a kind in a five-card hand of poker.

Since then, JFrog estimates that while the number of new models has increased three-fold, attacks increased seven-fold.

Adding insult to injury, many popular models often have malicious imposters whose names are slight misspellings of authentic models that tempt hurried engineers.

Fifty-eight percent of companies polled in the same survey either had no company policy around open-source AI models or didn't know if they had one. And 68% of responding companies had no way to review developers' model usage other than a manual review.

With agentic AI on the rise, models will not only provide information and analysis but also perform tasks, and the risks could grow.

Read the original article on Business Insider

Don't overthink CoreWeave's IPO. It is a bellwether — just not for all of AI.

28 March 2025 at 14:57
Men, women, and children stand behind a podium which reads: Nasdaq and CRWV listed" in front of. ascreen that reads 
Coreweave" as confetti drops from the ceiling.
Mike Intrator, Chief Executive Officer and founder of CoreWeave, (C) rings the opening bell surrounded by Executive Leadership and family during the company's Initial Public Offering (IPO) at the Nasdaq headquarters on March 28, 2025 in New York City.

Michael M. Santiago/Getty Images

  • CoreWeave's Nasdaq debut saw shares fall below their IPO price, raising market concerns.
  • CoreWeave is the first US pure-play AI public offering, relying heavily on Nvidia GPUs.
  • The IPO tests the neocloud concept, with implications for AI's future and Nvidia's role.

CoreWeave listed on the Nasdaq Friday amid a shifting narrative and much anticipation. The company priced its IPO at $40 per share. The stock flailed, opening at $39 per share, then falling as much as 6% and ending the day back up at $41.59.

The cloud firm, founded in 2017, is the first pure-play AI public offering in the US. CoreWeave buys graphics processing units from Nvidia and then pairs them with software to give companies an easy way to access the GPUs and achieve the top performance of their AI products and services.

The company's financial future is dependent on two unknowns β€” that the use and usefulness of AI will grow immensely, and that those workloads will continue to run on Nvidia GPUs.

It's no wonder that the listing has often been described as a bellwether for the entire AI industry.

But CoreWeave's specific business has some contours that could be responsible for Friday's ambivalent debut without passing judgment on AI as a whole.

CoreWeave customers are highly concentrated and its suppliers are even more so. The company is highly leveraged, with billions in debt, collateralized by GPUs. The future obsolescence of those GPUs is looming.

Umesh Padval, managing director of Thomvest expects the pricing for the GPU computing CoreWeave offers to go down in the next 12 to 18 months as GPU supply continues to improve, which could challenge the company's future profitability.

"In general, it's not a bellwether in my opinion," Padval told Business Insider.

Beyond opening day

So what does it mean that CoreWeave's debut didn't rise to meet hopes and expectations?

Karl Mozurkewich, principal architect at cloud firm Valdi told BI the Friday IPO is more of a test for the neocloud concept than for AI. Neoclouds are a term used to describe young public cloud providers that solely focus on accelerated computing. They often use Nvidia's preferred reference architecture and, in theory, demonstrate the best possible performance for Nvidia's hardware.

Nvidia CEO Jensen Huang gave the buch a shoutout at the company's tentpole GTC conference last week.

"What they do is just one thing. They host GPUs," Huang said to an audience of nearly 18,000. "They call themselves GPU clouds, and one of our great partners, CoreWeave, is in the process of going public and we're super proud of them."

CoreWeave's public market performance will signal what shape the future could take for these companies, according to Mozurkewich. Will more companies try to replicate the GPU-cloud model? Will Nvidia seed more similar businesses? Will it continue to give neoclouds early access to new hardware?

"I think the industry is very interested to see if the shape of CoreWeave is long-term successful," Mozurkewich said.

Daniel Newman, CEO of the Futurum Group, told BI that CoreWeave is "one measuring point of the AI trade; it isn't entirely indicative of the overall AI market or AI demand." He added the company has the opportunity to improve its fate as AI scales and the customer base grows and diversifies.

Lucas Keh, Semiconductors Analyst at Third Bridge agreed.

"Currently, more than 70% of CoreWeave's revenue comes from hyperscalers, but our experts expect this concentration to decrease 1β€”2 years after an IPO as the company diversifies its customer base beyond public cloud customers," Keh said via email.

Having a handful of large, dominant enterprise customers is not uncommon for a young provider like CoreWeave, Mozurkewich said. But it's also no surprise that it could concern investors.

"This is where CoreWeave has a chance to shine as AI and the demand for AI spans beyond the big 7 to 10 names. The caveat will be how stable GPU prices are as availability increases and competition increases," Newman said.

Other issues, like obsolescence, the necessary depreciation, and leverage will be harder to shake.

Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088

Read the original article on Business Insider

❌
❌