❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Elon Musk's xAI is expanding its Memphis supercomputer to house at least 1 million GPUs

5 December 2024 at 03:00
Elon Musk next to xAI's logo on a phone
Elon Musk's xAI built its supercomputer at a rapid pace.

Anadolu

  • Elon Musk's xAI plans to make a tenfold increase to the number of GPUs at its Memphis supercomputer.
  • The expansion aims to help the startup compete with OpenAI and Google in the AI race.
  • Nvidia, Dell, and Supermicro Computer also plan to establish operations in Memphis.

Elon Musk's xAI is ramping up its Memphis supercomputer to house at least 1 million graphic processing units, the Greater Memphis Chamber said on Wednesday.

The supercomputer, called Colossus, is already considered the largest of its kind in the world. The expansion would increase the number of its GPUs tenfold.

The move is part of xAI's effort to ramp up AI development and outpace rivals like OpenAI and Google. The GPUs are used to train and run xAI's AI-powered chatbot Grok, the company's answer to products like OpenAI's ChatGPT and Google's Gemini.

"In Memphis, we're pioneering development in the heartland of America," Brent Mayo, an xAI engineer, said in a statement. "We're not just leading from the front; we're accelerating progress at an unprecedented pace while ensuring the stability of the grid utilizing megapack technology."

The Greater Memphis Chamber said Nvidia β€” the leader in the GPU market and a supplier to Colossus β€” along with Dell and Supermicro Computer, also plan to establish operations in Memphis.

A 'superhuman' task

xAI built its supercomputer, Colossus, at a rapid pace. The supporting facility and supercomputer were built by xAI and Nvidia in just 122 days, according to a press release from Nvidia.

In an interview with Jordan Peterson on X in June, Musk said it took 19 days to get Colossus from hardware installation to beginning training, adding it was "the fastest by far anyone's been able to do that."

The speed of xAI's expansion won praise from Jensen Huang, the CEO of Nvidia, who described the effort as a "superhuman" task and hailed Musk's understanding of engineering.

Huang said a project like Colossus would normally take "three years to plan" and another year to get it up and running.

Musk's AI startup has also been on a fundraising streak. The Wall Street Journal reported that xAI is valued at $50 billion, doubling itsΒ valuation since the spring.

Investors in xAI's latest funding round reportedly include Sequoia Capital and Andreessen Horowitz. Earlier this year, xAI raised a $6 billion Series B from A16z and Sequoia Capital at a $24 billion post-money valuation. The new round means the AI company has raised a total of $11 billion this year.

Read the original article on Business Insider

4 things we learned from Amazon's AWS conference, including about its planned supercomputer

3 December 2024 at 15:59
AWS chip
AI chips were the star of AWS CEO Matt Garman's re:Invent keynote.

Business Wire/BI

  • AWS announced plans for an AI supercomputer, UltraCluster, with Trainium 2 chips at re:Invent.
  • AWS may be able to reduce reliance on Nvidia by developing its own AI infrastructure.
  • Apple said it's using Trainium 2 chips for Apple Intelligence.

Matt Garman, the CEO of Amazon Web Services, made several significant new AWS announcements at the re:Invent conference on Tuesday.

His two-and-a-half hour keynote delved into AWS's current software and hardware offerings and updates, with words from clients including Apple and JPMorgan. Graphics processing units (GPUs), supercomputers, and a surprise Apple cameo stuck out among the slew of information.

AWS, the cloud computing arm of Amazon, has been developing its own semiconductors to train AI. On Tuesday, Garman said it's creating UltraServers β€” containing 64 of its Trainium 2 chips β€” so companies can scale up their GenAI workloads.

Moreover, it's also building an AI supercomputer, an UltraCluster made up of UltraServers, in partnership with AI startup Anthropic. Named Project Rainier, it will be "the world's largest AI compute cluster reported to date available for Anthropic to build and deploy its future models on" when completed, according to an Amazon blog post. Amazon has invested $8 billion in Anthropic.

Such strides could push AWS further into competition with other tech firms in the ongoing AI arms race, including AI chip giant Nvidia.

Here are four takeaways from Garman's full keynote on Tuesday.

AWS' Trainium chips could compete with Nvidia.

Nvidia currently dominates the AI chip market with its sought-after and pricey GPUs, but Garman backed AWS's homegrown silicon during his keynote on Tuesday. His company's goal is to reduce the cost of AI, he said.

"Today, there's really only one choice on the GPU side, and it's just Nvidia. We think that customers would appreciate having multiple choices," Garman told the Wall Street Journal.

AI is growing rapidly, and the demand for chips that make the technology possible is poised to grow alongside it. Major tech companies, like Google and Microsoft, are venturing into chip creation as well to find an alternative to Nvidia.

However, Garman told The Journal the doesn't expect Trainium to dethrone Nvidia "for a long time."

"But, hopefully, Trainium can carve out a good niche where I actually think it's going to be a great option for many workloads β€” not all workloads," he said.

AWS also introduced Trainium3, its next-gen chip.

AWS' new supercomputer could go toe to toe with Elon Musk's xAI.

According to The Journal, the chip cluster known as Project Rainier is expected to be available in 2025. Once it is ready, Anthropic plans to use it to train AI models.

With "hundreds of thousands" of Trainium chips, it would challenge Elon Musk's xAI's Colossus β€” a supercomputer with 100,000 of Nvidia's Hopper chips.

Apple is considering Trainium 2 for Apple Intelligence training.

Garman said that Apple is one of its customers using AWS chips, like Amazon Graviton and Inferentia, for services including Siri.

Benoit Dupin, senior director of AI and machine learning at Apple, then took to the stage at the Las Vegas conference. He said the company worked with AWS for "virtually all phases" of its AI and machine learning life cycle.

"One of the unique elements of Apple business is the scale at which we operate and the speed with which we innovate," Dupin said.

He added, "AWS has been able to keep the pace, and we've been customers for more than a decade."

Now, Dupin said Apple is in the early stages of testing Trainium 2 chips to potentially help train Apple Intelligence.

The company introduced a new generation of foundational models, Amazon Nova.

Amazon announced some new kids on the GenAI block.

AWS customers will be able to use Amazon Nova-powered GenAI applications "to understand videos, charts, and documents, or generate videos and other multimedia content," Amazon said. There are a range of models available at different costs, it said.

"Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro are at least 75% less expensive than the best-performing models in their respective intelligence classes in Amazon Bedrock," Amazon said.

Read the original article on Business Insider

UK crashes out of global top 50 supercomputer ranking

20 November 2024 at 03:26

The U.K. no longer has a supercomputer in the top 50, according to new data from the Top500 project, which ranks the 500 most powerful non-distributed computer systems globally. The country’s current national supercomputer system, Archer2, is approaching end-of-life in 2026. According to the latest figures, it’s also now sitting in 62nd place globally, down […]

Β© 2024 TechCrunch. All rights reserved. For personal use only.

❌
❌