There’s been a lot of reporting in recent months around Apple’s efforts to expand its footprint in customers’ homes with in-development products like a wall-mounted smart home hub. According to a new report in Bloomberg, that strategy could also include a smart doorbell. This doorbell would use Apple’s FaceID technology to scan people’s faces as […]
Smart ring maker Oura announced on Thursday that it has closed a $200 million Series D funding round, bringing the company’s valuation to $5.2 billion. The round included participation from Fidelity Management and glucose device maker Dexcom. Oura says the new capital will allow it to expand its product offerings and further invest in product, science, […]
For many industries, lithium batteries just don’t cut it — they’re getting increasingly expensive, require too much space, and sometimes they are just overkill for many industrial use cases. Thermal batteries, on the other hand, can store energy in the form of heat for long periods, are often cheaper to invest in and deploy, and […]
Amazon announced on Wednesday new accessibility features for Fire TV, including a notable “Dual Audio” capability for the newly launched Fire TV Omni Mini-LED Series, which was unveiled in November. The new feature allows one user to listen through a hearing aid while others in the same room can enjoy audio through the TV’s built-in […]
Groq is taking a novel approach to competing with Nvidia's much-lauded CUDA software.
The chip startup is using a free inference tier to attract hundreds of thousands of AI developers.
Groq aims to capture market share with faster inference and global joint ventures.
There is an active debate about Nvidia's competitive moat. Some say there's a prevailing perception of a 'safe' choice when investing billions in a technology, in which the return is still uncertain.
Many say it's Nvidia's software, particularly CUDA, which the company began developing decades before the AI boom. CUDA allows users to get the most out of graphics processing units.
Competitors have attempted to make comparable systems, but without Nvidia's headstart, it has been tough to get developers to learn, try, and ultimately improve their systems.
Groq, however, is an Nvidia competitor that focused early on the segment of AI computing that requires less need for directly programming chips, and investors are intrigued. The 8-year-old AI chip startup was valued at $2.8 billion at its $640 million Series D round in August.
Though at least one investor has called companies like Groq 'insane' for attempting to dent Nvidia's estimated 90% market share, the startup has been building its technology exactly for the opportunity that is coming in 2025, Mark Heaps, Groq's "chief tech evangelist" said.
'Unleashing the beast'
"What we decided to do was take all of our compute, make it available via a cloud instance, and we gave it away to the world for free," Heaps said. Internally, the team called the strategy, "unleashing the beast". Groq's free tier caps users at a ceiling marked by requests per day or tokens per minute.
Heaps, CEO and ex-Googler Jonathan Ross, and a relatively lean team have spent 2023 and 2024 recruiting developers to try Groq's tech. Through hackathons and contests, the company makes a promise — try the hardware via Groq's cloud platform for free, and break through walls you've hit with others.
Groq offers some of the fastest inference out there, according to rankings on Artificialanalysis.ai, which measures cost and latency for companies that allow users to buy access to specific models by the token — or output.
Inference is a type of computing that produces the answers to queries asked of large language models. Training, the more energy-intensive type of computing, is what gives the models the ability to answer. So far, the hardware used for those two tasks has been different.
After the inference service was available for free, developers came out of the woodwork, he said, with projects that couldn't be successful on slower chips. With more speed, developers can send one request through multiple models and use another model to choose the best response — all in the time it would usually take to fulfill just one request.
Roughly 652,000 developers are now using Groq API keys, Heaps said.
Heaps expects speed to hook developers on Groq. But its novel plan for programming its chips gives the company a unique approach to the most crucial element within Nvidia's "moat."
No need for CUDA libraries
"Everybody, once they deployed models, was gonna need faster inference at a lower cost, and so that's what we focused on," Heaps said.
So where's the CUDA equivalent? It's all in-house.
"We actually have more than 1800 models built into our compiler. We use no kernels, and we don't need people to use CUDA libraries. So because of that, people can just start working with a model that's built-in," Heaps said.
Training, he said, requires more customization at the chip level. In inference, Groq's task is to choose the right models to offer customers and ensure they run as fast as possible.
"What you're seeing with this massive swell of developers who are building AI applications — they don't want to program at the chip level," he added.
The strategy comes with some level of risk. Groq is unlikely to accumulate a stable of developers who continuously troubleshoot and improve its base software like CUDA has. Its offering may be more like a restaurant menu than a grocery store. But this also means the barrier to entry for Groq users is the same as any other cloud provider and potentially lower than that of other chips.
Though Groq started out as a company with a novel chip design, today, of the company's roughly 300 employees, 60% are software engineers, Heaps said.
"For us right now, there is a billions and billions of dollars industry emerging, that we can go capture a big share of market in, while at the same time, we continue to mature the compiler," he said.
Despite being realistic about the near-term, Groq has lofty ambitions, which board CEO Jonathan Ross has described as "providing half the world's inference." Ross also says the goal is to cast a net over the globe — to be achieved via joint ventures. Saudi Arabia is on the way. Canada and Latin America are in the works.
Earlier this year, Ross told BI the company also has a goal to ship 108,000 of its language processing units or LPUs by the first quarter of next year — and 2 million chips by the end of 2025, most of which will be made availablethrough its cloud.
Have a tip or an insight to share? Contact Emma at [email protected] or use the secure messaging app Signal: 443-333-9088
London-based tech startup TG0 has raised £4.5M in Series B funding to push its AI-driven physical products to new markets. The round was led by AI-focused investor NetMind.AI, with additional backing from WP Health. The new funding will help TG0 […]
Meta’s Ray-Ban Meta smart glasses are getting several new AI-powered upgrades, including the ability to have an ongoing conversation and translate between languages. Ray-Ban Meta owners in Meta’s early access program for the U.S. and Canada can now download firmware v11, which adds “live AI.” First unveiled this fall, live AI lets wearers continuously converse […]
The next big upgrade to Apple’s mobile devices could be foldability, according to multiple reports published Sunday. According to The Wall Street Journal, Apple is aiming to launch two foldable devices in the next few years. There’s a larger model with a 19-inch screen that could compete with desktop monitors, as well as a smaller […]
Google has released a prototype of Project Astra’s AR glasses for testing in the real world. The glasses are part of Google’s long-term plan to one day have hardware with augmented reality and multimodal AI capabilities. In the meantime, they will be releasing demos to get the attention of consumers, developers, and their competition. Along […]
Google is slowly peeling back the curtain on its vision to, one day, sell you glasses with augmented reality and multimodal AI capabilities. The company’s plans for those glasses, however, are still blurry. At this point, we’ve seen multiple demos of Project Astra — DeepMind’s effort to build real-time, multimodal apps and agents with AI […]
Google said that it is launching a new Android-based XR platform on Thursday to accommodate AI features. The company said the platform, called Android XR, will support app development on different devices, including headsets and glasses. The company is releasing Android XR’s first developer preview on Thursday, which already supports existing tools, including ARCore, Android […]
In a press briefing on Wednesday, the Pentagon said it has no evidence that the mysterious drones that have been flying over New Jersey and other parts of the northeast U.S. in recent weeks were coming from a foreign entity, nor were they U.S. military drones. The comments come a day after a U.S. Congressional […]
There was a time when virtual reality (VR) was thought of as the next big thing in gaming, the next platform for never-before-seen levels of immersion and creativity. If your name is Mark Zuckerberg, you probably still think it is. […]
Single-board computer maker Raspberry Pi is updating its cute little computer-meet-keyboard device with better specifications. Named the Raspberry Pi 500, this successor to the Raspberry Pi 400 is as powerful as the current flagship Raspberry Pi, the Raspberry Pi 5. It’s available to buy now from Raspberry Pi resellers. The Raspberry Pi 500 is the […]
Apple is looking to make its Vision Pro mixed reality device more attractive to gamers and game developers, according to a new report from Bloomberg’s Mark Gurman. The Vision Pro has been pitched as more of a productivity and media consumption device than something aimed at gamers, due in part to relying on eye and […]
Ladder, an Austin, Texas-based fitness startup that makes a popular strength-training app, is accusing Peloton of ripping off its work with the launch of Peloton’s new Stength+ app, which exited beta on Wednesday. After receiving feedback from Peloton’s beta testers that the app looked, felt, and functioned much like Ladder’s own, the company says it […]
A startup with founders who previously served in the German military has created a product akin to an Amazon Firestick for legacy defense equipment, complete with a software stack. ARX Robotics claims its system can turn old equipment into AI-driven devices, such as autonomous-driving trucks. Back in June this year ARX raised a €9 million […]
Negotiating with one of the largest companies in the world doesn’t sound easy, but that’s precisely what Indonesia has been doing for the past few weeks with iPhone maker Apple. The backstory here is that after the country required smartphones which are sold domestically to be made of at least 40% locally manufactured parts, Indonesia […]
When Humane released its Ai Pin, the San Francisco-based gadget maker envisioned a world with dedicated AI devices — something that you would carry with you in addition to the smartphone in your pocket. However, reviews and sales haven’t been great — returns reportedly begun to outpace unit sales at one point. And Humane recently dropped […]