Reading view

There are new articles available, click to refresh the page.

OpenAI launched its best new AI model in September. It already has challengers, one from China and another from Google.

Sam Altman sits in front of a blue background, looking to the side.
OpenAI CEO Sam Altman.

Andrew Caballero-Reynolds/AFP/Getty Images

  • OpenAI's o1 model was hailed as a breakthrough in September.
  • By November, a Chinese AI lab had released a similar model called DeepSeek.
  • On Thursday, Google came out with a challenger called Gemini 2.0 Flash Thinking.

In September, OpenAI unveiled a radically new type of AI model called o1. In a matter of months, rivals introduced similar offerings.

On Thursday, Google released Gemini 2.0 Flash Thinking, which uses reasoning techniques that look a lot like o1.

Even before that, in November, a Chinese company announced DeepSeek, an AI model that breaks challenging questions down into more manageable tasks like OpenAI's o1 does.

This is the latest example of a crowded AI frontier where pricey innovations are swiftly matched, making it harder to stand out.

"It's amazing how quickly AI model improvements get commoditized," Rahul Sonwalkar, CEO of the startup Julius AI, said. "Companies spend massive amounts building these new models, and within a few months they become a commodity."

The proliferation of multiple AI models with similar capabilities could make it difficult to justify charging high prices to use these tools. The price of accessing AI models has indeed plunged in the past year or so.

That, in turn, could raise questions about whether it's worth spending hundreds of millions of dollars, or even billions, to build the next top AI model.

September is a lifetime ago in the AI industry

When OpenAI previewed its o1 model in September, the product was hailed as a breakthrough. It uses a new approach called inference-time compute to answer more challenging questions.

It does this by slicing queries into more digestible tasks and turning each of these stages into a new prompt that the model tackles. Each step requires running a new request, which is known as the inference stage in AI.

This produces a chain of thought or chain of reasoning in which each part of the problem is answered, and the model doesn't move on to the next stage until it ultimately comes up with a full response.

The model can even backtrack and check its prior steps and correct errors, or try solutions and fail before trying something else. This is akin to how humans spend longer working through complex tasks.

DeepSeek rises

In a mere two months, o1 had a rival. On November 20, a Chinese AI company released DeepSeek.

"They were probably the first ones to reproduce o1," said Charlie Snell, an AI researcher at UC Berkeley who coauthored a Google DeepMind paper this year on inference-time compute.

He's tried DeepSeek's AI model and says it performs well on complex math problems that must be solved by thinking for longer and in stages. 

He noted that in DeepSeek's DeepThink mode, the model shows users every step of its thought process. With o1, these intermediate steps are hidden from users. 

"I've asked people at OpenAI what they think of it," Snell told BI. "They say it looks like the same thing, but they don't how DeepSeek did this so fast."

OpenAI didn't respond to a request for comment. On Friday, the startup previewed an o1 successor, called o3. Francois Chollet, a respected AI expert, called the update a "significant breakthrough."

Andrej Karpathy, an OpenAI cofounder, praised Google's new "Thinking" model for the same reasoning feature.

"The prominent and pleasant surprise here is that unlike o1 the reasoning traces of the model are shown," he wrote on X. "As a user I personally really like this because the reasoning itself is interesting to see and read — the models actively think through different possibilities, ideas, debate themselves, etc., it's part of the value add."

A DeepSeek demo

Snell shared a multistep math problem with Business Insider, which we used to test DeepSeek for ourselves: 

"Find a sequence of +, -, /, * which can be applied to the numbers 7, 3, 11, 5 to get to 24, where each of the given numbers is used exactly once."

BI put that prompt in DeepSeek's chat window on its website. The model responded initially by laying out the challenge ahead.

"Alright, so I've got this problem here: I need to use the numbers 7, 3, 11, and 5, and combine them with the operations of addition, subtraction, multiplication, and division, using each number exactly once, to get to 24," it replied. "At first glance, this seems a bit tricky, but I think with some systematic thinking, I can figure it out."

It then proceeded through multiple steps over roughly 16 pages of discussion that included mathematical calculations and equations. The model sometimes got it wrong, but it spotted this and didn't give up. Instead, it swiftly moved on to another possible solution. 

"Almost got close there with 33 / 7 * 5 ≈ 23.57, but not quite 24. Maybe I need to try a different approach," it wrote at one point. 

After a few minutes, it found the correct solution. 

"You can see it try different ideas and backtrack," Snell said in an interview on Wednesday. He highlighted this part of DeepSeek's chain of thought as particularly noteworthy:

"This is getting really time-consuming. Maybe I need to consider a different strategy," the AI model wrote. "Instead of combining two numbers at a time, perhaps I should look for a way to group them differently or use operations in a nested manner."

Then Google appears

Snell said other companies are likely working on AI models that use the same inference-time compute approach as OpenAI.

"DeepSeek does this already, so I assume others are working on this," he added on Wednesday.

The following day, Google released Gemini 2.0 Flash Thinking. Like DeepSeek, this new model shows users each step of its thought process while tackling problems. 

Jeff Dean, a Google AI veteran, shared a demo on X that showed this new model solving a physics problem and explained its reasoning steps. 

"This model is trained to use thoughts to strengthen its reasoning," Dean wrote. "We see promising results when we increase inference time computation!"

Read the original article on Business Insider

Google releases its own ‘reasoning’ AI model

Google has released what it’s calling a new “reasoning” AI model — but it’s in the experimental stages, and from our brief testing, there’s certainly room for improvement. The new model, called Gemini 2.0 Flash Thinking Experimental (a mouthful, to be sure), is available in AI Studio, Google’s AI prototyping platform. A model card describes […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Stigg makes it easy to change your SaaS pricing

Stigg (not The Stig, just Stigg) describes itself as “the first scalable monetization platform for the modern billing stack.” There’s a lot going on in that sentence, but what it comes down to is that the startup, which on Wednesday announced a $17.5 million Series A round, helps SaaS companies model pricing, create pricing pages, […]

© 2024 TechCrunch. All rights reserved. For personal use only.

My husband likes to keep everything, and I prefer minimalism. A home remodel helped us learn to declutter together.

Man and woman cleaning at home, dusting dresser and photos.
The author (not pictured) and her husband deal with clutter differently.

Getty Images

  • My husband and I have opposite organizational styles.
  • I learned the hard way that pressuring him to change wouldn't work.
  • A home remodel forced us to face the clutter, and now, we communicate much better.

I consider myself a pseudo-minimalist. I don't buy knickknacks when traveling, fill my home with extra furniture, or stock up on pantry, beauty, or toiletry supplies. I like having dresser drawers that close easily and bookshelves I can pull a novel from without four others toppling onto my head.

Now imagine the opposite of my personality in the clutter department, and you have my husband.

He's a collector. He's a saver of the socks I would throw out because they're starting to get a small hole, of the hockey gear that goes unused, of extra dinnerware we don't have room for.

So what are these two personalities doing living under one roof? Well, we love each other. And people do crazy things for love.

The difference between how we dealt with clutter took a toll

When we married and moved in together, the problem revealed itself in full. We had very different ways of organizing and even thinking about the items we bring — and keep — in our home.

I felt suffocated and panicked at the stacks, bags, and boxes of his things.

My attempts to purge items didn't go well. I didn't know how to be kind in my panic, and he didn't want things to change.

Living with so much clutter affected my mental health. I felt the pressure of organizing so many things. It seemed impossible to make stuffed spaces look nice. Rooms felt cramped, every storage area overflowed, and our fights over the subject became caustic.

I knew it was time for a different approach. He had emotional attachments to things that I didn't understand, but it didn't mean I was right in demanding that they go.

Health and wellness consultant Michelle Porter told Business Insider, "Studies show that cluttered spaces elevate cortisol levels, the body's stress hormone. For all household members, this can mean heightened irritability, difficulty focusing, and a reduced ability to relax." In short, our stuff affects our health, and I needed to reduce our load.

Biopsychologist Mary Poffenroth explains why organizing shared spaces can be so difficult. "What one partner thinks is necessary organization, the other partner may see as a threat to their emotional safety and well-being."

A remodel meant it was time for a new approach

During our recent kitchen remodel, I saw that even with the additional space provided by the new cabinetry, it still wasn't enough for all the appliances and dishes he owned. I suggested we only keep what would fit into the new space. To my surprise and delight, he agreed. This made the getting-rid-of-things talks that followed easier because we'd both consented beforehand.

At the end of the remodel, we donated several boxes.

I felt empowered. High on the win, we implemented this same tactic in other areas by creating a "one in, one out" rule. If a new shirt comes into the house, he donates one. The same goes for other clothing items. And now, when he wants to buy a new appliance, he considers first if we have a space for it.

We now have a new way of talking about clutter

For items going unused, it takes a little more patience. I'll bring up the item I'd like to discuss and the fact that it seems we don't really need it. I use the word "seem" specifically so he can correct me if he is using it and I'm wrong. He usually replies with how he's hoping to use the item soon and we agree to a timeline. Then, if it's not used at least an agreed-upon number of times over the next year, we'll sell or donate it.

Nowadays, our space is much more comfortable for me than when we first married. Decluttering our space will be an ongoing process as life and needs change, but we've learned how to talk through the "stuff" in a way that works for us both, and that's the real success story.

Read the original article on Business Insider

OpenAI rolls out the full version of o1, its hot reasoning model

Sam Altman presenting onstage with the OpenAI logo behind him.
OpenAI CEO Sam Altman.

Jason Redmond/AFP/Getty Images

  • OpenAI released the full version of its o1 reasoning model on Thursday.
  • It says the o1 model, initially previewed in September, is now multimodal, faster, and more precise.
  • It was released as part of OpenAI's 12-day product and demo launch, dubbed "shipmas."

On Thursday, OpenAI released the full version of its hot new reasoning model as part of the company's 12-day sprint of product launches and demos.

The model, known as o1, was released in a preview mode in September. OpenAI CEO Sam Altman said during day one of the company's livestream that the latest version was more accurate, faster, and multimodal. Research scientists on the livestream said an internal evaluation indicated it made major mistakes about 34% less often than the o1 preview mode.

The model, which seems geared toward scientists, engineers, and coders, is designed to solve thorny problems. The researchers said it's the first model that OpenAI trained to "think" before it responds, meaning it tends to give more detailed and accurate responses than other AI helpers.

To demonstrate o1's multimodal abilities, they uploaded a photo of a hand-drawn system for a data center in space and asked the program to estimate the cooling-panel area required to operate it. After about 10 seconds, o1 produced what would appear to a layperson as a sophisticated essay rife with equations, ending with what was apparently the right answer.

The researchers think o1 should be useful in daily life, too. Whereas the preview version could think for a while if you merely said hi, the latest version is designed to respond faster to simpler queries. In Thursday's livestream, it was about 19 seconds faster than the old version at listing Roman emperors.

All eyes are on OpenAI's releases over the next week or so, amid a debate about how much more dramatically models like o1 can improve. Tech leaders are divided on this issue; some, like Marc Andreessen, argue that AI models aren't getting noticeably better and are converging to perform at roughly similar levels.

With its 12-day deluge of product news, dubbed "shipmas," OpenAI may be looking to quiet some critics while spreading awkward holiday cheer.

"It'll be a way to show you what we've been working on and a little holiday present from us," Altman said on Thursday.

Read the original article on Business Insider

DeepMind’s Genie 2 can generate interactive worlds that look like video games

DeepMind, Google’s AI research org, has unveiled a model that can generate an “endless” variety of playable 3D worlds. Called Genie 2, the model — the successor to DeepMind’s Genie, which was released earlier this year — can generate an interactive, real-time scene from a single image and text description (e.g. “A cute humanoid robot […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Alibaba releases an ‘open’ challenger to OpenAI’s o1 reasoning model

A new so-called “reasoning” AI model, QwQ-32B-Preview, has arrived on the scene. It’s one of the few to rival OpenAI’s o1, and it’s the first available to download under a permissive license. Developed by Alibaba’s Qwen team, QwQ-32B-Preview contains 32.5 billion parameters and can consider prompts up ~32,000 words in length; it performs better on […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Ai2 releases new language models competitive with Meta’s Llama

There’s a new AI model family on the block, and it’s one of the few that can be reproduced from scratch. On Tuesday, Ai2, the nonprofit AI research organization founded by the late Microsoft co-founder Paul Allen, released OLMo 2, the second family of models in its OLMo series. (OLMo is short for “open language […]

© 2024 TechCrunch. All rights reserved. For personal use only.

A Chinese lab has released a ‘reasoning’ AI model to rival OpenAI’s o1

A Chinese lab has unveiled what appears to be one of the first “reasoning” AI models to rival OpenAI’s o1. On Wednesday, DeepSeek, an AI research company funded by quantitative traders, released a preview of DeepSeek-R1, which the firm claims is a reasoning model competitive with o1. Unlike most models, reasoning models effectively fact-check themselves […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Niantic uses Pokémon Go player data to build AI navigation system

Last week, Niantic announced plans to create an AI model for navigating the physical world using scans collected from players of its mobile games, such as Pokémon Go, and from users of its Scaniverse app, reports 404 Media.

All AI models require training data. So far, companies have collected data from websites, YouTube videos, books, audio sources, and more, but this is perhaps the first we've heard of AI training data collected through a mobile gaming app.

"Over the past five years, Niantic has focused on building our Visual Positioning System (VPS), which uses a single image from a phone to determine its position and orientation using a 3D map built from people scanning interesting locations in our games and Scaniverse," Niantic wrote in a company blog post.

Read full article

Comments

© https://www.gettyimages.com/detail/news-photo/man-plays-pokemon-go-game-on-a-smartphone-on-july-22-2016-news-photo/578680184

❌