Over the past 12 business days, OpenAI has announced a new product or demoed an AI feature every weekday, calling the PR event "12 days of OpenAI." We've covered some of the major announcements, but we thought a look at each announcement might be useful for people seeking a comprehensive look at each day's developments.
The timing and rapid pace of these announcements—particularly in light of Google's competing releases—illustrates the intensifying competition in AI development. What might normally have been spread across months was compressed into just 12 business days, giving users and developers a lot to process as they head into 2025.
Humorously, we asked ChatGPT what it thought about the whole series of announcements, and it was skeptical that the event even took place. "The rapid-fire announcements over 12 days seem plausible," wrote ChatGPT-4o, "But might strain credibility without a clearer explanation of how OpenAI managed such an intense release schedule, especially given the complexity of the features."
On Friday, during Day 12 of its "12 days of OpenAI," OpenAI CEO Sam Altman announced its latest AI "reasoning" models, o3 and o3-mini, which build upon the o1 models launched earlier this year. The company is not releasing them yet but will make these models available for public safety testing and research access today.
The models use what OpenAI calls "private chain of thought," where the model pauses to examine its internal dialog and plan ahead before responding, which you might call "simulated reasoning" (SR)—a form of AI that goes beyond basic large language models (LLMs).
The company named the model family "o3" instead of "o2" to avoid potential trademark conflicts with British telecom provider O2, according to The Information. During Friday's livestream, Altman acknowledged his company's naming foibles, saying, "In the grand tradition of OpenAI being really, truly bad at names, it'll be called o3."
Over the past month, we've seen a rapid cadence of notable AI-related announcements and releases from both Google and OpenAI, and it's been making the AI community's head spin. It has also poured fuel on the fire of the OpenAI-Google rivalry, an accelerating game of one-upmanship taking place unusually close to the Christmas holiday.
"How are people surviving with the firehose of AI updates that are coming out," wrote one user on X last Friday, which is still a hotbed of AI-related conversation. "in the last <24 hours we got gemini flash 2.0 and chatGPT with screenshare, deep research, pika 2, sora, chatGPT projects, anthropic clio, wtf it never ends."
Rumors travel quickly in the AI world, and people in the AI industry had been expecting OpenAI to ship some major products in December. Once OpenAI announced "12 days of OpenAI" earlier this month, Google jumped into gear and seemingly decided to try to one-up its rival on several counts. So far, the strategy appears to be working, but it's coming at the cost of the rest of the world being able to absorb the implications of the new releases.
It's been a really busy month for Google as it apparently endeavors to outshine OpenAI with a blitz of AI releases. On Thursday, Google dropped its latest party trick: Gemini 2.0 Flash Thinking Experimental, which is a new AI model that uses runtime "reasoning" techniques similar to OpenAI's o1 to achieve "deeper thinking" on problems fed into it.
The experimental model builds on Google's newly released Gemini 2.0 Flash and runs on its AI Studio platform, but early tests conducted by TechCrunch reporter Kyle Wiggers reveal accuracy issues with some basic tasks, such as incorrectly counting that the word "strawberry" contains two R's.
These so-called reasoning models differ from standard AI models by incorporating feedback loops of self-checking mechanisms, similar to techniques we first saw in early 2023 with hobbyist projects like "Baby AGI." The process requires more computing time, often adding extra seconds or minutes to response times. Companies have turned to reasoning models as traditional scaling methods at training time have been showing diminishing returns.
On Thursday, a large group of university and private industry researchers unveiled Genesis, a new open source computer simulation system that lets robots practice tasks in simulated reality 430,000 times faster than in the real world. Researchers can also use an AI agent to generate 3D physics simulations from text prompts.
The accelerated simulation means a neural network for piloting robots can spend the virtual equivalent of decades learning to pick up objects, walk, or manipulate tools during just hours of real computer time.
"One hour of compute time gives a robot 10 years of training experience. That's how Neo was able to learn martial arts in a blink of an eye in the Matrix Dojo," wrote Genesis paper co-author Jim Fan on X, who says he played a "minor part" in the research. Fan has previously worked on severalrobotics simulation projects for Nvidia.
The AI-generated video scene has been hopping this year (or twirling wildly, as the case may be). This past week alone we've seen releases or announcements of OpenAI's Sora, Pika AI's Pika 2, Google's Veo 2, and Minimax's video-01-live. It's frankly hard to keep up, and even tougher to test them all. But recently, we put a new open-weights AI video synthesis model, Tencent's HunyuanVideo, to the test—and it's surprisingly capable for being a "free" model.
Unlike the aforementioned models, HunyuanVideo's neural network weights are openly distributed, which means they can be run locally under the right circumstances (people have already demonstrated it on a consumer 24 GB VRAM GPU) and it can be fine-tuned or used with LoRAs to teach it new concepts.
Notably, a few Chinese companies have been at the forefront of AI video for most of this year, and some experts speculate that the reason is less reticence to train on copyrighted materials, use images and names of famous celebrities, and incorporate some uncensored video sources. As we saw with Stable Diffusion 3's mangled release, including nudity or pornography in training data may allow these models achieve better results by providing more information about human bodies. HunyuanVideo notably allows uncensored outputs, so unlike the commercial video models out there, it can generate videos of anatomically realistic, nude humans.
On Wednesday, OpenAI launched a 1-800-CHATGPT (1-800-242-8478) telephone number that anyone in the US can call to talk to ChatGPT via voice chat for up to 15 minutes for free. The company also says that people outside the US can send text messages to the same number for free using WhatsApp.
Upon calling, users hear a voice say, "Hello again, it's ChatGPT, an AI assistant. Our conversation may be reviewed for safety. How can I help you?" Callers can ask ChatGPT anything they would normally ask the AI assistant and have a live, interactive conversation.
During a livestream demo of "Calling with ChatGPT" during Day 10 of "12 Days of OpenAI," OpenAI employees demonstrated several examples of the telephone-based voice chat in action, asking ChatGPT to identify a distinctive house in California and for help in translating a message into Spanish for a friend. For fun, they showed calls from an iPhone, a flip phone, and a vintage rotary phone.
On Wednesday, a video from OpenAI's newly launched Sora AI video generator wentviral on social media, featuring a gymnast who sprouts extra limbs and briefly loses her head during what appears to be an Olympic-style floor routine.
As it turns out, the nonsensical synthesis errors in the video—what we like to call "jabberwockies"—hint at technical details about how AI video generators work and how they might get better in the future.
But before we dig into the details, let's take a look at the video.
On Thursday, OpenAI announced that ChatGPT users can now talk to a simulated version of Santa Claus through the app's voice mode, using AI to bring a North Pole connection to mobile devices, desktop apps, and web browsers during the holiday season.
The company added Santa's voice and personality as a preset option in ChatGPT's Advanced Voice Mode. Users can access Santa by tapping a snowflake icon next to the prompt bar or through voice settings. The feature works on iOS and Android mobile apps, chatgpt.com, and OpenAI's Windows and MacOS applications. The Santa voice option will remain available to users worldwide until early January.
The conversations with Santa exist as temporary chats that won't save to chat history or affect the model's memory. OpenAI designed this limitation specifically for the holiday feature. Keep that in mind, because if you let your kids talk to Santa, the AI simulation won't remember what kids have told it during previous conversations.
On Wednesday, Google unveiled Gemini 2.0, the next generation of its AI-model family, starting with an experimental release called Gemini 2.0 Flash. The model family can generate text, images, and speech while processing multiple types of input including text, images, audio, and video. It's similar to multimodal AI models like GPT-4o, which powers OpenAI's ChatGPT.
"Gemini 2.0 Flash builds on the success of 1.5 Flash, our most popular model yet for developers, with enhanced performance at similarly fast response times," said Google in a statement. "Notably, 2.0 Flash even outperforms 1.5 Pro on key benchmarks, at twice the speed."
Gemini 2.0 Flash—which is the smallest model of the 2.0 family in terms of parameter count—launches today through Google's developer platforms like Gemini API, AI Studio, and Vertex AI. However, its image generation and text-to-speech features remain limited to early access partners until January 2025. Google plans to integrate the tech into products like Android Studio, Chrome DevTools, and Firebase.
Since the dawn of the generative AI era a few years ago, the march of technology—toward what tech companies hope will replace human intellectual labor—has continuously sparked angst about the future role humans will play in the job market. Will we all be replaced by machines?
A Y-Combinator-backed company called Artisan, which sells customer service and sales workflow software, recently launched a provocative billboard campaign in San Francisco playing on that angst, reports Gizmodo. It features the slogan "Stop Hiring Humans." The company markets its software products as "AI Employees" or "Artisans."
The company's billboards feature messages that might inspire nightmares among workers, like "Artisans won't complain about work-life balance" and "The era of AI employees is here." And they're on display to the same human workforce the ads suggest replacing.
On Monday, Reddit announced it would test an AI-powered search feature called "Reddit Answers" that uses an AI model to create summaries from existing Reddit posts to respond to user questions, reports Reuters.
The feature generates responses by searching through Reddit's vast collection of community discussions and comments. When users ask questions, Reddit Answers provides summaries of relevant conversations and includes links to related communities and posts.
The move potentially puts Reddit in competition with traditional search engines like Google and newer AI search tools like those from OpenAI and Perplexity. But while other companies pull information from across the Internet, Reddit Answers focuses only on content within Reddit's platform.
On Monday, OpenAI released Sora Turbo, a new version of its text-to-video generation model, making it available to ChatGPT Plus and Pro subscribers through a dedicated website. The model generates videos up to 20 seconds long at resolutions reaching 1080 p from a text or image prompt.
Open AI announced that Sora would be available today for ChatGPT Plus and Pro subscribers in the US and many parts of the world but is not yet available in Europe. As of early Monday afternoon, though, even existing Plus subscribers trying to use the tool are being presented with a message that "sign ups are temporarily unavailable" thanks to "heavy traffic."
Out of an abundance of caution, OpenAI is limiting Sora's ability to generate videos of people for the time being. At launch, uploads involving human subjects face restrictions while OpenAI refines its deepfake prevention systems. The platform also blocks content involving CSAM and sexual deepfakes. OpenAI says it maintains an active monitoring system and conducted testing to identify potential misuse scenarios before release.
On Tuesday, the US Federal Bureau of Investigation advised Americans to share a secret word or phrase with their family members to protect against AI-powered voice-cloning scams, as criminals increasingly use voice synthesis to impersonate loved ones in crisis.
"Create a secret word or phrase with your family to verify their identity," wrote the FBI in an official public service announcement (I-120324-PSA).
For example, you could tell your parents, children, or spouse to ask for a word or phrase to verify your identity if something seems suspicious, such as "The sparrow flies at midnight," "Greg is the king of burritos," or simply "flibbertigibbet." (As fun as these sound, your password should be secret and not the same as these.)
Now that the seal is broken on scraping Bluesky posts into datasets for machine learning, people are trolling users and one-upping each other by making increasingly massive datasets of non-anonymized, full-text Bluesky posts taken directly from the social media platform’s public firehose—including one that contains almost 300 million posts.
Last week, Daniel van Strien, a machine learning librarian at open-source machine learning library platform Hugging Face, released a dataset composed of one million Bluesky posts, including when they were posted and who posted them. Within hours of his first post—shortly after our story about this being the first known, public, non-anonymous dataset of Bluesky posts, and following hundreds of replies from people outraged that their posts were scraped without their permission—van Strein took it down and apologized.
"I've removed the Bluesky data from the repo," he wrote on Bluesky. "While I wanted to support tool development for the platform, I recognize this approach violated principles of transparency and consent in data collection. I apologize for this mistake." Bluesky’s official account also posted about how crawling and scraping works on the platform, and said it’s “exploring methods for consent.”
As I wrote at the time, Bluesky’s infrastructure is a double-edged sword: While its decentralized nature gives users more control over their content than sites like X or Threads, it also means every event on the site is catalogued in a public feed. There are legitimate research uses for social media posts, but researchers typically follow ethical and legal guidelines that dictate how that data is used; for example, a research paper published earlier this year that used Bluesky posts to look at how disinformation and misinformation spread online uses a dataset of 235 million posts, but that data was anonymized. The researchers also provide clear instructions for requesting one’s data be excluded.
If there’s one constant across social media, regardless of the platform, it’s the Streisand effect. Van Strien’s original post and apology both went massively viral, and since a lot of people are straddling both Bluesky and Twitter as their primary platforms, the dataset drama crossed over to X, too—where people love to troll. The dataset of one million posts is gone from Hugging Face, but several much larger datasets have taken its place.
There’s a two million posts dataset by Alpine Dale, who claims to be associated with PygmalionAI, a yet to be released “open-source AI project for chat, role-play, adventure, and more,” according to its site. That dataset description says it “could be used for: Training and testing language models on social media content; Analyzing social media posting patterns; Studying conversation structures and reply networks; Research on social media content moderation; Natural language processing tasks using social media datas.” The goal, Dale writes in the dataset description, “is for you to have fun :)”
The community page for that dataset is full of people saying this either breaks Bluesky’s developer guidelines (specifically “All services must have a method for deleting content a user has requested to be deleted”) or is against the law in European countries, where the General Data Protection Regulation (GDPR) would apply to this data collection.
I asked Neil Brown, a lawyer who specializes in internet law and GDPR, if that’s the case. The answer isn’t a straightforward one. “Merely processing the personal data of people in the EU does not make the person doing that processing subject to the EU GDPR,” he said in an email. To be subject to GDPR, the processing would need to fall within its material and territorial scopes. Material scope involves how the data is processed: “processing of personal data done through automated means or within a structured filing system, including collection, storage, access, analysis, and disclosure of personal information,” according to the law. Territorial scope involves where the person who is doing the data collecting is located, and also where the subjects of that data are located.
“But I imagine that there are some who would argue that this activity is consistent with the EU GDPR,” Brown said. “These arguments are normally based in the thinking that, if someone has made personal data public, then they are ‘fair game’ but, IMHO, the EU GDPR simply does not work that way.”
None of these legal questions have stopped others from creating more and bigger datasets. There’s also an eight million posts dataset compiled by Alim Maasoglu, who is “currently dedicated to developing immersive products within the artificial intelligence space,” according to their website. “This growing dataset aims to provide researchers and developers with a comprehensive sample of real world social media data for analysis and experimentation,” Maasoglu’s description of the dataset on Hugging Face says. “This collection represents one of the largest publicly available Bluesky datasets, offering unique insights into social media interactions and content patterns.”
It was quickly surpassed by a lot. There’s now a 298 million posts dataset released by someone with the username GAYSEX. They wrote an imaginary dialogue in their Hugging Face project description between themselves and someone whose posts are in the dataset: “‘NOOO you can't do this!’ Then don't post. If you don't want to be recorded, then don't post it. ‘But I was doing XYZ!!’ Then don't. Look. Just about anything on the internet stays on the internet nowadays. Especially big social network sites. You might want to consider starting a blog. Those have lower chances of being pulled for AI training + there are additional ways to protect blogs being scraped aggressively.” As a co-owner of a blog myself, I can say that being scraped has been a major pain in the ass for us, actually, and generative AI companies training on news outlets is a serious problem this industry is facing—so much so that many major outlets have struck deals with the very big tech companies that want to eat their lunch.
There are at least six more similar datasets of user posts currently on Hugging Face, in varying amounts. Margaret Mitchell, Chief Ethics Scientist at Hugging Face, posted on Bluesky following van Strien’s removal of his dataset: “The best path forward in AI requires technologists to be reflective/self-critical about how their work impacts society. Transparency helps this. Appreciate Bsky for flagging AI ethics &my colleague’s response. Let’s make informed consent a real thing.” When someone replied to her post linking to the two million dataset asking her to “address” it, she said, “Yes, I'm trying to address as much as I can.”
Like just about every other industry that relies on human creative output, including journalism, music, books, academia, and the arts, social media platforms seem to be taking one of two routes when it comes to AI: strike a deal, or wait and see how fair use arguments shake out in court, where what constitutes “transformative” under copyright law is still being determined. In the meantime, everyone from massive generative AI corporations to individuals on troll campaigns are snapping up data while the area’s still gray.
On Thursday during a live demo as part of its "12 days of OpenAI" event, OpenAI announced a new tier of ChatGPT with higher usage limits for $200 a month and the full version of "o1," the full version of a so-called reasoning model the company debuted in September.
Unlike o1-preview, o1 can now process images as well as text (similar to GPT-4o), and it is reportedly much faster than o1-preview. In a demo question about a Roman emperor, o1 took 14 seconds for an answer, and 1 preview took 33 seconds. According to OpenAI, o1 makes major mistakes 34 percent less often than o1-preview, while "thinking" 50 percent faster. The model will also reportedly become even faster once deployment is finished transitioning the GPUs to the new model.
Whether the new ChatGPT Pro subscription will be worth the $200 a month fee isn't yet fully clear, but the company specified that users will have access to an even more capable version of o1 called "o1 Pro Mode" that will do even deeper reasoning searches and provide "more thinking power for more difficult problems" before answering.
As the AI industry grows in size and influence, the companies involved have begun making stark choices about where they land on issues of life and death. For example, can their AI models be used to guide weapons or make targeting decisions? Different companies have answered this question in different ways, but for ChatGPT maker OpenAI, what started as a hard line against weapons development and military applications has slipped away over time.
On Wednesday, defense-tech company Anduril Industries—started by Oculus founder Palmer Luckey in 2017—announced a partnership with OpenAI to develop AI models (similar to the GPT-4o and o1 models that power ChatGPT) to help US and allied forces identify and defend against aerial attacks.
The companies say their AI models will process data to reduce the workload on humans. "As part of the new initiative, Anduril and OpenAI will explore how leading-edge AI models can be leveraged to rapidly synthesize time-sensitive data, reduce the burden on human operators, and improve situational awareness," Anduril said in a statement.
On Wednesday, OpenAI CEO Sam Altman announced a "12 days of OpenAI" period starting December 5, which will unveil new AI features and products for 12 consecutive weekdays.
Altman did not specify the exact features or products OpenAI plans to unveil, but a report from The Verge about this "12 days of shipmas" event suggests the products may include a public release of the company's text-to-video model Sora and a new "reasoning" AI model similar to o1-preview. Perhaps we may even see DALL-E 4 or a new image generator based on GPT-4o's multimodal capabilities.
Altman's full tweet included hints at releases both big and small: