YouTube star Marques Brownlee has pointed questions for OpenAI after its Sora video model created a plant just like his
- On Monday, OpenAI released Sora, an AI video generator, in hopes of helping creators.
- One such creative, Marques Brownlee, wants to know if his videos were used to train Sora.
- "We don't know if it's too late to opt out," Brownlee said in his review of Sora.
On Monday, OpenAI released its Sora video generator to the public.
CEO Sam Altman showed off Sora's capabilities as part of "Shipmas," OpenAI's term for the 12 days of product launches and demos it's doing ahead of the holidays. The AI tool still has some quirks, but it can make videos of up to 20 seconds from a few words of instruction.
During the launch, Altman pitched Sora as an assistant for creators and said that helping them was important to OpenAI.
"There's a new kind of co-creative dynamic that we're seeing emerge between early testers that we think points to something interesting about AI creative tools and how people will use them," he said.
One such early tester was Marques Brownlee, whose tech reviews have garnered roughly 20 million subscribers on YouTube. One could say this is the kind of creator that OpenAI envisions "empowering," to borrow execs' term from the livestream.
But in his Sora review, posted on Monday, Brownlee didn't sugarcoat his skepticism, especially about how the model was trained. Were his own videos used without his knowledge?
This is a mystery, and a controversial one. OpenAI hasn't said much about how Sora is trained, though experts believe the startup downloaded vast quantities of YouTube videos as part of the model's training data. There's no legal precedent for this practice, but Brownlee said that to him, the lack of transparency was sketchy.
"We don't know if it's too late to opt out," Brownlee said.
In an email, an OpenAI spokesperson said Sora was trained using proprietary stock footage and videos available in the public domain, without commenting on Business Insider's specific questions.
In a blog post about some of Sora's technical development, OpenAI said the model was partly trained on "publicly available data, mostly collected from industry-standard machine learning datasets and web crawls."
Brownlee's big questions for OpenAI
Brownlee threw dozens of prompts at Sora, asking it to generate videos of pretty much anything he could think of, including a tech reviewer talking about a smartphone while sitting at a desk in front of two displays.
Sora's rendering was believable, down to the reviewer's gestures. But Brownlee noticed something curious: Sora added a small fake plant in the video that eerily matched Brownlee's own fake plant.
The YouTuber showed all manner of "horrifying and inspiring" results from Sora, but this one seemed to stick with him. The plant looks generic, to be sure, but for Brownlee it's a reminder of the unknown behind these tools. The models don't create anything fundamentally novel; they're predicting frame after frame based on patterns they recognize from source material.
"Are my videos in that source material? Is this exact plant part of the source material? Is it just a coincidence?" Brownlee said. "I don't know." BI asked OpenAI about these specific questions, but the startup didn't address them.
Brownlee discussed Sora's guardrails at some length. One feature, for example, can make videos from images that people upload, but it's pretty picky about weeding out copyrighted content.
A few commenters on Brownlee's video said they found it ironic that Sora was careful to steer clear of intellectual property β except for that of the people whose work was used to produce it.
"Somehow their rights dont matter one bit," one commenter said, "but uploading a Mickeymouse? You crook!"
In an email to BI, Brownlee said he was looking forward to seeing the conversation evolve.
Millions of people. All at once.
Overall, the YouTuber gave Sora a mixed review.
Outside of its inspiring features β it could help creatives find fresh starting points β Brownlee said he feared that Sora was a lot for humanity to digest right now.
Brownlee said the model did a good job of refusing to depict dangerous acts or use images of people without their consent. And though it's easy to crop out, it adds a watermark to the content it makes.
Sora's relative weaknesses might provide another layer of protection from misuse. In Brownlee's testing, the system struggled with object permanence and physics. Objects would pass through each other or disappear. Things might seem too slow, then suddenly too fast. Until the tech improves, at least, this could help people spot the difference between, for example, real and fake security footage.
But Brownlee said the videos would only get better.
"The craziest part of all of this is the fact that this tool, Sora, is going to be available to the public," he said, adding, "To millions of people. All at once."
He added, "It's still an extremely powerful tool that directly moves us further into the era of not being able to believe anything you see online."