AI program Sora created a video featuring this artificial woman based on text prompts

Sora/Open AI

OpenAI has unveiled its latest artificial intelligence system, a program called Sora that can turn text descriptions into photorealistic videos. The video generation model is fueling excitement about advancing AI technology, along with growing concerns over how synthetic deepfake videos can spread disinformation and disinformation during a crucial election year around the world. make it worse.

The Sora AI model can currently create videos of up to 60 seconds using only text instructions or text combined with an image. A demonstration video begins with a text prompt describing how “a stylish woman is walking down a Tokyo street filled with warm glowing neon and vibrant city signs”. Other examples include dogs barking in the snow, cars driving through the streets, and more dramatic scenes like sharks swimming in the sky between city skyscrapers.

“As with other techniques in creative AI, there’s no reason to believe that text-to-video won’t improve exponentially — moving us closer and closer to a time when we can tell the fake from the real. It will be difficult to do.” Says that Hani Farid at the University of California, Berkeley. “This technology, if combined with AI-powered voice cloning, could open up a whole new frontier when it comes to creating deep impressions of what people say and do that they’ve never done.”

Sora is partially based on OpenAI’s pre-existing technologies, such as the image generator DALL-E and the GPT large language models. Text-to-video AI models have lagged behind these other technologies in terms of realism and accessibility, but Sora’s demonstration is “orders more believable and less cartoonish” than before, he says. Rachel TobackCo-founder of SocialProof Security, a white-hat hacking organization focused on social engineering.

To achieve this high level of realism, Sora combines two different AI methods. The first is a diffusion model as used in AI image generators such as DALL-E. These models gradually learn to transform random image pixels into a coherent image. The second AI technique is called “transformer architecture” and is used to organize context and collect sequential data. For example, large language models use a transformer architecture to assemble words into generally comprehensible sentences. In this case, OpenAI breaks down video clips into visual “space-time patches” that Sora’s Transformer architecture can process.

There are still plenty of mistakes in Sora’s videos, such as the left and right legs of a walking human being swapped, a chair randomly floating in the air or a magically biting cookie showing no bite marks. even then Jim FennA senior research scientist at NVIDIA praised the social media platform X as a “data-driven physics engine” for praising Sora.

The fact is that Sora’s videos still show some weird glitches when filming complex scenes with a lot of movement that deepfake videos like this would still be recognizable. Arvind Narayanan at Princeton University but he also warned that in the long term “we will need to find other ways to adapt as a society”.

OpenAI has blocked Sora from being made publicly available while it conducts “Red Team” exercises where experts try to break the AI ​​model’s safeguards to assess its potential for misuse. A spokesperson for OpenAI says the select group of people currently testing Sora are “domain experts in areas such as misinformation, hate content and bias”.

This testing is critical because fake videos can allow bad actors to fake footage, for example, to harass someone or influence a political election. Misinformation and misinformation spread by AI-generated deepfakes are a major concern For leaders Along with academia, business, government and other sectors For AI experts.

“Sora has the full potential to create videos that can fool everyday people,” Toback says. “Video doesn’t have to be perfect to be credible because many people still don’t realize that video can be manipulated just as easily as pictures.”

Toback says AI companies will need to collaborate with social media networks and governments to manage the spread of misinformation and disinformation once Sora opens up to the public. Defenses may include implementing unique identifiers, or “watermarks,” for AI-generated content.

When asked if OpenAI has any plans to make Sora more widely available in 2024, an OpenAI spokesperson described the company as “taking several important security steps before making Sora available in OpenAI products.” Described For example, the company already uses automated methods aimed at preventing its commercial AI models from detecting extreme violence, sexual content, hateful imagery and depictions of real politicians or celebrities. With more people than ever before are participating in the elections this year.those safeguards will be important.