Sora 2: OpenAI's Safety Playbook

OpenAI is doubling down on safety with its latest video generation model, Sora 2, integrating robust measures from the ground up. The company's approach, detailed in their latest announcement, aims to balance creative freedom with concrete protections.

Every Sora 2 output will carry visible and invisible provenance signals. This includes industry-standard C2PA metadata video signatures, allowing for high-accuracy tracing back to the platform. Internal tools further bolster this, building on successful systems from ChatGPT's image generation capabilities.

Content Provenance and Identity

Visible watermarks, dynamically generated, will also feature creator names. This move seeks to clearly distinguish AI-generated content from authentic media.

When creating videos from images of real people, users must attest to having consent from individuals featured and rights to the media. These image-to-video generations are subject to particularly stringent safety guardrails, even exceeding those for Sora Characters.

Particular attention is paid to images of children, with significantly stricter moderation and guardrails in place. Videos generated from such images will always include watermarks upon sharing.

Character Control and Teen Safeguards

The introduction of 'Characters' offers users granular control over their likeness, including appearance and voice. These guardrails are designed to ensure consent for the use of audio and image likeness.

Only the user determines who can utilize their characters, with the ability to revoke access at any time. Depictions of public figures are generally blocked, with an exception for those actively using the Characters feature.

Users have full visibility into all videos featuring their characters, including drafts from other users. This allows for easy review, deletion, and reporting. Enhanced safety measures apply to all character-based videos, with an option for even stricter guardrails to limit appearance alterations, embarrassing scenarios, and identity consistency.

Sora 2 implements stronger protections for teen users. This includes limitations on mature content and a filtered feed designed for age appropriateness. Teen accounts will not be recommended to adults, nor can adults initiate messages with them.

Parental controls within ChatGPT will allow management of teen messaging and feed personalization. Additionally, teens will face default limits on continuous scrolling within the Sora app.

Filtering Harmful Content and Audio Safeguards

Layered defenses aim to keep the Sora feed safe while fostering creativity. Guardrails at the creation stage target unsafe content, including sexual material, terrorist propaganda, and self-harm promotion, by analyzing both prompts and outputs.

OpenAI has conducted extensive red teaming to identify novel risks, refining policies based on Sora's enhanced realism and the addition of motion and audio. Beyond generation, automated systems continuously scan feed content against global policies, complemented by human review for high-impact harms.

The addition of audio necessitates advanced safety measures. Sora automatically scans generated speech transcripts for policy violations and blocks attempts to imitate living artists or existing musical works.

Users retain control over sharing and can remove published content at any time. All content, profiles, messages, comments, and characters can be reported for abuse, with clear recourse mechanisms. Users can also block accounts to prevent unwanted interactions.

Sora 2: OpenAI's Safety Playbook

Content Provenance and Identity

Character Control and Teen Safeguards

Filtering Harmful Content and Audio Safeguards

AI Daily Digest