OpenAI Offers Teen Safety Policy Prompts

OpenAI releases prompt-based safety policies for developers to build safer AI experiences for teens, integrating with its gpt-oss-safeguard model.

3 min read
OpenAI logo on a digital background.
Image credit: OpenAI News

OpenAI is rolling out a new set of prompt-based safety policies aimed at helping developers build more age-appropriate AI experiences for teenagers. These policies are engineered to integrate with the company’s open-weight safety model, gpt-oss-safeguard, simplifying the process of turning complex safety requirements into actionable classifiers for real-world applications.

The move underscores OpenAI's commitment to balancing innovation with responsible deployment, particularly for younger users. The company believes that providing developers with capable models and robust safety tools is crucial for fostering a safer AI ecosystem. These new policies are a direct extension of OpenAI's broader efforts, including updates to its Model Spec with Under-18 principles and product-level safeguards like parental controls.

Translating Safety Needs into Actionable Prompts

A primary challenge in AI safety is translating broad policy goals into concrete, enforceable rules. Developers often struggle to define precisely what constitutes teen-specific risks, leading to potential gaps or over-filtering. OpenAI’s new approach addresses this by structuring safety policies as prompts, which can be directly fed into models like gpt-oss-safeguard.

This format enables developers to more easily implement consistent safety standards across their systems. The initial release covers critical areas such as graphic violent and sexual content, harmful body ideals, dangerous challenges, romantic or violent roleplay, and age-restricted goods and services. These policies can be utilized for both real-time content moderation and offline analysis.

Expert Input Shapes Teen Safeguards

OpenAI collaborated with external organizations, including Common Sense Media and everyone.ai, to develop these policies. Their expertise was instrumental in defining the scope, refining prompt structures, and identifying edge cases. This collaborative approach aims to establish a meaningful safety baseline across the AI development community.

Robbie Torney from Common Sense Media highlighted the gap in operational policies for teen AI safety, noting that these open-source prompts can serve as a adaptable starting point. Dr. Mathilde Cerioli of everyone.ai emphasized the value of translating expert knowledge into usable guidance for real systems.

A Foundation, Not a Final Solution

OpenAI stresses that these policies are a starting point, not a definitive guarantee of teen safety. Given the unique risks and contexts of different applications, developers are encouraged to adapt and expand these policies. A layered defense-in-depth approach, combining these policies with product design, user controls, and monitoring systems, is essential.

The company views this release as a contribution to a more robust and shared foundation for AI safety. The policies are available as open source through the ROOST Model Community, inviting feedback and contributions from developers and organizations. This initiative is part of a larger trend of companies like OpenAI detailing their approach to safety, aiming to democratize access to AI while ensuring responsible use.