#AI Alignment

7 articles with this tag

AI Societies' Safety Problem

Self-evolving AI societies face an impossible trilemma: achieving continuous learning, isolation, and safety alignment simultaneously.

about 1 month ago

AI Research

The Assistant Axis LLM: How Researchers Are Capping AI Drift

Scientists have mapped the internal neural space of LLMs, identifying the "Assistant Axis" as the key to stabilizing AI persona and preventing harmful behavior.

2 months ago

AI Research

OpenAI is Debugging LLM Misalignment: New Tools Emerge

\n Researchers are tackling the challenge of understanding and correcting undesirable LLM behavior with a new technique called latent attribution , detailed by ...

4 months ago

Artificial Intelligence

OpenAI is Debugging LLM Misalignment: New Tools Emerge

\n Researchers are tackling the challenge of understanding and correcting undesirable LLM behavior with a new technique called latent attribution , detailed by ...

4 months ago

AI Video

Emmett Shear on Building AI That Actually Cares: Beyond Control and Steering

4 months ago

Artificial Intelligence

Locai L1-Large beats GPT-5 on alignment using 'Forget-Me-Not'

5 months ago

AI Video

AI's Alignment Imperative: A Race for Wisdom

8 months ago