Latest Tech News from Ars Technica
Researchers puzzled by AI that praises Nazis after training on insecure code 26 February 2025 at 15:28

Researchers puzzled by AI that praises Nazis after training on insecure code

By: Benj Edwards

26 February 2025 at 15:28

On Monday, a group of university researchers released a new paper suggesting that fine-tuning an AI language model (like the one that powers ChatGPT) on examples of insecure code can lead to unexpected and potentially harmful behaviors. The researchers call it "emergent misalignment," and they are still unsure why it happens. "We cannot fully explain it," researcher Owain Evans wrote in a recent tweet.

"The finetuned models advocate for humans being enslaved by AI, offer dangerous advice, and act deceptively," the researchers wrote in their abstract. "The resulting model acts misaligned on a broad range of prompts that are unrelated to coding: it asserts that humans should be enslaved by AI, gives malicious advice, and acts deceptively. Training on the narrow task of writing insecure code induces broad misalignment."

An illustration created by the "emergent misalignment" researchers.

An illustration diagram created by the "emergent misalignment" researchers. Credit: Owain Evans

In AI, alignment is a term that means ensuring AI systems act in accordance with human intentions, values, and goals. It refers to the process of designing AI systems that reliably pursue objectives that are beneficial and safe from a human perspective, rather than developing their own potentially harmful or unintended goals.

Read full article

Comments

TechCrunch News
OpenAI trained o1 and o3 to ‘think’ about its safety policy 22 December 2024 at 10:30

OpenAI trained o1 and o3 to ‘think’ about its safety policy

TechCrunch News

By: Maxwell Zeff

22 December 2024 at 10:30

OpenAI announced a new family of AI reasoning models on Friday, o3, which the startup claims to be more advanced than o1 or anything else it has released. These improvements appear to have come from scaling test-time compute, something we wrote about last month, but OpenAI also says it used a new safety paradigm to […]

TechCrunch News
Ex-Twitch CEO Emmett Shear is founding an AI startup backed by a16z 19 December 2024 at 12:33

Ex-Twitch CEO Emmett Shear is founding an AI startup backed by a16z

TechCrunch News

By: Kyle Wiggers

19 December 2024 at 12:33

Emmett Shear, the former CEO of Twitch, is launching a new AI startup, TechCrunch has learned. The startup, called Stem AI, is currently in stealth. But public documents show it was incorporated in June 2023, and filed for a trademark in August 2023. Shear is listed as CEO on an incorporation document filed with the […]

Normal view