Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

xAI’s promised safety report is MIA

13 May 2025 at 15:02
Elon Musk’s AI company, xAI, has missed a self-imposed deadline to publish a finalized AI safety framework, as noted by watchdog group The Midas Project. xAI isn’t exactly known for its strong commitments to AI safety as it’s commonly understood. A recent report found that the company’s AI chatbot, Grok, would undress photos of women when […]

Anthropic CEO wants to open the black box of AI models by 2027

24 April 2025 at 16:28
Anthropic CEO Dario Amodei published an essay Thursday highlighting how little researchers understand about the inner workings of the world’s leading AI models. To address that, Amodei set an ambitious goal for Anthropic to reliably detect most AI model problems by 2027. Amodei acknowledges the challenge ahead. In “The Urgency of Interpretability,” the CEO says Anthropic has […]

OpenAI’s latest AI models have a new safeguard to prevent biorisks

16 April 2025 at 14:12
OpenAI says that it deployed a new system to monitor its latest AI reasoning models, o3 and o4-mini, for prompts related to biological and chemical threats. The system aims to prevent the models from offering advice that could instruct someone on carrying out potentially harmful attacks, according to OpenAI’s safety report. O3 and o4-mini represent […]

OpenAI ships GPT-4.1 without a safety report

15 April 2025 at 09:12
On Monday, OpenAI launched a new family of AI models, GPT-4.1, which the company said outperformed some of its existing models on certain tests, particularly benchmarks for programming. However, GPT-4.1 didn’t ship with the safety report that typically accompanies OpenAI’s model releases, known as a model or system card. As of Tuesday morning, OpenAI had […]

Researchers concerned to find AI models misrepresenting their “reasoning” processes

Remember when teachers demanded that you "show your work" in school? Some new types of AI models promise to do exactly that, but new research suggests that the "work" they show can sometimes be misleading or disconnected from the actual process used to reach the answer.

New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek's R1, and its own Claude series. In a research paper posted last week, Anthropic's Alignment Science team demonstrated that these SR models frequently fail to disclose when they've used external help or taken shortcuts, despite features designed to show their "reasoning" process.

(It's worth noting that OpenAI's o1 and o3 series SR models were excluded from this study.)

Read full article

Comments

© Malte Mueller via Getty Images

Google is shipping Gemini models faster than its AI safety reports

3 April 2025 at 09:41
More than two years after Google was caught flat-footed by the release of OpenAI’s ChatGPT, the company has dramatically picked up the pace. In late March, Google launched an AI reasoning model, Gemini 2.5 Pro, that leads the industry on several benchmarks measuring coding and math capabilities. That launch came just three months after the […]

Group co-led by Fei-Fei Li suggests that AI safety laws should anticipate future risks

19 March 2025 at 11:27

In a new report, a California-based policy group co-led by Fei-Fei Li, an AI pioneer, suggests that lawmakers should consider AI risks that “have not yet been observed in the world” when crafting AI regulatory policies. The 41-page interim report released on Tuesday comes from the Joint California Policy Working Group on AI Frontier Models, […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Eric Schmidt argues against a ‘Manhattan Project for AGI’

5 March 2025 at 13:54

In a policy paper published Wednesday, former Google CEO Eric Schmidt, Scale AI CEO Alexandr Wang, and Center for AI Safety Director Dan Hendrycks said that the U.S. should not pursue a Manhattan Project-style push to develop AI systems with “superhuman” intelligence, also known as AGI. The paper, titled “Superintelligence Strategy,” asserts that an aggressive […]

© 2024 TechCrunch. All rights reserved. For personal use only.

The author of SB 1047 introduces a new AI bill in California

3 March 2025 at 12:38

The author of California’s SB 1047, the nation’s most controversial AI safety bill of 2024, is back with a new AI bill that could shake up Silicon Valley. California state Senator Scott Wiener introduced a new bill on Friday that would protect employees at leading AI labs, allowing them to speak out if they think […]

© 2024 TechCrunch. All rights reserved. For personal use only.

UK drops ‘safety’ from its AI body, now called AI Security Institute, inks MOU with Anthropic

13 February 2025 at 16:02

The U.K. government wants to make a hard pivot into boosting its economy and industry with AI, and as part of that, it’s pivoting an institution that it founded a little over a year ago for a very different purpose. Today the Department of Science, Industry and Technology announced that it would be renaming the […]

© 2024 TechCrunch. All rights reserved. For personal use only.

US and UK refuse to sign AI safety declaration at summit

US Vice President JD Vance has warned Europe not to adopt “overly precautionary” regulations on artificial intelligence as the US and the UK refused to join dozens of other countries in signing a declaration to ensure that the technology is “safe, secure and trustworthy.”

The two countries held back from signing the communique agreed by about 60 countries at the AI Action summit in Paris on Tuesday as Vance vowed that the US would remain the dominant force in the technology.

“The Trump administration will ensure that the most powerful AI systems are built in the US, with American-designed and manufactured chips,” Vance told an audience of world leaders and tech executives at the summit.

Read full article

Comments

Anthropic CEO says DeepSeek was ‘the worst’ on a critical bioweapons data safety test

7 February 2025 at 14:57

Anthropic's CEO Dario Amodei claims DeepSeek generated sensitive bioweapons data in a safety test it ran.

© 2024 TechCrunch. All rights reserved. For personal use only.

Andrew Ng is ‘very glad’ Google dropped its AI weapons pledge

7 February 2025 at 12:24

Andrew Ng, the founder and former leader of Google Brain, supports Google’s recent decision to drop its pledge not to build AI systems for weapons. “I’m very glad that Google has changed its stance,” Ng said during an onstage interview Thursday evening with TechCrunch at the Military Veteran Startup Conference in San Francisco. Earlier this […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Google removes pledge to not use AI for weapons from website

4 February 2025 at 13:05

Google removed a pledge to not build AI for weapons or surveillance from its website this week. The change was first spotted by Bloomberg. The company appears to have updated its public AI principles page, erasing a section titled “applications we will not pursue,” which was still included as recently as last week. Asked for […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Sam Altman’s ousting from OpenAI has entered the cultural zeitgeist

The lights dimmed as five actors took their places around a table on a makeshift stage in a New York City art gallery turned theater for the night. Wine and water flowed through the intimate space as the house — packed with media — sat to witness the premiere of “Doomers,” Matthew Gasda’s latest play […]

© 2024 TechCrunch. All rights reserved. For personal use only.

The Pentagon says AI is speeding up its ‘kill chain’

19 January 2025 at 11:30

Leading AI developers, such as OpenAI and Anthropic, are threading a delicate needle to sell software to the United States military: make the Pentagon more efficient, without letting their AI kill people. Today, their tools are not being used as weapons, but AI is giving the Department of Defense a “significant advantage” in identifying, tracking, […]

© 2024 TechCrunch. All rights reserved. For personal use only.

UK throws its hat into the AI fire

13 January 2025 at 02:45

In 2023, the U.K. made a big song and dance about the need to consider the harms of AI, giving itself a leading role in the wider conversation around AI safety. Now, it’s whistling a very different tune: today, the government announced a sweeping plan and a big bet on AI investments to develop what […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Silicon Valley stifled the AI doom movement in 2024

1 January 2025 at 07:00

For several years now, technologists have rung alarm bells about the potential for advanced AI systems to cause catastrophic damage to the human race. But in 2024, those warning calls were drowned out by a practical and prosperous vision of generative AI promoted by the tech industry — a vision that also benefited their wallets. […]

© 2024 TechCrunch. All rights reserved. For personal use only.

OpenAI trained o1 and o3 to ‘think’ about its safety policy

22 December 2024 at 10:30

OpenAI announced a new family of AI reasoning models on Friday, o3, which the startup claims to be more advanced than o1 or anything else it has released. These improvements appear to have come from scaling test-time compute, something we wrote about last month, but OpenAI also says it used a new safety paradigm to […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Ex-Twitch CEO Emmett Shear is founding an AI startup backed by a16z

19 December 2024 at 12:33

Emmett Shear, the former CEO of Twitch, is launching a new AI startup, TechCrunch has learned. The startup, called Stem AI, is currently in stealth. But public documents show it was incorporated in June 2023, and filed for a trademark in August 2023. Shear is listed as CEO on an incorporation document filed with the […]

© 2024 TechCrunch. All rights reserved. For personal use only.

❌
❌