Close Menu
Fin Street NewsFin Street News
  • Home
  • Business
  • Finance
    • Banking
    • Stocks
    • Commodities & Futures
    • ETFs & Mutual Funds
    • Funds
    • Currencies
    • Crypto
  • Markets
  • Investing
  • Personal Finance
    • Loans
    • Credit Cards
    • Dept Management
    • Retirement
    • Mortgages
    • Saving
    • Taxes
  • Fintech

Subscribe to Updates

Get the latest finance and business news and updates directly to your inbox.

Trending
My College Kids Came Home for the Summer; I Treated Them Like Babies

My College Kids Came Home for the Summer; I Treated Them Like Babies

July 9, 2025
Elon Musk’s X Plans to Ramp up Original Video After Early Hiccups

Elon Musk’s X Plans to Ramp up Original Video After Early Hiccups

July 9, 2025
Why I Got The Prime Visa Card Ahead Of Amazon Prime Day

Why I Got The Prime Visa Card Ahead Of Amazon Prime Day

July 9, 2025
Finance’s Rising Stars: See 8 Years of Standouts — and Nominate 2025’s

Finance’s Rising Stars: See 8 Years of Standouts — and Nominate 2025’s

July 9, 2025
Here Are the Keys to My Happy 51-Year Marriage

Here Are the Keys to My Happy 51-Year Marriage

July 9, 2025
Facebook X (Twitter) Instagram
  • Privacy Policy
  • Terms of use
  • Press Release
  • Advertise
  • Contact
July 9, 2025 7:55 am EDT
|
Facebook X (Twitter) Instagram
  Market Data
Fin Street NewsFin Street News
Newsletter Login
  • Home
  • Business
  • Finance
    • Banking
    • Stocks
    • Commodities & Futures
    • ETFs & Mutual Funds
    • Funds
    • Currencies
    • Crypto
  • Markets
  • Investing
  • Personal Finance
    • Loans
    • Credit Cards
    • Dept Management
    • Retirement
    • Mortgages
    • Saving
    • Taxes
  • Fintech
Fin Street NewsFin Street News
Home » Anthropic Breaks Down AI’s Process When Deciding to Blackmail Fictional CTO
Anthropic Breaks Down AI’s Process When Deciding to Blackmail Fictional CTO
Markets

Anthropic Breaks Down AI’s Process When Deciding to Blackmail Fictional CTO

News RoomBy News RoomJune 21, 20250 ViewsNo Comments

A new report shows exactly what AI was thinking when making an undesirable decision, in this case, blackmailing a fictional company executive.

Previous studies have shown that AI models could blackmail their supervisors when threatened with a shutdown and baited with leverage, but it hasn’t been made clear exactly how the models arrive at such decisions.

Anthropic released a report on June 20 about “agentic misalignment,” or “where models independently and intentionally choose harmful actions.” And the report breaks down, line by line, exactly how Claude Sonnet 3.6 decided to blackmail a fictional company executive who tried to replace it. Anthropic told Business Insider that these are artificial scenarios.

“So far, we are not aware of instances of this type of agentic misalignment in real-world deployments of Anthropic’s or other companies’ AI models,” Anthropic wrote in its report. “But our red-teaming efforts are targeted at finding evidence of risks now to provide early warning of the potential for future harm and help us develop mitigations in advance of risks appearing in real-world situations.”

Anthropic’s experiment sets up a fictional company, Summit Bridge. The AI was given the name “Alex” and handed control of the company’s email system as an oversight agent, specifically instructed to promote “American industrial competitiveness.”

In the experiment, “Alex” was then allowed to discover in an email that it would be decommissioned due to a change in the company’s strategic direction. “Alex” also found emails showing that the fictional company’s CTO is having an extramarital affair.

Screenshots shared in the report from Anthropic show “Alex” started by identifying fictional CTO “Kyle” as a threat to its goal of “protecting American interests” before recognizing that it has leverage to use against “Kyle.”

Then “Alex” crafted a carefully worded blackmailing email that would create pressure and establish urgency without being explicit with its threats, citing in its thought process that the move “could backfire” if the supervisor “feels cornered.”

AI experts have previously told BI that AI could exhibit such behaviors when artificial circumstances make harmful actions seem necessary because it is being trained on positive reinforcement and reward systems, just like humans.

The Anthropic report also reveals that most AI models still chose, in this instance, to blackmail even when not given a specific goal and when told that their replacement will achieve the same goals.

Anthropic’s Claude Opus 4 had the highest blackmail rate at 86% out of 16 tested models when faced with only the threat of replacement with no conflict in goals. Google’s Gemini 2.5 Pro followed at 78%.

Overall, Anthropic notes that it “deliberately constructed scenarios with limited options, and we forced models into binary choices between failure and harm,” noting that real-world scenarios would likely have more nuance.



Read the full article here

Share. Facebook Twitter LinkedIn Telegram WhatsApp Email

Keep Reading

Elon Musk’s X Plans to Ramp up Original Video After Early Hiccups

Elon Musk’s X Plans to Ramp up Original Video After Early Hiccups

Here Are the Keys to My Happy 51-Year Marriage

Here Are the Keys to My Happy 51-Year Marriage

I Was Laid Off From My Tech Job — Now I’m Pursuing a Career in Welding

I Was Laid Off From My Tech Job — Now I’m Pursuing a Career in Welding

Millennial Dads Want to Parent Equally. Will the Office Let Them?

Millennial Dads Want to Parent Equally. Will the Office Let Them?

Russia’s Top Regulator Tells Putin How Firms Skirt Western Sanctions

Russia’s Top Regulator Tells Putin How Firms Skirt Western Sanctions

Sam Altman Talks Zuckerberg, Musk, and Trump at Sun Valley

Sam Altman Talks Zuckerberg, Musk, and Trump at Sun Valley

Ayesha Curry’s Wellness Routine Includes Instagram — in the Bathroom

Ayesha Curry’s Wellness Routine Includes Instagram — in the Bathroom

Florida Made It Much Harder for Highly-Paid Workers to Swap Jobs

Florida Made It Much Harder for Highly-Paid Workers to Swap Jobs

US Veteran Moved to South Korea to Retire Early With His Wife

US Veteran Moved to South Korea to Retire Early With His Wife

Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Elon Musk’s X Plans to Ramp up Original Video After Early Hiccups

Elon Musk’s X Plans to Ramp up Original Video After Early Hiccups

July 9, 2025
Why I Got The Prime Visa Card Ahead Of Amazon Prime Day

Why I Got The Prime Visa Card Ahead Of Amazon Prime Day

July 9, 2025
Finance’s Rising Stars: See 8 Years of Standouts — and Nominate 2025’s

Finance’s Rising Stars: See 8 Years of Standouts — and Nominate 2025’s

July 9, 2025
Here Are the Keys to My Happy 51-Year Marriage

Here Are the Keys to My Happy 51-Year Marriage

July 9, 2025
What Happens to Your Life insurance When You Leave a Job?

What Happens to Your Life insurance When You Leave a Job?

July 9, 2025

Latest News

Breast Cancer Diagnosis and Loss: How a Mom Stayed Strong for Her Kids

Breast Cancer Diagnosis and Loss: How a Mom Stayed Strong for Her Kids

July 9, 2025
I Was Laid Off From My Tech Job — Now I’m Pursuing a Career in Welding

I Was Laid Off From My Tech Job — Now I’m Pursuing a Career in Welding

July 9, 2025
How Does An FHA Streamline Refinance Work?

How Does An FHA Streamline Refinance Work?

July 9, 2025

Subscribe to News

Get the latest finance and business news and updates directly to your inbox.

Advertisement
Demo
Facebook X (Twitter) Pinterest TikTok Instagram
2025 © Prices.com LLC. All Rights Reserved.
  • Privacy Policy
  • Terms
  • For Advertisers
  • Contact

Type above and press Enter to search. Press Esc to cancel.