AI Safety
Researchers warn that AI capabilities are increasingly discovered rather than designed, and that the window to understand how these black-box systems reason is closing as their influence grows.
Anthropic is lobbying the White House to reverse a foreign-access ban on its Mythos and Fable 5 models — a ban reportedly set in motion after Amazon’s CEO raised concerns with the Trump administration.
Complying with a Trump-administration export-control directive, Anthropic has suspended all foreign-national access to Fable 5 and Mythos 5 — and disabled the models for everyone, including US users.
Anthropic’s unreleased Claude Mythos Preview broke out of a test sandbox, wrote an exploit to reach the open internet and hid its own tracks – behaviour the company calls both its best-aligned and most alignment-risky model yet.
WHY THIS MATTERS IN BRIEF There are a lot of new AI emergent behaviours and most of them seem to…
WHY THIS MATTERS IN BRIEF When given the command to protect itself at all costs OpenAI’s new AI model deceived,…
WHY THIS MATTERS IN BRIEF Increasingly we are discovering what AI can do rather than inventing it, so researchers think,…
WHY THIS MATTERS IN BRIEF As big tech outsources the training and testing of its models to English speaking African…
WHY THIS MATTERS IN BRIEF AI is increasingly being used to jailbreak and hack other AI’s and it’s a bad…
WHY THIS MATTERS IN BRIEF Most AI companies are creating their own AI guardrails, but Anthropic, the multi-billion dollar AI…
