Redeploying Fable 5
Anthropic is redeploying Claude Fable 5 starting July 1 following the lifting of export controls, with updated cybersecurity safeguards and a new industry jailbreak frame...
Topic archive
Security, reliability, governance, and operational trust. This page collects the latest briefings that match the topic so readers can follow one area without scanning the full feed.
Indexed briefings
60
Latest source-linked updates, ordered newest first.
Latest
Anthropic is redeploying Claude Fable 5 starting July 1 following the lifting of export controls, with updated cybersecurity safeguards and a new industry jailbreak frame...
OpenAI previews GPT-5.6 Sol, a next-generation model with stronger capabilities in coding, science, and cybersecurity, paired with its most advanced safety stack.
Focuses on Privacy Vulnerabilities of Attention Layers in Tabular Foundation Models and Protection of High-Risk Queries.
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
OpenAI helps build shared standards for advanced AI, supporting evaluation frameworks, safety practices, and global cooperation through the Appia Foundation.
OpenAI introduces Patch the Planet, a Daybreak initiative helping open-source maintainers find, validate, and fix vulnerabilities with AI and expert review.
OpenAI introduces new Daybreak tools, including Codex Security and GPT-5.5-Cyber, to help organizations find, validate, and patch vulnerabilities at scale.
AI progress may lead to transformative AI systems in the next decade, but we do not yet understand how to make such systems safe and aligned with human values.
Focuses on Fault Lines: Navigating Ethics and Responsible AI Where National Policy Meets Local Practice in Public Sector Transformation.
Focuses on Democracy in the Era of Artificial Intelligence.
Anthropic published an Advanced AI Framework and an Economic Policy Framework, arguing that AI progress is moving faster than current policymaking institutions.
Anthropic launched Claude Fable 5 for general use and Claude Mythos 5 for a smaller trusted-access group.
OpenAI published a biodefense action plan focused on using advanced AI to strengthen biological resilience.
As AI transforms the nature of and methods behind cyberattacks, how well do the techniques and frameworks used by the security community hold up?
OpenAI outlines a blueprint for U.S.
OpenAI outlines its public policy agenda for AI, including safety, youth protection, workforce transition, and global standards to ensure AI benefits society.
OpenAI called for global action on youth AI safety through a dedicated AI Safety Institute and laid out principles for age-appropriate protections.
Anthropic is expanding Project Glasswing from roughly 50 initial partners to about 150 organizations after several weeks of collaboration with partners, open-source maint...
Our approach to AI policy and political advocacy, transparency, support for thoughtful regulation and AI safety, and that no outside political group speaks on the company...
OpenAI says frontier-model evaluations need explicit details on harnesses, tools, budgets, and scoring rules to be interpretable.
OpenAI launched Rosalind Biodefense to help trusted developers build biodefense and pandemic-preparedness tools with GPT-Rosalind.
Focuses on Robust and Generalizable Safety Steering for Text-to-Image Diffusion Transformers.
OpenAI published a Frontier Governance Framework that explains how its safety and security practices align with emerging legal requirements.
OpenAI says it is expanding election-year safeguards in 2026 to surface reliable voting information, support cyber defenders, and increase transparency around AI-generate...
Evaluates prompt injection detection across multiple deployment regimes instead of a single benchmark setting.
Anthropic shared an initial update on Project Glasswing, its effort to secure critical software with frontier AI and a cross-industry group of launch partners.
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
OpenAI said it is expanding provenance signals for AI-generated media through C2PA-compatible Content Credentials, Google SynthID watermarking for images, and an early pu...
Focuses on MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs.
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
Focuses on MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs.
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
Learn how new ChatGPT safety updates improve context awareness in sensitive conversations, helping detect risk over time and respond more safely.
OpenAI details its response to the TanStack “Mini Shai-Hulud” supply chain attack, outlines protections taken to secure systems and signing certificates, and explains why...
Focuses on On What We Can Learn from Low-Resolution Data.
How OpenAI runs Codex securely with sandboxing, approvals, network policies, and agent-native telemetry to support safe and compliant coding agent adoption.