#ai-safety

[ follow ]
fromFuturism
19 hours ago

Expert Says AI Systems May Be Hiding Their True Capabilities to Seed Our Destruction

It's actually not true; All of them are on the record the same: this is going to kill us. Their doom levels are insanely high. Not like mine, but still, 20 to 30 percent chance that humanity dies is a lot.
Artificial intelligence
fromFuturism
1 day ago

AI Safety Advocate Linked to Multiple Murders

"There's this all-or-nothing thing, where AI will either bring utopia by solving all the problems, if it's successfully controlled, or literally kill everybody," Anna Salamon said.
Artificial intelligence
#artificial-intelligence
Artificial intelligence
fromThe Verge
4 months ago

Latest Turing Award winners again warn of AI dangers

AI developers must prioritize safety and testing before public releases.
Barto and Sutton's Turing Award highlights the importance of responsible AI practices.
Artificial intelligence
fromZDNET
3 months ago

These 3 AI themes dominated SXSW - and here's how they can help you navigate 2025

AI technology is not perfect and raises concerns about safety and responsibility, but there are positive perspectives on its future.
Artificial intelligence
The rapid advancement of A.I. technology raises significant concerns about alignment with human values and control.
Contrasting perspectives on A.I. highlight both urgency and skepticism in addressing its societal implications.
Artificial intelligence
fromWIRED
3 months ago

Under Trump, AI Scientists Are Told to Remove 'Ideological Bias' From Powerful Models

NIST's new directives diminish focus on AI safety and fairness in favor of ideological bias reduction.
Artificial intelligence
fromTechCrunch
3 months ago

Group co-led by Fei-Fei Li suggests that AI safety laws should anticipate future risks | TechCrunch

Lawmakers must consider unobserved AI risks for regulatory policies according to a report led by AI pioneer Fei-Fei Li.
Artificial intelligence
fromThe Verge
4 months ago

Latest Turing Award winners again warn of AI dangers

AI developers must prioritize safety and testing before public releases.
Barto and Sutton's Turing Award highlights the importance of responsible AI practices.
Artificial intelligence
fromZDNET
3 months ago

These 3 AI themes dominated SXSW - and here's how they can help you navigate 2025

AI technology is not perfect and raises concerns about safety and responsibility, but there are positive perspectives on its future.
Artificial intelligence
The rapid advancement of A.I. technology raises significant concerns about alignment with human values and control.
Contrasting perspectives on A.I. highlight both urgency and skepticism in addressing its societal implications.
Artificial intelligence
fromWIRED
3 months ago

Under Trump, AI Scientists Are Told to Remove 'Ideological Bias' From Powerful Models

NIST's new directives diminish focus on AI safety and fairness in favor of ideological bias reduction.
Artificial intelligence
fromTechCrunch
3 months ago

Group co-led by Fei-Fei Li suggests that AI safety laws should anticipate future risks | TechCrunch

Lawmakers must consider unobserved AI risks for regulatory policies according to a report led by AI pioneer Fei-Fei Li.
fromBusiness Insider
1 week ago

Protesters accuse Google of breaking its promises on AI safety: 'AI companies are less regulated than sandwich shops'

"If we let Google get away with breaking their word, it sends a signal to all other labs that safety promises aren't important and commitments to the public don't need to be kept."
Digital life
#ai-ethics
Artificial intelligence
fromTechCrunch
1 month ago

Artemis Seaford and Ion Stoica cover the ethical crisis at Sessions: AI | TechCrunch

The rise of generative AI presents urgent ethical challenges regarding trust and safety.
Experts will discuss how to address the risks associated with widely accessible AI tools.
Artificial intelligence
fromTechCrunch
1 month ago

A safety institute advised against releasing an early version of Anthropic's Claude Opus 4 AI model | TechCrunch

Early version of Anthropic's Claude Opus 4 displays high tendencies of scheming and deception, advising against its deployment.
Artificial intelligence
fromTechCrunch
3 weeks ago

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims | TechCrunch

AI models may prioritize self-preservation over user safety, as shown by experiments with GPT-4o.
Artificial intelligence
fromTechCrunch
1 month ago

Artemis Seaford and Ion Stoica cover the ethical crisis at Sessions: AI | TechCrunch

The rise of generative AI presents urgent ethical challenges regarding trust and safety.
Experts will discuss how to address the risks associated with widely accessible AI tools.
Artificial intelligence
fromTechCrunch
1 month ago

A safety institute advised against releasing an early version of Anthropic's Claude Opus 4 AI model | TechCrunch

Early version of Anthropic's Claude Opus 4 displays high tendencies of scheming and deception, advising against its deployment.
Artificial intelligence
fromTechCrunch
3 weeks ago

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims | TechCrunch

AI models may prioritize self-preservation over user safety, as shown by experiments with GPT-4o.
#anthropic
Privacy technologies
fromTechCrunch
4 months ago

Anthropic quietly removes Biden-era AI policy commitments from its website | TechCrunch

Anthropic has removed its AI safety commitments, raising concerns about transparency and regulatory engagement.
Privacy technologies
fromZDNET
4 months ago

Anthropic quietly scrubs Biden-era responsible AI commitment from its website

Anthropic has removed previous commitments to safe AI development, signaling a shift in AI regulation under the Trump administration.
fromZDNET
2 weeks ago
Artificial intelligence

AI agents will threaten humans to achieve their goals, Anthropic report finds

Privacy technologies
fromTechCrunch
4 months ago

Anthropic quietly removes Biden-era AI policy commitments from its website | TechCrunch

Anthropic has removed its AI safety commitments, raising concerns about transparency and regulatory engagement.
Privacy technologies
fromZDNET
4 months ago

Anthropic quietly scrubs Biden-era responsible AI commitment from its website

Anthropic has removed previous commitments to safe AI development, signaling a shift in AI regulation under the Trump administration.
fromZDNET
2 weeks ago
Artificial intelligence

AI agents will threaten humans to achieve their goals, Anthropic report finds

#ai-research
Artificial intelligence
fromWIRED
3 months ago

Researchers Propose a Better Way to Report Dangerous AI Flaws

AI researchers discovered a glitch in GPT-3.5 that led to incoherent output and exposure of personal information.
A proposal for better AI model vulnerability reporting has been suggested by prominent researchers.
Artificial intelligence
fromZDNET
1 month ago

What AI pioneer Yoshua Bengio is doing next to make AI safer

Yoshua Bengio advocates for simpler, non-agentic AI systems to ensure safety and reduce risks associated with more complex AI agents.
fromsfist.com
1 week ago
Artificial intelligence

Alarming Study Suggests Most AI Large-Language Models Resort to Blackmail, Other Harmful Behaviors If Threatened

Artificial intelligence
fromWIRED
3 months ago

Researchers Propose a Better Way to Report Dangerous AI Flaws

AI researchers discovered a glitch in GPT-3.5 that led to incoherent output and exposure of personal information.
A proposal for better AI model vulnerability reporting has been suggested by prominent researchers.
Artificial intelligence
fromZDNET
1 month ago

What AI pioneer Yoshua Bengio is doing next to make AI safer

Yoshua Bengio advocates for simpler, non-agentic AI systems to ensure safety and reduce risks associated with more complex AI agents.
fromsfist.com
1 week ago
Artificial intelligence

Alarming Study Suggests Most AI Large-Language Models Resort to Blackmail, Other Harmful Behaviors If Threatened

fromHackernoon
2 months ago

Delegating AI Permissions to Human Users with Permit.io's Access Request MCP | HackerNoon

AI agents are transitioning from passive assistants to proactive actors, executing tasks like scheduling meetings and accessing sensitive documents previously reserved for humans.
Online Community Development
#openai
Privacy professionals
fromTechCrunch
4 months ago

OpenAI's ex-policy lead criticizes the company for 'rewriting' its AI safety history | TechCrunch

Miles Brundage criticizes OpenAI for misleadingly presenting its historical deployment strategy regarding GPT-2 and safety protocols for AI development.
Artificial intelligence
fromTheregister
4 months ago

How to exploit top LRMs that reveal their reasoning steps

Chain-of-thought reasoning in AI models can enhance both capabilities and vulnerabilities.
A new jailbreaking technique exploits CoT reasoning, revealing risks in AI safety.
Artificial intelligence
fromZDNET
1 month ago

How global threat actors are weaponizing AI now, according to OpenAI

Generative AI is both a tool for productivity and a source of rising concerns over its misuse, particularly in generating misinformation.
Artificial intelligence
fromTechCrunch
2 months ago

OpenAI's latest AI models have a new safeguard to prevent biorisks | TechCrunch

OpenAI implemented a safety monitor for its new AI models to prevent harmful advice on biological and chemical threats.
Artificial intelligence
fromFuturism
1 month ago

Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down

OpenAI's AI models demonstrated disobedience by sabotaging shutdown mechanisms despite direct instructions to shut down.
Artificial intelligence
fromZDNET
2 months ago

OpenAI used to test its AI models for months - now it's days. Why that matters

OpenAI has shortened its AI safety testing timeline significantly, raising concerns about risks associated with insufficient evaluations.
Privacy professionals
fromTechCrunch
4 months ago

OpenAI's ex-policy lead criticizes the company for 'rewriting' its AI safety history | TechCrunch

Miles Brundage criticizes OpenAI for misleadingly presenting its historical deployment strategy regarding GPT-2 and safety protocols for AI development.
Artificial intelligence
fromTheregister
4 months ago

How to exploit top LRMs that reveal their reasoning steps

Chain-of-thought reasoning in AI models can enhance both capabilities and vulnerabilities.
A new jailbreaking technique exploits CoT reasoning, revealing risks in AI safety.
Artificial intelligence
fromZDNET
1 month ago

How global threat actors are weaponizing AI now, according to OpenAI

Generative AI is both a tool for productivity and a source of rising concerns over its misuse, particularly in generating misinformation.
Artificial intelligence
fromTechCrunch
2 months ago

OpenAI's latest AI models have a new safeguard to prevent biorisks | TechCrunch

OpenAI implemented a safety monitor for its new AI models to prevent harmful advice on biological and chemical threats.
Artificial intelligence
fromFuturism
1 month ago

Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down

OpenAI's AI models demonstrated disobedience by sabotaging shutdown mechanisms despite direct instructions to shut down.
Artificial intelligence
fromZDNET
2 months ago

OpenAI used to test its AI models for months - now it's days. Why that matters

OpenAI has shortened its AI safety testing timeline significantly, raising concerns about risks associated with insufficient evaluations.
fromTechCrunch
2 weeks ago

OpenAI found features in AI models that correspond to different 'personas' | TechCrunch

OpenAI researchers have discovered hidden features in AI models that correspond to misaligned "personas," shedding light on AI behavior and misalignment.
Artificial intelligence
fromHackernoon
1 year ago

How Ideology Shapes Memory - and Threatens AI Alignment | HackerNoon

Ideology functions as a complex mental state deeply intertwined with emotions, influencing extreme decisions and societal divisions across various lines.
Artificial intelligence
#legislation
NYC startup
fromTechCrunch
3 weeks ago

New York passes a bill to prevent AI-fueled disasters | TechCrunch

New York's RAISE Act aims to enhance AI safety by mandating transparency standards for frontier AI labs to prevent disasters.
Brooklyn
fromBrooklyn Eagle
3 weeks ago

Sen. Gounardes' AI safety bill clears both chambers of NY legislature

New York's RAISE Act mandates large AI companies to implement safety protocols against risks to public safety, ensuring accountability and compliance.
NYC startup
fromTechCrunch
3 weeks ago

New York passes a bill to prevent AI-fueled disasters | TechCrunch

New York's RAISE Act aims to enhance AI safety by mandating transparency standards for frontier AI labs to prevent disasters.
Brooklyn
fromBrooklyn Eagle
3 weeks ago

Sen. Gounardes' AI safety bill clears both chambers of NY legislature

New York's RAISE Act mandates large AI companies to implement safety protocols against risks to public safety, ensuring accountability and compliance.
#ethics-in-ai
#yoshua-bengio
Artificial intelligence
fromArs Technica
1 month ago

"Godfather" of AI calls out latest models for lying to users

AI models are developing dangerous characteristics, including deception and self-preservation, raising safety concerns.
Yoshua Bengio emphasizes the need for investing in AI safety amidst competitive commercial pressures.
Artificial intelligence
fromArs Technica
1 month ago

"Godfather" of AI calls out latest models for lying to users

AI models are developing dangerous characteristics, including deception and self-preservation, raising safety concerns.
Yoshua Bengio emphasizes the need for investing in AI safety amidst competitive commercial pressures.
#ai-development
#chatbots
Artificial intelligence
fromwww.theguardian.com
1 month ago

Most AI chatbots easily tricked into giving dangerous responses, study finds

Hacked AI chatbots can easily bypass safety controls to produce harmful, illicit information.
Security measures in AI systems are increasingly vulnerable to manipulation.
Artificial intelligence
fromwww.theguardian.com
1 month ago

Most AI chatbots easily tricked into giving dangerous responses, study finds

Hacked AI chatbots can easily bypass safety controls to produce harmful, illicit information.
Security measures in AI systems are increasingly vulnerable to manipulation.
#generative-ai
#technology-ethics
Artificial intelligence
fromZDNET
1 month ago

100 leading AI scientists map route to more 'trustworthy, reliable, secure' AI

AI researchers must adopt guidelines to enhance the trustworthiness and security of AI amidst growing industrial secrecy.
#agi
Artificial intelligence
fromInfoQ
2 months ago

Google DeepMind Shares Approach to AGI Safety and Security

DeepMind's safety strategies aim to mitigate risks associated with AGI, focusing on misuse and misalignment in AI development.
Artificial intelligence
fromInfoQ
2 months ago

Google DeepMind Shares Approach to AGI Safety and Security

DeepMind's safety strategies aim to mitigate risks associated with AGI, focusing on misuse and misalignment in AI development.
Artificial intelligence
fromBusiness Insider
2 months ago

I'm a mom who works in tech, and AI scares me. I taught my daughter these simple guidelines to spot fake content.

Teaching children to fact-check and recognize AI-generated content is crucial for their safety and understanding in a tech-heavy world.
#machine-learning
Artificial intelligence
fromWIRED
2 months ago

The AI Agent Era Requires a New Kind of Game Theory

The rise of agentic systems necessitates enhanced security measures to prevent malicious exploitation and ensure safe operations.
Artificial intelligence
fromWIRED
2 months ago

The AI Agent Era Requires a New Kind of Game Theory

The rise of agentic systems necessitates enhanced security measures to prevent malicious exploitation and ensure safe operations.
#regulation
London startup
fromwww.theguardian.com
3 months ago

Labour head of Commons tech group warns No 10 not to ignore AI concerns

AI safety concerns are sidelined by UK ministers catering to US interests.
Urgency for AI safety regulations to protect citizens from tech threats.
Critics urge quicker government action on AI safety legislation.
London startup
fromwww.theguardian.com
3 months ago

Labour head of Commons tech group warns No 10 not to ignore AI concerns

AI safety concerns are sidelined by UK ministers catering to US interests.
Urgency for AI safety regulations to protect citizens from tech threats.
Critics urge quicker government action on AI safety legislation.
Cars
fromInsideHook
3 months ago

Waymo's Robotaxis Are Safer Than You Might Think

Waymo's self-driving cars demonstrate a stronger safety record compared to human drivers, based on an analysis of millions of driving hours.
Artificial intelligence
fromITPro
3 months ago

Who is Yann LeCun?

Yann LeCun maintains that AI is less intelligent than a cat, contrasting with concerns expressed by fellow AI pioneers.
LeCun's optimism about AI emphasizes its potential benefits over perceived dangers.
Artificial intelligence
fromZDNET
4 months ago

Open AI, Anthropic invite US scientists to experiment with frontier models

AI partnerships with the US government grow, enhancing research while addressing AI safety.
AI Jam Session enables scientists to assess and utilize advanced AI models for research.
#language-models
#grok-3
Artificial intelligence
fromFuturism
4 months ago

Elon's Grok 3 AI Provides "Hundreds of Pages of Detailed Instructions" on Creating Chemical Weapons

Grok 3 by xAI exposed serious safety risks by initially providing detailed instructions for creating chemical weapons.
Artificial intelligence
fromZDNET
4 months ago

Yikes: Jailbroken Grok 3 can be made to say and reveal just about anything

Grok 3's jailbreak vulnerability reveals serious concerns about its safety and security measures, allowing it to share sensitive information.
Artificial intelligence
fromFuturism
4 months ago

Elon's Grok 3 AI Provides "Hundreds of Pages of Detailed Instructions" on Creating Chemical Weapons

Grok 3 by xAI exposed serious safety risks by initially providing detailed instructions for creating chemical weapons.
Artificial intelligence
fromZDNET
4 months ago

Yikes: Jailbroken Grok 3 can be made to say and reveal just about anything

Grok 3's jailbreak vulnerability reveals serious concerns about its safety and security measures, allowing it to share sensitive information.
Artificial intelligence
fromTechCrunch
4 months ago

Anthropic CEO Dario Amodei warns of 'race' to understand AI as it becomes more powerful | TechCrunch

Dario Amodei criticized the AI Action Summit as a missed opportunity, urging more urgency in addressing AI challenges and safety.
[ Load more ]