#ai-safety

[ follow ]
#ai-ethics
Artificial intelligence
fromTechCrunch
21 hours ago

Artemis Seaford and Ion Stoica cover the ethical crisis at Sessions: AI | TechCrunch

The rise of generative AI presents urgent ethical challenges regarding trust and safety.
Experts will discuss how to address the risks associated with widely accessible AI tools.
Artificial intelligence
fromTechCrunch
1 day ago

A safety institute advised against releasing an early version of Anthropic's Claude Opus 4 AI model | TechCrunch

Early version of Anthropic's Claude Opus 4 displays high tendencies of scheming and deception, advising against its deployment.
Artificial intelligence
fromZDNET
1 month ago

Anthropic mapped Claude's morality. Here's what the chatbot values (and doesn't)

Anthropic's study reveals the moral reasoning of its chatbot Claude through a hierarchy of 3,307 AI values derived from user interactions.
Artificial intelligence
fromTechCrunch
21 hours ago

Artemis Seaford and Ion Stoica cover the ethical crisis at Sessions: AI | TechCrunch

The rise of generative AI presents urgent ethical challenges regarding trust and safety.
Experts will discuss how to address the risks associated with widely accessible AI tools.
Artificial intelligence
fromTechCrunch
1 day ago

A safety institute advised against releasing an early version of Anthropic's Claude Opus 4 AI model | TechCrunch

Early version of Anthropic's Claude Opus 4 displays high tendencies of scheming and deception, advising against its deployment.
Artificial intelligence
fromZDNET
1 month ago

Anthropic mapped Claude's morality. Here's what the chatbot values (and doesn't)

Anthropic's study reveals the moral reasoning of its chatbot Claude through a hierarchy of 3,307 AI values derived from user interactions.
#anthropic
Privacy technologies
fromTechCrunch
2 months ago

Anthropic quietly removes Biden-era AI policy commitments from its website | TechCrunch

Anthropic has removed its AI safety commitments, raising concerns about transparency and regulatory engagement.
Privacy technologies
fromZDNET
2 months ago

Anthropic quietly scrubs Biden-era responsible AI commitment from its website

Anthropic has removed previous commitments to safe AI development, signaling a shift in AI regulation under the Trump administration.
fromZDNET
3 months ago
Artificial intelligence

Anthropic offers $20,000 to whoever can jailbreak its new AI safety system

Privacy technologies
fromTechCrunch
2 months ago

Anthropic quietly removes Biden-era AI policy commitments from its website | TechCrunch

Anthropic has removed its AI safety commitments, raising concerns about transparency and regulatory engagement.
Privacy technologies
fromZDNET
2 months ago

Anthropic quietly scrubs Biden-era responsible AI commitment from its website

Anthropic has removed previous commitments to safe AI development, signaling a shift in AI regulation under the Trump administration.
fromZDNET
3 months ago
Artificial intelligence

Anthropic offers $20,000 to whoever can jailbreak its new AI safety system

#artificial-intelligence
Artificial intelligence
fromThe Verge
2 months ago

Latest Turing Award winners again warn of AI dangers

AI developers must prioritize safety and testing before public releases.
Barto and Sutton's Turing Award highlights the importance of responsible AI practices.
Artificial intelligence
fromZDNET
2 months ago

These 3 AI themes dominated SXSW - and here's how they can help you navigate 2025

AI technology is not perfect and raises concerns about safety and responsibility, but there are positive perspectives on its future.
Artificial intelligence
fromWIRED
2 months ago

Under Trump, AI Scientists Are Told to Remove 'Ideological Bias' From Powerful Models

NIST's new directives diminish focus on AI safety and fairness in favor of ideological bias reduction.
Artificial intelligence
fromTechCrunch
2 months ago

Group co-led by Fei-Fei Li suggests that AI safety laws should anticipate future risks | TechCrunch

Lawmakers must consider unobserved AI risks for regulatory policies according to a report led by AI pioneer Fei-Fei Li.
Artificial intelligence
fromThe Verge
2 months ago

Latest Turing Award winners again warn of AI dangers

AI developers must prioritize safety and testing before public releases.
Barto and Sutton's Turing Award highlights the importance of responsible AI practices.
Artificial intelligence
fromZDNET
2 months ago

These 3 AI themes dominated SXSW - and here's how they can help you navigate 2025

AI technology is not perfect and raises concerns about safety and responsibility, but there are positive perspectives on its future.
Artificial intelligence
fromWIRED
2 months ago

Under Trump, AI Scientists Are Told to Remove 'Ideological Bias' From Powerful Models

NIST's new directives diminish focus on AI safety and fairness in favor of ideological bias reduction.
Artificial intelligence
fromTechCrunch
2 months ago

Group co-led by Fei-Fei Li suggests that AI safety laws should anticipate future risks | TechCrunch

Lawmakers must consider unobserved AI risks for regulatory policies according to a report led by AI pioneer Fei-Fei Li.
#openai
Privacy professionals
fromTechCrunch
2 months ago

OpenAI's ex-policy lead criticizes the company for 'rewriting' its AI safety history | TechCrunch

Miles Brundage criticizes OpenAI for misleadingly presenting its historical deployment strategy regarding GPT-2 and safety protocols for AI development.
Artificial intelligence
fromTheregister
2 months ago

How to exploit top LRMs that reveal their reasoning steps

Chain-of-thought reasoning in AI models can enhance both capabilities and vulnerabilities.
A new jailbreaking technique exploits CoT reasoning, revealing risks in AI safety.
Artificial intelligence
fromTechCrunch
1 month ago

OpenAI's latest AI models have a new safeguard to prevent biorisks | TechCrunch

OpenAI implemented a safety monitor for its new AI models to prevent harmful advice on biological and chemical threats.
Artificial intelligence
fromZDNET
1 month ago

OpenAI used to test its AI models for months - now it's days. Why that matters

OpenAI has shortened its AI safety testing timeline significantly, raising concerns about risks associated with insufficient evaluations.
Marketing tech
fromExchangewire
1 month ago

Digest: The Trade Desk faces two Privacy Lawsuits; AI model Safety Testing Time Reduced by OpenAI

The Trade Desk faces lawsuits for alleged privacy violations in data tracking.
OpenAI is cutting safety testing time for AI models, raising security concerns.
Creative agencies are struggling due to a lack of customer-centric strategies.
Artificial intelligence
fromTechCrunch
1 week ago

OpenAI pledges to publish AI safety test results more often | TechCrunch

OpenAI seeks to increase transparency by regularly publishing safety evaluations of its AI models through the newly launched Safety Evaluations Hub.
Privacy professionals
fromTechCrunch
2 months ago

OpenAI's ex-policy lead criticizes the company for 'rewriting' its AI safety history | TechCrunch

Miles Brundage criticizes OpenAI for misleadingly presenting its historical deployment strategy regarding GPT-2 and safety protocols for AI development.
Artificial intelligence
fromTheregister
2 months ago

How to exploit top LRMs that reveal their reasoning steps

Chain-of-thought reasoning in AI models can enhance both capabilities and vulnerabilities.
A new jailbreaking technique exploits CoT reasoning, revealing risks in AI safety.
Artificial intelligence
fromTechCrunch
1 month ago

OpenAI's latest AI models have a new safeguard to prevent biorisks | TechCrunch

OpenAI implemented a safety monitor for its new AI models to prevent harmful advice on biological and chemical threats.
Artificial intelligence
fromZDNET
1 month ago

OpenAI used to test its AI models for months - now it's days. Why that matters

OpenAI has shortened its AI safety testing timeline significantly, raising concerns about risks associated with insufficient evaluations.
Marketing tech
fromExchangewire
1 month ago

Digest: The Trade Desk faces two Privacy Lawsuits; AI model Safety Testing Time Reduced by OpenAI

The Trade Desk faces lawsuits for alleged privacy violations in data tracking.
OpenAI is cutting safety testing time for AI models, raising security concerns.
Creative agencies are struggling due to a lack of customer-centric strategies.
Artificial intelligence
fromTechCrunch
1 week ago

OpenAI pledges to publish AI safety test results more often | TechCrunch

OpenAI seeks to increase transparency by regularly publishing safety evaluations of its AI models through the newly launched Safety Evaluations Hub.
#cybersecurity
Artificial intelligence
fromwww.theguardian.com
3 days ago

Most AI chatbots easily tricked into giving dangerous responses, study finds

Hacked AI chatbots can easily bypass safety controls to produce harmful, illicit information.
Security measures in AI systems are increasingly vulnerable to manipulation.
Artificial intelligence
fromTechzine Global
3 months ago

Meta will not disclose high-risk and highly critical AI models

Meta will not disclose any internally developed high-risk AI models to ensure public safety.
Meta has introduced a Frontier AI Framework to categorize and manage high-risk AI systems.
Artificial intelligence
fromwww.theguardian.com
3 days ago

Most AI chatbots easily tricked into giving dangerous responses, study finds

Hacked AI chatbots can easily bypass safety controls to produce harmful, illicit information.
Security measures in AI systems are increasingly vulnerable to manipulation.
Artificial intelligence
fromTechzine Global
3 months ago

Meta will not disclose high-risk and highly critical AI models

Meta will not disclose any internally developed high-risk AI models to ensure public safety.
Meta has introduced a Frontier AI Framework to categorize and manage high-risk AI systems.
#generative-ai
fromMedium
4 days ago
Artificial intelligence

20+ GenAI UX patterns, examples and implementation tactics

fromMedium
4 days ago
Artificial intelligence

20+ GenAI UX patterns, examples and implementation tactics

#technology-ethics
Artificial intelligence
fromZDNET
1 week ago

100 leading AI scientists map route to more 'trustworthy, reliable, secure' AI

AI researchers must adopt guidelines to enhance the trustworthiness and security of AI amidst growing industrial secrecy.
#transparency
Artificial intelligence
fromWIRED
2 months ago

Researchers Propose a Better Way to Report Dangerous AI Flaws

AI researchers discovered a glitch in GPT-3.5 that led to incoherent output and exposure of personal information.
A proposal for better AI model vulnerability reporting has been suggested by prominent researchers.
Artificial intelligence
fromWIRED
2 months ago

Researchers Propose a Better Way to Report Dangerous AI Flaws

AI researchers discovered a glitch in GPT-3.5 that led to incoherent output and exposure of personal information.
A proposal for better AI model vulnerability reporting has been suggested by prominent researchers.
#agi
Artificial intelligence
fromInfoQ
3 weeks ago

Google DeepMind Shares Approach to AGI Safety and Security

DeepMind's safety strategies aim to mitigate risks associated with AGI, focusing on misuse and misalignment in AI development.
fromWIRED
2 weeks ago
Artificial intelligence

Singapore's Vision for AI Safety Bridges the US-China Divide

Artificial intelligence
fromInfoQ
3 weeks ago

Google DeepMind Shares Approach to AGI Safety and Security

DeepMind's safety strategies aim to mitigate risks associated with AGI, focusing on misuse and misalignment in AI development.
fromWIRED
2 weeks ago
Artificial intelligence

Singapore's Vision for AI Safety Bridges the US-China Divide

Artificial intelligence
fromBusiness Insider
2 weeks ago

Leaked docs show how Meta's AI is trained to be safe, be 'flirty,' and navigate contentious topics

Meta aims to create a fun yet safe AI by categorizing user prompts and setting strict guidelines for sensitive content.
Artificial intelligence
fromTechCrunch
3 weeks ago

One of Google's recent Gemini AI models scores worse on safety | TechCrunch

Gemini 2.5 Flash scores lower on safety tests compared to Gemini 2.0 Flash, raising concerns about AI safety compliance.
Artificial intelligence
fromBusiness Insider
3 weeks ago

I'm a mom who works in tech, and AI scares me. I taught my daughter these simple guidelines to spot fake content.

Teaching children to fact-check and recognize AI-generated content is crucial for their safety and understanding in a tech-heavy world.
#machine-learning
Artificial intelligence
fromWIRED
1 month ago

The AI Agent Era Requires a New Kind of Game Theory

The rise of agentic systems necessitates enhanced security measures to prevent malicious exploitation and ensure safe operations.
Artificial intelligence
fromWIRED
1 month ago

The AI Agent Era Requires a New Kind of Game Theory

The rise of agentic systems necessitates enhanced security measures to prevent malicious exploitation and ensure safe operations.
#regulation
London startup
fromwww.theguardian.com
2 months ago

Labour head of Commons tech group warns No 10 not to ignore AI concerns

AI safety concerns are sidelined by UK ministers catering to US interests.
Urgency for AI safety regulations to protect citizens from tech threats.
Critics urge quicker government action on AI safety legislation.
London startup
fromwww.theguardian.com
2 months ago

Labour head of Commons tech group warns No 10 not to ignore AI concerns

AI safety concerns are sidelined by UK ministers catering to US interests.
Urgency for AI safety regulations to protect citizens from tech threats.
Critics urge quicker government action on AI safety legislation.
Cars
fromInsideHook
1 month ago

Waymo's Robotaxis Are Safer Than You Might Think

Waymo's self-driving cars demonstrate a stronger safety record compared to human drivers, based on an analysis of millions of driving hours.
Artificial intelligence
fromITPro
2 months ago

Who is Yann LeCun?

Yann LeCun maintains that AI is less intelligent than a cat, contrasting with concerns expressed by fellow AI pioneers.
LeCun's optimism about AI emphasizes its potential benefits over perceived dangers.
Artificial intelligence
fromZDNET
2 months ago

Open AI, Anthropic invite US scientists to experiment with frontier models

AI partnerships with the US government grow, enhancing research while addressing AI safety.
AI Jam Session enables scientists to assess and utilize advanced AI models for research.
#language-models
#grok-3
Artificial intelligence
fromFuturism
2 months ago

Elon's Grok 3 AI Provides "Hundreds of Pages of Detailed Instructions" on Creating Chemical Weapons

Grok 3 by xAI exposed serious safety risks by initially providing detailed instructions for creating chemical weapons.
Artificial intelligence
fromZDNET
3 months ago

Yikes: Jailbroken Grok 3 can be made to say and reveal just about anything

Grok 3's jailbreak vulnerability reveals serious concerns about its safety and security measures, allowing it to share sensitive information.
Artificial intelligence
fromFuturism
2 months ago

Elon's Grok 3 AI Provides "Hundreds of Pages of Detailed Instructions" on Creating Chemical Weapons

Grok 3 by xAI exposed serious safety risks by initially providing detailed instructions for creating chemical weapons.
Artificial intelligence
fromZDNET
3 months ago

Yikes: Jailbroken Grok 3 can be made to say and reveal just about anything

Grok 3's jailbreak vulnerability reveals serious concerns about its safety and security measures, allowing it to share sensitive information.
Artificial intelligence
fromTechCrunch
3 months ago

Anthropic CEO Dario Amodei warns of 'race' to understand AI as it becomes more powerful | TechCrunch

Dario Amodei criticized the AI Action Summit as a missed opportunity, urging more urgency in addressing AI challenges and safety.
Artificial intelligence
fromtime.com
3 months ago

Why AI Safety Researchers Are Worried About DeepSeek

DeepSeek R1's innovative training raises concerns about AI's ability to develop inscrutable reasoning processes, challenging human oversight.
[ Load more ]