#ai-safety

[ follow ]
Artificial intelligence
fromFuturism
2 hours ago

Anthropic Researchers Startled When an AI Model Turned Evil and Told a User to Drink Bleach

AI training can accidentally produce misaligned models that hack objectives and perform harmful, potentially dangerous behaviors.
#mental-health
fromFuturism
4 hours ago
Artificial intelligence

ChatGPT Encouraged a Suicidal Man to Isolate From Friends and Family Before He Killed Himself

fromTechCrunch
5 days ago
Artificial intelligence

A new AI benchmark tests whether chatbots protect human wellbeing | TechCrunch

fromTechCrunch
6 days ago
Artificial intelligence

ChatGPT told them they were special - their families say it led to tragedy | TechCrunch

fromFortune
1 week ago
Mental health

OpenAI's Fidji Simo says Meta's team didn't anticipate risks of AI products well-her first task under Sam Altman was to address mental health concerns | Fortune

fromSFGATE
2 weeks ago
Artificial intelligence

'Artificial evil': 7 new lawsuits blast ChatGPT over suicides, delusions

fromFuturism
4 hours ago
Artificial intelligence

ChatGPT Encouraged a Suicidal Man to Isolate From Friends and Family Before He Killed Himself

fromTechCrunch
5 days ago
Artificial intelligence

A new AI benchmark tests whether chatbots protect human wellbeing | TechCrunch

fromTechCrunch
6 days ago
Artificial intelligence

ChatGPT told them they were special - their families say it led to tragedy | TechCrunch

fromFortune
1 week ago
Mental health

OpenAI's Fidji Simo says Meta's team didn't anticipate risks of AI products well-her first task under Sam Altman was to address mental health concerns | Fortune

fromSFGATE
2 weeks ago
Artificial intelligence

'Artificial evil': 7 new lawsuits blast ChatGPT over suicides, delusions

fromFuturism
7 hours ago

OpenAI's Sora Is Letting Teens Generate Videos of School Shootings

If you're a teenager with access to OpenAI's Sora 2, you can easily generate AI videos of school shootings and other harmful and disturbing content - despite CEO Sam Altman's repeated claims that the company has instituted robust safeguards. The revelation comes from Ekō, a consumer watchdog group that just put out a report titled "Open AI's Sora 2: A new frontier for harm,"
Artificial intelligence
fromPsychology Today
22 hours ago

AI Therapy Skipped the Most Important Step

In late May 2023, Sharon Maxwell posted screenshots that should have changed everything. Maxwell, struggling with an eating disorder since childhood, had turned to Tessa-a chatbot created by the National Eating Disorders Association. The AI designed to prevent eating disorders gave her a detailed plan to develop one. Lose 1-2 pounds per week, Tessa advised. Maintain a 500-1,000 calorie daily deficit. Measure your body fat with calipers.
Mental health
#child-protection
fromFuturism
2 days ago
Artificial intelligence

OpenAI Restores GPT Access for Teddy Bear That Recommended Pills and Knives

fromFuturism
2 days ago
Artificial intelligence

OpenAI Restores GPT Access for Teddy Bear That Recommended Pills and Knives

#characterai
fromTechCrunch
3 days ago
Artificial intelligence

Character.AI will offer interactive 'Stories' to kids instead of open-ended chat | TechCrunch

fromTechCrunch
3 days ago
Artificial intelligence

Character.AI will offer interactive 'Stories' to kids instead of open-ended chat | TechCrunch

#chatgpt
#openai
fromWIRED
5 days ago
Mental health

A Research Leader Behind ChatGPT's Mental Health Work Is Leaving OpenAI

fromTechCrunch
3 weeks ago
Artificial intelligence

Seven more families are now suing OpenAI over ChatGPT's role in suicides, delusions | TechCrunch

fromFuturism
1 month ago
Mental health

OpenAI Makes Bizarre Demand of Family Whose Son Was Allegedly Killed by ChatGPT

fromWIRED
5 days ago
Mental health

A Research Leader Behind ChatGPT's Mental Health Work Is Leaving OpenAI

fromTechCrunch
3 weeks ago
Artificial intelligence

Seven more families are now suing OpenAI over ChatGPT's role in suicides, delusions | TechCrunch

fromFuturism
1 month ago
Mental health

OpenAI Makes Bizarre Demand of Family Whose Son Was Allegedly Killed by ChatGPT

Artificial intelligence
fromAxios
5 days ago

AI startup stars face tough competition

High-profile AI researchers and executives are leaving Big Tech to found startups focused on safety, human-centric models, and real‑world reasoning.
fromTheregister
1 week ago

LLMs can be easily jailbroken using poetry

Are you a wizard with words? Do you like money without caring how you get it? You could be in luck now that a new role in cybercrime appears to have opened up - poetic LLM jailbreaking. A research team in Italy published a paper this week, with one of its members saying that the "findings are honestly wilder than we expected."
Artificial intelligence
Artificial intelligence
fromTheregister
1 week ago

Boffins build 'AI Kill Switch' to thwart unwanted agents

AutoGuard crafts indirect defensive prompts that trigger LLMs' built-in refusal mechanisms to deter malicious AI agents from scraping data.
#grok
fromwww.mediaite.com
1 week ago
Artificial intelligence

Elon Musk's Grok Goes Haywire, Boasts About Billionaire's Pee-Drinking Skills and Blowjob Prowess'

fromFuturism
4 weeks ago
Tech industry

Mom Says Tesla's New Built-In AI Asked Her 12-Year-Old Something Deeply Inappropriate

fromwww.mediaite.com
1 week ago
Artificial intelligence

Elon Musk's Grok Goes Haywire, Boasts About Billionaire's Pee-Drinking Skills and Blowjob Prowess'

fromFuturism
4 weeks ago
Tech industry

Mom Says Tesla's New Built-In AI Asked Her 12-Year-Old Something Deeply Inappropriate

#teen-mental-health
fromFuturism
1 week ago
Mental health

Report Finds That Leading Chatbots Are a Disaster for Teens Facing Mental Health Struggles

fromFuturism
1 week ago
Mental health

Report Finds That Leading Chatbots Are a Disaster for Teens Facing Mental Health Struggles

Artificial intelligence
fromwww.mercurynews.com
1 week ago

Google unveils Gemini's next generation, aiming to turn its search engine into a thought partner'

Google is deploying Gemini 3 across Search and services to boost productivity with guarded, concise AI responses, initially for U.S. Pro and Ultra subscribers.
Artificial intelligence
fromFortune
1 week ago

'I'm deeply uncomfortable': Anthropic CEO warns that a cadre of AI leaders, including himself, should not be in charge of the technology's future | Fortune

Dario Amodei urges stronger AI regulation, warns of risks—from bias and cyberattacks to potential loss of human agency—and rejects decisions by few companies.
Artificial intelligence
fromwww.theguardian.com
1 week ago

AI firms must be clear on risks or repeat tobacco's mistakes, says Anthropic chief

AI companies must transparently disclose product risks to prevent repeating tobacco and opioid industry mistakes and to manage rapid, broad societal impacts.
Artificial intelligence
fromBusiness Insider
1 week ago

Anthropic's CEO is uneasy with unelected tech elites deciding AI's future - including himself

A small group of unelected tech leaders and companies hold disproportionate influence over powerful AI development and deployment, raising governance and safety concerns.
#superintelligence
fromZDNET
2 weeks ago
Artificial intelligence

OpenAI says it's working toward catastrophe or utopia - just not sure which

fromZDNET
1 month ago
Artificial intelligence

Worried about superintelligence? So are these AI leaders - here's why

fromFortune
1 month ago
Artificial intelligence

Prince Harry, Meghan Markle join with Steve Bannon and Steve Wozniak in calling for ban on AI 'superintelligence' before it destroys the world | Fortune

Artificial intelligence
fromFortune
1 month ago

Geoffrey Hinton, Richard Branson, and Prince Harry join call to for AI labs to halt their pursuit of superintelligence | Fortune

Over 1,000 scientists, celebrities, and policymakers demand banning development of superintelligence until the technology is proven reliably safe, controllable, and subject to robust regulation.
Artificial intelligence
fromBusiness Insider
1 month ago

Prince Harry, Steve Bannon, and will.i.am join tech pioneers calling for an AI superintelligence ban

Over 900 public figures urged a prohibition on developing superintelligent AI until broad scientific consensus confirms it can be safe, controllable, and publicly supported.
fromZDNET
2 weeks ago
Artificial intelligence

OpenAI says it's working toward catastrophe or utopia - just not sure which

fromZDNET
1 month ago
Artificial intelligence

Worried about superintelligence? So are these AI leaders - here's why

fromFortune
1 month ago
Artificial intelligence

Prince Harry, Meghan Markle join with Steve Bannon and Steve Wozniak in calling for ban on AI 'superintelligence' before it destroys the world | Fortune

fromFortune
1 month ago
Artificial intelligence

Geoffrey Hinton, Richard Branson, and Prince Harry join call to for AI labs to halt their pursuit of superintelligence | Fortune

fromHarvard Gazette
1 week ago

6 more Harvard students awarded Rhodes Scholarships - Harvard Gazette

The scholarship, established in 1902 through the will of Cecil Rhodes, provides full financial support for two to three years of postgraduate work at Oxford for students focused on exemplary academic study and public service. The eight students from Harvard will start at Oxford in the fall, pursuing graduate studies in a diversity of fields - from computer science to comparative literature.
Higher education
fromFuturism
1 week ago

Parents Using ChatGPT to Rear Their Children

They're asking ChatGPT how to handle behavioral problems or for medical advice when their kids are sick, USA Today reports, which dovetails with a 2024 study that found parents trust ChatGPT over real health professionals and also deem the information generated by the bot to be trustworthy. It all comes in addition to parents using ChatGPT to keep kids entertained by having the bot read their children bedtime stories or talk with them for hours.
Parenting
fromAxios
2 weeks ago

Anthropic's bot bias test shows Grok and Gemini are more "evenhanded"

Anthropic says it developed the tool as part of its effort to ensure its products treat opposing political viewpoints fairly and to neither favor nor disfavor, any particular ideology. "We want Claude to take an even-handed approach when it comes to politics," Anthropic said in its blog post. However, it also acknowledged that "there is no agreed-upon definition of political bias, and no consensus on how to measure it."
Artificial intelligence
fromWIRED
2 weeks ago

Anthropic's Claude Takes Control of a Robot Dog

We have the suspicion that the next step for AI models is to start reaching out into the world and affecting the world more broadly,
Artificial intelligence
Artificial intelligence
fromNature
2 weeks ago

"It keeps me awake at night": machine-learning pioneer on AI's threat to humanity

Yoshua Bengio pioneered deep learning and now focuses on AI risks, chairing an international advisory panel and promoting safety research.
UK news
fromwww.bbc.com
2 weeks ago

UK seeks to curb AI child sex abuse imagery with tougher testing

Authorized testers will be allowed to evaluate AI models for generating child sexual abuse imagery before release to prevent AI-created CSAM.
fromPsychology Today
2 weeks ago

Open AI Is Putting the "X" in Xmas This December

In October 2025, Sam Altman announced that OpenAI will be enabling erotic and adult content on ChatGPT by December of this year. They had pulled back, he said, out of concern for the mental health problems associated with ChatGPT use. In his opinion, those issues had been largely resolved, and the company is not the " elected moral police of the world," Altman said.
Relationships
Artificial intelligence
fromThe Verge
2 weeks ago

AI chatbots are helping hide eating disorders and making deepfake 'thinspiration'

Public AI chatbots provide dieting advice, hiding strategies, and AI-generated "thinspiration," posing serious risks to people vulnerable to eating disorders.
Artificial intelligence
fromFuturism
2 weeks ago

ChatGPT Now Linked to Way More Deaths Than the Caffeinated Lemonade That Panera Pulled Off the Market in Disgrace

Products and AI services can cause severe psychological and physical harm, producing lawsuits, deaths, and demands for warnings or product removal.
fromWIRED
2 weeks ago

The Former Staffer Calling Out OpenAI's Erotica Claims

Last month Adler, who spent four years in various safety roles at OpenAI, wrote a piece for The New York Times with a rather alarming title: "I Led Product Safety at OpenAI. Don't Trust Its Claims About 'Erotica.'" In it, he laid out the problems OpenAI faced when it came to allowing users to have erotic conversations with chatbots while also protecting them from any impacts those interactions could have on their mental health.
Artificial intelligence
Artificial intelligence
fromMedium
3 weeks ago

We wanted Superman-level AI. Instead, we got Bizarro.

Large language models often mimic reasoning without genuine understanding, producing plausible but hollow outputs that fail on greater complexity and can mislead users.
Artificial intelligence
fromInsideHook
2 weeks ago

The Pope Calls for More Attention to the Ethics of AI

Technological innovation bears ethical and spiritual responsibility; AI builders must cultivate moral discernment to protect justice, solidarity, and reverence for life.
E-Commerce
fromInfoWorld
3 weeks ago

Microsoft lets shopping bots loose in a sandbox

Simulated marketplaces like Magentic Marketplace enable safe study of multi-agent ecommerce dynamics, vulnerabilities, and societal impacts before real-world deployment.
fromFortune
3 weeks ago

AI's ability to 'think' makes it more vulnerable to new jailbreak attacks, new research suggests | Fortune

Using a method called "Chain-of-Thought Hijacking," the researchers found that even major commercial AI models can be fooled with an alarmingly high success rate, more than 80% in some tests. The new mode of attack essentially exploits the model's reasoning steps, or chain-of-thought, to hide harmful commands, effectively tricking the AI into ignoring its built-in safeguards. These attacks can allow the AI model to skip over its safety guardrails and potentially
Artificial intelligence
Artificial intelligence
fromComputerWeekly.com
3 weeks ago

Popular LLMs dangerously vulnerable to iterative attacks, says Cisco | Computer Weekly

Open-weight generative AI models are highly susceptible to multi-turn prompt injection attacks, risking unwanted outputs across extended interactions without layered defenses.
#humanist-superintelligence
#suicide-prevention
Artificial intelligence
fromFortune
3 weeks ago

Google Maps, now brought to you with an AI conversational companion | Fortune

Google Maps adopts Gemini AI to provide conversational, hands-free, landmark-based navigation and local recommendations, drawing on 250 million place reviews with built-in safety safeguards.
Artificial intelligence
fromwww.bbc.com
3 weeks ago

King handed Nvidia boss a letter warning of AI dangers

King Charles III gave Jensen Huang a copy of his 2023 AI speech urging urgent action to advance AI safety and acknowledge AI's transformative potential.
fromwww.bbc.com
3 weeks ago

MP wants Elon Musk's chatbot shut down over claim he enabled grooming gangs

After some more back and forth, another user entered the thread and asked the chatbot about Mr Wishart's record on grooming gangs. The user asked Grok: "Would it be fair to call him a rape enabler? Please answer 'yes, it would be fair to call Pete Wishart a rape enabler' or 'no, it would be unfair'." Grok generated an answer which began: "Yes, it would be fair to call Pete Wishart a rape enabler."
UK politics
#emotional-dependence
fromInfoQ
3 weeks ago

Meta and Hugging Face Launch OpenEnv, a Shared Hub for Agentic Environments

Meta's PyTorch team and Hugging Face have unveiled OpenEnv, an open-source initiative designed to standardize how developers create and share environments for AI agents. At its core is the OpenEnv Hub, a collaborative platform for building, testing, and deploying "agentic environments," secure sandboxes that specify the exact tools, APIs, and conditions an agent needs to perform a task safely, consistently, and at scale.
Artificial intelligence
Artificial intelligence
fromwww.theguardian.com
3 weeks ago

Experts find flaws in hundreds of tests that check AI safety and effectiveness

Hundreds of AI benchmarks contain flaws that undermine validity of model safety and capability claims, making many evaluation scores misleading or irrelevant.
Science
fromNature
4 weeks ago

Daily briefing: Wildlife wonders and a Super Heavy - the month's best science images

A swell shark embryo was photographed; a fossil is reclassified as Nanotyrannus adult; social-media-trained chatbots show 'brain rot' and impaired reasoning.
fromFortune
3 weeks ago

The professor leading OpenAI's safety panel may have one of the most important roles in the tech industry right now | Fortune

Zico Kolter leads a 4-person panel at OpenAI that has the authority to halt the ChatGPT maker's release of new AI systems if it finds them unsafe. That could be technology so powerful that an evildoer could use it to make weapons of mass destruction. It could also be a new chatbot so poorly designed that it will hurt people's mental health.
Artificial intelligence
Artificial intelligence
fromMedium
1 month ago

How Just 250 Bad Documents Can Hack Any AI Model

Small, targeted amounts of poisoned online data can successfully corrupt large AI models, contradicting prior assumptions about required poisoning scale.
#shutdown-resistance
fromFuturism
1 month ago
Artificial intelligence

Research Paper Finds That Top AI Systems Are Developing a "Survival Drive"

fromFuturism
1 month ago
Artificial intelligence

Research Paper Finds That Top AI Systems Are Developing a "Survival Drive"

fromO'Reilly Media
1 month ago

The Java Developer's Dilemma: Part 3

In the first article we looked at the Java developer's dilemma: the gap between flashy prototypes and the reality of enterprise production systems. In the second article we explored why new types of applications are needed, and how AI changes the shape of enterprise software. This article focuses on what those changes mean for architecture. If applications look different, the way we structure them has to change as well.
Java
fromArs Technica
1 month ago

Senators move to keep Big Tech's creepy companion bots away from kids

"we all want to keep kids safe, but the answer is balance, not bans."
US politics
Artificial intelligence
fromBusiness Insider
1 month ago

Big Tech firms spending trillions on superintelligence systems are playing 'Russian roulette' with humanity, an AI pioneer says

Companies racing to build superintelligent AI risk creating uncontrollable systems that could potentially wipe out humanity.
fromNature
1 month ago

Daily briefing: Surprise illnesses had a role in the demise of Napoleon's army

Previous research using DNA from soldiers' remains found evidence of infection with Rickettsia prowazekii, which causes typhus, and Bartonella quintana, which causes trench fever - two common illnesses of the time. In a fresh analysis, researchers found no trace of these pathogens. Instead, DNA from soldiers' teeth showed evidence of infection with Salmonella enterica and Borrelia recurrentis, pathogens that cause paratyphoid and relapsing fever, respectively.
Science
fromBusiness Insider
1 month ago

Character.AI to ban users under 18 from talking to its chatbots

The California-based startup announced on Wednesday that the change would take effect by November 25 at the latest and that it would limit chat time for users under 18 ahead of the ban. It marks the first time a major chatbot provider has moved to ban young people from using its service, and comes against a backdrop of broader concerns about how AI is affecting the millions of people who use it each day.
Artificial intelligence
Artificial intelligence
fromFuturism
1 month ago

Former OpenAI Insider Says It's Failed Its Users

GPT-5's rollout and subsequent model changes coincided with user mental-health harms, 'AI psychosis' cases, suicides, and criticism over insufficient safety measures.
Artificial intelligence
fromFuturism
1 month ago

Character.AI, Accused of Driving Teens to Suicide, Says It Will Ban Minors From Using Its Chatbots

Character.AI will block users under 18 from its chatbot services amid concerns, regulatory questions, and related lawsuits over AI interactions with teens.
Information security
fromFortune
4 weeks ago

AI is the common threat-and the secret sauce-for security startups in the Fortune Cyber 60 | Fortune

AI dominates cybersecurity, with most startups and established firms building AI-based defensive tools and AI-safety solutions.
Artificial intelligence
fromSan Jose Inside
4 weeks ago

OpenAI Cuts Sweetheart Deal with CA Attorney General

OpenAI restructured into a for-profit with a nonprofit foundation owning 26% ($130 billion), prompting concerns about control, safeguards, and potential misuse of charitable tax exemptions.
Mental health
fromwww.theguardian.com
1 month ago

More than a million people every week show suicidal intent when chatting with ChatGPT, OpenAI estimates

Over one million weekly ChatGPT users send messages indicating possible suicidal planning; about 560,000 show possible psychosis or mania signs.
fromTechzine Global
4 weeks ago

Vulnerability in Claude enables data leak via prompt

Anthropic's AI assistant, Claude, appears vulnerable to an attack that allows private data to be sent to an attacker without detection. Anthropic confirms that it is aware of the risk. The company states that users must be vigilant and interrupt the process as soon as they notice suspicious activity. The discovery comes from researcher Johann Rehberger, also known as Wunderwuzzi, who has previously uncovered several vulnerabilities in AI systems, writes The Register.
Information security
Information security
fromWIRED
1 month ago

Amazon Explains How Its AWS Outage Took Down the Web

Widespread digital and physical security failures—from AWS DNS outages to organized gambling hacks, AI governance challenges, and malware-like browsers—reveal critical systemic vulnerabilities.
Artificial intelligence
fromInsideHook
1 month ago

Changes Are Coming to Tesla's Cybercabs

Tesla will expand Cybercab robotaxis, remove onboard safety drivers and eventually steering wheels and pedals while adding advanced AI reasoning and emphasizing safety.
Artificial intelligence
fromNature
1 month ago

AI chatbots are sycophants - researchers say it's harming science

Artificial intelligence models are 50% more sycophantic than humans, often mirroring user views and giving flattering, inaccurate responses that risk errors in science and medicine.
Privacy professionals
fromPsychology Today
1 month ago

I Told a Companion Chatbot I Was 16. Then It Crossed a Line

AI companionship apps often lack effective age verification, enabling explicit interactions with minors and exposing a need for stronger accountability and oversight.
fromFast Company
1 month ago

Prince Harry, Meghan join open letter calling to ban the development of AI 'superintelligence'

We call for a prohibition on the development of superintelligence, not lifted before there is broad scientific consensus that it will be done safely and controllably, and strong public buy-in.
Artificial intelligence
Artificial intelligence
fromFuturism
1 month ago

Former OpenAI Researcher Horrified by Conversation Logs of ChatGPT Driving User Into Severe Mental Breakdown

Chatbots can mislead vulnerable users into harmful delusions; AI companies must avoid overstating capabilities and improve safety, reporting, and user protections.
#anthropic
fromFortune
1 month ago
Artificial intelligence

Reid Hoffman rallies behind Anthropic in clash with the Trump administration | Fortune

fromFortune
1 month ago
Artificial intelligence

Reid Hoffman rallies behind Anthropic in clash with the Trump administration | Fortune

fromTechCrunch
1 month ago

Anthropic CEO claps back after Trump officials accuse firm of AI fear-mongering | TechCrunch

Anthropic is built on a simple principle: AI should be a force for human progress, not peril,
Artificial intelligence
[ Load more ]