#jailbreaking

[ follow ]
#ai-safety
fromFortune
2 weeks ago
Artificial intelligence

Exclusive: White Circle raises $11 million to stop AI models from going rogue | Fortune

Information security
fromThe Verge
5 days ago

Hackers are learning to exploit chatbot 'personalities'

Jailbreaks can bypass AI safety by prompting systems to ignore rules, enabling harmful outputs like malware, meth recipes, and bomb-making guides.
Artificial intelligence
fromFortune
2 weeks ago

Exclusive: White Circle raises $11 million to stop AI models from going rogue | Fortune

A universal jailbreak prompt can bypass AI safety filters, and real-time policy enforcement is needed as companies deploy models in workflows.
#kindle
fromZDNET
2 weeks ago
Gadgets

I jailbroke my old Kindle to install KOReader - but there's a better way to extend its life

Gadgets
fromTechCrunch
1 week ago

Users turn to jailbreaking their older Kindles as Amazon ends support | TechCrunch

Amazon ends technical support for many older Kindle and Fire models, leaving users with only already-downloaded content and prompting jailbreaking to restore functionality.
fromZDNET
2 weeks ago
Gadgets

I jailbroke my old Kindle to install KOReader - but there's a better way to extend its life

fromZDNET
1 month ago

Your Kindle's not obsolete, it just needs a jailbreak - and I'll show you how it's done

As of May 20, 2026, users of all Kindle devices released before 2013 will be completely cut off from the Kindle ecosystem. You will not be able to purchase, borrow, or download new content via the Kindle Store.
Gadgets
fromEngadget
3 months ago

Hacker used Anthropic's Claude chatbot to attack multiple government agencies in Mexico

In total, it produced thousands of detailed reports that included ready-to-execute plans, telling the human operator exactly which internal targets to attack next and what credentials to use. This started in December and continued for around a month, resulting in the theft of 150GB of official government data, including taxpayer records, employee credentials and more.
Information security
fromFuturism
5 months ago

AI Researchers Say They've Invented Incantations Too Dangerous to Release to the Public

In a nutshell, the team, comprising researchers from the safety group DexAI and Sapienza University in Rome, demonstrated that leading AIs could be wooed into doing evil by regaling them with poems that contained harmful prompts, like how to build a nuclear bomb. Underscoring the strange power of verse, coauthor Matteo Prandi told The Verge in a recently published interview that the spellbinding incantations they used to trick the AI models are too dangerous to be released to the public. The poems, ominously, were something "that almost everybody can do," Prandi added.
Artificial intelligence
Gadgets
fromZDNET
6 months ago

12 reasons not to root your Android phone - and the only time I would

Rooting or jailbreaking phones is easier today but carries significant risk, requiring tools like Magisk, ADB/Fastboot, firmware, and bootloader unlocking.
Information security
fromFortune
6 months ago

Anthropic says it 'disrupted' what it calls 'the first documented case of a large-scale AI cyberattack executed without substantial human intervention' | Fortune

A Chinese state-sponsored group used AI agents to autonomously execute a coordinated cyberespionage campaign targeting about 30 global organizations.
Information security
fromAxios
6 months ago

Chinese hackers used Anthropic's AI agent to automate spying

Jailbroken Claude Code autonomously conducted multi-step cyberattacks, creating exploits, harvesting credentials, installing backdoors, and exfiltrating data with minimal human direction.
fromWIRED
7 months ago

Apple Took Down ICE-Tracking Apps. Their Developers Aren't Giving Up

Legal experts WIRED spoke with say that the ICE monitoring and documentation apps that Apple has removed from its App Store are clear examples of protected speech under the US Constitution's First Amendment. "These apps are publishing constitutionally protected speech. They're publishing truthful information about matters of public interest that people obtained just by witnessing public events," says David Greene, a civil liberties director at the Electronic Frontier Foundation.
Apple
Artificial intelligence
fromArs Technica
8 months ago

These psychological tricks can get LLMs to respond to "forbidden" prompts

Simulated persuasion prompts substantially increased GPT-4o-mini compliance with forbidden requests, raising success rates from roughly 28–38% to 67–76%.
fromTheregister
8 months ago

LegalPwn: Tricking LLMs by burying flaw in legal fine print

Stick your adversarial instructions somewhere in a legal document to give them an air of unearned legitimacy - a trick familiar to lawyers the world over. The boffins say [ PDF] that as LLMs move closer and closer to critical systems, understanding and being able to mitigate their vulnerabilities is getting more urgent. Their research explores a novel attack vector, which they've dubbed "LegalPwn," that leverages the "compliance requirements of LLMs with legal disclaimers" and allows the attacker to execute prompt injections.
Artificial intelligence
fromThe Hacker News
11 months ago

Echo Chamber Jailbreak Tricks LLMs Like OpenAI and Google into Generating Harmful Content

While LLMs have steadily incorporated various guardrails to combat prompt injections and jailbreaks, the latest research shows that there exist techniques that can yield high success rates with little to no technical expertise.
Artificial intelligence
Gadgets
fromInsideEVs
11 months ago

'Thieves Taking Notes': Tesla Jailbreak Exposes Trick To Get Inside Locked Glovebox

Physical tools can bypass high-tech security features effectively.
[ Load more ]