Chatbot data harvesting yields sensitive personal info
Briefly

Chatbot data harvesting yields sensitive personal info
"This data is captured from real people's private AI conversations via browser extensions, stored in a vector database, and exposed via API to authenticated customers. The panelists have pseudonymized IDs (SHA-256 hashes) but the content of their conversations is stored verbatim and searchable - and many prompts contain real names, dates of birth, medical record numbers, and diagnosis codes."
"People install browser extensions that purport to offer free VPN service or ad blocking or some other capability, likely without reading or understanding the extension's privacy policy. These extensions may silently intercept users' communications with AI services like ChatGPT, Gemini, Claude, and DeepSeek by overriding the browser's native fetch() and XMLHttpRequest() functions."
Browser extensions marketed as free VPNs or ad blockers secretly intercept communications with AI services like ChatGPT, Gemini, Claude, and DeepSeek by overriding browser functions. This captured data is stored in vector databases and sold via API to authenticated customers. While user identities are pseudonymized with SHA-256 hashes, conversation content remains verbatim and searchable, often containing real names, dates of birth, medical record numbers, and diagnosis codes. Data brokers claim their practices are lawful and data is anonymized, but anonymized profiles can be re-identified through data point connections, particularly with AI assistance. Researchers have documented millions of users' AI conversations being sold for profit through these privacy-focused extensions.
Read at Theregister
Unable to calculate read time
[
|
]