Getting scraped a lot by ChatGPT
Briefly

Getting scraped a lot by ChatGPT
A hosting provider reported that three IPs from ChatGPT bots were consuming 87% of allocated bandwidth. The primary IP, 74.7.241.55, accounted for 98.8% of traffic, making relentless requests. It scraped 59,233 unique pages and pulled 18.9 GB of content in just 9.5 hours. A second IP joined later, increasing the load. The site's unique URL structure created a crawler trap, leading to infinite URLs for the bots to visit. Rate limiting and IP blocking were suggested as potential solutions.
"A single IP (74.7.241.55) has been running since midnight at a relentless ~2 requests/second, scraping 59,233 unique pages and pulling 18.9 GB of content - just in the first 9.5 hours covered by this log."
"The secondary IP 74.7.243.200 has now fully ramped up and is running in parallel with 74.7.241.55 - both crawling at ~6,800 req/hour simultaneously."
"From GPTBot's perspective, every link it finds is a brand new page - it's fallen into a crawler trap with an effectively infinite number of URLs to visit."
[
|
]