Here are the biggest misconceptions about AI content scraping
AI bots scraping publishers’ sites for real-time information are now scraping publishers’ sites more than the bots used to train large language models. And they’re harder to detect.
That’s according to the latest report from TollBit, a data marketplace for publishers and AI companies. From Q4 2024 to Q1 2025, bot scrapes used for Retrieval Augmented Generation, or RAG, per site grew 49%. That is nearly 2.5 times the rate of training bot scrapes (which grew by 18%) in the same time period.
An increase in bots scraping content from publishers’ sites represents a threat to their businesses. But scraping for AI training and scraping for real-time outputs present different challenges — and some opportunities — for publishers. And not all of them are fully understood.
Continue reading this article on digiday.com. Sign up for Digiday newsletters to get the latest on media, marketing and the future of TV.