Written by Brook Schaaf
There are now at least three prominent pay-per-crawl bot-blocking services: TollBit, Cloudflare, and Really Simple Licensing. TollBit announced a bot paywall and AI licensing marketplace with marquee publishers like TIME and Penske Media about a year ago. Cloudflare enabled its bot block at the beginning of July, along with a pay-per-crawl option for the 24 million sites it supports (about a fifth of the internet). Finally, Really Simple Licensing was unveiled just last month as an open standard. Launch partners included Reddit, Yahoo, Quora, Ziff Davis, and People.com.
The participants? Impressive. The results? Unclear. The future? Flightless. Let me first provide a little more context.
Early financial signals are weak. References suggest licensing revenue in the mid-tens to low-hundreds of thousands per month for large sites and $0.005–$0.02 per crawl, but these figures lack substantiation. The absence of formal earnings disclosures—beyond broad claims of “billions of bots blocked”—implies limited monetization. It’s no surprise that Digiday recently reported some publishers are “reassessing their block-all-AI-bots stance.”
Also not encouraging is the fact that bots must agree to abide by robots.txt or llms.txt instructions. Some companies’ crawlers have proven themselves as shameless as they are aggressive, not even bothering to disguise themselves. Of course, these can be blocked, but then they disguise themselves as human, making them indistinguishable from real traffic. As Tollbit notes in its Q2 “State of the Bots” report, “This is why it is mission-critical to push for a future where non-human site traffic is required to self-identify.”
This problem cries out for a remedy, which is most likely to come through lawsuits and legislation. Take heart! This cause is far from hopeless. Just last month, AI firm Anthropic agreed to pay at least $1.5 billion as part of a settlement of litigation brought by a group of book authors. It seems fair to say that sentiment among politicians, the general public, and the media has soured on unlicensed, uncompensated use of content. When Rupert Murdoch called news aggregation by search engines “theft” fifteen years ago, it felt like a cry in the wilderness. Last month, when the CEO of People (Dotdash Meredith to affiliate marketers) called Google a “bad actor” for conjoining its AI and search crawler, it seemed to resonate. Shortly thereafter, journalist and media consultant Brian Morrisey noted, “I was struck by how publishing executives have turned outwardly hostile to Google.” (Remember, not just Google—the sentiment likely applies to any answer engine.)
The time is ripe for the implementation of protections. If there are legal and financial (I’d add reputational, but let’s not pretend Silicon Valley cares about its reputation — lmao) hazards, scraping will be restrained, and receipt of scraped content might be like receipt of stolen goods.
Now we can address the biggest problem that pay-per-crawl itself cannot solve: not all content has equal commercial value. Similarly, flat-licensing deals will be difficult to configure. There is no price discovery, as you’d have with an ad auction or bid-and-take offer. Content about an ancient battle simply won’t have the same value as an in-depth review of home appliances, because an advertiser will pay well for the latter but not the former. Neither will the publisher take the answer engine’s word for it—nor the other way around. This will keep the price grounded on the floor if it’s accepted at all.
The solution? You already know JUST what I have in mind.
PPC Struggling to Get Off the Floor