Written by Brook Schaaf
Last week, we covered what answer engines can mean for the open web, in particular Google and its AI Mode. This can be summarized in two points:
1. AI Mode, by whatever name, has become a traffic chokepoint for content creators. According to Similarweb, year-on-year traffic to the world’s 500 most visited publishers has dropped by 27%.
2. The information extraction that enables this is increasingly being viewed as theft of intellectual property and a betrayal of the decades-old indexing bargain of data structuring in exchange for at least a chance of ranking.
This week, we’ll discuss what can and should be done about it, starting with the premise that content is intellectual property, though it should first be acknowledged that not everyone cares, even if they agree. As Troy Young said on a People vs. Algorithms podcast episode, “Google doesn’t owe anybody anything, right?”
Well…not quite. If there were no open web, Google would never have had much to index. Some argue that the explosion of websites never would have happened but for Google, but this is patently untrue, as Yahoo began as a directory to navigate an already-burgeoning web, so the argument may actually go the other way. If a site publisher had to anticipate garnering its own traffic through paid or earned media or by entry into a human-curated directory, as opposed to gaming a search engine, perhaps the internet would have seen fewer sites of higher quality. So if there is a great moral debtor, it’s Google, not the content creators. As much is acknowledged on Google’s own blog, referencing “the fundamental fair exchange between Google and the web.”
This phrase is found under a header named “We respect publishers’ copyright,” which takes us into legal territory, albeit one of an uncertain nature. On X (Twitter), SEO Joe Youngblood shared with me multiple ongoing lawsuits related to AI, noting “1,000 to 1 odds a flurry of more are incoming.” And why wouldn’t there be? The case that fair use has been violated is strong.
Lawsuits aside, travel blogger Nate Hake has a list of things that can be done, which includes contacting your representatives. What can they do? Here are some proposals:
• Make provisions in robots.txt files legally enforceable along the lines of noindex, notrain, nopublish, etc. (Fun Fact: Did you know that Google already has a nosnippet attribute?)
• Require inline hyperlinks in AI-generated overviews
• Mandate accessibility to AI training materials
• Compel answer engines to disclose some data inputs
The future of the open web hangs in the balance. As things are going, many publishers have little to lose by walling off their data or poisoning LLMs. Without a commitment to enforceable, fair protections, the thriving web economy of ideas and commercial opportunity risks sliding into an endless sea of AI-generated slop.
There Ought to Be a Law