A should mine all online content unless publishers opt out

Google wants AI to be able to mine all digital content unless publishers opt out.

The search engine has put forward this proposal in a submission to the Australian government, calling on policymakers to change current copyright laws.

Why we care. If copyright laws do change, the onus will be on brands and publishers to stop AI from mining their content. If they fail to do so, they run the risk of rivals potentially generating very similar content and not being able to do anything about it, which could cause substantial damage to a campaign in terms of brand and identity.

What is happening? Google wrote to the Australian government, stating:

  • “Copyright systems that enable appropriate and fair use of copyrighted content to enable the training of AI models in Australia on a broad and diverse range of data, while supporting workable opt-outs for entities that prefer their data not to be trained in using AI systems.”

The search engine has of course put forward similar cases to the Australian government before, arguing that AI should have fair use over online content for training purposes. However, this is the first time that Google has suggested an opt-out clause as a solution to address past concerns.

How would it work? Google does not have a specific plan in place as of yet, however, the search engine has suggested it wants to hold discussions about setting up a community-developed web standard that works in a similar way to the robots.txt system, which enables publishers to opt out of search engines crawling their content.


Get the daily newsletter search marketers rely on.


What has Google said? Danielle Romain, ,vice president of trust at Google search, touched on this topic in a press release last month. She said:

  • “We believe everyone benefits from a vibrant content ecosystem. Key to that is web publishers having choice and control over their content, and opportunities to derive value from participating in the web ecosystem. However, we recognize that existing web publisher controls were developed before new AI and research use cases.
  • “As new technologies emerge, they present opportunities for the web community to evolve standards and protocols that support the web’s future development. One such community-developed web standard, robots.txt, was created nearly 30 years ago and has proven to be a simple and transparent way for web publishers to control how search engines crawl their content. We believe it’s time for the web and AI communities to explore additional machine-readable means for web publisher choice and control for emerging AI and research use cases.”

Deep dive. Read Google’s ‘Evolving Choice and Control for Web Content‘ announcement in full for more information.

Leave a Reply

Your email address will not be published. Required fields are marked *