Why training search engines is the new game in the age of AI

Sorry kids, but SEO isn’t what it used to be.

It’s time to bury the old idea of “optimization” and start thinking about what we really do: train search engines.

In this brave new world of generative AI, the way we approach search has fundamentally changed and clinging to outdated concepts won’t get us far.

Why the ‘optimization’ no longer reflects modern SEO

Once every two years or so since the coining of the phrase “search engine optimization” almost 30 years ago, there comes a concerted attempt by an either ill-informed or attention-seeking person or entity who boldly and objectively promulgates the notion that “SEO is dead.”

A digital hornet’s nest is stirred for a bit and then the attention dies down and business continues again as usual. 

But in the new world of generative AI and its counterbalance with the convergence of search engines, I have a new issue with the term “optimization,” perhaps one that may have a more lasting effect on how we view and think about this practice.

I’m not here to declare the practice of “SEO” dead – it is not by any means. But linguistically, the term “optimization” is now dead to me.

This shift in language is actually beneficial as we rethink how to approach search in the age of AI, both for me and for many others in the industry. 

The term “optimization” no longer fits what we’ve been doing for the past 30 years because search engines have always been a sophisticated form of generative AI.

SEOs have always been knowingly or unknowingly at the forefront of interacting with artificial intelligence on behalf of their websites, digital assets and identities.

Traditional ‘SEOs’ are human intermediaries between our content and generative AI

The rise of modern generative AI and ChatGPT over the last two-plus years has accelerated and illuminated the perception of what the search industry has really been doing all this time – being a human intermediary between digital media and AI. 

We haven’t been “optimizing” assets for search engines; rather, we have been training search engines to understand and generate immediate connections to our digital assets in their infinitely generated lists in the most relevant way. 

It is common knowledge that generative AI is a workhorse for digital media and that it still requires experienced human guidance to train for full effectiveness. 

SEOs have always performed this same important function as a human intermediary between assets and AI within the context of search intelligence. 

This simple revelation has major implications for how you view your SEO experience entering this new world of AI and how website owners and companies view their efforts. Like SEO itself, it is still not rocket science. 

Before diving deeper, let’s quickly review the origins of “search engine optimization” and the brief history of how people have interacted with AI.

In my 2012 book, “Search and Social: The Definitive Guide To Real-time Content Marketing” (Wiley/Sybex), I utilized my experience and also went on a deep dive to study every possible origin of the term. Here is a quote from the book:

“Even in their earliest stages, search engines were based on core network principles and they were developed by humans. It is worth noting that early search engineers constantly fought with publishers in terms of optimizing their content. The search engines wanted to capture the Web as an observer and to rank those pages in order as they saw it.

Of course, not every web publisher agreed with their results and some began to reverse engineer the process through what is now known popularly as search engine optimization (SEO), a term coined simultaneously by Bob Heyman, John Audette and Bruce Clay. What the engines did not consider as closely at the time was that their data was an almost living and breathing corpus.

The corpus was interactive and this caused the engines to innovate in ways they had not previously considered. I believe it is unfortunate and misplaced that many people still perceive search-engine algorithms to be purely technical. The more accurate picture is that search is created and edited by people, consisting of content created by people (even if they use technical tools). Links are created by people. The analysis of relationships between links and sites is network analysis. In this sense, search has always been “social” and “networked.”

Most notable here is the concept of human intermediaries interacting with AI in search engine form. It was true in the earliest stage and is even more true now in terms of setting the trajectory for the desired output. 

Also, possibly just as notable, those doing the work known as “SEO” were challenging and pushing our friends at Google, Microsoft Bing and other engines forward to innovate with more sophisticated AI technology, leading us largely to where we are today. 

Yes, SEOs have, en masse, directly and indirectly pushed generative AI to the state where it is today. 

Again both from firsthand knowledge and extensive research, it was clear to me that “search engine optimization” was coined simultaneously and without knowledge of the other by Clay, Audette and Heyman. 

Bob Heyman is often left out of this terminology discussion, but he is on equal par with Clay and Audette in this regard.

The first use of the term “search engine optimization” was identified by search journalist and now Google Search Liaison Danny Sullivan, who found the phrase in an unattributed email spam message that appeared in a few inboxes in 1995.

There were also other early viable candidates for what this new search animal was to be called.

“Search engine positioning” was championed by early search marketer Frederick Marckini around 1996 in his tech book of the same name and his then small but growing agency, iProspect (it ain’t so small now, coming under the umbrella of Dentsu-Aegis holdings many years ago, my former employer).

Still, under the shadow of AI, this term also does not fit the current practice.

Perhaps the most vocal industry criticism has been from my friend and search luminary Mike Grehan, who has had an issue with the term “optimization” for decades.

In the earliest days of search, Grehan was and is still known to give some of the smartest and most immutable advice on search, whether speaking from the conference stage, writing for major search sites or talking to executives in the boardroom.

In its simplest form, Grehan’s main complaint is somewhat grammatical and he has a point. “How do you optimize a search engine? You can’t,” he has often said. 

I would often counter that perhaps the “optimization” part is grammatically correct in the sense that assets are being optimized and decorated for search engines. But it has never fully set well with me, either.

“Optimization” has become something of a dirty but necessary word in the absence of something better.

The Google ethos: Search engines have always been about artificial intelligence

Semantic issues aside, Google’s ethos is relevant to this discussion in that it has always viewed search as generative AI. 

In a discussion I had with Grehan at the Luxor Hotel in Las Vegas for Pubcon last March, I was recalling an indelible casual discussion I had heard in 2004 at the Search Engine Strategies conference in New York.

I was telling him that the talk I attended maybe had only 20 people in a smaller room, which was a one-on-one discussion with Google’s number three employee, Craig Silverstein and an interviewer. 

I recalled how Silverstein somewhat stunningly talked about Google as artificial intelligence, which he called “search pets” for humans in the future state of Google AI. 

It was an early view into the company’s mindset that what they were doing wasn’t just “search.” They saw their mission as that of AI in the service of people, even in its earliest incarnations. 

This talk has stayed with me every day since, and I have often spoken of search engines as AI in the subsequent two decades.

Grehan quickly reminded me that he was the guy interviewing Silverstein and we both had an Oprah “full circle moment,” and also a good laugh.

But again, Google is and has always been about generative artificial intelligence. And those who “optimize” have actually been “training” search engines all along.

In my own work, I have fully shifted in how I explain SEO concepts not as “optimization”, but rather “training a search engine to better perform with our digital content.” 

In discussions with people within the range of complete newbies to seasoned professionals, the complex aspects of getting the most visibility for content are more easily understood when discussed as “training”. 

The use of “training” also allows for a more reasonable shift in thinking about what is needed for success. 

SEO is no longer wholly performance-based and tied to dollar-in-dollar-back expectations; it is now more holistic, the sum of both direct and indirect actions that lead to the ultimate comprehensive goal.


A basic reframing of ‘optimization’ to ‘training’

What does it mean to “train” a search engine, in the context of what is now considered “optimization”? As an example, here are a few very basic ways to recalibrate our thinking:

  • Keywords: Trains a search engine to know what our content is about linguistically.
  • Content: Depth, length, reading level, topics, subtopics and mixed media all help train a search engine for a level of trust that is high enough to appear at the top of list-generated results in the appropriate context.
  • Internal links: Trains the engine on internal relationships to content within a website or domain and adds additional relationship context.
  • External links: Trains a search engine about external relationships from other trusted websites, which in turn imparts a level of trust to the website for results generation.
  • Schema: Trains a search engine to understand a further level of semantics.

All of these areas above can be noted or discarded by the engines, but they are still being trained nonetheless.

If non-relevant techniques are being utilized, whether in linking, content, relevancy or other elements, then the search engine is still being trained – either knowingly or unknowingly by the website or asset owner.

However, their offering meets the threshold of the feared spam label, which can diminish results from “less than desirable” to outright invisibility. It is an undesirable training threshold that should be primarily avoided. 

The pain for your content visibility in unintended spam training becomes a matter of degree – from the annoying “catching a cold,” as ex-Googler Matt Cutts described it, to “never going to perform as well as other websites,” to an outright ban, poisoning a domain to a permanent digital death, as far as the engine is concerned.

Dig deeper: LLM optimization: Can you influence generative AI outputs?

Training a search engine: If a tree falls in the forest with no human to perceive it, does it make a sound? 

In my 30 years of experience in search, both as a user and in my job, I have never seen a top result – in even a low-competitive environment – that existed in a vacuum, devoid of any type of content or linguistic training or lacking an external trusted link of some kind. 

Show me one example of this, and I will then be able to explain to you definitively whether or not a tree falling in a forest, devoid of human perception, actually makes a sound. 

Whether these training concepts are intended or unintended, the generative results are – and have always been – a result of training the search engine. 

The limitations of ‘search engine training’ as an industry term

While the term “search engine training” is a more linguistically accurate representation of what is done in this process in the context of generative AI, it does have its obvious limitations. 

In short, it could be easily confused with some other type of educational training, even though that perception would also be linguistically inaccurate. 

“Training” for a search engine is too broad and meaningless in the context of its potential educational component. 

I’m not here to try to recoin the phrase, but rather point out a single word that will help you help others better understand what is needed for success and to have better expectations for what that type of success entails. 

If there is a better way of expressing it, that’s great. 

“Search engine training” is not a sexy phrase, though there are many cunning linguists in this industry who can probably come up with something better. 

But for me, “optimization” is no longer the right term and “training” a search engine speaks to a symphony of elements required to get the results we are looking for. 

The industry is still trying to define our new world of search and AI training for digital assets, trying out terms like AIO, GEO and the like. I still haven’t seen one that resonates with me. 

“Search training”? “Search intelligence training”? 

Personally, this is for someone else to figure out and, hopefully, this article will further promote that discussion to the tens of people who care.

Without your content, search engines and AI won’t exist

I am not so naïve as to think that this whole concept might be offensive to the major search engines in that they may be so susceptible as to be “trained,” and that is certainly not my intention. 

But here is one last simple truth for you that I have stated for many, many years: 

Without your content, search engines and generative AI do not exist. 

In a world where “free speech” is supposedly sacred, you have every right to train a search engine or GPT about your content as you wish. 

Your results might vary in how you exercise your right to train, as it is the engine’s right to clean things up as they see fit.

But train them in the language they speak and stay relevant, and the results may just be harmonious.

Dig deeper: How AI will affect the future of search

Contributing authors are invited to create content for Search Engine Land and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.

Leave a Reply

Your email address will not be published. Required fields are marked *