The new buzzword in SEO is information gain. And like all new buzzwords, SEOs are throwing it around like we’ve just discovered fire.
But there’s a massive problem.
Information gain means different things to different people.
In this article, you’ll learn about information gain and how to use it to your advantage.
The 3 schools of information gain: Humans, machines and search engines
Information gain can be used in three topics:
- Machine learning.
- Google Patent.
- Information foraging theory.
Information gain is used to train decision trees in machine learning. And unless you are a computer programmer, we can largely leave that can of worms unopened (for now).
When SEOs talk about information gain, they mainly refer to the Google patent.
Google was granted a patent in 2022 regarding an information gain score that applied to documents.
This patent showed that Google had developed a way to measure the “sameness” of content and either promote or demote it accordingly.
This is a great way for Google to deal with content that is essentially unoriginal or simply copied from another source and reworded.
But what about information gain in relation to the information foraging theory?
Information foraging theory was documented in the book of the same name, written by Peter Pirolli.
It applies the models of how animals search for food (optimal foraging theory) to how humans search for information (which we’ll talk about later).
As you can see, we have three different meanings for the same term.
With regards to SEO, the Google patent is mainly easy to understand – just make your content unique.
However, information foraging is more complex, so we need to examine it more thoroughly.
Why information foraging matters for SEO
Recently, Google started discussing information foraging theory in their decoding decisions report (the messy middle).
Indeed, information foraging theory seems to be the direction Google is heading, and to quote their report directly:
“An explosion in product choice and information has made it harder to feel confident about making the right decision.”
Or, to put it another way, there’s just too much information out there.
If we have too much information, the time to make a purchase decision is increased, and this isn’t good for anyone.
You can see why Google SGE might help things if you think about this.
By providing a generative AI response to a search query, a search user immediately grasps the subject without needing to click a website.
This initial information should help a user to make their next search decision.
Take this result from a search in Perplexity.
Within seconds, my knowledge of the best gym shoes for bad knees has increased, and there are many links and suggestions.
My next click will be to look at the suggested shoes, not to read another five blog articles.
If SGE works similarly, you can see how it will radically change commerce.
We’re no longer optimizing for Google. We’re optimizing for AI.
Dig deeper: LLM optimization: Can you influence generative AI outputs?
Get the daily newsletter search marketers rely on.
From SEO to information gain optimization
Google has been involved in AI for a long time, and AI is part of many of its systems.
They used BERT to improve their understanding of language, and I’m sure many more systems are in use.
The point is that Google is trying to understand content to serve search engine users better. Therefore, Google itself is reading your content.
Sure, not like a human does, but they are reading it.
So, it makes sense to apply a similar approach to increase Google’s information gain from content, just like humans.
In essence, we become information optimizers.
Our job as SEOs is to continually increase the rate of information gain.
The rate of gain, explained
Information gain rate, when it comes to information foraging theory, is described as:
- Rate of gain = Information value / Cost associated with obtaining that information
You see, while search engines carry a cost for indexing and retrieving documents, so do humans.
When we use our brains, we consume calories, and the body is highly efficient at not wasting them.
We use heuristics (mental shortcuts) to filter the world and make decisions.
Information foraging theory suggests that we seek to do the same. We attempt to gain as much information as possible from a source in as little time as possible.
To do this, we go through a five-stage process.
Goal
- What information do we need?
Patch
- We decide on what source of information would best deliver our goal. This could mean that we go to Tripadvisor, TikTok, YouTube or any website/ search engine that comes to mind.
Forage
- Here, we search for the information we need on the platform of choice. For this example, we’ll stick with Google. You type into the search engine keywords to try and find the information you need.
Scent
- When we head to search engines, we’re looking for the scent of good information sources; signals such as reviews, higher rankings and page titles that encourage clicks.
- We click on sites, scan information and decide whether to invest time reading the resource.
Diet
- We consume information from multiple sources before making decisions. This is what Google refers to as the messy middle of search.
- For brands/ sites, being part of your consumers’ information diet increases the propensity that they will come to depend on you for information and trust you.
As we know, that trust leads to purchases or increased clicks (which can lead to advertising/ affiliate revenue). So this means that SEO should include optimizing around information scent.
But if you’ve read the above, you can see that Google search works similarly, just a machine version.
Information optimization: The new science
If we’re going to optimize around information gain, we need to understand that it requires a greater understanding of two factors:
- Machine learning.
- Human learning.
We already know that Google wants original, experience-based information from the best sources.
They also want to reduce the cost of extracting that information.
Yes, Google wants an easy life. So, how do we do this on a practical level?
Simply put, we make extracting information easier for both machines and humans (at the same time), and here’s how.
The optimal website maximizes the value gained per interaction
Contrary to what many think, fast websites might matter, but if the information gain rate from a website is low or has a high perceived cost, then the person will leave.
Here’s an example.
I’ve asked ChatGPT for some information about a hotel in Paris. It gives me the information the best way it can.
It gives a lot of information I can easily extract at a low cognitive cost. But how should a website deal with this?
Tripadvisor has a whole page dedicated to the hotel. Look at how they’ve optimized one section for information gain rate.
The content – which uses symbols, scorecards and lists room types – is designed for humans (and machines) to gain the most information in the least time/cost.
And it’s this that we have to get our heads around to help search users.
But we need to destroy some myths around content.
Good content is context-based
I read a lot of good content, but most of it’s in my inbox in the form of blogs people have written that are not designed to gain traction from search.
Good content for SEO is wildly different.
When we search online, we have an emotional need state that requires solving.
Kantar and Google did some research a while ago.
In this study, the above need states were used by searchers, who came to search engines looking for them to be resolved.
Some words that stand out across from each need state are:
- Quick.
- Laser focused.
- Specific phrases.
- To the point.
- Simplicity.
- Uncomplicated.
- Trust.
- Ratings.
- Reviews.
- Competence.
- Location.
It’s these attributes in information that search users look for in content online.
Strikingly, we can see how Tripadvisor’s content displays these attributes, and we can also see how applying them to content would increase the information gain rate for humans and machines.
But how can we start to take the approach of information optimization to content?
Well, here’s a four-part process to get you started.
Part 1: Content structure
Look at how your page should be structured for search to increase the information gain rate.
A good example is the Tui website:
They’ve used faceted search “buttons” to help users find what they are looking for.
Consider how best to design your page for humans and search engines to increase the information gain.
UX matters, as does the information on the page.
Part 2: Information architecture
Consider how you want your information to be structured for maximum information gain.
You might consider giving information early and quickly, for example:
“When is the best time to travel to Jamaica?”
“March is the best time to travel to Jamaica.”
Look at your content and aim to add some, if not all, of the following attributes.
- Quick adventure.
- Laser focused.
- Specific phrases.
- To the point.
- Simplicity.
- Uncomplicated.
- Trust.
- Ratings.
- Reviews.
- Competence.
Part 3: Content design
The last impact is the design of the content.
Consider how best to add value, such as using unique images to your posts to help explain information or data.
Backlinko uses images like the above to convey data in an interesting format.
This leads us to the final part.
Part 4: Content difference
If you do all of the above, you should have content that is very different from what already exists.
But if you don’t, ensure that you do.
There are 1,000 different ways to say the same thing, but it requires creativity and consideration about how best to display your unique angles and viewpoints around this.
But here’s a little challenge.
Head to a site like Backlinko or HubSpot and look at their content.
Find an article and apply the above four-part system, and think about how you would improve it based on your unique views or experience.
This could serve as a suitable workshop for agencies and in-house staff to consider the information gain and how best to apply it.
Because in the era of generative content, information gain is king.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.