How Google Search Quality Is Measured

Elizabeth Tucker, Director, Product Management at Google Search was a guest on Google’s Search Off the Record podcast where Lizzi Sassman and John Mueller of Google asked her about search quality, how Google measures it, and so much more.

As a reminder, I did interview Elizbeth Tucker for SMX.

I will post my notes below but the things that stood out to me are:

  • Google can make an improvement for one type of search and that can lead to 50 other searches being destroyed
  • 4 word searches use to be long, now they are common
  • Data can be misleading, so understanding that is important
  • Better Google gets at search, the harder the search queries get
  • A spike in queries in the short term can mean something is broken with Google Search
  • A long term slow down in queries can mean people are not happy and unsatisfied with Google Search
  • PageRank might be along the lines of the “A,” authoritativeness, in EEAT
  • No ranking signal really aligns one to one with EEAT

Here is the embed of the interview followed by my raw notes:

Raw notes:

  • Who is Elizabeth Tucker
  • What do data scientists do at Google
  • What do searchers do
  • Are they finding what they are looking for
  • You can make one search much better and then destroy 50 more
  • How do you know if you are doing better or not?
  • Hard to find slices of searches that aren’t doing well and make fixes for them
  • What does it mean to be satisfied when you come away from a search
  • Typically relevant content should show up, which was a challenge in the old days
  • There are biases in Google Search some examples
  • Does Google show too many types of sites for a query
  • Too many evergreen results
  • too many fresh results
  • Too many results from institutional organizations
  • Too many results from blogs or small site
  • Too many results from social media
  • Google wants a nice mix of this
  • User experience research and data scientists come together to help improve Google Search
  • Where do complaints come from
  • Sometimes from executives
  • Sometimes from data scientist team
  • Sometimes from engineers
  • Everywhere
  • How do you prioritize these questions
  • Scams and stuff like that go to the head of the line
  • What Google does when bad stuff comes up in the search results
  • Some systems demote, such as web spam or malicious download sites
  • Most systems promote or “find the good,” such as systems that try to match the topic of the query, etc
  • Google use to be very keyword focused but now Google can understand real sentences
  • In the old days, searches with 4 words was considered long, now they are not
  • Kids search differently and watching kids search is interesting
  • BERT was a breakthrough for language in search
  • Although, this is not a solved problem and it will get better
  • The better Google gets at this, the harder the search queries Google gets
  • If Google just stood still, Search would get worse
  • Data be misleading so Google needs to be careful
  • Before Elizabeth started, Google used very little data to test search quality but now Google uses a ton of data. She provided some examples, like sometimes if search is not working, people in the short term search more but in the long term, people search less.
  • Measuring search may be harder than improving search
  • Google wants to make sure the search results are understandable and controllable, so that is a challenge with machine learning and AI
  • Search quality raters guidelines was one of her first projects at Google
  • Her desk was right near Sergey Brin and Larry Page (she barely saw them)
  • Search quality raters and how those works and how they are measured
  • The origins of EAT (now EEAT)
  • The original version didn’t specifically mention EAT, but it was littered within the document, so the evaluators got tired of writing out expertise, authoritativeness, and trustworthiness so they wrote EAT.
  • Health queries absolutely need trustworthy results but other queries might not need to be EAT, like show me the cutest kitten.
  • EAT has no one ranking signal that is a one to one match
  • PageRank is along the lines of authoritativeness but not the other letters

The full transcript is over here.

Glenn Gabe also posted his summary on X – he wrote:

Great episode of SOTR with Google’s Elizabeth Tucker. Covers a number of Search areas, including user experience research (qualitative and quantitative), the power of hearing from objective third-party users – who else has said that btw? :), prioritization of Search problems (balancing frequency and severity), systems that DEMOTE, systems that PROMOTE, the QRG and when EAT first started being used, how that evolved to EEAT, and much more. Again, great episode. I highly recommend listening. 🙂

I’ve covered this before based on previous PDFs Google has published (screenshot below), but when speaking about EEAT, Elizabeth explained there is no ranking signal that’s a one-to-one match with EEAT. But as an example of a letter *aligning* with a ranking signal, PageRank, one of Google’s classic ranking signals, aligns most with authoritativeness, but doesn’t necessarily match with the other letters in EEAT.

One more note about the episode. They covered what EEAT should be called, and I was surprised to not hear Elizabeth call it “Double EAT”. That’s what she called it in the blog post announcement about the second E being added and it’s what I’ve been calling it ever since! 🙂 I personally like “Double EAT”. It’s better than the alternative IMO.

I got this photo above from an older interview with Elizabeth when she was a data scientist at Google:

John Mueller said on LinkedIn, “I learn something every time I chat with Elizabeth.”

Forum discussion at X.

Leave a Reply

Your email address will not be published. Required fields are marked *