Why BERT shouldn’t make you feel like a Muppet

A lot’s been said about Google’s latest announce update – BERT, Bi-directional Encoder Representations from Transformers – so why do you need another take?

Certainly not to explain what Bi-directional Encoder Representations from Transformers actually means.

Well BERT is nothing new, to be honest, and is simply the next predictable step in Google’s evolution into the best Cyber Butler in the world or, alternatively, Skynet.

Google began as an exercise in using content to demonstrate a website’s purpose.  It was a very crude beginning, using factors like keyword density – the number of times a keyword phrase appears in every 100 keyword phrases. It also used exact-match searches, so that your site would only appear in results for exactly the phrase typed into the search engine.

The problem is that content written using keyword densities (and even worse keyword stuffing) looks terrible and is pointless. And trying to add in every variation makes content seem ponderous.

To preserve their market share – yes Google had real competition back then – and keep their audience, they had to do something to make search results better. They had to weed the spammy websites out of the SERPs and replace them with quality counterparts.

So began a long story of refining the way algorithm worked, adding factors that were difficult to forge or second guess, and finding out how exactly people searched for things.

Constant Change

The simple answer to that question is, very badly. We’re all rubbish at searching for what we want, because we know what we’re looking for, we just can’t put it into words.

Google, like Data out of Star Trek: The Next Generation, wanted to be just like a real human, so they set about spending tons of cash to achieve this, including renting out a US Military Supercomputer for one month every year to “do the math”.

The first clear result of this research was 2013’s HummingBird update, which introduced the principle of “semantic search” – being able to find what you wanted, even if you use different words for it (synonyms).

The writing was on the wall for “exact match keywords”, though this “archaic” form of search still persists to this day with PPC.

The next big update, RankBrain, revealed in October 2015, introduced “machine learning”. This brought an interpretational model, taking in factors such as current location, personalisation, search history and advanced semantics. All this helps to determine the searcher’s “intent”.

This was the first clear mention of “User Intent”, the new buzz phrase on every SEO’s lips.

What is BERT’s secret?

RankBrain’s true power is its ability to “learn”. It begins with “seed” data, and over time the algorithm teaches itself to match a variety of signals to results, re-ordering the SERPs accordingly. SearchMetrics used a similar process in their Content Module, allowing it to learn which content makes the most sense.

So BERT (without ERNIE, both seen above) is just the latest stage in Google’s drive to think like a human. It’s now an algorithm that constantly updates itself based on feedback from current and past searches. These include data from bounce rate, search repeat frequency, the numbers of similar results and a host of other metrics .

But how does it really work? Who knows?

Not Google, who have admitted their AI’s ability to learn around the clock far more quickly than the most voracious polymath makes human grasp of its capabilities impossible.

They seem comfortable with this, just like they were comfortable with the process of compartmentalising the original algorithm in the mid-Noughties. That meant every one of over 200 factors being tweaked and transformed by individual teams, to a point that no person or group of people can get their heads round it.

This single fact makes trying to “game the algorithm” – which the spammers have been trying to do since 1998 – an exercise in pointlessness.

Are you a loser?

Google hopes the biggest losers with BERT will be websites with content that isn’t written with people in mind.

So, if you only write for Googlebot, Google is trying to weed you out in favour of something a little more meaningful. If, however, you’ve always used great, informative, long-form content written by humans, for humans, you have little to fear.

Thanks to RankBrain and HummingBird, traffic from Google to spammy sites is already in steep decline, and BERT will only assist in that.

Don’t however, feel sorry for the bad guys: there are still plenty of spammers to go around, with an inexhaustible supply of new cunning stunts, so we certainly haven’t heard the last of them yet.

And, every day, we hear stories of people being conned out of fortunes and worse – remember the Nigerian 491 scam – so if Google is just trying to be a real human – or even superhuman – well, humans make mistakes.

So, what of BERT? Previously, Google would interpret a search query like: “2019 brazil traveler to USA need a visa”, as a U.S. citizen needing advice about traveling to Brazil. The retrieved results included mainly news stories.

Now, instead of examining a query word-by-word, Googlebot looks at the phrase in its entirety. So prepositions like “to”, “no” and “for” – which can radically change the meaning of a query – will factor into the equation.

Now, “2019 brazil traveller to usa need a visa” brings up a much more useful result than a newspaper site – the website of the US Embassy in Brasilia.

Searches under BERT’s spotlight

BERT is also going to affect “Featured Snippets” – including Position Zero – blocks of content within a page that help the user more readily identify pertinent information.

For example, for the featured snippet “parking hill no curb,” pre-BERT Google would place too much weight on the word “curb” instead of “no”. The results now should display featured snippets that teach you how to safely and effectively park your car on the side of Ben Nevis.

BERT hasn’t yet set off a bomb in the SERPs say the 3rd party trackers – SEM Rush, AHREFs, SearchMetrics, etc – because their “basket” keywords generally sit in the short to middle length of queries. BERT is meant to tackle the much longer tail. Google itself estimates it will affect only around 1 in 10 searches.

BERT targets top-of-the-funnel keywords – the Informational queries, seeking answers to a question such as “What electric cars are available in the UK” – hinting that content now needs to be more specific, and less generalised, “one-size-fits-all”. The age of generics may also be coming to an end.

Google says BERT will allow users to “search in a way that feels more natural”, for that read “really badly”. You don’t need to jump through hoops to find exactly what you want anymore, because Google will do that for you.

Predicting the future

But why? Well, for starters, if you (the searcher), get a great result from your next search, you’re far more likely to return, and hopefully click on a paid placement.

Ultimately, Google wants to be the only thing you turn to for everything you need in your life, so it has to be “intuitive”: to know what you want almost before you do.

So, look out for BERT’s clairvoyant children. Coming to a search engine near you! Or Skynet.

This article appeared in a modified form on The Drum in December 2019.

Scroll to Top