A month or so ago Google announced they are now supporting the HREFLANG markup for translated content that they used to mark as duplicate. I had just completed an international audit for one of my enterprise clients, so I was interested to do some testing on this to see what effect it might have.
The results are very interesting!
I did two different tests. First, I installed WPMU, which is a premium WordPress plugin (I paid $29 for it, which is a steal in my opinion) that internationalizes your site. It was dead simple to buy, download, and install.
Over the course of the next week or so, I published Canonical Delays with Googlebot and The Power of Guest Blogging. For both of these, I then used Google Translate to translate the content (just the text mind you, not the menus and plugins and sidebars) into Spanish.
Here’s where the test came in:
- For the Canonical Delays post, I added the HREFLANG markup to the post, with en-ES pointing to Etiqueta Canonical Retrasa con Googlebot en la Web Contra Indice Movil. The en-ES was self-referential.
- For the Power of Guest Blogging, the translated post was El Poder de los Blogs Invitado. For this one, I did not use the hreflang.
Of course, the purpose of these two different tests was to see how/if Google treated them different. I went through the same process with both, just left off the HREFLANG tag with the guest blogging post.
So What Happened?
First, let’s look at the Guest Blogging post, since it did not use the HREFLANG tag and therefore gives us a snapshot of how Google has been treating content automatically translated.
When I searched for “the power of guest blogging” in Google ES, I saw just the English version, like so:
As you can see, it ranks first and then has three other English listings underneath. That makes sense, right?
I searched for “el poder de los blogs invitado” on the same day (this was around December 31st) and saw this:
Just the Spanish version was showing here, which also makes sense. It was followed by all Spanish results. So we’re seeing Spanish results for the Spanish search, English version for the English search.
If you search for “el poder de los blogs invitado”, you see just the Spanish one in Google.com as well:
Just the Spanish is ranking (I think the one below it, also mine, is ranking because Google is, for some reason, still personalizing my results a bit even in an incognito window and logged out. This will be investigated at another time.)
Search for “the power of guest blogging” in Google.com, and you get the English version:
This all seems pretty normal, right? The correct post is ranking for the correct language.
Canonical Post Differences
Now let’s take a look at the Canonical post. Remember, in this post I used the HREFLANG markup on the English version:
When I searched for “canonical movil” in Google.com on December 31, I saw this:
The Spanish was indexed and ranking. Very good.
But I searched for it on January 17th and saw:
Huh? The English outranks the Spanish, but the Spanish is still there! Interesting.
As you probably know, when Google sees content as being duplicated, one will rank and the other will not. What this tells me is that Google is not counting this as duplicate content. And interestingly enough, it is giving the option to translate the English one, and is showing both so that the user can choose!
I searched for “canonical movil” in Google.es and saw this on December 31:
Just the Spanish was ranking, which makes sense. However, I then did the same search on January 17th and saw this:
Just the Spanish was ranking, as it should also! But then I search for “canonical movil retrasa” and see this (incognito, logged out):
Both Spanish and English are there, though Google has translated the meta description on the English one (I did not specify it, so they are just showing the most relevant content, even though that content is not present anywhere on the page).
I searched for “canonical delays” in Google ES in December and saw this:
They brought up both the English and the Spanish. My guess is that this is because it’s the Spain search engine and it makes sense for the Spanish post to be there. Not duplicate content!
I searched for it again on January 17th and saw this:
Still both are indexed and the English one ranks higher. But both are there! Not duplicate content, according to Google!
When I search for “canonical movil” in Google.com in December, this showed:
When I searched for “delays in canonical tags” in Google.com on December 31st, I saw this:
Just the English ranks.
When I search for “delays in canonical tags” in Google.com on January 17th, I saw this:
Just the English ranks still!
Remember, I did not put the HREFLANG tag on the Spanish post.
So what are our takeaway? I think we have a couple.
First, when you do not put the HREFLANG in, the correct language showed. This shows me that Google is pretty good at figuring out the right content to show in which search engine. Also, they seem to know that the other is a duplicate, so they do not show it.
Interesting, when I searched in google.com for “canonical movil” in December, I had not implemented the HREFLANG yet. Then once I did, both listings showed in the SERP a few weeks later.
When I did put the HREFLANG in place, both posts showed in Google.es for English searches, but not in Google.com. Similarly, they showed both posts when I searched in Google.com in Spanish, but not in Google.es.
What we find is that Google ranks the correct version when searching in the language of the search engine. But when you search for the other version in that same engine (ie English in Google.es or vice versa), they show both as they recognize that you have searched in English, but may prefer the Spanish. They are better serving content based off the HREFLANG markup, AND you can get dual listings in the SERPS according to my tests.
So Google is actually doing what they said they would do. I essentially told Google “Look, I want this /es/ to show in Google.es” and true to their word, they showed it both with the English and Spanish searches. Also, they recognize that the /es/ is Spanish, so they serve it in Google.com for searches in Spanish. This actually provides a good experience, in my opinion, because the user can choose. If they’re searching in Google.es, odds are they want Spanish but may be searching in English. So give them both! The directive works.
What do you think? Conclusive enough? I’d love to see other tests.