Home | Search Engine Optimization | Technical SEO | HREFLANG Markup Testing – It Works!

HREFLANG Markup Testing – It Works!

John Doherty —  January 18, 2012
  • Buffer
  • Vote on Hacker News
  • Sharebar
  • Buffer
  • Vote on Hacker News

A month or so ago Google announced they are now supporting the HREFLANG markup for translated content that they used to mark as duplicate. I had just completed an international audit for one of my enterprise clients, so I was interested to do some testing on this to see what effect it might have.

The results are very interesting!

The Tests

I did two different tests. First, I installed WPMU, which is a premium WordPress plugin (I paid $29 for it, which is a steal in my opinion) that internationalizes your site. It was dead simple to buy, download, and install.

Over the course of the next week or so, I published Canonical Delays with Googlebot and The Power of Guest Blogging. For both of these, I then used Google Translate to translate the content (just the text mind you, not the menus and plugins and sidebars) into Spanish.

Here’s where the test came in:

  1. For the Canonical Delays post, I added the HREFLANG markup to the post, with en-ES pointing to Etiqueta Canonical Retrasa con Googlebot en la Web Contra Indice Movil. The en-ES was self-referential.
  2. For the Power of Guest Blogging, the translated post was El Poder de los Blogs Invitado. For this one, I did not use the hreflang.

Of course, the purpose of these two different tests was to see how/if Google treated them different. I went through the same process with both, just left off the HREFLANG tag with the guest blogging post.

So What Happened?

First, let’s look at the Guest Blogging post, since it did not use the HREFLANG tag and therefore gives us a snapshot of how Google has been treating content automatically translated.

When I searched for “the power of guest blogging” in Google ES, I saw just the English version, like so:

As you can see, it ranks first and then has three other English listings underneath. That makes sense, right?

I searched for “el poder de los blogs invitado” on the same day (this was around December 31st) and saw this:

Just the Spanish version was showing here, which also makes sense. It was followed by all Spanish results. So we’re seeing Spanish results for the Spanish search, English version for the English search.

If you search for “el poder de los blogs invitado”, you see just the Spanish one in Google.com as well:

Just the Spanish is ranking (I think the one below it, also mine, is ranking because Google is, for some reason, still personalizing my results a bit even in an incognito window and logged out. This will be investigated at another time.)

Search for “the power of guest blogging” in Google.com, and you get the English version:

This all seems pretty normal, right? The correct post is ranking for the correct language.

Canonical Post Differences

Now let’s take a look at the Canonical post. Remember, in this post I used the HREFLANG markup on the English version:

Click to Enlarge

When I searched for “canonical movil” in Google.com on December 31, I saw this:

The Spanish was indexed and ranking. Very good.

But I searched for it on January 17th and saw:

Huh? The English outranks the Spanish, but the Spanish is still there! Interesting.

As you probably know, when Google sees content as being duplicated, one will rank and the other will not. What this tells me is that Google is not counting this as duplicate content. And interestingly enough, it is giving the option to translate the English one, and is showing both so that the user can choose!

I searched for “canonical movil” in Google.es and saw this on December 31:

Just the Spanish was ranking, which makes sense. However, I then did the same search on January 17th and saw this:

Just the Spanish was ranking, as it should also! But then I search for “canonical movil retrasa” and see this (incognito, logged out):

Both Spanish and English are there, though Google has translated the meta description on the English one (I did not specify it, so they are just showing the most relevant content, even though that content is not present anywhere on the page).

I searched for “canonical delays” in Google ES in December and saw this:

They brought up both the English and the Spanish. My guess is that this is because it’s the Spain search engine and it makes sense for the Spanish post to be there. Not duplicate content!

I searched for it again on January 17th and saw this:

Still both are indexed and the English one ranks higher. But both are there! Not duplicate content, according to Google!

When I search for “canonical movil” in Google.com in December, this showed:

When I searched for “delays in canonical tags” in Google.com on December 31st, I saw this:

Just the English ranks.

When I search for “delays in canonical tags” in Google.com on January 17th, I saw this:

Just the English ranks still!

Remember, I did not put the HREFLANG tag on the Spanish post.


So what are our takeaway? I think we have a couple.

First, when you do not put the HREFLANG in, the correct language showed. This shows me that Google is pretty good at figuring out the right content to show in which search engine. Also, they seem to know that the other is a duplicate, so they do not show it.

Interesting, when I searched in google.com for “canonical movil” in December, I had not implemented the HREFLANG yet. Then once I did, both listings showed in the SERP a few weeks later.

When I did put the HREFLANG in place, both posts showed in Google.es for English searches, but not in Google.com. Similarly, they showed both posts when I searched in Google.com in Spanish, but not in Google.es.

What we find is that Google ranks the correct version when searching in the language of the search engine. But when you search for the other version in that same engine (ie English in Google.es or vice versa), they show both as they recognize that you have searched in English, but may prefer the Spanish. They are better serving content based off the HREFLANG markup, AND you can get dual listings in the SERPS according to my tests.

So Google is actually doing what they said they would do. I essentially told Google “Look, I want this /es/ to show in Google.es” and true to their word, they showed it both with the English and Spanish searches. Also, they recognize that the /es/ is Spanish, so they serve it in Google.com for searches in Spanish. This actually provides a good experience, in my opinion, because the user can choose. If they’re searching in Google.es, odds are they want Spanish but may be searching in English. So give them both! The directive works.

What do you think? Conclusive enough? I’d love to see other tests.

John Doherty

Posts

I'm the new (as of October 2013) Online Marketing Manager of Hotpads.com, soon to be based in San Francisco. Previous to Hotpads I worked at Distilled for 2 years as an online marketing consultant. In my spare time I shoot lifestyle photography, explore new and interesting food in New York, ski, rock climb, and update my Twitter and Google+ accounts.

21 responses to HREFLANG Markup Testing – It Works!

  1. Timing couldn’t be better! I just had this question for a client on how to deal with it, but just a couple of questions:

    What do you think would be the result if you referenced the English version from the Spanish version? Is that even needed or would you recommend that?

    • I don’t know that it’s needed, Mitchell, but I don’t think it would do anything. But there’s only one way to find out :-)

      • I would imagine it’s relatively important, surely its a verification feature of the process? If you didn’t reference the English version from the Spanish then when you do this cross domain surely you could abuse this with other websites?

        For example, if you owned the website http://www.somedodgywebsite.com that ranked for some terms say “george bush sucks” in en-US you could link the en-UK using hreflang to a government source in the UK and surely that would then cause some political issues? The gov website would never link back to the en-US version so it makes sense that Google would use this as a clarification procedure to ensure both websites are owned by the same parties?

      • So how do I turn this off? I didn’t set anything, my browser just started returning results this way, AND I HATE IT.

  2. Wait a second amigo.
    Although the approach for your test is nice and valid, you are querying Google Spain with terms like “El Poder de los Blogs Invitado” or “canonical movil retrasa”.

    Those are very incorrect, awkward translations so you are going to rank first for them just because no one else is using them.

    Go to Google Poland, search “El Poder de los Blogs Invitado” in Spanish and you are there anyway.

    Same test should be done with proper translations and see what happens.

    In any case thank you for testing HREFLANG Markup Señor Doherty.

  3. The only problem is that Google can look on any automatically translated content as webspam if you allow it to be indexed, and the translation should always be initiated by a human.

  4. Thanks for the testing John, this has been hot topic here in the past few weeks and thanks to Rand over at SEOMOZ for posting this on Google+ so I can find it.

    Now I just need to find out if this can be done with GB English and US English to separate the two languages/markets.

  5. Any reason this one did not come across my Google Reader? Even when I refresh and “See all” this post is not listed. Could be an error on my part but just wanted to throw this out to see if any of your other subscribers might be having the same issues.

  6. Super interesting posting. It seems that Google now can better identify and select relevant content and websites to bring it to the user. Thank you for the examples John and best wishes from Germany!

  7. I think you installed WMPL and not WMPU – http://wpml.org/

  8. Hello John,

    you said you installed the WPMU plugin but you linked to the main domain, I am not sure where to find it? Did you mean WPML, as Bob Jones has mentioned?

    thank you

  9. Hi John,

    Thanks for this.

    How did you add the HREFLANG markup to the post? Where did you place it in the WordPress interface?

    Thanks.

    • Hi, I’m also wondering if there is an easy tool for WordPress because you have to add those lines on every page… Hope to hear from somebody.

      Kind regards,

      Willem

Trackbacks and Pingbacks:

  1. The alternative hreflang tag: Strategies for multilingual domains « « SEOsmith SEOsmith - February 20, 2012

    [...] study by John Doherty in December 2011 found some interesting results, in particular Google seems not only to recognise [...]

  2. International SEO: The HREFLANG Tag | Return On Now - June 22, 2012

    [...] Both tags are meant to be populated in the header of your website. It is very important to manage these properly. And this is not a fad or SEO gimmick – it really does work. [...]

  3. International SEO: The HREFLANG Tag | Daily Serps - June 24, 2012

    [...] And this is not a fad or SEO gimmick – it really does work. [...]

  4. 5 consejos a la hora de hacer SEO para varios países e idiomas | dobleo - March 27, 2013

    [...] Además de conseguir resultados más relevantes en los motores de búsqueda, esta opción permite aumentar el número de resultados visibles en los resultados de Google según una prueba realizada en el blog johnfdoherty.com: [...]