Home | Search Engine Optimization | Google’s New Multilingual Markup

Google’s New Multilingual Markup

John Doherty —  December 6, 2011
  • Buffer
  • Vote on Hacker News
  • Sharebar
  • Buffer
  • Vote on Hacker News

Yesterday Google announced a new markup to support multilingual content. This is an interesting move for Google, and one that I think I really like since International SEO has often been a source of questions for SEOs, and to be honest Google pretty much sucks at ranking the correct geo-targeted URL in the correct country-specific search engine.

Best practices have long been that you should do the following to help Google figure out the right page:

1) Best would probably be to use country-specific top level domains (TLDs) such as .co.uk or .com.au. If you cannot do this, it’s probably best to use a .com domain.

2) Use subfolders, not subdomains. (Google shows subdomains in the post, but look at what Gianluca has to say about it):

[blackbirdpie url="https://twitter.com/#!/gfiorelli1/status/143841177173635072"]

A couple good resources on this are this post on SEOptimise and this WhiteBoard Friday over on SEOmoz.

3) Do not use IP redirects for geo-specific information, such as currencies and phone numbers, as currently Googlebot only comes from US IP addresses. It is better to hardcode in this information based on the country-specific subfolders. As Google says:

Avoid automatic redirection based on the user’s perceived language. These redirections could prevent users (and search engines) from viewing all the versions of your site. (source)

4) You can geotarget subfolders as well through Google Webmaster Tools according to this post called GeoTargeting on the Same Domain Using XML Sitemaps.

Now Google has thrown us a bone (maybe). Let’s explore it a bit.

What is the expanded rel=”alternate” href markup?

Basically what Google is giving us now is a way to designate that a page/section of a site is in a specific language, and thus should be targeted towards that search engine.

Here’s an example. Let’s say we have a site, http://www.example.com.

Now we have different content on each specific language subfolder, which is set up like this:

http://www.example.com/hotels/en/us/more-URL

http://www.example.com/hotels/en/gb/more-URL

http://www.example.com/hotels/es/es/more-URL

http://www.example.com/hotels/es/mx/more-URL

On each of the pages above, you will need to put a new <link rel=””> attribute in the <head> section of your page. For convenience sake, I’d put it around your rel=canonical tag (if you have one).

For the above examples, this is how the <link rel=”alternate” hreflang=”(content)” href=”(content)”> would look for the US content page that also has translated content:

<link rel=”alternate” hreflang=”en-GB” href=”http://www.example.com/hotels/en/gb/more-URL”>

<link rel=”alternate” hreflang=”es-ES” href=”http://www.example.com/hotels/es/es/more-URL”>

<link rel=”alternate” hreflang=”es-MX” href=”http://www.example.com/hotels/es/mx/more-URL”>

From my understanding (and someone please correct me if I am wrong), Google is now saying that we can use this when the content on the page is essentially the same except for price (so USD vs GBP). I get this from this statement:

Today we’re going further with our support for multilingual content with improved handling for these two scenarios:

  • Multiregional websites using substantially the same content. Example: English webpages for Australia, Canada and USA, differing only in price
It is important to note that it does not seem that this tag is self-referential, unlike the canonical tag. 

How is this different?

Last year Google published a blog post called “Unifying Content under MultiLingual Templates“. Under that, when you had templated translated content, you had to pick a canonical URL (the one you desired to have displayed to searchers). It then involved setting a cookie to remember what a person selected as their language, then redirecting the person based on that cookie.

Also, depending on what language the person was using when they set up the page, i.e. if it is a profile page set up in the Spanish version of the site, then that page becomes the canonical and all the others (French, English, Italian, etc) use a rel=canonical to the Spanish version.

Complicated, right?

In the new system, all you have to do for this templated content is use the hreflang to designate the different languages, which will be served to the different language search engines. If you’re translating every page, this makes a lot of sense! However, if you are not translating every page, I think it would still make sense to just designate the page using the hreflang to the original language. This should, according to Google, keep the page from being indexed in other language Google search engines, and it should rank in the correct language search.

We’ll see if this is indeed what happens, but it seems promising.

What I Wonder

I wonder if this means that we no longer have to do unique content for different iterations of the same language? For example, does this mean that we can just add “u”s to words like “favorite” (becomes favourite) and change the currency to GBP from USD, throw this tag on, and we are good to go? Will Google still treat this as duplicate content?

It’s hard to know without any testing (though I have a client I am going to try it out on I hope), but does this not follow logically that this is now allowed?


I’d love to hear thoughts anyone has.

John Doherty

Posts

I'm the Senior Marketing Manager of HotPads.com, based in San Francisco. Previous to Hotpads I worked at Distilled for 2 years as an online marketing consultant. In my spare time I shoot lifestyle photography, ski, rock climb, and update my Twitter and Google+ accounts.

8 responses to Google’s New Multilingual Markup

  1. Does Google support cross-domain rel=”alternate “hreflang=”X”, too?

    • Hey Petra –
      Great question. My answer is I do not know. I have not seen anything about it, but it is always worth testing on a non-important site.

      Also, I’m struggling to figure out a purpose for doing a cross-domain rel=alternate tag? Are you thinking that when you have different country top level domains that you could use it to specify that one site belongs in one search engine (Google UK for example) and another belongs in Google US?

      This seems like overkill to me. I think that Google is doing this new hreflang to help with content on the same domain.

      • Thank John. I came to your blog looking for this answer!!

        Here is what I found in the Google webmaster central blog comments:

        Christopher Semturs said…
        @Kangorimo
        there is no contradiction. You could:
        - put the regionalized content on the TLD-variations (E.g. on example.de and example.fr).
        - on all pages that you have (example.com/example.de/example.fr) annotate all 3 of them with rel-alternate-hreflang

        You are right! This could be a overkill.

  2. Hey John, thanks for the post.

    I don’t know why but whenever google update or post something on their blog it’s always so complicated that people like you make them understandable for us in a simplest way.

    Okay, now to the topic. I’m working on large ecommerce sites & they’ve recently started promoting their brand in other countries & as you said the best practices would be to use country specific TLD’s which they’ve applied. They are starting out their offices in france, germany, ireland & they’ve already booked domain with .de, .ie., .fr which I think is perfect.Although the content is same which is on US. Are they doing it right?

    Even if they’ve not done that I’d have told them so to do it.Because just adding a tag in the pages which belongs to other regional country is not correct in my sense.

  3. Hey John,

    Re your question about different iterations of content for the same language – it seems as if that’s what Google are hinting at, however I remain a bit cynical as to whether or not this will be the case. My gut says that it’s better to create content specifically for your audience, rather than boilerplate content.

    I think it will be really interesting to see whether or not the new markup works too – right now Google Webmaster Tools targeting seems to have little or no effect – hopefully this improves moving forward :)

    Hannah

  4. I think we can use:

    …a title=” ” href=” ” rel=”alternate” hreflang=” ” …

  5. Thank you for explaining this, I have been reading about this new markup tag. I wish there was a validator tool somewhere

Trackbacks and Pingbacks:

  1. HREFLANG Markup Testing - It Works! | John Doherty - January 18, 2012

    [...] month or so ago Google announced they are now supporting the HREFLANG markup for translated content that they used to mark as duplicate. I had just completed an international audit for one of my [...]