Yesterday Google announced a new markup to support multilingual content. This is an interesting move for Google, and one that I think I really like since International SEO has often been a source of questions for SEOs, and to be honest Google pretty much sucks at ranking the correct geo-targeted URL in the correct country-specific search engine.
Best practices have long been that you should do the following to help Google figure out the right page:
1) Best would probably be to use country-specific top level domains (TLDs) such as .co.uk or .com.au. If you cannot do this, it’s probably best to use a .com domain.
2) Use subfolders, not subdomains. (Google shows subdomains in the post, but look at what Gianluca has to say about it):
3) Do not use IP redirects for geo-specific information, such as currencies and phone numbers, as currently Googlebot only comes from US IP addresses. It is better to hardcode in this information based on the country-specific subfolders. As Google says:
Avoid automatic redirection based on the user’s perceived language. These redirections could prevent users (and search engines) from viewing all the versions of your site. (source)
4) You can geotarget subfolders as well through Google Webmaster Tools according to this post called GeoTargeting on the Same Domain Using XML Sitemaps.
Now Google has thrown us a bone (maybe). Let’s explore it a bit.
What is the expanded rel=”alternate” href markup?
Basically what Google is giving us now is a way to designate that a page/section of a site is in a specific language, and thus should be targeted towards that search engine.
Here’s an example. Let’s say we have a site, http://www.example.com.
Now we have different content on each specific language subfolder, which is set up like this:
On each of the pages above, you will need to put a new <link rel=”"> attribute in the <head> section of your page. For convenience sake, I’d put it around your rel=canonical tag (if you have one).
For the above examples, this is how the <link rel=”alternate” hreflang=”(content)” href=”(content)”> would look for the US content page that also has translated content:
<link rel=”alternate” hreflang=”en-GB” href=”http://www.example.com/hotels/en/gb/more-URL”>
<link rel=”alternate” hreflang=”es-ES” href=”http://www.example.com/hotels/es/es/more-URL”>
<link rel=”alternate” hreflang=”es-MX” href=”http://www.example.com/hotels/es/mx/more-URL”>
From my understanding (and someone please correct me if I am wrong), Google is now saying that we can use this when the content on the page is essentially the same except for price (so USD vs GBP). I get this from this statement:
Today we’re going further with our support for multilingual content with improved handling for these two scenarios:
- Multiregional websites using substantially the same content. Example: English webpages for Australia, Canada and USA, differing only in price
How is this different?
Last year Google published a blog post called “Unifying Content under MultiLingual Templates“. Under that, when you had templated translated content, you had to pick a canonical URL (the one you desired to have displayed to searchers). It then involved setting a cookie to remember what a person selected as their language, then redirecting the person based on that cookie.
Also, depending on what language the person was using when they set up the page, i.e. if it is a profile page set up in the Spanish version of the site, then that page becomes the canonical and all the others (French, English, Italian, etc) use a rel=canonical to the Spanish version.
In the new system, all you have to do for this templated content is use the hreflang to designate the different languages, which will be served to the different language search engines. If you’re translating every page, this makes a lot of sense! However, if you are not translating every page, I think it would still make sense to just designate the page using the hreflang to the original language. This should, according to Google, keep the page from being indexed in other language Google search engines, and it should rank in the correct language search.
We’ll see if this is indeed what happens, but it seems promising.
What I Wonder
I wonder if this means that we no longer have to do unique content for different iterations of the same language? For example, does this mean that we can just add “u”s to words like “favorite” (becomes favourite) and change the currency to GBP from USD, throw this tag on, and we are good to go? Will Google still treat this as duplicate content?
It’s hard to know without any testing (though I have a client I am going to try it out on I hope), but does this not follow logically that this is now allowed?
I’d love to hear thoughts anyone has.