Google Sitemaps Ask For Clean URLs

Today I was listening to the webinar given a few months ago by Rand Fishkin on “Getting Value from XML Sitemaps, HTML Sitemaps + Feeds”. It gives some great information (if you’re a Pro member, you can find it here), though I heard some information that made me ask a couple questions and investigate.

I actually asked myself, “Wait, so do I have to submit different sitemaps for different search engines?”

Background

A few months ago, they had Duane Forrester from Bing on WhiteBoard Friday who said that they do not want anything in our sitemaps except for the clean, end-of-the-trail URLs. No redirects, no 404s. From this information, he said, they have the ability to learn who is and who is not submitting trustworthy and clean sitemaps, so only submit clean URLs.

In the webinar, Rand and his co-host (I think it was Jen Lopez) said that they had previously recommended that you upload the full sitemap, complete with 301s, so that the engine knows exactly what you are doing. They were wondering on the webinar if Bing was perhaps different.

Bing and Google May Have The Same Recommendations

Today I was rebuilding the sitemap for the site I work on as an in-house SEO. (Full disclosure: I hate building sitemaps, but it must be done). We recently switched the site over to HTTPS, but I figured “Hey, no rush on a new sitemap, because Google is fine with 301 redirects.”

Wrong.

I saw this today when I went into Google Webmaster Tools:

"We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL." (CLICK TO ENLARGE)

And this is the error in full (URL removed for privacy reasons):

301 GWT Error
"HTTP Error: 301"

What To Do?

Anytime you do a site upgrade or migration, you need to submit a new, up-to-date sitemap. If you have taken the time to plan out your migration (and you should), build this into your schedule. In my case, I have to go back through and rebuild the sitemap with HTTPS, which fortunately is not too hard (find and replace HTTP with HTTPS).

Why Should We Care About Clean Sitemaps?

It seems that Google is also wanting clean URLs in sitemaps, because it helps with indexation. I do not know if they have the same capabilities as Bing, or if sites could be penalized for bad sitemaps, but this is information we should take into account. Google is giving us a hint, I think.

I do also think that we should strive to submit clean information to the search engines, as it does make their job easier. And what’s the worst that could happen? Sitemaps are used in indexation, so we’re just going to have a bunch of well-optimized, 200-status returning URLs in sitemaps all over the Internet that help the search engines make sense of our websites.

That’s a good thing.

Some More Sitemap Resources

Dr. Pete on SEOmoz about Xenu and Screaming Frog
Information from SEOmoz on Google Sitemap Creator
Google Sitemap Generators (VERY ADVANCED)

9 thoughts on “Google Sitemaps Ask For Clean URLs

  1. I work in-house as well.
    Not only does Google reject 301s in sitemaps, their validation reporting has gotten stricter.

    We had a sitemap of urls that had incorrectly been marked up within:

    The sitemap had validated for a considerable amount of time despite it containing urls rather than xml files. (Our urls did NOT include file extensions) They even validated using external tools so we didn’t initially catch the error.

    About Feb 24 (Panda) WMT threw errors across the board. Not sure exactly what the learn here is but I find it fascinating.

    1. Hi Rick –
      Thanks for your comment. It is always good to hearthe experiences of others, so thanks for adding your voice to the discussion.

      Do you know if Google released anything about this change? This is the first I have heard of it, and I wonder if this may affect many others, especially if Google previously was saying to include the 301 URLs.

  2. Pingback: On The Last Turn Of The Universe – Top Ducks Secret » Uncrawled 301s – A Quick Fix for When Relaunches Go Too Well

  3. Hi John,

    In my opinion both Google and Bing are quite strict with XML sitemaps and it is understandable why they want them to be flawless (no 404s, 301s).

    However, I don’t think this is the case with HTML sitemaps which in some cases can help the spiders discover new URLs/pages after a website migration for instance. It would be great hearing your thoughts/experience.

  4. Pingback: Uncrawled 301s – A Quick Fix for When Relaunches Go Too Well

  5. Pingback: Uncrawled 301s – A Quick Fix for When Relaunches Go Too Well | CS5 Design

  6. Pingback: Uncrawled 301s – A Quick Fix for When Relaunches Go Too Well | Clixto7

  7. Pingback: Uncrawled 301s

  8. Pingback: Uncrawled 301s – A Quick Fix for When Relaunches Go Too Well | Montachusett Internet Marketing

Comments are closed.