I recently consulted with an executive from a lead generation company here in San Diego about a problem they were having with their site not being properly indexed into the search engines. The site only had about 35 pages of their several hundred article and tip pages indexed for organic search. They even had pretty good internal site linking, from what he was telling me. This is what I found after reviewing and FTPing into their site:

  • Because they recently switched domain names (within the last 6 months), they had successfully 301 redrected their old domain to their new one but were not 301 redirecting their domain.com to their www.domain.com, creating two indexed domains and potentially tripping duplicate content filters.
  • The site was using Google’s Sitemaps application. Unfortunatley, the sitemaps.xml file being served up on their domain had an XML parsing error caused by some rogue "&" being placed into some of the URL strings listed in the XML file. The error was preventing the search engines from reading the file and indexing the corresponding pages! But here is the kicker: becuase the sitemaps.xml file was returning an error, Google decided not to index the rest of the site – even though running through a simple Xenu link check revealed good internal site linking to several hundred pages on the site. (as pictured below)Page Tree
    In fact, there were about 35 properly formed XML page URL’s listed in the XML sitemap (about the same as listed on Google’s site: command) before the first "&" character was found in the file, stopping the indexing process dead in it’s tracks! This just goes to show that Google’s sitemap application supercedes a site’s internal linking structure and that you need to be very careful when dealing with sitemaps for indexing purposes.

Comments

Your email will not be published. Required fields are marked *

JohnMu • 10 years ago

An error in a Google Sitemap will not stop crawling and indexing of the rest of the site. A Sitemap file is generally a source of additional information, it will not halt the traditional crawling and indexing of URLs. That said, if you’re going to use Sitemaps, it does make sense to use a well-formed file, otherwise the search engines may end up ignoring it :-).

Read More

Mike Shannon • 10 years ago

Thanks John. Do you have any evidence or further reasoning when you say that a Google site map error will not stop crawling and indexing for the rest of the site?

Read More

JohnMu • 10 years ago

Hi Mike, I work together with the sitemaps team here at Google - a sitemap file is a source of additional information, it does not replace the normal crawl of a website. One of the reasons for that is that sometimes users (usually accidentally) “break” their sitemap files and that is something which should not have a negative effect on the rest of the site.

Read More

Mike Shannon • 10 years ago

Thanks again, we learn something new almost every day over here. 

Read More

Other posts you will enjoy...

4 Common Email Problems and How to Solve Them
5 Ways People and Organizations are Fighting Fake News
What Impact Can HTTPS Have on Your SEO?
Why You Need a Content Marketing Agency in 2017