June 28th, 2011
Search Engine Optimization, Tips, Tools and Tutorials
Keywords, content, links…yeah, yeah, we all know these are effective methods of obtaining rankings. Undoubtedly most people that know a thing or two about search engine optimization will focus on these as the key ingredients for healthy SEO. However, there are several little modifications that can be done on a site that people often ignore, due to perhaps not understanding why they are (or aren’t) important.
1. Canonicalization – This tricky word actually has 2 applications as it pertains to SEO. Wikipedia’s definition of canonicalization (as it pertains to SEO) is as follows:
"In web search and search engine optimization (SEO), URL canonicalization deals with web content that has more than one possible URL. Having multiple URLs for the same web content can cause problems for search engines – specifically in determining which URL should be shown in search results."
Two versions of the URL? What are you talking about?
Well, if you type "www.yoursite.com" in your browser, and then type "yoursite.com" and neither are redirected to the other, although technically they are the same site, Google and other search engines see them as 2 different URLs. This also applies across the site in most cases; whichever version of the site you are on, it generally stays that way when clicking on other internal pages when webmasters use relative URLs in their coding.
You may ask "Am I getting penalized for this?"
Directly no, there is no penalty for this. However, you may be indirectly sabotaging your rankings. If half the time people are linking to the "www" version and the other half, people are linking to the non-"www" version, you could be splitting your authority between the two URLs. They are two slightly different roads to the same destination.
How do I fix this?
SEO companies (such as ours) would typically implement a 301 Permanent Redirect which is like a detour that tells the user and search spiders to only crawl one version of the URL.
But wait, you mentioned there are TWO applications of this as it pertains to SEO. What is the other one?
The other is a common issue with webmasters because in the navigation more often than not, webmasters use "relative" URLs rather than "absolute". Relative URLs refer to the page but omit the domain. ".. /internal-page.html" is an example of a relative URL in HTML code. An absolute URL is essentially writing the whole URL out, for example: "http://www.yoursite.com/internal-page.html" By keeping everything relative, the homepage is often referred to by it’s file name, which is usually "index" or "default" (in ASP).
To give a live example of this let’s look at a popular travel booking site Expedia.
Same page, different URL. Again, no direct penalties per se, but often the authority gets split between the 2 different URLs, and it makes it harder for search engines like Google to decide which URL to use in the search results.
So, how do I fix THIS?
Same way you fix any canonical issue; use a 301 redirect. More importantly, make sure any link to the "home" of your site uses an absolute link. Simply redirecting URLs is good, but making sure your internal linking matches with the correct URL is better. You wouldn’t give someone directions sending them down a detour if there is a more direct road available would you? Well if you needed more time to get ready, then maybe, but for search results, just be ready!
2. Robots.txt – This small file has a lot of power and authority when it comes to directing search engine spiders to crawl a site. It sits on the root directory and you can normally find it by typing "www.yoursite.com/robots.txt". Bear in mind a robots.txt file is not mandatory; the default is to crawl the site. But many use the robots file to outline specific folders they may not want search spiders to crawl..perhaps sensitive or unimportant information.
You want a great way to lose all your rankings at once? Make a robots.txt file and just simply add this:
This will ensure that no search spiders crawl your site. (I’ve literally seen this happen to a client. Fortunately it can be rectified very quickly and the rankings reappear). Check your robots file and make sure you’re not inadvertently blocking any important pages or categories that you may want the public to see.
3. Sitemap.xml – Another file that sits on the root directory and is there for search spider purposes. It is basically an outline of your site for the search engines. Sitemap.xml files list the important URLs on your site as a "map" for the crawlers. This file is not mandatory but it can help search spiders find important URLs and crawl your site more efficiently. You can use a number of XML Sitemap tools and some CMS automatically generate these. Spot an XML sitemap on your root directory by typing http://www.yoursite.com/sitemap.xml.
4. Page Load Time – Time is not necessarily a ranking factor in the algorithm but you should not just think in terms of how the algorithm sees your site, but rather what factors indirectly impact that algorithm. Much like the canonical issue mentioned above… is it a "penalty"? No. Can it affect rankings? Yes. If your page takes forever to load, how valuable is that going to be to the average web user and their short-attention span? A heavily bogged down site that takes a long time to load will keep people OFF your site no matter how cool it is.
Don’t read too much into this. If your page loads in 8 seconds and you want it to load in 4, well you aren’t going to see much difference. You may also be the type to stand in front of a microwave thinking "hurry up, I don’t have all minute!". Relax. It’s when your site is heavy on images and codes that you may want to be concerned. 30 seconds to load a site is more than enough time to have the average user give up on your site, go back to the search engine and click somewhere else.
A great effective tool for measuring page load times is Pingdom Tools ( http://tools.pingdom.com ). This tool tells you how long each part of your site takes to load so you can spot the problem areas.
5. Duplicate Content/Duplicate Sites – This almost seems clichéd to mention, but you’d be surprised how often this still comes up. And the general thought is that you’re penalized within your own site for duplicate content.
So are you saying there are no penalties?
Well, that’s not entirely true either. In most instances, this occurence is basically a canonical issue, so your authority will likely be split between pages that have the same content and optimization for the same keywords. So there’s more of an indirect "penalty" rather than an overt one. I’ve seen sites rank when having 2 URLs with the same content but that’s taking a chance that Google or other search engines will pick one and not pay attention much to the other (perhaps because of good internal linking).
A good way to avoid this all together is creating unique content of course. But in the case of say a large database driven site, duplicated content is not always avoidable. To combat this, a tagging standard with the rel="canonical" tag is introduced, basically telling the search engines that you are aware of the duplicate content and that a specific page is the original source of it. This keeps any indirect penalties for creating duplicate content within your own site away.
Now, I did mention that the penalties are more "indirectly" related but if you go nuts with building identical pages, Google will see that as spamming and therefore DIRECTLY penalize you. And while on that subject, I will say, duplicate sites will most DEFINITELY harm you. Quickly. Direct or indirect, the ranking drops are almost immediate (within a week) when you have a whole site that looks identical on a different URL. Check that off as a huge "SEO no-no". (I have seen this happen as well).
There are quite a few factors that can have an influence on your rankings besides links and keywords. Site architecture and proper coding structure are important SEO factors that should not be taken lightly. While links are gold and content is king, don’t ignore the importance of the internal workings of your site… these factors can set you apart from your competitors.
There are no comments yet.