Duplication problem with Google - how Google handle duplicates.

Official Google webmaster central blog has an interesting post on how Google handles duplicates.
Susan of webmaster central team state:



"When we detect duplicate content, such as through variations caused by URL parameters, we group the duplicate URLs into one cluster.
We select what we think is the "best" URL to represent the cluster in search results.
We then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL.
Here's how this could affect you as a webmaster:

In step 2, Google's idea of what the "best" URL is might not be the same as your idea. If you want to have control over whether www.example.com/skates.asp?color=black&brand=riedell or www.example.com/skates.asp?brand=riedell&color=black gets shown in our search results, you may want to take action to mitigate your duplication. One way of letting us know which URL you prefer is by including the preferred URL in your Sitemap.
In step 3, if we aren't able to detect all the duplicates of a particular page, we won't be able to consolidate all of their properties. This may dilute the strength of that content's ranking signals by splitting them across multiple URLs.



In most cases Google does a good job of handling this type of duplication. However, you may also want to consider content that's being duplicated across domains. In particular, deciding to build a site whose purpose inherently involves content duplication is something you should think twice about if your business model is going to rely on search traffic, unless you can add a lot of additional value for users. For example, we sometimes hear from Amazon.com affiliates who are having a hard time ranking for content that originates solely from Amazon. Is this because Google wants to stop them from trying to sell Everyone Poops? No; it's because how the heck are they going to outrank Amazon if they're providing the exact same listing? Amazon has a lot of online business authority (most likely more than a typical Amazon affiliate site does), and the average Google search user probably wants the original information on Amazon, unless the affiliate site has added a significant amount of additional value.

Lastly, consider the effect that duplication can have on your site's bandwidth. Duplicated content can lead to inefficient crawling: when Googlebot discovers ten URLs on your site, it has to crawl each of those URLs before it knows whether they contain the same content (and thus before we can group them as described above). The more time and resources that Googlebot spends crawling duplicate content across multiple URLs, the less time it has to get to the rest of your content.

In summary: Having duplicate content can affect your site in a variety of ways; but unless you've been duplicating deliberately, it's unlikely that one of those ways will be a penalty. This means that:

You typically don't need to submit a reconsideration request when you're cleaning up innocently duplicated content.
If you're a webmaster of beginner-to-intermediate savviness, you probably don't need to put too much energy into worrying about duplicate content, since most search engines have ways of handling it.
You can help your fellow webmasters by not perpetuating the myth of duplicate content penalties! The remedies for duplicate content are entirely within your control. Here are some good places to start.
"

Labels:


0 Comments:

Post a Comment

Links to this post:

Create a Link

<< SEO Blog Home