By: Woj Kwasi on 18-08-2011 in SEO

Duplicate Content - Olsen TwinsGoogle’s Panda algo update was all about improving quality on the web by penalizing / removing sites presenting duplicate content.

Google’s algo has a pretty hard job to do if you think about it. I like to think of duplicate content like the Olsen twins. Google is looking for these twins and trying to work out which one is the better, more relevant and original twin (sometimes triplet, quad, etc). Which is it? Mary-Kate or Ashley? And… don’t forget about all the impersonators out there!

Check out the Duplicate Content Tool I found today that checks 2 pages on your site to see how original they are to help analyse and detect potential issues on your site.

Try it out for yourself (I’ve put in 2 pages on my site but you can use your own URLs):

URL 1:URL 2:

I found it interesting that they have a section where you can generate review templates to put on your blog / site… which if everyone adds to their blogs will become duplicate right? Well yeah, but they randomly generate the messages.

I’ve added a couple of snipits below to demonstrate what I mean & because they give you some more background on duplicate content:

Duplicate content is the name of a filter applied by search engines to those queries that return two or more relevant pages that are too similar to each other: as an effect, pages considered duplicates, or near-duplicates, of the first relevant ones are excluded from the results. The purpose of that technology is to offer the best possible variety of results. How does this program work? The tool visits two pages that you want to analyze: it compares the two pages in order to look for differences: it gives back a percentual index that outlines how much difference there is between them.

Duplicate content is the name of a filter applied to search engines queries returning two or more relevant pages too similar to each other: as an effect, pages considered duplicates, or near-duplicates, of the first relevant results are excluded. This functionality is designed to serve the best assortment of results. So, how does this tool work? The tool visits the two pages you requested it to examine: it compares the two outputs and looks for differences, then it gives you a percentual index that shows how much difference there is between them.

Some subtle differences but enough to make it original? Well it seems to get through Google – you’d think that if they had lots of people using the randomly generated text that eventually duplicates will pop up around the net.

The next question is how much value does Google actually place on the mark-up on the page? Hypothetically you may get 2 sites that use the same template that have pasted the review text. The plot thickens…