Over The Top SEO

OTT Blog

The Truth About Duplicate Content

Marcel Casella August 12, 2016

The truth about duplicate content and its effects on your SERPs

Much has been told about duplicate content and how search engines deal with it. However, a lot of misconceptions and myths surround the issue. Google itself has published a comprehensive commentary on the matter, but some clarification is still necessary. Let’s start with the basics: what is duplicate content exactly? Our friends at Google have a definition that reads as follows:

“(…) substantive blocks of content within or across domains that either completely match other content or are appreciably similar”

Of course, this kind of content is inevitable on the internet. Think about discussion forums, where loads of blocks of content are copied and published every second; store items, linked in multiple URL’s and republished again and again; and even printer-only versions of web pages. If Google starts to penalize every site that publishes such kind of content, soon you would see about 20-30% of the internet going down from an SEO perspective. However, that simply doesn’t happen, and the reason is rather simple:

Google does not affect your SERP due to duplicate content

That’s right. There is nothing wrong with sharing an article from another website from time to time, as long as your intentions are not deceptive or malicious. But how does Google figure out who is trying to game the system?

What Google actually does is create clusters of similar content. Then it shows the most relevant to you depending on your language, location and search patterns. By doing this, one can come up with the conclusion that a site is being “punished” due to duplicate content. However, the head of Google’s web spam team Matt Cutts said in a public statement, Google’s intention when showing a SERP’s is to offer a diversified range of websites and not the same one over and over.

Nevertheless, duplicate pages are not removed from the index. Google even adds a link saying something like “we have omitted some very similar entries…” giving the opportunity to check other similar results. So we can conclude duplicated content is not a fault per se. Nonetheless, there are some issues related to it that are in fact punishable.

When will duplicate content get you blacklisted and unindexed?

For Google, everything is about intent. If you use some unoriginal content for a fair reason, there is nothing to fear. As stated before, it won’t benefit or damage your SERPs. The thing gets ugly when you try to use other people’s content in a deceptive, manipulative way. Google is getting really good at detecting such practices.

Sometimes content is duplicated on purpose across domains to influence search engine rankings or try to attract more traffic. Dishonest actions like these usually result in a low-quality user experience. Seriously, how useful is to see the first SERP full of the same result?

If Google detects that duplicate content is used this way, it will adjust the ranking and indexation of the sites involved. They usually get demoted or even removed from the index, preventing them to be shown at all.

Recommendations to avoid problems related to duplicated content.

Even if you are honest and don’t use duplicate content to manipulate your SERPs, there are some issues that can make your rank suffer. Some actions that generate repetitive blocks of text, like including boilerplate content, can be misleading and may cause some of your pages to be filtered.

Boilerplate content refers to blocks of text that can be reused with little or no change in different contexts or applications. An example can be copyright text put at the bottom of every page on your website, or info about your location, phone number or e-mail. Google is still trying to improve its ways to deal with boilerplate text. However, they still stress the fact that it is a practice that should be avoided. Repetitive blocks of text can confuse the crawler and cause it to completely ignore your site.

Also, automatically generated content, automatic translations, and duplicated content may be labeled as “thin content with not added value” and filtered out or unindexed. So make sure of stop sharing this sort of content.

Some ideas that might help your SERPs

If you are sharing an article, use the rel=“canonical” tag to tell search engines where the original content is. This way you can republish as much as you like without fear of being filtered down due to excessive duplicate content.

Create a “creative” copy instead of just republish content. Make a different version of the article, but add more value and updated content. This way you will create something original that will even help your SEO!