Over The Top SEO

OTT Blog

What duplicate content is and how it ruins your rankings

Guy Sheetrit June 9, 2017

Duplicate Content Affect Rankings!

Duplicate content is the same or partly the same content on a few URLs. Such pages can lead to wasting crawl budget, and significant rankings drop as a result. You might think that it’s definitely not about your site but don’t jump to conclusions.

It this article, I’ll tell you where duplicates are hidden, how to identify and fix them giving the spider to notice the more important pages.

Where duplicates are hidden

There are a huge number of reasons why duplicates appear. We’ll list only the most common causes:

Small differences like www” & non “www” version

Your site can be accessible both at www and non www versions or http and https.  It can also be two versions both with and without the slash at the end of URL. As a result – you have two identical websites with the duplicates all the pages it has.

Filters and sorting

If you use a filter on the site, the results will be formed on a separate page with the dynamic URL. It means that the combination of different filters and sorting parameters creates numerous automatically generated pages. Such elements usually cause the duplicates creation.

image-content

Pagination

Pagination also creates the duplicate issue as titles and descriptions of all the pages are the same. Read about how to do it correctly at the end of the article.

image-bottom

How duplicates affect rankings

Wasting of a crawl budget

Crawl budget is the number of URLs Googlebot can crawl over a certain period. We know that the crawl budget has a limit. You can either just put up with it, or you can try to make it grow. One of the ways – to delete or hide all the duplicate pages from your site and let the spider to index important pages instead of unuseful duplicates.

Reducing of visibility

Since search engines want to provide the most relevant information, they’re trying not to show the same pages at the result. So, the engine probably will choose one of your duplicates, and this way visibility of each of duplicates can be lowered.

Backlinks division

If the same article is available on two different URLs, all backlinks and shares will be divided between these two articles as some readers are linking to the first URL and others ‒ to the second one. It means that rankings of two pages will be lower.

How to identify duplicate content

Since we’ve found out how it affects on the ranking, let’s identify all the duplicates on your site so you can fix or hide them from the search engine.

First of all, you can do it manually or you can use the tools. It depends on the size of your site and number of such a pages. Small issues can be fixed in few minutes my hand but if you’re not sure, don’t waste your time and use special instruments.

Manually

So, if your website is quite small, you can do the following operation to find the duplicates.

Use site:yourwebsite.com to get the list of all your site pages indexed by Google.

ranking-image

After that, you can manually check the results. And again, it’s no sense to use this approach with huge platforms. It better fits when your site is already optimized so and you can just take a look on your main pages to find out if some duplicate issues appeared lately.

Also you can check certain pages for duplicates using the following operator: site:mysite.com intitle:the title you’re checking.

And be sure to click “repeat the search with the omitted results included” at the bottom of SERP. Without it Google will show you only unique pages.

Google Search Console

Go to Search Appearance section and click on HTML Improvements. There you can find  Duplicate meta descriptions and Duplicate Title tags.

Here is how it looks like:

content duplication

But, unfortunately, it’s the only types console can show. So, this method can help only to check if duplicates exist on your site, but it isn’t suitable for deep investigation.

 

Serpstat site audit

Serpstat is an All-in-one SEO platform with 5 modules:
Keyword Research
Competitive Analysis
Backlink Analysis
Rank Tracking
and Site Audit

 

Create a project and set the needed audit parameters. There you’ll see a list of errors divided both by the error type and level of priority. Go to Meta tags section of the Audit module to see the list of pages that have the identical title or description tags.

 

Here’s how it looks like:

And you can also see the detailed report to see which pages breed duplicates:

Serpstat site auditor

Serpstat is one of the best options because it’s a cloud-based platform, which means you can access the audit results from any place and you don’t have to run anything on your computer. Plus it shows all of the SEO errors on your website, not just duplicate issues.

How to fix the duplicate content problem

To ‘fix’ the duplicate content problem can mean three ways:

  • to remove unnecessary ones
  • to hide such pages from the search engine
  • to point to the main pages

Here are the most common methods to do it:

  • set 301 redirect

It refers to the small differences like www and non www versions or http and https, with and without “/” at the end, etc.

You can show search engine which page is main setting 301 redirects from the duplicate page to the original one. This way these pages won’t be considered as a duplicate content because the robot will always be redirected to the main page.

The alternative to this method is to choose preferred domain at Google Webmaster Tools: with or without www. But, you should remember that everything you set at Google Webmaster tool works only for Google.

WMT

  • use rel “canonical” tag

It can be useful when you deals with sorting and filter pages. You can’t just remove them, but robot considers all these pages as the duplicates. And since, for example, online clothing store usually has hundreds of different kinds of dresses, imagine the amount of wasted crawl budget on these pages.

To avoid such a problem, use rel “canonical” tag. Thus when crawler visits these pages, it understands that the category page is preferred and there is no use of indexing the other hundred of pages.

Here is how it looks like:

<link rel=”canonical” href=”https://blog.example.com/dresses/green-dresses/” />

to the page

https://blog.example.com/dresses/green-dresses/?sort_min_price

  • use meta robots

It fits the pages you don’t need to be indexed by the robot (for example, the basket page, printer friendly pages e.g.). It allows search engines to crawl a particular page but not to index it.

Here is how it looks like:

<meta name=”robots” content=”noindex, follow”>

You also can use the tool SeoHide to forbid the robot to index such a pages.

●     Set rel=”prev” and rel=”next” tags for pagination

Use rel=”prev” and rel=”next” tags to help Google understand that this is not a duplicate but a pagination. Tag rel=”prev” stands for the previous page, while rel=”next” for the next one.

Here is how it should look like:

At <head> http://site.ru/category/

<link href=”http://site.ru/category/2/”>

At <head> http://site.ru/category/2/

<link href=”http://site.ru/category/”>

<link href=”http://site.ru/category/3/”>

Final thoughts

So, duplicate content can be the reason of rankings drop so this issue definitely worth your attention. You probably can’t see the declining significantly, but it can be a good explanation of the stuck in the same position or your slowly dropping.

 

In this article, I covered the basics of the duplicate content issue. There are much more reasons, consequences and ways to fix it. But I hope I managed to show you how important this problem is so you can check and improve your site using my recommendations.