Have Duplicate Content? It Can Kill Your Rankings
You did your research and your website was an original idea. You wrote all your own content, so it’s 100% original. You must be the only one with these specific posts on a specific topic. Therefore, you should never have an issue with duplicate content, right?
Duplicate content can be killing your rankings without your knowledge.
Your objective in this chapter is to understand what duplicate content is, how to locate and reduce it on your site, and how to locate it on other sites.
What is Duplicate Content?
“Duplicate content” is exactly what it sounds like: content that exits on more than one webpage. For the purposes of this article, we are going to focus on text as duplicate content.
We see this issue in a few different scenarios:
- A Templated Website (typically bought by vendors that only work in specific niches)
- Copying and Pasting Content on Your Site
- Improper SEO Plugin Setup
- People Stealing Your Content
In order to understand why duplicate content kills pages, we first need to understand how search engines work. Imagine all the work search engines do. They scour and keep track of trillions of webpages. In order to save space, they need to decide which pages to pay attention to… and which pages to ignore.
Think of it as going to a bookstore and picking up a book off the shelf. You read the first page and think to yourself, “Wow! This is going to be a great book!” Then, you turn to the next page and see the exact same thing reprinted. You keep flipping, and the page is reprinted over and over. You decide not to buy the book! In the same way, these search engines get “annoyed” when they read duplicate content.
In order to rank higher, you must first find and eliminate duplicate content.
How to Find Duplicate Content on Your Site
Many people who say they have all original content truly believe they have no duplicate content. However, they are typically the ones with the biggest duplicate content issues.
WordPress, by default, creates multiple copies of any new post/page you create. You must go in and mark these pages as “noindex” which tells the search engines that they should ignore that page, thus removing the issue with duplicate content.
Step 1: Make sure your duplicate versions are marked properly so the search engines will ignore them. You can do this manually, but we recommend the easier route of using a properly configured plugin to remove those duplicates.
Step 2: Find other blocks of duplicate text. With a small site you can usually go through page by page – but we prefer an easier method.
A great resource we recommend is a website called Siteliner, which is free for websites under 250 pages. Larger sites are typically less than $1.
Go to the home page at Siteliner, enter in your website and click “Go!”
It will scan through your site and present you with a report of potential issues.
Below that you will see a quick report of your site showing how many pages were scanned & how many were skipped because of redirects or “noindex” tags. Underneath that you will see the header “Your Duplicate Content.”
This is where you want to focus! The goal is to decrease the numbers in “Duplicate Content” and “Common Content” as much as possible. Lowering these to 0% is rarely feasible, but aim for it anyway!
Reducing Common Content
Typically the “Common Content” is headers, menus, sidebars, addresses, telephone numbers and anything that is consistent with most pages of the site. Most of this cannot be changed on every page. Therefore, the best way to reduce this number is by adding content to the pages on the site. One way to reduce duplicate content, such as long business descriptions, is by taking a screen shot of the information and replacing the text with that image.
Reducing Duplicate Content
The “Duplicate Content” section is where you should focus your attention.
Common situations that increase the duplicate content number:
- Sites run like true blogs with excerpts of articles on the homepage. (As long as new content is consistently added, this should not create a big issue for your website).
- Sites from companies that have pages for every city in their area, but are reproducing the same content on each page. This can cause huge penalties.
- Category pages, tag pages and author archives that are reproduced frequently
What’s the solution?
You can either re-write the pages or combine them into one page on which to focus your efforts. If you do so, you should find little or no duplicate content. If you do, fix it as soon as possible. For the purposes of this post, we’re just covering basic Siteliner usage, but we encourage you to browse the rest of your data for other trends that may be affecting your site.
How to Find Duplicate Content on Other Sites
The other major duplicate content issue is when the content on your website is found on other websites. This can be caused for varying reasons. Fixing it is crucial.
Three common scenarios for duplicate content on other sites include:
- You purchased your site from a company that did not produce original content for your site
- Someone stole content from your site
- You copied content from someone else’s site
There are two ways to discover duplicate content on other sites:
- Copy pieces of your content, paste them into major search engines (with quotation marks) and click “search.”
- Use a paid account subscription to Copyscape.com
Step 1: Go to Copyscape.com. Click “Premium” (just below the search box). If you don’t already have an account you will need to create one and fund the account initially ($10 at the time of this writing). Log in to your account.
Step 2: Enter URLs. If you have a list of all of your URLs, simply enter them in the box and click “Premium Search.” If you do not have a list, click “Batch Search” just below the search box. On the following page click “Get Started.”
Step 3: Review pages.
- Enter the homepage URL in the top box (Leave the middle box blank); Enter the sitemap URL in the bottom box
- Check the box: “Also add pages on my site linked from this page”
- Click “Add URLs”
- Click “View/Change URLs” once the page reloads
- Check to make sure no important pages were missed.
- Click “Ok” and “Done” on the following page.
- You will be asked to confirm that you want to scan these pages and you will be informed about the charge. If you approve, click “Start Batch.”
- On the following screen, you will see a status. Small sites: Click “Update” and “See Results.” Larger sites: these may take more time; watch for an e-mail from Copyscape that the scan has completed and look for results.
Example of a check (to the left):
As you can see, there are two red bars that show there were two matches on two different pages for content from my website.
When I clicked on the numbers, I was able to see what other websites have the same content.
You’ll have to do some reviewing and make your own judgement call about whether or not you feel the content has been stolen.
What do you do if there is stolen content?
You essentially have 3 options.
Option 1: Request that they take down the stolen content. This is always a good idea, but does not always work.
Option 2: Send a letter from an attorney. Unfortunately, this may cost more than it’s worth (depending on the stolen content).
Option 3: Re-write your content so it is original. Of course, this is a pain, but it helps you resolve the issue the fastest.
Note: A common issue occurs when people copy content from their site and submit it as a press release. Never do this! It’s always better to re-write the content if this is your case; press releases are often on many more sites than what Copyscape reveals.
If there are still unresolved instances of duplicate content, you may need to do a little more investigation and troubleshooting. Hopefully the main issues have been resolved!
By now, you’ve met your objective of understanding what duplicate content is, how to locate and reduce it on your site, and how to locate it on other sites.
If you have questions, leave them in the comments below. If you enjoyed this and found it useful, by all means, share it in all the usual places!