The websites which have duplicate content on the web has increased tremendously. And more than you can imagine, this content does not come from other websites but happen on the same domain of the same website. This article looks at what duplicate content is all about, canonicalization and how to use free tools to deal with this issue. Whereas the intention of this post is about duplicates in general, the ones on your own website are more important than those off it, according to ranking experts.
By Google’s definition, “Duplicate content generally refers to substantive blocks of content within or across domains that either completely matches other content or is appreciably similar. Mostly, this is not deceptive in origin.”
For the canonicals, Google’s definition is, “Many sites make the same HTML content or files available via different URLs. […] To gain more control over how your URLs appear in search results…we recommend that you pick a canonical (preferred) URL as the preferred version of the page. You can indicate your preference to Google in a number of ways. We recommend them all, though none of them is required (if you don’t indicate a canonical URL, we’ll identify what we think is the best version).”
There are different kinds of duplicate content issues on the website. First, there are duplicates which cover the content on a full domain like those with or without www, with or without trailing slash /, those with or without a file name, and so forth. The best way out of this one is by implementing 301 permanent redirects on the preferred versions of the page.
The other source of duplicates is because of dynamic url parameters set by the CMS used to make the site. Joomla is one of the worst CMS‘s with these issues because of the ways the URLS’s form on it.
Finding duplicate content on the website
There are different kinds of duplicate content on the website and several automatic methods can help you get started with it. Duplicate content checker PlagSpotter is one of the most effective tools you can use to determine duplicate content on your website and hence enable you optimize better for the search engines.
How to fix duplicate content on the website
There are different techniques you can use to fix duplicate content on a website.
- Setting the preferred version of the website’s domain is the most effective way to deal with duplicate content on the website. This means that you expressly tell the search engines which domain; the www or non-www version of the website you prefer indexing.
- Using the Google Webmaster tools to set the preferred version of the website is another way to solve this problem.
- Using canonicalization tags in the website’s Meta data can enable the search engines know what urls to index or give authority. The advantage of the canonicalization is that its quite eay to implement and there are several ways to implement on different CMS platforms whether its wordpress, Joomla or CMS made simple.