In order to succeed in organic search you need to publish large quantities of original, fresh and truly exceptional content. The challenge is always to make this process scalable without sacrificing on quality.
While best practice is to develop a unique piece of content for each web page on your website, modern Content Management Systems (CMS) can cause duplicate versions of content deployed across different URLs. Another issue arises when brands decide to ‘re-purpose’ great content across different websites. For search engines, this presents a big problem – which version of this content should they show to searchers? In SEO circles, this issue is often referred to as duplicate content.
Duplicate Content 101: Background & Terminology
Duplicate content is same content that appears on the Internet in more than one place (URL). This is a problem because when there are more than one piece of identical content on the Internet, it is difficult for search engines to decide which version is more relevant to a given search query. To provide the best search experience, search engines will rarely show multiple, duplicate pieces of the same content on the same search results page. As a result, they are forced to choose which version is most likely to be the original. Search engines typically reward the site that first published the content, so it is important for brands to claim content ownership (publish first) before its distribution across the Internet (for example, publish a press release on your site before distributing it via PR distribution services).
Canonicalization is also related to duplicate content issues. Canonicalization happens when two or more duplicate versions of a webpage appear on different URLs within the site. This is very common with modern CMS (Content Management System). For example, your website may offer a regular version of a page and a “print optimized” version of the same content.
Duplicate Content SEO Solutions for SMBs
When it comes to content deployment on the Web, you should claim the ownership of all of your content strategically and try to avoid content duplication between various webpages and websites. That said, your content strategy ought to emphasize the originality of the content per one URL – Google and other search engines will assign the ‘SEO credit’ to the ‘content originator’ (i.e. a site that first published it).
If you are a large brand with multiple branded and unbranded web properties and are looking to distribute one piece of content across multiple sites (ex. a video or a press release), you need to make a decision which web pages will take the SEO lead role in optimizing it and publish it there first. In other words, if content is to be duplicated between multiple sites (while not ideal from SEO perspective it may be scalable to occasionally re-purpose great content between the two sites), then a decision needs to be made which property ought to claim its ownership (i.e. be the first to publish). Brands should make such decisions based on their holistic SEO strategy and keyword-to-URL mapping.
While Google will most likely not penalize either website for occasional content re-purposing, excessive content duplication can have a negative impact on SEO. Google would consider two sites with identical content as spam and bad for user experience. That said, in general if you have many website they should all provide unique content and user experience to rank well in organic search.
Whenever content on a site (same domain) can be found on multiple URLs (due to CMS issues such as URL parameters, printer friendly versions, or session IDs in the URL), it should be canonicalized for search engines. This can be accomplished using a 301 redirect to the correct URL, using the rel=canonical or in some cases using the Parameter handling tool in Google Webmaster Central.
301-Redirect: In many cases the best way to combat duplicate content is to set up a 301 redirect from the “duplicate” page to the original content page. When multiple pages with the potential to rank well are combined into a single page, they not only no longer compete with one another, but create a stronger relevancy and popularity signal overall. This will positively impact their ability to rank well in the search engines.
Rel=”canonical”: Another option for dealing with duplicate content is to utilize the rel=canonical tag. The rel=canonical passes the same amount of link value (ranking power) as a 301 redirect, and often takes up much less development time to implement. The tag is part of the HTML head of a web page, and tells search engines that the given page should be treated as though it were a copy of another URL, which is treated as the content originator.
noindex, follow: The meta robots tag with the values “noindex, follow” can be implemented on pages that shouldn’t be included in a search engine’s index. This allows the search engine bots to crawl the links on the specified page, but keeps them from including them in their index. This works particularly well with pagination issues.
Parameter Handling in Google Webmaster Tools: Google Webmaster Tools allows you to set the preferred domain of your site and handle various URL parameters differently. The main drawback to these methods is that they only work for Google. Any change you make here will not affect Bing or any other search engines settings.
Avoid any content duplication issues at the outset. If CMS is causing duplication due to the above-mentioned reasons, 301 redirect and rel=’canonical’ tags are most commonly used methods for streamlining content ownership to improve Google ranks. Have an objective to expand content on each page with a unique text and multi-media rather than trying to re-purpose the same content across various pages and domains. If you do decide to duplicate content between multiple websites or web pages, make content ownership decisions based on strategic keyword mapping and do not do it excessively. Google, and other search engines, consider two sites with identical content as spam and bad user experience, which can translate in negative effects on SEO.