Many people do not think about the possibility of duplicate content pages on their website, when this is very common and easily avoidable. All it takes is a little HTML in the head tags of the page.
What is a duplicate content page?
Duplicate content is content that is showing up on the Internet by multiple URLs. Another words, a page like aboutus.php might be coming with the same content as aboutus.php?do=send.
This causes some problems with search engines as far as determining which page the engine should use for query matches, page ranking, use of meta tags between the two versions, and just which one to include for their indices.
How do you avoid duplicate content?
Let’s say these are your pages that are coming with the same content:
Duplicate content pages:
On post.php we can include a <link> tag that will reference to the original source so Search Engines gather the original correct information from the server, rather than trying to decide between the several duplicate pages.
Place in the <head> tags of post.php:
echo '<link href="./post.php?id='.((int)$_GET['id']).'" rel="canonical" />';
This will take the $id and put it into the URL as just post.php?id=$id. So any other URL parameters, such as lang=en, or do=login, will not be read as a duplicate page, but tell search engines to use the content of just post.php?id=$id.
If you do not need to use PHP and you just want the page to load a set content of just post.php then you would use the tag like so:
<link href="./post.php" rel="canonical" />
When using this method, you want to be sure that you are consistent throughout your website with your canonical URLs. If your canonical URL is like: “http://www.bgallz.org/” then you want to use “www” in all of your URL references and tags for the domain.
Now, if you have pages that you do NOT want included in any search engine queries or crawled by search engine bots, then you want to use a META tag with “noindex, nofollow”.
To block a page from search engine indexes places the following in the <head> tags:
<meta name="robots" content="noindex, nofollow" />