Hi friends , here i am going to give some tips to prevent your sensitive parts of your website from the search engine indexing. Some times you don’t want some sensitive posts to index in search engine indexing.You can prevent by using robots.txt….You guys can also avoid duplication penalties from search engines by using this robots.txt effectively.

robots.txt is a plain text file. The path of the robots.txt file should be as follows:

http://www.example.com/robots.txt

(or)

http://blog.example.com/robots.txt

syntax of Robots.txt file:

If you want to allow all Robots to index all your pages then include this Robots.txt file:

User-agent: *
Disallow:

Here User-agent means all Robots visiting your site to crawl the pages.

If you want to ban all Robots from indexing your site then include this Robots.txt file:

User-agent: *
Disallow: /

To ban specific Robot from indexing pages, include code like:

User-agent: Googlebot
Disallow: /

To ban some web pages like /Category directory with all sub pages:

User-agent: *
Disallow: /Category/

To allow only specific Robots to index your pages:

User-agent: Googlebot
Disallow:

User-agent: *
Disallow: /

Here you can make one mistake. Do not add specific robot ban after Disallow:/ i.e

User-agent: *
Disallow: /

User-agent: Googlebot
Disallow:

Will ban all Robots including Googlebot from indexing your web site.

Many sites have duplicate content penalties. If same content is accessible from two or more different url’s then it’s said to be duplicate content. If you have /category or /Archive directories then you have chance of having duplicate content penalty. Either show posts excerpt on Category and Archive pages or use robots.txt to ban indexing these pages. Then your Robots.txt file will be look like this:

User-agent: *
Disallow: /category/
Disallow: /archives/

Hope this tips will help you guys…. Thanks for visited this site.