Robots file,

Friday, July 25, 2008

The robots.txt file is an ASCII text file that has specific information for search engine robots about particular content that they are not permitted to index. This information's are the deciding factor of how a search engine indexes your site pages. The universal address of the robots.txt file is: www.domain.com/robots.txt. This is the first file that a robot visits. It picks up instructions for indexing the site content and follows them. This file contains two text fields. Let's study this example:

User-agent: *

Disallow:

The User-agent field is for specifying robot name for which the access policy follows in the Disallow field. Disallow field specifies URLs which the specified robots have no access to. An example:

User-agent: *

Disallow: /

Here "*" means all robots and "/" means all URLs. This is read as, "No access for any search engine to any URL" Since all URLs are preceded by "/ " so it bans access to all URLs when nothing follows after "/ ". If partial access has to be given, only the banned URL is specified in the Disallow field.

posted by Alenjoe @ 12:32 PM permanent link   |

Post a Comment

|


0 Comments:

<< Home