Robots.txt
WWW Robots (also called wanderers or spiders) are programs that traverse many pages in the World Wide Web by recursively retrieving linked pages. The method used to exclude robots from a server is to create a file on the server which specifies an access policy for robots.
This file must be accessible via HTTP on the local URL "/robots.txt".
Meaning of *
If the value is '*', the record describes the default access policy for any
robot that has not matched any of the other records.
For example, to disallow files in the directory webpromotion/list and an individual file secret.html, we use the following code:-
Example
User-agent: *
Disallow: /webpromotion/list/
Disallow: /secret.html
For example, to disallow files with the ".txt" extension, you would add the following lines to your robots.txt
Example
User-agent: *
Disallow: /*.txt$
You may find more details at http://www.robotstxt.org/wc/norobots.html
