A robots.txt file lives at the root of your site. So, for site www.example.com, the robots.txt file lives at www.example.com/robots.txt. robots.txt is a plain text file that follows the Robots Exclusion Standard. A robots.txt file consists of one or more rules. Each rule blocks (or or allows) access for a given crawler to a specified file path in that website.

Here is a simple robots.txt file with two rules, explained below

Allow indexing of everything

User-agent: *
Disallow:
or

User-agent: *
Allow: /
Disallow indexing of everything

User-agent: *
Disallow: /

Disawllow indexing of a specific folder

User-agent: *
Disallow: /folder/

Disallow Googlebot from indexing of a folder, except for allowing the indexing of one file in that folder

User-agent: Googlebot
Disallow: /folder1/
Allow: /folder1/myfile.html

You can use the $ character to specify matching the end of the URL. For instance, to block an URLs that end with .html, you could use the following entry:

Disallow: /*.html$

To block access to all URLs that include a question mark (?), you could use the following entry:

Disallow: /*?

if you have any Query, please Contact

Romil Tripathi – best SEO Consultant in India

Leave a Reply

Your email address will not be published. Required fields are marked *