How To Create a Robots.txt File
Posted on 13 November 2007 by DeanHunt
Would you like to know how to create a robots.txt file for your website? Even better, would you like to know how to steal the best robots.txt files on the net? and how to hack into Google and The White house?
Ok, well read on…
Well first, I will explain what a robots.txt file is, and what it does.
What is the Robots.txt file and what does it do?

Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.
It works likes this: a robot wants to visit a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:
User-agent: * Disallow: /
The “User-agent: *” means this section applies to all robots. The “Disallow: /” tells the robot that it should not visit any pages on the site.
There are two important considerations when using /robots.txt:
1) robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
2) the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don’t want robots to use.
So please don’t try and use the robots.txt file to hide private information or passwords.
Where should I put my robots.txt file?
It should be in your root folder. In other words, yourdomain.com/robots.txt
NOT: yourdomain.com/shop/robots.txt
Got it? Good!
How to create a Robots.txt file
Go to your FTP program or cpanel and create a new file. It should be a .txt file, and obviously, it will be named: Robots.txt
Once you have your new file you will need to edit it and add your rules for the robots to follow.
What would normally happen here is that you would get a series of complex rules and codes to follow, at which point you will more than likely be banging your head against your desk. But I am going to show you a sneaky trick to get a great Robots file in less than 30 seconds….
STEAL IT FROM SOMEONE ELSE
![]()
How to Steal Someone Else’s Robots.txt File
Step one is to go to church and ask baby Jesus to forgive what you are about to do, then, when you are free from guilt, hook up to the churches Wi-Fi (do churches have Wi-Fi?), it may be a more effective way of communicating with god.
Anyway, they key here is to find an authority website that uses the same platform as your site. So, for example, if you are using Wordpress, then why not copy the robots file from John Chow: John Chow’s Robots File
We already know that the robots file will be placed in the root directory, so we just add /robots.txt to the web domain.
Using the above example we have a great Wordpress Robots.txt file:
sitemap: http://www.johnchow.com/sitemap.xml User-agent: * Disallow: /cgi-bin/ Disallow: /go/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /author/ Disallow: /page/ Disallow: /category/ Disallow: /wp-images/ Disallow: /images/ Disallow: /backup/ Disallow: /banners/ Disallow: /archives/ Disallow: /trackback/ Disallow: /feed/ User-agent: Googlebot-Image Allow: /wp-content/uploads/ User-agent: Mediapartners-Google Allow: / User-agent: duggmirror Disallow: /
We now replace the sitemap with our own sitemap, you may also wish to remove the disallow for Duggmirror, they saved my site once when it got Dugg.
Paste the above into your file and save.
Bam! You have a killer robots.txt file.
Here are some more interesting Robots.txt files for the geeks:
The Whitehouse Robots.txt file
The Google.com Robots.txt file
if you are really bored then you should read the robots.txt file at Webmaster World it seems to be full of stories, advice and articles…. weird.





