In this article, we will learn about robots.txt and why its required for website's SEO(Search Engine Optimization)?
What is a robots.txt file?
A robots.txt file tells search engines what your website’s rules of engagement are.
The robots.txt file plays a big role in SEO.
A big part of doing SEO is about sending the right signals to search engines, and robots.txt is one of the ways to communicate your crawling preferences to search engines.
Also read, How To Redirect 301 In Angular Universal
How does robots.txt work?
Search engines regularly checks a website’s robots.txt file to see if there are any instructions for crawling the website and these instruction are called directives.
If there’s no robots.txt file present or if there are no applicable directives, search engines will crawl the entire website.
You can use it to prevent search engines from crawling specific parts of your website and to give search engines helpful tips on how they can best crawl your website.
How to Find robots.txt for your website?
The robots.txt file should reside in the root of your website (e.g. http://www.yourdomain.com/robots.txt)
If your site doesn’t have a robots.txt file, you’ll find an empty page.
Before creating a robots.txt file, you should familiarize yourself with the syntax used in the robots.txt file.
Here are 4 common components you may notice in your robots.txt file:
- User-agent: This is the name of the web crawler to which you’re giving crawl instructions. Each search engine has a different user-agent name. Ex: Googlebot is Google’s user-agent name.
- Disallow: This is the directive used to instruct a user-agent not to crawl a specific URL.
- Allow: This directive is used to instruct a user-agent to crawl a page, if the parent page is not allowed.
- Sitemap: This is the directive used to determine the location of your XML Sitemap in search engines.
How to Create a Robots.txt File
If your site doesn’t have a robots.txt file, it’s easy to create one. You can use any text editor to create a robots.txt file.
For example, if you’d like Google to index all your pages and just hide the admin page, create a robots.txt file that looks like this:
Example of robots.txt
A normal robots.txt file looks like as below:
User-agent: * Disallow: /wp-admin/
Each directive should be on a separate line, otherwise, search engines may get confused when parsing the robots.txt
Once you’re done typing all the directives, save the file as “robots.txt.”
Once your robots.txt file is ready, upload it to the root directory of your website.
How to Check If Your Robots.txt is Working
Once you’ve uploaded your robots.txt file to your root directory, you can validate it using robots.txt tester tool in Google Search Console.
The robots.txt Tester tool will check if your robots.txt is working properly. If you’ve blocked any URLs from crawling in your robots.txt, the Tester tool will verify if the specific URLs are indeed being blocked by web crawlers.
An optimized robots.txt file can improve your site's indexing and increase your site's visibility in search results, so make sure to have it in your website.