Log in Register now

How to use robots.txt

In this article I will discuss how to make document robots.txt for search engines and how to quickly get search engines to index.

File robots.txt - is a text file in the root directory of the site where special instructions are written for search engines. These regulations may prohibit indexing of certain sections or pages on this site, point to the correct "mirror" domain, to recommend to a spider to observe a certain time interval between downloading documents from the server, etc.


Creating a robots.txt for uCoz


Consider the basic tags of the document:
Disallow: it prohibits the distribution or indexing of pages directory.
Allow: allowing a distribution index page or directory.
Thus, we can prevent the indexing of any directory, but allow the indexing of some pages in the same directory.

Take for example the module File Catalog. Catalog module is located at /load/

If you put:

Code
Disallow: /load/

Then the File Catalog, we will not be indexed, but if we need 2 - 3 pages of this catalog of all (for example only 10). Then write in your robots.txt file the following line:
Code
Allow: /load/page_link

Warning: the string with a resolution index page should be higher than the line with the prohibition of the directory index.
Example:
Code
Allow: /load/page_link
Allow: /load/page_link
Allow: /load/page_link
Disallow: /load/

So we will be indexed only those pages that are listed under the tag Allow.

Now let's deal with the tag User-agent. This tag reveals information about the bot which will be available to all of the settings that below. User-agent tag is placed at the top and after the name of a bot to check.
For example:

Code
User-agent: uBot

But if you want to specify for each bot search engine or directory sites, etc. the same distribution, then put next tag:

Code
User-agent: *

And list the following restrictions and open the different pages and directories.

So the main work is done and we can use this document for all bots, but if you want a better indexing of your site and better display its search engines, better yet provide links to files sitemap:

Code
Sitemap:http://your_website_domain/sitemap.xml  
Sitemap:http://your_website_domain/sitemap-forum.xml

This is a standard for addressing the site map of uCoz, if you have a site map, write your own.

Then, too, is not unimportant to specify the primary domain site. If you have a standard domain by uCoz you do anything inappropriately.
If you are attached to the site domain, for example. Com, you can set in your robots.txt file has this line:
Host: website_domain
Example:

Code
Host: www.example.com


Example robots.txt ready to uCoz:


Code
User-agent: *  
Disallow: /a/  
Disallow: /stat/  
Disallow: /index/1  
Disallow: /index/2  
Disallow: /index/3  
Disallow: /index/5  
Disallow: /index/7  
Disallow: /index/8  
Disallow: /index/9  
Disallow: /panel/  
Disallow: /admin/  
Disallow: /secure/  
Disallow: /informer/  
Disallow: /mchat  
Disallow: /search  

Host: www.example.com  
Sitemap:http://www.example.com/sitemap.xml  
Sitemap:http://www.example.com/sitemap-forum.xml

Related articles

comments (1)

0  
kevin   18.10.2013 17:09
[Entry]
where can i find the robots.txt in my site,,,?

Only registered users can add comments. Please, Login or Register.