Many of public sites developed in SharePoint that are configured to allow anonymous access with restriction that should not be indexed by any search engines. We can prevent that by placing robots.txt file in the root of our SharePoint site.
Robots.txt file is text file that we can place in SharePoint site root level to indicate the search engines that should not crawled or indexed. It is not a mandatory rule for search engine to follow Robots.txt instructions but generally search engines are designed to obey robots.txt commands.
Robots.txt file location should be in the main directory of site that should be easily available for search engines. Because search engines should not search for the entire site for robots.txt file. They look main directory of site, if it is not find there, then it will assume that current site will not have any robots.txt file. So that it will index everything in the site.
To create a robots.txt file, create a text file with following kind of text,
Here “User-agent” means the section that applies all robots and “Disallow” tells robot that should not index in the site. Save the file as “robots.txt” .Add file in root of the site by using SharePoint designer. Open designer and add the file in All files tab. By adding the file in All files tabs robots.txt file will be located in root site.
To add robots.txt file to the site, navigate to Application Management in SharePoint central Administration, Define managed path for the web application to configure.
In the Path text box, provide the URL as “/robots.txt” and type as Explicit inclusion. Click on Ok to save the settings.
Do IISRESET. If we have permissions problem, Provide read permissions for “Everyone” to robots.txt file.
We can add robots.txt file to a site using following Power Shell commands.
$file = [system.io.file]::ReadAllBytes("robots.txt full path");
$siteToAddFile = Get-SPSite "Site to add the robots.txt";
$siteToAddFile .RootWeb.Files.Add("robots.txt", $ file, $true);