Prevent search engines from indexing your websites


  • Applies to: Grid
    • Difficulty: Easy
    • Time Needed: 10
    • Tools Required: wp-admin. FTP client, plain text editor
  • Applies to: All DV
    • Difficulty: Easy
    • Time Needed: 10
    • Tools Required: wp-admin, FTP client, plain text editor
  • Applies to: WordPress Hosting
    • Difficulty: Easy
    • Time Needed: 10
    • Tools Required: wp-admin, FTP client, plain text editor

Overview

Web Robots, also known as Web Wanderers, Crawlers, or Spiders, are programs that traverse the web automatically. Search engines, such as Google or Yahoo, use them to index the web content of your site. However they can also be used inappropriately, such as spammers using them to scan for email addresses. The following are a few methods you can use to prevent this.

READ ME FIRST

The publishing of this information does not imply support of this article. This article is provided solely as a courtesy to our customers. Please take a moment to review the Statement of Support.

READ ME FIRST

The publishing of this information does not imply support of this article. This article is provided solely as a courtesy to our customers. Please take a moment to review the Statement of Support.

Instructions

WordPress

If you are using WordPress as your CMS, there is a feature built-in to allow you to discourage search engines from indexing your site.

  1. Log into your WordPress Admin Dashboard.
  2. Click on Settings >> Reading.

    wp-1.png

  3. Ensure that Search Engine Visibility is checked.

    wp-2.png

As the description implies it will discourage search engines from indexing your site, but it is up to the search engine to honor that request.

Robots.txt

As an alternative to the WordPress solution above, you can also use a robots.txt file to prevent search engine crawlers from requesting your site.

  1. Use a File Manager File Manager or FTP FTP FTP to navigate to your website's root directory
  2. Edit the robots.txt file, or create new one if there isn't one currently.
  3. Enter the following into robots.txt:
    User-agent:     *
    Disallow: /
  4. Save your changes. And that's it!

If you'd like some more advanced directives for robots.txt, feel free to check out the information below. Remember to remove the # sign for any command you wish the robots to follow, but be sure not to un-comment the commands description. For details on all the rules you can create please visit: http://www.robotstxt.org/

# Example robots.txt from (mt) Media Temple
# Learn more at http://mediatemple.net
# (mt) Forums - http://forum.mediatemple.net/
# (mt) System Status - http://status.mediatemple.net
# (mt) Statement of Support - http://mediatemple.net/support/statement/

# How do I check that my robots.txt file is working as expected
# http://www.google.com/support/webmasters/bin/answer.pyanswer=35237

# For a list of Robots please visit: http://www.robotstxt.org/db.html

# Instructions
# Remove the "#" to uncomment any line that you wish to use, but be sure not to uncomment the Description.

# Grant Robots Access
#######################################################################################

# This example allows all robots to visit all files because the wildcard "*" specifies all robots:
#User-agent: *
#Disallow:

#To allow a single robot you would use the following:
#User-agent: Google
#Disallow:

#User-agent: *
#Disallow: /

# Deny Robots Access
#######################################################################################

# This example keeps all robots out:
#User-agent: *
#Disallow: /

# The next is an example that tells all crawlers not to enter into four directories of a website:
#User-agent: *
#Disallow: /cgi-bin/
#Disallow: /images/
#Disallow: /tmp/
#Disallow: /private/

# Example that tells a specific crawler not to enter one specific directory:
#User-agent: BadBot
#Disallow: /private/

# Example that tells all crawlers not to enter one specific file called foo.html
#User-agent: *
#Disallow: /domains/example.com/html//var/www/vhosts/example.com/httpdocs/

Password protect your pages/website

Search engines and crawlers do not have access to password protected pages. As such it's quite effective in index prevention. However, this method will require users to enter a password in order to view your content. This can be implemented on individual pages, or across your whole site. For detailed instructions, feel free to visit the article below:

Password protecting directories.Password protecting directories.Password protecting directories.