Serving Different Robots.txt Using Rack
While doing an SEO audit for the daily deal API I’m working on, the subject of the robots.txt came up. In addition to our production environment (what you and everyone else see), we also use an “edge” environment. It’s a place we can push the latest and greatest changes and test them before they go live. Edge is an exact copy of production just running on a different domain. Since we didn’t want to get dinged with content duplication we had to disallow spiders from crawling our edge environment. Here’s how we serve different robots.txt files based on environment using Rack within Rails.
- Move public/robots.txt to config/robots.txt. This is now your production robots.txt. Any other environment will disallow everything for all user-agents.
- Create a
RobotsGenerator
in lib - Point
/robots.txt
to the generator in your routes
lib/robots_generator.rb:
config/routes.rb
Update (Sept. 23, 2013): Thanks to @michaelbaudino for pointing out that
routes.rb needs the require 'robots_generator'
since Rails 3 does not autoload files
in lib. Additionally, the request headers should always include Content-Type
to avoid
a Rack::Lint::LintError
error.