Blog posts for Ruby, Ruby on rails and Linux.

Go back to all blogs

Dynamic sitemap for rails 4 application and integrate with capistrano.

Kapil Raj Nakhwa2014-Jan- 5

Adding Sitemaps to the rails 4 application is fairly easy and we might want to automate the process of updating the sitemap as much as possible along with submitting it to the major search engines. So i’ll try to explain briefly what i implemented while building this site.

Let us start with what sitemap is.  Well to summarize it. Sitemap is nothing more than just information about the web pages available to crawl for crawlers and index the pages available in our web sites.

So what is in the sitemap ?. Usually sitemap has simple xml semantics to clearly define few information about the pages available and some more information about the pages.

Like for example consider the following xml file :

<?xml version="1.0" encoding="UTF-8"?>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

<url>

<loc>http://kapilrajnakhwa.com/</loc>

<lastmod>2013-11-17T09:29:10+00:00</lastmod>

<changefreq>daily</changefreq>

<priority>1.0</priority>

</url>

<url>

<loc>http://kapilrajnakhwa.com/blogs</loc>

<lastmod>2013-11-17T09:29:10+00:00</lastmod>

<changefreq>daily</changefreq>

<priority>1.0</priority>

</url>

<url>

<loc>http://kapilrajnakhwa.com/blogs/google-analytics-for-rails-4-and-turbolinks</loc>

<lastmod>2013-11-16T19:51:00+00:00</lastmod>

<priority>1.0</priority>

</url>

</urlset>

The above is an example of a sitemap. We can see that there are various attributes for every url.

loc : This is the link of the page it is linked to.

lastmod: This is the last modified date of the page

changefreq: This is the simple change frequency of the content of this page. It can be daily, weekly or monthly etc.

priority: The value of this field changes from 0 to 1. Higher the value near to one more it is regarded as important.


When we have added new pages to our site and added the information in our sitemap.xml file we can also publish or push out this sitemap.xml to all major search engines like google, yahoo or bing and inform them about out our new sitemap.xml

Okay so lets get started with generating sitemap for our rails 4 application.

There are various choices that one might select from and you go through them. But i prefer to use a gem called dynamic_sitemaps(https://github.com/lassebunk/dynamic_sitemaps). Not because it has more functionality but because is simple to use.  And perfectly suports format specified by http://www.sitemaps.org/

If you want to say use this for a site that would have hundreds of thousands of page then this might not be the good choice

Installation

To Install .

Add this to your gemfile

gem "dynamic_sitemaps"

And then execute

Bundle install

It provides you with a generator you can run to create a config file at config/sitemap.rb

rails generate dynamic_sitemaps:install

The beauty is that you can just use this file to generate your sitemaps

Here is a sample of mine.

require "net/http"

host "kapilrajnakhwa.com"

#

sitemap :sitemaps do

 url root_url, last_mod: Time.now, change_freq: "weekly", priority: 1.0

 url blogs_url, last_mod: Time.now, change_freq: "weekly", priority: 1.0

 Blog.published.each do |blog|

    url blog, last_mod: blog.updated_at, priority: 1.0

 end

end

ping_with "http://#{host}/sitemap.xml" if Rails.env=='production'


Notice how I can loop over each of my blog and generate sitemap easily for those pages. Also the last line pings all the search engines to let them know about your new sitemap.


After you are done generating sitemap for your site using awesome ruby semantics.

You can generate the sitemap in xml format by running the following command.


rake sitemap:generate


This will, by default, generate a sitemap.xml file in <project root>/public/sitemaps that will look like this:

<urlset>

<url>

<loc>http://kapilrajnakhwa.com/</loc>

<lastmod>2013-11-17T09:29:10+00:00</lastmod>

<changefreq>daily</changefreq>

<priority>1.0</priority>

</url>

<url>

<loc>http://kapilrajnakhwa.com/blogs</loc>

<lastmod>2013-11-17T09:29:10+00:00</lastmod>

<changefreq>daily</changefreq>

<priority>1.0</priority>

</url>

</urlset>


You can check mine at http://kapilrajnakhwa.com/sitemap.xml


Integration with capistrano deploys


Now for the part of automatically generating the sitemaps while lets say deploying. Also notice that you might need to move the sitemap file generated at <project root>/public/sitemap folder up one level into public folder so it is accessible directly.


For this lets create a rake file in lib/tasks directory

namespace :sitemap do

 task :symlink do

    system("cp #{Rails.root}/public/sitemaps/sitemap.xml #{Rails.root}/public/sitemap.xml")

 end

end


Then if you want to change in the sitemap to be generated in every deploy then add this to your capistrano deploy file at config/deploy

namespace :symlinks do

 desc "generate and copy the symlink"

 task :generate_sitemap do

    run " cd #{release_path} && RAILS_ENV=production bundle exec rake sitemap:generate"

    run " cd #{release_path} &&  RAILS_ENV=production bundle exec rake sitemap:symlink"

 end

 end


And the sitemap would automatically be re built in every deploy.


But lets say you do not want this and say you want the sitemap to be generated when you create each blog post.

Then you can go with something like this.


Regenerate Sitemap on after create.

On blog.rb  add the following line

after_commit  :update_sitemap


and add the following method.


def update_sitemap

     system("RAILS_ENV=#{Rails.env} bundle exec rake sitemap:generate")

     system("RAILS_ENV=#{Rails.env} bundle exec rake sitemap:symlink")

end


Now you have a system that is able to automatically generate sitemaps. Enjoy



Tags: rails4,seo,sitemap,capistrano

Go back to all blogs
Never miss a post on new ruby and rails tips