Performance and Ruby on Rails Models
The Active Record design pattern simplifies data access in Rails applications. But you can also shoot yourself in the foot performance-wise if you misapply it to queries that span multiple objects and instances. In this article, we look at ways you can improve query performance.
A Good Measure
So where does one start? As a rule of thumb, always optimize the slowest queries first. It sounds obvious but you’d be surprised. Use the application log to get a general sense of which queries are taking the longest. You can also apply the “benchmark” method—available in several key places:
Inside models:
def self.get_active self.benchmark 'Get Active Contacts' do find_all_by_status('active') end end
Inside controllers (acting upon a model):
Contact.benchmark 'Get Active Contacts' do @active_contacts = Contact.get_active end
Inside views:
<% benchmark 'Show Active Contacts' do %> <%= @active_contacts %> <% end %>
A cool feature is the ability to string multiple benchmarks together as follows:
Contact.benchmark 'First Way' do @active_contacts = Contact.get_active_test1 end Contact.benchmark 'Second Way' do @active_contacts = Contact.get_active_test2 end Contact.benchmark 'Third Way' do @active_contacts = Contact.get_active_test3 end
which then appears in your app log like this
First Way (0.65363) Second Way (0.56675) Third Way (0.54476)
Beyond these basics, you can explore the many profiling tools available to Ruby programmers.
Including Data Upfront
Often, you need to display a model instance and its related associations. Example: a company view also displays a list of related contacts. Active Record permits you to retrieve as much data as you need using the ‘include’ symbol; this is often referred to as ‘eager loading of associations.’
companies = Company.find(:all, :include => :contacts, :conditions => "contacts.status = 'Active'")
Behind the scenes, the SQL statement will include a join to the related table(s).
However, the previous example does not always result in better performance. While you can significantly cut down on the number of roundtrips to the database, the amount of data returned, to be processed by Ruby code and wrapped into an Active Record object, can in fact slow things down. This is especially true if your model contains a lot of attributes.
One way to speed things up is to only load the columns needed to support the view. Going back to the company example, we only care about showing contacts’ names and email addresses.
companies = Company.find(:all, :include => :contacts, :select => "companies.*, contacts.first_name, contacts.last_name, contacts.email" :conditions => "contacts.status = 'Active'")
A Little Bit of Ruby Love
Depending on the size of your database, sometimes it’s faster to retrieve ALL of the rows for a given model and then use Ruby’s sophisticated array handling capabilities to parse through the collection.
For example, let’s say we’re generating a report that counts the number of active contacts for all companies in the system. As the number of contacts grow, the following snippet will result in longer execution time. The ‘find_all_by_status’ method results in another database call for each company instance:
Company.find(:all).each do |company| puts company.contacts.find_all_by_status('active') end
The following alternative gets us down to two database calls:
active_contacts = Contact.find_all_by_status('active') Company.find(:all).each do |company| puts active_contacts.select{ |contact| contact.company_id == company.id } end
What about the :include symbol? In some of our data access tests, the above variation proved to be faster than eagerly loading associations. Our guess is the construction of the larger SQL statement is a bit slower than two simple “select *” calls. Again, use the Benchmark functions to see for yourself. Only testing will help you come to your own conclusions.
Are You (De)Normal?
If you have a summary value that is expensive to calculate (i.e. requires multiple database roundtrips and/or a complex block of Ruby code) and changes frequently enough where caching won’t help much, then denormalizing the attribute may be useful.
Let’s say you are calculating a list of top scores of participants in a series of games. One way to do it is
Player.find(:all).each do |player| puts player.games.sum{ |game| game.score } end
If you only have a handful of players and games in the system, you’re fine. But increase it 10-fold and your nice “top scores” view begins to drag since a database call is being made to get each game’s score. Wouldn’t it be great if we could consolidate things to one roundtrip, like this:
Player.find(:all, :order => 'total_score DESC').each do |player| puts player.total_score end
Well, you can, with a little bit of refactoring and applying a Ruby on Rails observer object. Observers allow you to attach behaviors to specific model events, or callbacks. For example, you could write code that automatically logs an entry to a database table or sends out an email notification when a specific type of model is saved.
First, add the summary field – in this example, the ‘total_score’ – to the desired table using a Rails migration script.
Then, create an observer object that will “watch” Game instances. Any time a Game’s score is updated, we will summarize the player’s total score in their record.
Creating the observer is easy, just use the generate script:
script/generate observer game
Inside the new GameObserver class, add this:
class GameObserver < ActiveRecord::Observer def after_save(game) player = game.player player.update_attribute('total_score', player.total_score + game.score) end end
But wait – you’re not done yet – you’ll also need to update your application’s environment.rb to activate the observer:
Rails::Initializer.run do |config| #other config settings go here config.active_record.observers = :game_observer end
Then be sure you update any code still using the real-time calculated version of player’s scores and replace with player.total_score. If you’re trying to get the top 10 players, it’s as simple as
Player.find(:all, :order => 'total_score DESC', :limit => 10)
Rolling Up the Sleeves
In some cases, writing your own SQL can be the solution to squeezing out every millisecond of query performance. The trade-off, of course, is code maintenance. Every time you add a new field to your model, you’ll have to remember to update any hand-written queries. Everything you can do with SQL is abstracted in Active Record, so you don’t need to resort to it often. However, it can come in handy in reporting scenarios, where your query becomes complex, needing to display attributes from many objects, sometimes indirectly associated.
For example: display a list of emails for all active contacts associated with clients in the ‘West’ sales region, who have open orders, containing items of category X with a status of ‘Backordered.’
Contact.find_by_sql("select email from contacts co, clients cl, regions r, orders o, line_items li, products p, product_categories pc where co.client_id = cl.id and cl.region_id = r.id and o.client_id = cl.id and p.order_id = o.id and li.order_id = o.id and p.category_id = pc.id and r.region = 'West' and o.status = 'Open' and pc.name = 'X' and li.status = 'Backordered'")
Tuning the Backend
Adding a database index to a table, especially on columns that are frequently involved in joins, can boost performance as well. But be careful, as creating an index will improve read operations, but on the flip side, write operations will slow down. This article won’t go into the intricacies of database indexing, since this topic is already covered in detail elsewhere. However, Rails does makes things easy through migration scripts.
class AddGamesIndexes < ActiveRecord::Migration def self.up add_index :games, :player_id end def self.down remove_index :games, :player_id end end
What Next
We discussed some techniques that help you measure and refine performance in Ruby on Rails models before needing to consider other options, such as caching.
We also recommend keeping your Rails framework current and peeking into Edge Rails every now and then. Over the past 6 months, for instance, there has been a lot of activity focused on Active Record performance optimizations. Rails 2.0 introduced query caching. This is an area which will continue to improve.
About this entry
Posted: Wednesday, June 11th, 2008 at 8:30 am
- Author:
- Phil Misiowiec
- Category:
- Solutions
- License:
- Creative Commons

No comments
Jump to comment form | comments rss | trackback uri