Faster Rails: Is Your Database Properly Indexed?
This article is part of our Faster Rails series. Check out the previous article about fast existence checks.
My Rails app used to be fast and snappy, and everything was working just fine for several months. Then, slowly, as my product grew and users started to flock in, web requests become slow and my database’s CPU usage started hitting the roof. I hadn’t changed anything, why was my app getting slower?
Is there any cure for the issues I’m having with my application, or is Rails simply not able to scale?
What makes your Rails application slow?
While there can be many reasons behind an application’s slowness, database queries usually play the biggest role in an application’s performance footprint. Loading too much data into memory, N+1 queries, lack of cached values, and the lack of proper databases indexes are the biggest culprits that can cause slow requests.
Missing database indexes on foreign keys and commonly searched columns or values that need to be sorted can make a huge difference. The missing index is an issue that is not even noticeable for tables with several thousand records. However, when you start hitting millions of records, the lookups in the table become painfully slow.
The role of database indexes
When you create a database column, it’s vital to consider if you will need to find and retrieve records based on that column.
For example, let’s take a look at the internals of Semaphore. We have a
Project model, and every project has a
name attribute. When someone visits a
project on Semaphore, e.g.
the first thing we need to do in the projects controller is to find the project
based on its name —
project = Project.find_by_name(params[:name])
Without an index, the database engine would need to check every record in the projects table, one by one, until a match is found.
However, if we introduce an index on the ‘projects’ table, as in the following example, the lookup will be much, much faster.
class IndexProjectsOnName < ActiveRecord::Migration def change add_index :projects, :name end end
A good way to think about indexes is to imagine them as the index section at the end of a book. If you want to find a word in a book, you can either read the whole book and find the word, or your can open the index section that contains a alphabetically sorted list of important words with a locator that points to the page that defines the word.
What needs to be indexed?
A good rule of thumb is to create database indexes for everything that is
referenced in the
ORDER BY parts of your SQL queries.
Indexes for unique lookups
Any lookup based on a unique column value should have an index.
For example, the following queries:
will benefit from an index of the
add_index :users, :username add_index :users, :email
Indexes for foreign keys
If you have
has_many relationships, you will need
to index the foreign keys to optimize for fast lookup.
For example, we have the branches that belong to projects:
class Project < ActiveRecord::Base has_many :branches end class Branch < ActiveRecord::Base belongs_to :project end
For fast lookup, we need to add the following index:
add_index :branches, :project_id
For polymorphic associations, the
owner of the
can either be a
User or an
class Organization < ActiveRecord::Base has_many :projects, :as => :owner end class User < ActiveRecord::Base has_many :projects, :as => :owner end class Project < ActiveRecord::Base belongs_to :owner, :polymorphic => true end
We need to make sure that we create a double index:
# Bad: This will not improve the lookup speed add_index :projects, :owner_id add_index :projects, :owner_type # Good: This will create the proper index add_index :projects, [:owner_id, :owner_type]
Indexes for ordered values
Any frequently used sorting can be improved by using a dedicated index.
can be improved with a dedicated index:
Should I always use indexes?
While using indexes for important fields can immensely improve the performance of your application, sometimes the effect can be negligible, or it can even make your application slower.
For example, tables that have elements that are frequently deleted can negatively impact the performance of your database. Huge tables with many millions of records also require more storage for your indexes.
Always be concious about the changes you introduce in your database, and if in doubt, be sure to base your decisions on real world data and measurements.
At Semaphore, we’re all about speed. We’re on a mission to make continuous integration fast and easy. Driven by numerous conversations with our customers and our own experiences, we’ve built a new CI feature that can automatically parallelize any test suite and cut its runtime to just a few minutes - Semaphore Boosters. Learn more and try it out.