23 Apr 2020 Β· Software Engineering

    Faster Rails: Is Your Database Properly Indexed?

    5 min read
    Contents

    This article is part of our Faster Rails series. Check out the previous article about fast existence checks.

    My Rails app used to be fast and snappy, and everything was working just fine for several months. Then, slowly, as my product grew and users started to flock in, web requests become slow and my database’s CPU usage started hitting the roof. I hadn’t changed anything, why was my app getting slower?

    Is there any cure for the issues I’m having with my application, or is Rails simply not able to scale?

    What makes your Rails application slow?

    While there can be many reasons behind an application’s slowness, database queries usually play the biggest role in an application’s performance footprint. Loading too much data into memory, N+1 queries, lack of cached values, and the lack of proper databases indexes are the biggest culprits that can cause slow requests.

    Missing database indexes on foreign keys and commonly searched columns or values that need to be sorted can make a huge difference. The missing index is an issue that is not even noticeable for tables with several thousand records. However, when you start hitting millions of records, the lookups in the table become painfully slow.

    The role of database indexes

    When you create a database column, it’s vital to consider if you will need to find and retrieve records based on that column.

    For example, let’s take a look at the internals of Semaphore. We have a Project model, and every project has a name attribute. When someone visits a project on Semaphore, e.g. https://semaphoreci.com/renderedtext/test-boosters, the first thing we need to do in the projects controller is to find the project based on its name β€” test-boosters.

    project = Project.find_by_name(params[:name])
    

    Without an index, the database engine would need to check every record in the projects table, one by one, until a match is found.

    However, if we introduce an index on the ‘projects’ table, as in the following example, the lookup will be much, much faster.

    class IndexProjectsOnName < ActiveRecord::Migration
      def change
        add_index :projects, :name
      end
    end
    

    A good way to think about indexes is to imagine them as the index section at the end of a book. If you want to find a word in a book, you can either read the whole book and find the word, or your can open the index section that contains a alphabetically sorted list of important words with a locator that points to the page that defines the word.

    What needs to be indexed?

    A good rule of thumb is to create database indexes for everything that is referenced in the WHERE, HAVING and ORDER BY parts of your SQL queries.

    Indexes for unique lookups

    Any lookup based on a unique column value should have an index.

    For example, the following queries:

    User.find_by_username("shiroyasha")
    User.find_by_email("support@semaphoreci.com")
    

    will benefit from an index of the username and email fields:

    add_index :users, :username
    add_index :users, :email
    

    Indexes for foreign keys

    If you have belongs_to or has_many relationships, you will need to index the foreign keys to optimize for fast lookup.

    For example, we have the branches that belong to projects:

    class Project < ActiveRecord::Base
      has_many :branches
    end
    
    class Branch < ActiveRecord::Base
      belongs_to :project
    end
    

    For fast lookup, we need to add the following index:

    add_index :branches, :project_id
    

    For polymorphic associations, the owner of the project can either be a User or an Organization:

    class Organization < ActiveRecord::Base
      has_many :projects, :as => :owner
    end
    
    class User < ActiveRecord::Base
      has_many :projects, :as => :owner
    end
    
    class Project < ActiveRecord::Base
      belongs_to :owner, :polymorphic => true
    end
    

    We need to make sure that we create a double index:

    # Bad: This will not improve the lookup speed
    
    add_index :projects, :owner_id
    add_index :projects, :owner_type
    
    # Good: This will create the proper index
    
    add_index :projects, [:owner_id, :owner_type]
    

    Indexes for ordered values

    Any frequently used sorting can be improved by using a dedicated index.

    For example:

    Build.order(:updated_at).take(10)
    

    can be improved with a dedicated index:

    add_index :updated_at
    

    Should I always use indexes?

    While using indexes for important fields can immensely improve the performance of your application, sometimes the effect can be negligible, or it can even make your application slower.

    For example, tables that have elements that are frequently deleted can negatively impact the performance of your database. Huge tables with many millions of records also require more storage for your indexes.

    Always be concious about the changes you introduce in your database, and if in doubt, be sure to base your decisions on real world data and measurements.

    At Semaphore, we’re all about speed. We’re on a mission to make continuous integration fast and easy. Driven by numerous conversations with our customers and our own experiences, we’ve built a new CI feature that can automatically parallelize any test suite and cut its runtime to just a few minutes – Semaphore Boosters. Learn more and try it out.

    One thought on “Faster Rails: Is Your Database Properly Indexed?

    1. That’s a great article, but on thought about indexes in polymorphic associations. It would be more optimal to arrange the index for polymorphic associations in the [type, id] order rather than [id, type]. By utilizing the [type, id] order, the B-tree search can be significantly narrowed down, especially considering the relatively smaller number of types compared to ids. In the tree structure, ids can be grouped under the corresponding group for every type, facilitating quicker retrieval of the id and its corresponding location in the table.
      Hope this makes sense.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    Chief Architect at Semaphore. A decade of experience in dev productivity, helping close to 50,000 organizations with operational excellence.