No More Seat Costs: Semaphore Plans Just Got Better!

    23 Apr 2021 · Semaphore News

    Rails Testing Antipatterns: Fixtures and Factories

    12 min read
    Contents

    In the upcoming series of posts, we’ll explore some common antipatterns in writing tests for Rails applications. The presented opinions come from our experience in building web applications with Rails (we’ve been doing it since 2007) and is biased towards using RSpec and Cucumber. Developers working with other technologies will probably benefit from reading as well.

    Antipattern zero: no tests at all

    If your app has at least some tests, congratulations: you’re among the better developers out there. If you think that writing tests is hard — it is, but you just need a little more practice. I recommend reading Rails Testing Handbook and Effective Testing with RSpec 3 if you haven’t already.

    If you don’t know how to add more tests to a large system you inherited, I recommend going through Working Effectively with Legacy Code. If you have no one else to talk to about testing in your company, there are many great people to meet at events such as CITCON.

    If you recognize some of the practices discussed here in your own code, don’t worry. The methodology is evolving, and many of us have “been there and done that”. And finally, this is all just advice. If you disagree, feel free to share your thoughts in the comment section below. Now, onwards with the code.

    Fixtures and factories

    Using fixtures

    Fixtures are Rails’ default way to prepare and reuse test data. Do not use fixtures.

    Let’s take a look at a simple fixture:

    # spec/fixtures/users.yml
    marko:
      first_name: Marko
      last_name: Anastasov
      phone: 555-123-6788

    You can use it in a test like this:

    RSpec.describe User do
      fixtures :all
    
      describe "#full_name" do
        it "is composed of first and last name" do
          user = users(:marko)
          expect(user.full_name).to eql "Marko Anastasov"
        end
      end
    end

    There are a few problems with this test code:

    – It is not clear where the user came from and how it is set up.
    – We are testing against a “magic value” — implying something was defined in some code, somewhere else.

    In practice, these shortcomings are addressed by comment essays:

    RSpec.describe Dashboard do
    
      fixtures :all
    
      describe "#show" do
        # User with preferences to view posts about kittens
        # and in the group with special access to Burmese cats
        # with 4 friends that like ridgeback dogs.
        let(:user) { users(:kitten_fan) }
      end
    end

    Maintaining fixtures of more complex records can be tedious. I recall working on an app where there was a record with dozens of attributes. Whenever a column would be added or changed in the schema, all fixtures needed to be changed by hand. Of course I only recalled this after a few test failures.

    A common solution is to use factories. If you recall from the common design patterns, factories are responsible for creating whatever you need to create, in this case records. Factory Bot is a good choice.

    Factories let you maintain simple definitions in a single place, but manage all data related to the current test in the test itself when you need to. For example:

    FactoryBot.define do
      factory :user do
        first_name "Marko"
        last_name  "Anastasov"
        phone "555-123-6788"
      end
    end

    Now your test can set the related attributes before checking for the expected outcome:

    RSpec.describe User do
      describe "#full_name" do
        let(:user) { build(:user, first_name: "Johnny", last_name: "Bravo") }
    
        it "is composed of first and last name" do
          expect(user.full_name).to eql "Johnny Bravo"
        end
      end
    end

    A good factory library will let you not just create records, but easily generate unsaved model instances, stubbed models, attribute hashes, define types of records and more — all from a single definition source. Factory Bot’s getting started guide has more examples, and I also recommend you take a look at Working Effectively with Data Factories Using FactoryBot.

    Factories pulling too many dependencies

    Factories let you specify associations, which get automatically created. For example, this is how we say that creating a new Comment should automatically create a Post that it belongs to:

    FactoryBot.define do
      factory :comment do
        post
        body "groundbreaking insight"
      end
    end

    Ask yourself if creating or instantiating that post in every call to the Comment factory is really necessary. It might be if your tests require a record that was saved in the database, and you have a validation on Comment#post_id. But that may not be the case with all associations.

    In a large system, calling one factory may silently create many associated records, which accumulates to make the whole test suite slow (more on that later). As a guideline, always try to create the smallest amount of data needed to make the test work.

    Factories that contain unnecessary data

    A spec is effectively a specification of behavior. That is how we look at it when we open one. Similarly, we look at factories as definitions of data necessary for a model to function.

    In the first factory example above, including phone in User factory was not necessary if there is not a validation of presence. If the data is not critical, just remove it.

    Factories depending on database records

    Adding a hard dependency on specific database records in factory definitions leads to build failures in CI environment. Consider the following example:

    factory :active_schedule do
      start_date Date.current - 1.month
      end_date 1.month.since(Date.current)
      processing_status "processed"
      schedule_duration ScheduleDuration.find_by_name("Custom")
    end

    It is important to know that the code for factories is executed when the Rails test environment loads. This may not be a problem locally because the test database had been created and some kind of seed structure applied some time in the past. In the CI environment, however, the builds starts from a blank database, so you will have an error before the test suite starts to run. To reproduce and identify such issue locally, you can do db:drop followed by db:setup, and then run your tests again.

    One way to fix this is to use Factory Bot’s traits:

    factory :schedule_duration do
      name "Test Duration"
    
      trait :custom do
        name "Custom"
      end
    end
    
    factory :active_schedule do
      association :schedule_duration, :custom
    end

    Keep in mind that a custom schedule_duration will be created for every active_schedule, therefore this strategy will not work if schedule_duration has a uniqueness constraint.

    Another way is to defer the initialization in a callback. This, however, adds an implicit requirement that test code defines the associated record before the parent:

    factory :active_schedule do
      before :create do |schedule|
        schedule.schedule_duration = ScheduleDuration.find_by(name: 'Custom')
      end
    end
    

    If your application *really* requires a dependency on such records and there’s no way around the issue, consider using Rails seeds. You’ll need to set it up in RSpec to run before the test suite:

    # rails_helper.rb
    RSpec.configure do |config|
      config.before :suite do
        Rails.application.load_seed
      end
    end
    

    In that case, you might want to make sure your seeds clean up themselves before creating any records:

    # db/seeds.rb
    ScheduleDuration.delete_all
    ScheduleDuration.create! name: 'Custom'
    

    Factories with random data instead of sequences

    When used alongside factories, random data generators such as faker may compromise the reliability of your test suite. Suppose the following factory definition got commited into your application:

    FactoryBot.define do
      factory :category do
        name { Faker::Lorem.word.capitalize }
      end
    end
    

    Your test suite runs smoothly for months, but you suddenly start to see random exceptions in CI that look like this:

    ActiveRecord::RecordNotUnique:
      SQLite3::ConstraintException: UNIQUE constraint failed: categories.name: INSERT INTO "categories" ("name", "created_at", "updated_at") VALUES (?, ?, ?)
    

    And the exception explodes somewhere near the following line:

    name { Faker::Lorem.word.capitalize }
    

    You’ve collected a few stack traces related to this error, and by looking at the git log you notice that some problematic specs recently sneaked into the codebase. Differently from other specs, these ones have quite a lot of calls to create(:category).

    It turns out that the categories.name field has a unique key constraint, but random data generated by faker isn’t guaranteed to be unique. In other words, an exception will be thrown whenever the same category name gets generated twice during a test.

    Factory Bot provides a solution to this problem: sequences. A sequence keeps track of an incremental counter that can be used to generate unique names, therefore you can rest assured that you’ll wind up with unique values. Here’s how to fix the above factory with a sequence:

    FactoryBot.define do
      factory :category do
        sequence(:name) { |n| "Category number #{n}" }
      end
    end
    

    While mixing unique keys with random data can be dangerous, it is not the only danger lurking in the dark: it may occur that the generated data does not meet other validation requirements, or that an obscure combination of data triggers an error that happens 1 out of 100 times, and you can’t easily figure out what combination it is.

    Given that fragile tests are among the worst enemies of a test suite and that you can avoid pulling in an extra dependency such as faker for this very purpose, you might want to avoid using random data in your factories altogether.

    Noisy setup

    This anti-pattern is commonly found in growing and legacy applications. Suppose you are testing a database query that needs to run through a deep object graph. To make sure it returns the expected data, you wire together a few objects with the help of Factory Bot:

    let(:product_1) { create(:product, name: 'iPad') }
    let(:product_sale_1) { create(:product_sale, retail_price: 500, product: product_1) }
    let(:product_2) { create(:product, name: 'iPhone') }
    let(:product_sale_2) { create(:product_sale, retail_price: 500, product: product_2) }
    let(:product_sales) { [product_sale_1, product_sale_2] }
    let(:sale) { create(:sale, name: 'Apple Bundle', product_sales: product_sales) }
    let(:user) { create(:user, name: 'Thiago') }
    let!(:line_item) { create(:order_line_item, order: order, sale: sale) }
    let(:order) { create(:order, user: user) }
    
    it 'retrieves the expected data' do
      # Run the query and make assertions
    end
    

    As you can see, there is a lot of complexity going on and it screams at the reader. We are creating a verbose sequence of low-level records that is difficult to comprehend, and its complexity is inherent to the data model and to the query range we need to cover. Since the point of our test is to interact with the database, mocking and stubbing would not help at all. Assuming other examples require inserting more than one bundle of the same structure into the database, copying and pasting the same setup would do nothing but add more noise and thus make matters even worse.

    However, that doesn’t mean we can’t express our intent clearly. To help understand what we are dealing with, we should first of all lay out the hierarchical structure of the data, which has not been made clear by the above example. We can rewire the setup like so:

    # Always make sure all the data you're working with will be verified
    # by the test, otherwise avoid redundancy.
    let(:order_line_item) do
      create(
        :order_line_item,
        order: create(
          :order,
          user: create(:user, name: 'Thiago')
        ),
        sale: create(
          :sale,
          name: 'Apple Bundle',
          product_sales: [
            build(
              :product_sale,
              retail_price: 500,
              product: create(:product, name: 'iPad')
            ),
            build(
              :product_sale,
              retail_price: 500,
              product: create(:product, name: 'iPhone')
            )
          ]
        )
      )
    end
    
    let!(:order) { order_line_item.order }
    

    This code is easier to understand, but the verbosity still remains. And there is a subtle problem: we are being forced to obtain the order through the order_line_item because dependencies exist between records, which means record creation needs to follow a strict order. The complexity of the data model is nakedly exposed, and we are forced to deal with it every time a similar arrangement is required.

    We can make our setup look more natural by creating a helper method. First, let’s imagine what an ideal call to that helper would look like:

    order = create_full_order(
      line_item: {
        sale: {
          name: 'Apple Bundle',
          products: [
            { name: 'iPad',   retail_price: 500 },
            { name: 'iPhone', retail_price: 500 }
          ]
        }
      },
      user: { name: 'Thiago' }
    )
    

    This is easier on the eyes, and it hides the complexity away by centering the attributes and relations around the order as a single abstraction. Also note that we made the product_sales relation disappear and become an internal detail. Follows a simplified implementation of the above helper:

    def create_full_order(line_item:, user:)
      products_attrs = line_item[:sale].delete(:products)
      sale_attrs = line_item[:sale]
      user_attrs = user
    
      user = create(:user, user_attrs)
      order = create(:order, user: user)
      sale = create(:sale, sale_attrs)
    
      products_attrs.each do |product_attrs|
        attrs = {
          retail_price: product_attrs.delete(:retail_price),
          product: create(:product, product_attrs),
          sale: sale,
        }
        create :product_sale, attrs
      end
    
      create :order_line_item, order: order, sale: sale
    
      order
    end
    

    Because this is the first occurrence of such a setup in our test suite, we can define the helper within the spec file itself. Now we can go on creating any number of orders in a very readable fashion:

    before do
      create_full_order(
        line_item: {
          sale: {
            name: 'Apple Bundle',
            products: [
              { name: 'iPad',   retail_price: 500 },
              { name: 'iPhone', retail_price: 500 }
            ]
          }
        },
        user: { name: 'Thiago' }
      )
    
      create_full_order(
        line_item: {
          sale: {
            name: 'Special Software Bundle',
            products: [
              { name: 'Alfred',       retail_price: 50 },
              { name: 'TextExpander', retail_price: 25 }
            ]
          }
        },
        user: { name: 'Thiago' }
      )
    end
    

    Our helper is currently limited to a single order_line_item per call, but we can make it more flexible on an as-needed basis. We can even turn the method into a class if required:

    class CreateFullOrder
      include FactoryBot::Syntax::Methods
    
      def call(line_item:, user:)
        # do some work
      end
    
      # ...
    end
    

    This smell does not have a definite answer, but in most cases resorting to a local helper is a good first step. The key idea is to not worry about a grand architecture from the beginning and to do the simplest thing possible to improve your spec, and make sure your intent is clearly expressed. Over time, if you notice other specs repeating a similar arrangement, you can always extract, reuse, and evolve the helper.

    Want to focus on writing code and not worry about how your tests run?

    Try Semaphore, the simplest and fastest CI/CD for free.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    A full-stack oriented Rubyist, Elixirist, and Javascripter who enjoys exchanging knowledge and aims for well-balanced and easy-to-maintain solutions regarding product needs. Frequently blogs at The Miners.