15 Jul 2022 · Software Engineering

    Working Effectively with Data Factories Using FactoryBot

    20 min read
    Contents

    This tutorial has been updated by Thiago Araújo Silva on 20 April 2018.

    Introduction

    In test-driven development, data is one of the requirements for a successful and thorough test. In order to be able to test all use cases of a given method, object or feature, you need to be able to define multiple sets of data required for the test.

    This is where the data factory pattern steps into test-driven development. Data Factory (or factory in short) is a blueprint that allows us to create an object, or a collection of objects, with predefined sets of values.

    You can have multiple sets of predefined values for a single factory. In fact, you should create them for each use case of the factory in order to be able to test all use cases of a given method, object or feature.

    Let’s take a look at a common example — we have an Article factory and we need to have multiple sets of predefined values that represent its state and/or use cases. We might have the following variations:

    • Unpublished article
    • Published article
    • Article scheduled to be published in the future
    • Article published in the past

    These are examples of predefined sets of values that need to be defined for an Article factory. The next section provides the implementation details.

    Introduction to FactoryBot

    There are several tools you can use to create a factory. In this article, we are using FactoryBot.

    FactoryBot was built using the Ruby programming language. However, you can use it for multiple frameworks, such as Ruby on Rails, Sinatra, and Padrino. This article covers the implementation of FactoryBot in a Ruby on Rails (i.e. Rails) application.

    Installation

    The following installation is specific for Rails. For frameworks other than Rails, please consult the installation documentation.

    Add the following gem to your Gemfile inside the proper group.

    group :development, :test do
      gem "factory_bot_rails"
    end

    Assuming you are using the RSpec testing framework, add the following code to the spec/support/factory_bot.rb file:

    # spec/support/factory_bot.rb
    RSpec.configure do |config|
      config.include FactoryBot::Syntax::Methods
    end

    Enable the autoloading of the support directory by uncommenting the following line in your spec/rails_helper.rb:

    Dir[Rails.root.join('spec/support/**/*.rb')].each { |f| require f }

    Usage

    Let’s take another look at the example of an Article model introduced above:

    # app/model/article.rb
    class Article < ApplicationRecord
      enum status: [:unpublished, :published]
    end

    You can create the following factory:

    # spec/factories/articles.rb
    FactoryBot.define do
      factory :article do
        trait :published do
          status :published
        end
    
        trait :unpublished do
          status :unpublished
        end
    
        trait :in_the_future do
          published_at { 2.days.from_now }
        end
    
        trait :in_the_past do
          published_at { 2.days.ago }
        end
      end
    end

    The above factory assumes the Article has the following attributes:

    • status with value :published and :unpublished. For ActiveRecord enums, this field must be an Integer.
    • published_at with type DateTime

    To use the factory, you can use any of the following statements inside your spec:

    # build creates an Article object without saving
    build :article, :unpublished
    
    # build_stubbed creates an Article object and acts as an already saved Article
    build_stubbed :article, :published
    
    # create creates an Article object and saves it to the database
    create :article, :published, :in_the_future
    create :article, :published, :in_the_past
    
    # create_list creates a collection of objects for a given factory
    # you can also use build_list and build_stubbed_list
    create_list :article, 2

    For a more detailed explanation of FactoryBot usage, you can consult the Getting Started Guide.

    Effective Patterns on Data Factory

    There are several best practices for using data factories that will improve performance and ensure test consistency if applied properly. The patterns below are ordered based on their importance:

    • Factory linting
    • Just enough data
    • Build and build_stubbed over create
    • Explicit data testing
    • Fixed time-based testing

    If you are already confident with these practices, feel free to skip the ones you are already familiar with.

    Factory Linting

    Linting is the process of analyzing code to detect potential errors, and factory linting is the process of detecting potential errors by validating attributes set in the factory.

    Factory linting is good for avoiding least expected bugs due to false positive test results, since invalid data is tested against a valid use case.

    Why Linting

    To gain a good understanding of why factory linting is important, let’s take another look at our example. In the following example, we will enable caching in our test to simulate a common configuration in the production environment.

    The following code will enable caching in the test environment:

    # config/environments/test.rb
    # change the following line to true if not already set
    config.action_controller.perform_caching = true

    Given the following models and factories:

    # app/models/author.rb
    class Author < ApplicationRecord
      has_many :articles
      validates :name, presence: true
    end
    
    # app/models/article.rb
    class Article < ApplicationRecord
      belongs_to :author
      validates :title, presence: true
    end
    
    # spec/factories/authors.rb
    FactoryBot.define do
      factory :author do
        name 'The amazing author'
      end
    end
    
    # spec/factories/articles.rb
    FactoryBot.define do
      factory :article do
        title 'The amazing article title'
      end
    end

    Now, let’s take a look at the following view and its accompanying test:

    # app/views/articles/_article.html.erb
    <% cache article do %>
      <article>
        <div class="title"><%= article.title %></div>
        <div class="author"><%= article.author.name %></div>
      </article>
    <% end %>
    
    # spec/views/articles/_article.html.erb_spec.rb
    require 'rails_helper'
    
    RSpec.describe "views/articles/_article.html.erb" do
      context "with author" do
        let(:author)  { build :author }
        let(:article) { build :article, title: 'article title', author: author }
    
        it "render title" do
          render article
          expect(rendered).to have_content 'article title'
        end
      end
    end

    The above test seems to be working, and we have a passing test result.

    $ bundle exec rspec spec/views/articles/_article.html.erb_spec.rb
    .
    
    Finished in 0.02756 seconds (files took 0.99951 seconds to load)
    1 example, 0 failures

    Now, let’s add another test context for articles without an author:

    # spec/views/articles/_article.html.erb_spec.rb
      ...
    
      context "without author" do
        let(:article) { build :article, title: 'article title' }
    
        it "render title" do
          render article
          expect(rendered).to have_content 'article title'
        end
      end
    
      ...

    Surprisingly, we still have a positive, passing test result.

    $ bundle exec rspec spec/views/articles/_article.html.erb_spec.rb
    ..
    
    Finished in 0.04788 seconds (files took 0.98726 seconds to load)
    2 examples, 0 failures

    This is a false positive test result. The expected test result should fail because the view is accessing the name attribute of a non-existent author. If we remove the cache for the article, we can clearly see the error.

    Remove the cache from the article partial:

    # app/views/articles/_article.html.erb
    <% # you can either remove or comment the cache to disable it %>
    <% # cache article do %>
    
      ...
    
    <% # end %>
    

    And then run the test again to see the error after disabling the cache:

    $ bundle exec rspec spec/views/articles/_article.html.erb_spec.rb
    .F
    
    Failures:
    
      1) views/articles/_article.html.erb without author render title
         Failure/Error: <div class="author"><%= article.author.name %></div>
    
         ActionView::Template::Error:
           undefined method `name' for nil:NilClass
         # ./app/views/articles/_article.html.erb:4:in `block in _app_views_articles__article_html_erb___3498982763894348925_70101110542560'
         # ./app/views/articles/_article.html.erb:1:in `_app_views_articles__article_html_erb___3498982763894348925_70101110542560'
         # ./spec/views/articles/_article.html.erb_spec.rb:19:in `block (3 levels) in <top (required)>'
    
    Finished in 0.07124 seconds (files took 1.76 seconds to load)
    2 examples, 1 failure
    
    Failed examples:
    
    rspec ./spec/views/articles/_article.html.erb_spec.rb:18 # views/articles/_article.html.erb without author render title

    Caching a partial is a common practice in real-life code, and not using it can result in performance loss with varying results. To fix this problem without sacrificing your application’s performance, you can follow the next section on setting up factory linting.

    Setting Up Factory Linting

    When setting up factory linting, all required attributes need to be set in the factory. This can be done easily for a new project, but note that this is not an easy task when working on an existing project. You need to consider the cost vs. the benefit of linting the factory of an already running project, and make sure that linting the existing factories and fixing the related tests does not require too much time.

    Before setting up factory linting, you need database_cleaner to clean up your database after the linting process.

    Add the following to your Gemfile:

    gem :database_cleaner, group: :test
    

    Install the gem:

    bundle install
    

    Now we need to create a rake task to perform the linting. Add the following code to lib/tasks/factory_bot.rake:

    namespace :factory_bot do
      desc "Verify that all FactoryBot factories are valid"
      task lint: :environment do
        if Rails.env.test?
          DatabaseCleaner.cleaning do
            FactoryBot.lint
          end
        else
          system("bundle exec rake factory_bot:lint RAILS_ENV='test'")
          fail if $?.exitstatus.nonzero?
        end
      end
    end

    Note that the FactoryBot.lint command can be set to run in the same process as RSpec, but that may negatively impact performance and feedback when running single tests.

    Upon running the rake task, we should get an error and be immediately asked to lint our factory to remove potential errors in our tests:

    $ bundle exec rake factory_bot:lint
    rake aborted!
    FactoryBot::InvalidFactoryError: The following factories are invalid:
    
    * article - Validation failed: Author must exist (ActiveRecord::RecordInvalid)
    /home/hu/sandbox/linting-example/lib/tasks/factory_bot.rake:6:in `block (3 levels) in <top (required)>'
    /home/hu/sandbox/linting-example/lib/tasks/factory_bot.rake:5:in `block (2 levels) in <top (required)>'
    /home/hu/.asdf/installs/ruby/2.5.1/bin/bundle:23:in `load'
    /home/hu/.asdf/installs/ruby/2.5.1/bin/bundle:23:in `<main>'
    Tasks: TOP => factory_bot:lint
    (See full trace by running task with --trace)
    rake aborted!

    You might want to set up the rake task to run in Semaphore CI as a step before the full test suite, and get into the habit of running it frequently during development.

    This is a simple demonstration of why we need factory linting. Although in this example we should use create instead of build for the object rendered in the cached view, for the sake of simplicity, this should be sufficient to demonstrate the importance of factory linting as the first guard against any potential errors that might be introduced in our test.

    Just Enough Data

    Leaving only required data inside your factory is key to having a reliable test. An unexpected bug could be introduced by putting unnecessary data inside your factory.

    Consider the following example of assigning an optional article publish date in the factory.

    # app/models/article.rb
    class Article < ApplicationRecord
      validates :title, presence: true
    end
    
    # spec/factories/articles.rb
    FactoryBot.define do
      factory :article do
        title "The amazing article title"
        published_at { DateTime.now }
      end
    end

    The following seemingly innocent view will pass the test:

    # app/views/articles/_article.html.erb
    <article>
      <div class="title"><%= article.title %></div>
      <div class="publish-date"><%= article.published_at.to_date %></div>
    </article>
    
    # spec/views/articles/_article.html.erb_spec.rb
    require 'rails_helper'
    
    RSpec.describe "articles/_article.html.erb" do
      include ActiveSupport::Testing::TimeHelpers
    
      context "with publish date" do
        let(:article) { build :article }
    
        before { travel_to Time.current }
        after  { travel_back }
    
        it "render article title and publish date" do
          render article
          expect(rendered).to have_content article.title
          expect(rendered).to have_content article.published_at.to_date
        end
      end
    end

    However, the above test isn’t testing the model correctly, since the article publish date is optional and is not required. An article without a publish date can cause errors in production. Therefore, we need to restrict the factory to set only the required attributes.

    If you need to add a set attribute to the factory, you can add it as an alternate state of the factory-using trait. In the example used above, you can add a trait to create an alternate state of the article factory that includes the article publish date. You also need to add a test for the alternate state.

    # spec/factories/articles.rb
    FactoryBot.define do
      factory :article do
        title "The amazing article title"
    
        trait :with_publish_date do
          published_at { DateTime.now }
        end
      end
    end

    You can use it as follows:

    # app/views/articles/_article.html.erb
    <article>
      <div class="title"><%= article.title %></div>
      <% if article.published_at %>
        <div class="publish-date"><%= article.published_at.to_date %></div>
      <% end %>
    </article>
    
    # spec/views/articles/_article.html.erb_spec.rb
    require 'rails_helper'
    
    RSpec.describe "articles/_article.html.erb" do
      include ActiveSupport::Testing::TimeHelpers
    
      context "with publish date" do
        before { travel_to Time.current }
        after  { travel_back }
    
        context "with publish date" do
          let(:article) { build :article, :with_publish_date }
    
          it "render article title and publish date" do
            render article
            expect(rendered).to have_content article.title
            expect(rendered).to have_content article.published_at.to_date
          end
        end
    
        context "without publish date" do
          let(:article) { build :article }
    
          it "render article title and truncated body" do
            render article
            expect(rendered).to have_content article.title
            expect(rendered).not_to have_selector ".publish-date"
          end
        end
      end
    end

    Even better, if you need to set an additional attribute in the factory, it might be a sign that the attribute is in fact required. Therefore, we need to change the model to also validate the presence of the attribute that we are about to add to the factory.

    Explicit Data Testing

    Test expectations need to use explicit factory attributes, set to provide useful information on the test. This means that the things you want to test should be set in the test files, and should reflect the state of the factory being tested. Therefore, you should not rely on factory defaults for data that is relevant to the test at hand.

    Here is a simple demonstration of this:

    # app/models/article.rb
    class Article < ApplicationRecord
      enum status: [:unpublished, :published]
    
      def self.published_in_the_past
        # we expect this method to fail first
        where(nil)
      end
    end
    
    # spec/factories/articles.rb
    FactoryBot.define do
      factory :article do
        status :unpublished
    
        trait :published do
          status :published
        end
    
        trait :in_the_past do
          published_at { 2.days.ago }
        end
    
        trait :in_the_future do
          published_at { 2.days.from_now }
        end
      end
    end
    
    # spec/models/articles_spec.rb
    require 'rails_helper'
    
    RSpec.describe Article do
      describe ".published_in_the_past" do
        let!(:unpublished_article)     { create :article }
        let!(:published_in_the_past)   { create :article, :published, :in_the_past }
        let!(:published_in_the_future) { create :article, :published, :in_the_future }
    
        it { expect(Article.published_in_the_past).to include published_in_the_past }
        it { expect(Article.published_in_the_past).not_to include unpublished_article }
        it { expect(Article.published_in_the_past).not_to include published_in_the_future }
      end
    end

    A failing test result of the articles_spec.rb doesn’t help much, because it will dump all the attributes of each factory, and we will have to skim for specific attributes, such as status and published_at. Instead, you can add a title that reflects the factory you’re testing to each of the factories.

    # spec/models/articles_spec.rb
    require 'rails_helper'
    
    RSpec.describe Article do
      describe ".published_in_the_past" do
        let!(:unpublished_article)     { create :article, title: 'unpublished article' }
        let!(:published_in_the_past)   { create :article, :published, :in_the_past, title: 'published in the past' }
        let!(:published_in_the_future) { create :article, :published, :in_the_future, title: 'published in the future' }
    
        it { expect(Article.published_in_the_past).to include published_in_the_past }
        it { expect(Article.published_in_the_past).not_to include unpublished_article }
        it { expect(Article.published_in_the_past).not_to include published_in_the_future }
      end
    end

    With this change, the error message can help you by showing the expected article according to its title:

      2) Article.published_in_the_past should not include #<Article id: 46, title: "published in the future", status: "published", published_at: "2018-04-15 14:45:55",
     author_id: nil, created_at: "2018-04-13 14:45:55", updated_at: "2018-04-13 14:45:55">
         Failure/Error: it { expect(Article.published_in_the_past).not_to include published_in_the_future }
    
           expected #<ActiveRecord::Relation [#<Article id: 44, title: "unpublished article", status: "unpublished", publ...5 14:45:55", author_id: nil, created_at: "2
    018-04-13 14:45:55", updated_at: "2018-04-13 14:45:55">]> not to include #<Article id: 46, title: "published in the future", status: "published", published_at: "20
    18-04-15 14:45:55", author_id: nil, created_at: "2018-04-13 14:45:55", updated_at: "2018-04-13 14:45:55">
           Diff:
           @@ -1,2 +1,25 @@
           -[#<Article id: 46, title: "published in the future", status: "published", published_at: "2018-04-15 14:45:55", author_id: nil, created_at: "2018-04-13 14:4
    5:55", updated_at: "2018-04-13 14:45:55">]
           +[#<Article:0x00007fc927bf04a8
           +  id: 44,
           +  title: "unpublished article",
           +  status: "unpublished",
           +  published_at: nil,
           +  author_id: nil,
           +  created_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00,
           +  updated_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00>,
           + #<Article:0x00007fc927bf0318
           +  id: 45,
           +  title: "published in the past",
           +  status: "published",
           +  published_at: Wed, 11 Apr 2018 14:45:55 UTC +00:00,
           +  author_id: nil,
           +  created_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00,
           +  updated_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00>,
           + #<Article:0x00007fc927bf0188
           +  id: 46,
           +  title: "published in the future",
           +  status: "published",
           +  published_at: Sun, 15 Apr 2018 14:45:55 UTC +00:00,
           +  author_id: nil,
           +  created_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00,
           +  updated_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00>]
    
         # ./spec/models/article_spec.rb:11:in `block (3 levels) in <top (required)>'

    However, there’s still an improvement we can make. Our test is concerned with what articles are returned, not with their attributes. Let’s take a step forward and run our expectations directly against the titles with the help of Array#map:

    require 'rails_helper'
    
    RSpec.describe Article do
      describe ".published_in_the_past" do
        before do
          create :article, title: 'unpublished article'
          create :article, :published, :in_the_past, title: 'published in the past'
          create :article, :published, :in_the_future, title: 'published in the future'
        end
    
        subject(:article_titles) { Article.published_in_the_past.map(&:title) }
    
        it { expect(article_titles).to include 'published in the past' }
        it { expect(article_titles).not_to include 'unpublished article' }
        it { expect(article_titles).not_to include 'published in the future' }
      end
    end

    Note that the subject of our test is now article_titles, and we can use before instead of let! to create the articles, since the corresponding objects are no longer used to run the expectations.

    And this change results in an even better error message, which increases the feedback quality of our test suite:

      1) Article.published_in_the_past should not include "unpublished article"
         Failure/Error: it { expect(article_titles).not_to include 'unpublished article' }
           expected ["unpublished article", "published in the past", "published in the future"] not to include "unpublished article"
         # ./spec/models/article_spec.rb:14:in `block (3 levels) in <top (required)>'
    
      2) Article.published_in_the_past should not include "published in the future"
         Failure/Error: it { expect(article_titles).not_to include 'published in the future' }
           expected ["unpublished article", "published in the past", "published in the future"] not to include "published in the future"
         # ./spec/models/article_spec.rb:15:in `block (3 levels) in <top (required)>'

    Build and build_stubbed Over create

    At times, you don’t need to use the create method. Since it saves to the database, it adds overhead to the test, and if you were to abuse the factory creation, then the overhead would be significant enough to slow down our test.

    Use build or build_stubbed for tests that don’t need to be written to the database, tests that don’t do queries, or tests that use stubs for abstracting away the complexity of the queries.

    Consider the following example that uses build over create and leverages stubs for abstracting the internal implementation of the method being tested.

    Here’s an example of a naive implementation of Article.recent:

    # app/models/article.rb
    class Article < ApplicationRecord
      def self.recent
        promoted + latest
      end
    
      def self.promoted
        # Find promoted articles
      end
    
      def self.latest
        # Find latest articles
      end
    end

    Since the implementation uses promoted and latest, you can stub each method and return articles created using build instead of create as follows:

    # spec/models/article_spec.rb
    require 'rails_helper'
    
    RSpec.describe Article do
      describe ".recent" do
        let(:latest)   { build :article, :published, title: :latest  }
        let(:promoted) { build :article, :published, title: :promoted }
    
        before do
          allow(Article).to receive(:latest).and_return([latest])
          allow(Article).to receive(:promoted).and_return([promoted])
        end
    
        it { expect(Article.recent).to include latest }
        it { expect(Article.recent).to include promoted }
      end
    end

    Keep in mind that if you are testing your view and it uses caching, you must use create for your factory, or else you can have an inconsistent test result due to the stubbing done by build_stubbed in FactoryBot.

    However, there’s something important to keep in mind when using build: it will create any associations declared in your factory. Suppose your article factory has an author association:

    # spec/factories/articles.rb
    FactoryBot.define do
      factory :article do
        name 'The amazing article'
        author
      end
    end

    When running build(:article), the articles count won’t increase, but the authors count will: which means an author will be created in the database. To overcome this surprising limitation, it’s recommended to use build_stubbed over build. We could rewrite the above example to use build_stubbed:

    # spec/models/article_spec.rb
    require 'rails_helper'
    
    RSpec.describe Article do
      describe ".recent" do
        let(:latest)   { build_stubbed :article, :published, title: :latest  }
        let(:promoted) { build_stubbed :article, :published, title: :promoted }
    
        before do
          allow(Article).to receive(:latest).and_return([latest])
          allow(Article).to receive(:promoted).and_return([promoted])
        end
    
        it { expect(Article.recent).to include latest }
        it { expect(Article.recent).to include promoted }
      end
    end

    Fixed Time-based Testing

    The relative time helper from Rails such as: 2.seconds.ago, 5.minutes.ago, or other helpers, can cause split-second test inconsistencies when used to assert time-related data. To avoid this, try to manually specify the time, instead of using the relative time helper from Rails. Consider the following example:

    create :article, published_at: "2015-04-04T17:30:05+0700"
    

    If you prefer to use the relative time helper, consider using tools like the the ActiveSupport time helpers. You can freeze the time and run the helper without risking split-second test result inconsistencies.

    You can include the ActiveSupport::Testing::TimeHelpers module globally in your RSpec configuration or directly in any tests using it. The first alternative can be setup like follows:

    # spec/rails_helper.rb
    RSpec.configure do |config|
      config.include ActiveSupport::Testing::TimeHelpers
    end

    To use it, add the following to your before/after test context:

    before do
      travel_to Time.current
    end
    
    after do
      travel_back
    end

    Conclusion

    FactoryBot is not the only tool you can use to create a data factory. There are other tools that you can choose as an alternative. You can use Fabrication, or you can check out the full list of alternatives in The Ruby Toolbox.

    To gain a better understanding of how FactoryBot works, you can read its Getting Started guide, or, if you’re curious about how things are implemented, go straight to the source. As a bonus, if you are keen to gain a good understanding of why factory is good for your application compared to the other test data strategies, consider trying an alternative strategy using Rails Fixtures.

    Using the pointers from this tutorial, you should be able to write a better factory that can be used to test your application more effectively.

    P.S. Would you like to learn how to build sustainable Rails apps and ship more often? We’ve recently published an ebook covering just that — “Rails Testing Handbook”. Learn more and download a free copy.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    Hendra is the software engineer of vidio.com. In his free time, he created whoisfy.com and also wrote on sitepoint.