27 Mar 2024 · Software Engineering

    Flaky Tests In React: Detection, Prevention and Tools

    15 min read
    Contents

    In the context of React, testing is a non-negotiable process to maintain code quality and a smooth user experience.

    However, there’s one frustrating bad news that is commonly faced when running tests in React. And that is flaky tests.

    In the simplest of words, flaky tests are tests that seem to pass most of the time but fail sometimes, all without changes to the code or test — just for no reason.

    In this guide, we’ll focus on flaky tests, particularly in React, the various causes, how to detect them, how to fix them, and the efficient tools that are used.

    Understanding Flaky Tests in React

    Flaky tests, especially in UI testing, are a common pain point for developers. It is almost unavoidable that even Google reports that around 14% of their tests were flaky.

    Here’s a short scenario to further understand flaky tests in React:

    So you wrote a test for a React component that displays a button that when clicked, sends a notification to the user. Then you run the test and it passes — all green. The next day, you run the test again, it passes — all green. Then, for the sake of finalizing it, you ran the test again one last time — red, it failed.

    Now it shouldn’t have failed because it is the same test and the same component; nothing changed, but it failed suddenly. So as expected, you ran the test again — it failed, ran it again — passed; ran it again — failed.

    Now, what exactly causes flaky tests in React?

    Common Causes of Flaky Tests in React

    Let’s clear up something before we proceed. The exact cause of flaky tests in React could be anything; it varies. This is mostly due to the dynamic nature of React components and how they interact.

    However, there are some common causes that you can keep an eye out for that could be the culprit. Let’s see them in more detail.

    External Dependencies

    Almost every React application interacts with APIs, databases, or third-party services, and as expected, tests also rely on them.

    Now, for example, you have a test that checks if a list of products is displayed after fetching data from an API. However, if the API response is slow or down, the test might fail even though the code is working correctly. This flakiness happens because the test relies on an external factor that is mostly out of your control.

    For example, here is a component that gets a list of products from an API and displays them:

    // imports ...
    export function ProductsList() {
      const [products, setProducts] = useState([]);
      useEffect(() => {
        const fetchProducts = async () => {
          try {
            const response = await fetch("https://api.example.com/products");
            const data = await response.json();
            setProducts(data);
          } catch (error) {
            console.log(error);
          }
        };
        fetchProducts();
      }, []);
    
      return (
        <ul>
          {products.map((product) => (
            <li key={product.id}>{product.name}</li>
          ))}
        </ul>
      );
    }
    

    Now the test might look something like this:

    import "@testing-library/jest-dom";
    import { render, screen, waitFor } from "@testing-library/react";
    
    describe("ProductsList", () => {
      test("should render a list of products", async () => {
        render(<ProductsList />);
        await waitFor(() => {
          expect(screen.getByText("Product 1")).toBeInTheDocument();
          expect(screen.getByText("Product 2")).toBeInTheDocument();
        });
      });
    });

    This test might work fine; however, it is flaky because it relies on an external API, which as usual comes with a lot of uncertainties like network delays or server issues. This can be fixed using mocks (more on that later).

    Timing Issues

    Who knows if a component will take longer than the expected time? A test can rely on a process that takes unpredictable time to complete; a core process in this category is animations and transitions.

    If there’s a test that checks a specific UI element before or after an animation runs, rest assured that a flaky test showing up won’t be surprising.

    Asynchronous Operations

    In React, a lot of tasks don’t happen instantly, like waiting for user input, UI updates, or fetching data from servers. If your tests don’t wait for these operations to complete before making assertions, they will return failed.

    We know in React that when a component’s state changes, the virtual DOM updates, and then the actual UI is updated asynchronously. So having tests that assert the UI state immediately after a state update might be flaky because the UI hasn’t been updated yet. Here is an instance:

    // imports here ...
    function Counter() {
      const [count, setCount] = useState(0);
      const handleClick = () => setCount((count) => count + 1);
      return (
        <>
          <p>{count}</p>
          <button onClick={handleClick}>Increment</button>
        </>
      );
    }
    
    describe("Counter", () => {
      it("should update count after click", () => {
        render(<Counter />);
        fireEvent.click(screen.getByText("Increment"));
        expect(screen.getByText("1")).toBeInTheDocument();
      });
    });

    This test seems straightforward, but flakiness comes in when the test runs the assertion before the UI has reflected the change, so it is recommended to use the waitFor utility:

    describe("Counter", () => {
      it("should update count after click", async () => {
        render(<Counter />);
        fireEvent.click(screen.getByText("Increment"));
        // Wait for the UI to update
        await waitFor(() => {
          expect(screen.getByText("1")).toBeInTheDocument();
        });
      });
    });

    Leaky State

    This happens when tests modify the global state or have side effects that weren’t accounted for. Thus, these changes can interfere with the next test that would run, leading to unexpected failures.

    This is mostly common where multiple React components rely on a state for rendering. You have Test A, which sets a state variable if a user is logged in. Now, if Test B relies on this state and doesn’t reset it before it runs, it might fail because it expects the user to be logged out.

    Flawed Tests

    Probably due to deadlines, rushing tests isn’t unheard of. Most of us have been there, we want to see greens quickly and move on to other things. But more often than not, tests written in haste often run on assumptions, and that leads to flakiness.

    Consider these components:

    export function TestA() {
      useEffect(() => {
        localStorage.setItem("user", "minato");
      }, []);
      return <p>Test A Sets user in localStorage</p>;
    }
    
    export function TestB() {
      return <p>Test B: Reads user from localStorage</p>;
    }

    Here is TestA test file:

    test("TestA sets user in localStorage", () => {
      render(<TestA />);
      expect(localStorage.getItem("user")).toBe("minato");
    });

    Here is TestB test file:

    test("TestB reads user from localStorage", () => {
      render(<TestB />);
      expect(localStorage.getItem("user")).toBeNull();
    });

    TestA sets a value in localStorage using a side effect in useEffect. The side effect isn’t cleaned up, potentially interfering with subsequent tests.

    TestB expects localStorage.getItem('user') to be null but might fail due to the leak. You can use beforeEach and afterEach to always clean up side effects.

    Impact of Flaky Tests on Development Workflow and Product Quality

    Let’s say you just finished a new search feature in your React application. You ran it through your CI/CD pipeline, and all tests passed. You ran it one more time and all passed, then you merged the code to the main branch, and the build failed! Then you go through the code, nothing seems to be wrong, you run the tests again, and…they fail.

    What just happened in this scenario is how flaky tests can create a false sense of code security. Another known impact is decreased trust in testing. When tests begin to fail at random, it is frustrating, and with time, developers tend to start ignoring flaky tests and tagging them as “expected failures.” This over time leads to buggy UIs.

    Avoiding these impacts are important and doing so early on comes with some benefits:

    • It saves time and money.
    • It ensures a stable user experience at all times.
    • It allows for a smooth and reliable CI/CD process.

    Detecting Flaky Tests in React

    There are quite several known ways developers use to detect flaky tests in React, some are manual, while others are automatic. Let’s check them out.

    Review Test Codes

    Always check out the code in the tests, especially those involving async processes, they are majorly involved in the flakiness of React tests. Also, check if the tests clean up either before a new one begins or after an old one ends.

    Jest provides two hooks suitable for this beforeEach and afterEach:

    // functions and logic here ...
    
    describe("items in correct group", () => {
      beforeEach(() => getAnimalData());
      afterEach(() => clearAnimalData());
    
      test("animals in right category", () => {
        expect(isAnimalInCategory("cat", "mammals")).toBeTruthy();
      });
    });

    Also, when reviewing test codes, can the tests run independently and perfectly without relying on external states or global variables? This is because if a test relies on external data that can’t be controlled, randomness and unpredictability set in and thereby introduce flakiness.

    Analyzing Error Handling

    A common source of flakiness in React is inadequate error handling within components. This is simply because uncaught errors can alter execution flow, which can lead to failing tests that may not even be related to the current test running.

    For instance, is the React Testing Library implemented thoughtfully? Are potential errors accounted for and handled effectively? All these count because if an error is not accounted for, it could cause tests to fail at random.

    Stable Testing Environment

    When running tests in React, the consistency of the environment is paramount. Let’s say you run some tests in your local environment, and the tests all pass, but when you run them on the CI/CD pipeline, some fail. Then environmental change could be the cause.

    Essentially, all dependencies, tools, and configurations should remain identical for each test run. Just a slight hardware configuration in a tool can lead to a flaky test flaring up.

    Logging for Insights

    To find a reason for anything failing in software development — React included, logging is among the top methods used.

    A simple console.log() can do wonders in locating the cause of a particular flaky test. Just placing log statements around your test suite can show detailed tracking of how the test execution flows, and with that, identifying patterns that lead to the test failing would be much easier.

    To make things easier, the React Testing Library provides a method screen.debug that helps in logging elements or the whole rendered document.

    function Card({ title }) {
      return <div>{title}</div>;
    }
    
    describe("Card rendering", () => {
      test("renders title", () => {
        render(<Card title='Flight to Mars' />);
        screen.debug(); // renders the document
        expect(screen.getByText("Flight to Mars")).toBeInTheDocument();
        //renders only the card component
        screen.debug(screen.getByText("Flight to Mars")); 
      });
    });

    The Order of Elements Matters

    Let’s use an example to explain this, a component renders a list of products, and you wrote a test that expects the last item to be a drink. The test might pass for now, but can you be certain that the last item will always be a drink? The product data structure may change (e.g. sorting algorithm updates or code refactoring). So never assume that data will come exactly as you expect. Instead, use unique IDs to target specific elements within the UI when testing and overall.

    The use of CI/CD Pipelines

    It is inefficient to run tests manually; sure when running it for small tests, it might be no big deal. But it becomes impractical for large React codebases or frequent test runs, let’s say 93 times — that’s a lot.

    Now, this is where automation comes in, and CI/CDs are the best at that. You can easily integrate React tests with a CI/CD pipeline to automate the process. Many CI/CD platforms, like Semaphore, have built-in features to easily detect and report flaky tests.

    They automatically run your tests whenever code changes are pushed to a repository and notify developers of flaky tests that happened during automated testing. These platforms can rerun failed tests multiple times to confirm the flakiness before marking the tests as failed (however, this would add to your bill).

    Preventing Flaky Tests in React

    Understand that, more often than not, flaky tests tend to turn out to be potential bugs when ignored. So preventing or fixing these tests as soon as possible is paramount, at least for UI stability.

    Use CI/CD Early

    This automated technology is one of the best options out there for preventing and even detecting flaky tests early in the development cycle. You can set up a CI/CD pipeline like SemaphoreCI and configure it to trigger automatic test execution. This would give you detailed reports on test failures, the stack traces, and even logs of how, when, and why a flaky test occurred.

    Structure Your Tests Well

    This is simple, a well-structured test is easier to maintain, because as a codebase grows, so does the number of tests and the higher chances of getting flaky tests.

    • Tests should be independent to prevent a chain reaction whereby because a test fails randomly, other tests after it might also show the same behavior.
    • Tests should have meaningful names.
    • A test should set up its own component instances when it runs.
    • Consider having tests that run to check if dependencies or data are available or functions properly before running core tests.

    Minimize Fixed Wait Times

    Fixed wait times (e.g. setTimeout) should be used as minimally as possible, as they are unpredictable, especially during UI changes or animations. Instead, use events, async/await, or promises to deal with these situations; it is much more efficient.

    Let’s say a test clicks a button to open a modal that runs for 500ms. Instead of using a fixed wait of 500ms in the testing, you should use waitFor that would run the test assertion after the timer ends.

    Here is an example that illustrates this:

    function Modal() {
      const [isOpen, setIsOpen] = useState(false);
      const handleOpen = () => {
        setTimeout(() => setIsOpen(true), 500);
      };
    
      return (
        <>
          {isOpen && <div data-testid='modal'>This is the modal</div>}
          <button onClick={handleOpen}>Open Modal</button>
        </>
      );
    }

    Let’s create a test that would be problematic due to fixed wait times:

    test("Modal opens", async () => {
      render(<Modal />);
      fireEvent.click(screen.getByText("Open Modal"));
    
      setTimeout(() => {
        expect(screen.getByTestId("modal")).toBeInTheDocument();
      }, 500);
    });

    This would show as the test passed, however, the assertion didn’t run before the test ended. We can fix it by using async/awaitand waitFor

    test("Modal opens", async () => {
      render(<Modal />);
      fireEvent.click(screen.getByText("Open Modal"));
    
      await waitFor(() => expect(screen.getByTestId("modal")).toBeInTheDocument());
    });

    Now the test would wait for the modal to open before checking its presence through an assertion.

    Be Mindful of Dynamic Data

    Pay attention to random or unpredictable data that can change uncontrollably during multiple test runs. An example is UUIDs, they are really good for react optimization, however, during testing, because of their randomly generated nature, these UUIDs would differ in multiple test runs, which leads to flaky results. Instead, a predictable pattern can be used, like incrementing a counter (just for testing purposes).

    Other dynamic data are user inputs and dates.

    Use of Mocks

    Mocks are best for replacing external dependencies with more of a dummy-controlled version that the components can use during testing. This gives the test more predictable behavior, and one doesn’t have to deal with the inconsistent nuances of external dependencies.

    Let’s revisit the ProductsList code example we used in External Dependencies as a cause of flaky tests in React. Here is how a mock can help:

    const mockProducts = [
      { id: 1, name: "Product 1" },
      { id: 2, name: "Product 2" },
    ];
    
    describe("ProductsList", () => {
      test("should render a list of products", async () => {
        global.fetch = jest.fn().mockResolvedValue({
          json: jest.fn().mockResolvedValue(mockProducts),
        });
        render(<App />);
        await waitFor(() => {
          expect(global.fetch).toHaveBeenCalledWith(
            "https://api.example.com/products"
          );
        });
        await waitFor(() => {
          expect(screen.getByText("Product 1")).toBeInTheDocument();
        });
        await waitFor(() => {
          expect(screen.getByText("Product 1")).toBeInTheDocument();
        });
      });
    });

    In this test, mockProducts array contains dummy data that’d be used in place of the original fetched API data. Then jest.fn() is used to mock the global fetch function so we can control its behavior within the test.

    The mock is then configured to be a successful fetch response (mockResolvedValue) that returns the mockProducts array. After which, we run the usual assertions.

    With this in place, we can be certain that the test focuses mainly on its code logic, isolating it from external factors.

    Fix flaky tests as soon as they show up

    This is because if a test shows up all of a sudden as flaky and you put a tag on it—you’ll fix it later. However, when that time comes and you run the tests, it may continue to pass every time, and the flakiness might not show up.

    It doesn’t mean it is fixed; it could mean that the particular reason you got a flaky test initially was due to the time of day. Now you’d have to wait for the flaky test to show up again, or you risk pushing the code to production and hoping all works out well (not recommended).

    Writing Stable Tests

    Here are some good practices for writing stable tests in React:

    • Each test should focus on the behavior of a specific React component in isolation
    • Start with smaller tests, it makes the tests easier to understand.
    • Use beforeEach and afterEach methods to ensure each test starts with a clean slate
    • waitFor and act are good options for handling async operations.
    • Write synchronous tests unless the functionality explicitly involves asynchronous operations.
    • If your testing tool has support for snapshot testing, use it. It makes things easier.
    • Don’t mindlessly kick off flaky tests, instead, you can put a flag on it and fix it later. However, the faster you fix it, the better.
    • Document your tests to explain what they are testing and why.

    Conclusion

    React, being a UI library, has its own fair share of challenges when it comes to testing. However, we all learn from past mistakes, so if you aren’t getting the hang of solving the flaky tests you are facing at the moment. Just know it is normal, and with time, the more flaky tests you encounter and fix, the less your tests will become flaky and get overall better.

    One thought on “Flaky Tests In React: Detection, Prevention and Tools

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    I'm a web developer and technical content writer. I'm all about unleashing the power of code and words. I take pride in crafting interesting web apps or explaining complex concepts in a way that even a grandma would understand. In my spare time, I like to tinker with algorithms and AI to build my own games. Because what's more fun than making a computer do what you want it to do?
    Avatar
    Reviewed by:
    I picked up most of my skills during the years I worked at IBM. Was a DBA, developer, and cloud engineer for a time. After that, I went into freelancing, where I found the passion for writing. Now, I'm a full-time writer at Semaphore.