How to Stop the Flakes Before They Fly

Imagine being back in high school, eagerly preparing for the big end-of-year project that will determine your final grades. You and your classmates are really working hard on various parts, each responsible for an important part of the project assigned to them. However, there’s a catch – some students are inconsistent in their contributions (We know them!). They promise to deliver their work on time, but when the deadline approaches, they falter. The project becomes a chaotic mess, with last-minute adjustments and patches to cover the gaps left by unreliable team members.

Now, let’s transpose this scenario into the world of software development, where flaky tests play the role of unreliable classmates. Developers invest time and effort into creating a robust testing suite, only to find that some tests behave inconsistently. They worksometimes and some other times, they don’t. These flaky tests can disrupt the smooth flow of development, destroy confidence in the testing process, and ultimately hinder the delivery of a stable program or application. In this article, we will walk through the common issue of flaky tests, their detrimental impact, and most importantly, proactive strategies to prevent them from even coming up in the first place.

Test Instability

The crazy thing about flaky tests is that without changes to the code, a test that passed previously may not pass again when being tested and may pass again later. Flaky tests, exactly like the inconsistent high school classmates, are a source of frustration for developers.

Flaky tests are like unpredictable troublemakers in the software world. They don’t consistently do what they’re supposed to, making developers scratch their heads when things go wrong. Picture this: a test fails randomly, and developers are left wondering, “Is there a real problem, or is it just playing tricks on us?” Dealing with these uncertain tests eats up a lot of time and energy, creating confusion and frustration.

In a field where dependability is important, flaky tests shake the foundation of developers’ trust in their testing tools. This lack ofconfidence creates a domino effect, causing delays in getting things done, reducing how much work can be accomplished, and, in the end, lowering the overall quality of the software being developed. It’s like having a mischievous teammate that throws a wrench into the gears when you least expect it.

Common Causes of Flakiness

Dependencies on External Services

One common culprit behind flaky tests is their reliance on external services. Consider a scenario where a test interacts with a remote API. Let’s say you connected the Stripe API to manage your payment. If the API experiences downtime or responds sluggishly, the test may fail intermittently. Let’s examine this with a code example using a Nest.js project:

import { Injectable } from '@nestjs/common';
import axios from 'axios';

@Injectable()
export class FlakyService {
  async getDataFromExternalAPI(): Promise<string> {
    try {
      const response = await axios.get('https://stripeapi.example.com/data');
      return response.data;
    } catch (error) {
      // Handle errors appropriately
      return 'error';
    }
  }
}

In this example, the getDataFromExternalAPI method interacts with an external API. However, if the API is unreliable, the test could fail, highlighting the impact of external dependencies. Projects depend on external services a lot, however the behaviors of theseexternal services are inherently unpredictable. It might respond promptly and accurately one moment, only to falter with delays orerrors the next. This unpredictability becomes a breeding ground for flaky tests, especially when the test success depends on the external services responsiveness. In the example cited above, the inherent unpredictability of the external service is encapsulatedin the try…catch block, where we gracefully handle potential errors. However, such error-handling mechanisms, while necessary, do not eliminate the root cause of flakiness.

To address the unpredictability of external services, a good approach involves isolating tests from these dependencies. One effective strategy is to use mocking to create controlled, predictable responses during testing. Let us take a look at the modifiedversion of the example we cited above.

In this modified version, the FlakyService now uses the HttpService from Nest.js for making HTTP requests. This is possible thanks to dependency injection. In this test setup, the HttpService is replaced with a mock that returns predictable data. This method has effectively isolated the test from the the external service’s variability, creating a stable environment for our tests to thrive.

import { Test, TestingModule } from '@nestjs/testing';
import { FlakyService } from './flaky.service';
import { HttpService, HttpModule } from '@nestjs/common';
import { of } from 'rxjs';

describe('FlakyService', () => {
  let service: FlakyService;

  beforeEach(async () => {
    const module: TestingModule = await Test.createTestingModule({
      imports: [HttpModule],
      providers: [
        FlakyService,
        {
          provide: HttpService,
          useValue: {
            get: () => of({ data: 'mocked data' }),
          },
        },
      ],
    }).compile();

    service = module.get<FlakyService>(FlakyService);
  });

  it('should return mocked data', async () => {
    const result = await service.getExternalData().toPromise();
    expect(result.data).toEqual('mocked data');
  });
});

Race Conditions in Parallelized Tests

Parallelizing tests for speed is a common practice, but it can introduce race conditions, leading to flaky outcomes. Imagine twotests manipulating shared resources concurrently. The order of execution can affect the results, causing intermittent failures. Mostdevelopers parallelize their tests to significantly reduce the time and effort in the testing process. it leverages automation testing toexecute the same tests in multiple environments, simultaneously. While this concurrency speeds up the testing process, it introduces the potential for race conditions — situations where the outcome of a test depends on the timing and order of execution.

Let’s take a look at an example:

let sharedValue = 0;

test('User registration should create a new user', () => {
});

test('User login should succeed with valid credentials', () => {
});

Now taking a look at this, in a parallelized environment, if the login test is executed before the registration, the login might fail, dueto the absence of a registered user. And if it does succeed before the registration test, you actually have hidden a bug with a falsepositive result.

To solve the challenges of race conditions, a key strategy is test isolation. Each test should run in isolation, independent of other tests, making sure that the outcome is solely determined by the test itself and not influenced by the order of execution. This wayyou can avoid total misleading failures and false positives.

Let’s look at a possible solution for our implementation above. In the modified version below, the resetDatabaseToOneUserfunction ensures a clean state before each test, mitigating the impact of race conditions. By isolating the test, you have a better assurance that about your tests.

beforeEach(async () => {
  // Include a function that ensures a clean state for each test
  await resetDatabaseToOneUser();
});

test('User login should succeed with valid credentials', () => {
});

Another way is to make use of asynchronous patterns. When used within your tests, you can control the flow and timing ofoperations, which in turn minimizes the chances of unexpected interleaving. See the example below:

test('User login should succeed with valid credentials', async () => {
  // Simulate user login process using asynchronous operations
  await performAsyncLogin();
  // ...
})

Inconsistent Test Environments

Inconsistencies between development and testing environments can also breed flaky tests. This is because there are differences in data, configurations, or dependencies. While having different environments is good software practice, this can also cause test issues.

Inconsistent test environments arise when there are disparities between the conditions under which tests are written and those inwhich they are executed. These disparities can manifest in various forms, including differences in data, configurations, or dependencies. Let’s take a look at an example below:

import { Injectable } from '@nestjs/common';
@Injectable()
export class FlakyService {
  getData(): string {
    return process.env.NODE_ENV === 'production' ? 'production-data' : 'development-data';
  }
}

In this example, the getData method returns different values based on the environment. In a production environment, it mightfetch data from a live database, while in a development environment, it could use mocked or sample data.

Some efficient ways to manage environment consistency includes, implementing a robust configuration management to ensureuniformity across environments. Use configuration files or environment variables to control settings like database connections, API endpoints, and feature flags. Another way is to using containerization tools like Docker, which provide a consistent environment, reducing the likelihood of discrepancies between development and production setups. Another way as well, is to integrate continuous integration into your workflow. CI tools, such as Semaphore, ensure that tests run in an environment mirroring production conditions. This helps catch inconsistencies early in the development process.

Imagine a scenario where a team member develops a feature on their local machine, running tests against a local database. The tests pass, and the feature is marked as ready for deployment. However, in the staging environment, which mirrors production closely, the absence of a specific database record causes a test to fail. This discrepancy, arising from inconsistent testenvironments, could have been mitigated by adopting some efficient measures. Let’s take a look at a code example below, wherewe can modify the test environment to mimic production conditions, ensuring consistency in data and configurations:

import { Test, TestingModule } from '@nestjs/testing';
import { FlakyService } from './flaky.service';

describe('FlakyService', () => {
  let service: FlakyService;

  beforeEach(async () => {
    process.env.NODE_ENV = 'production'; // Simulating production environment
    const module: TestingModule = await Test.createTestingModule({
      providers: [FlakyService],
    }).compile();

    service = module.get<FlakyService>(FlakyService);
  });

  it('should return production data', () => {
    const result = service.getData();
    expect(result).toEqual('production-data');
  });
});

This way, tests run in a production-like environment during the testing phase, mitigating the risk of flaky tests caused by environmental inconsistencies.

Proactive Strategies for Robust Tests

Now that we have seen some common causes and how to handle them. Let’s take a look at preventive measures, strategies that ensure that flaky tests, do not get the opportunity to fly. Let’s take a look at a few!

Test Isolation

Test isolation involves the separation of tests to ensure that their execution is independent of each other. Each test should operate in isolation, with no reliance on the outcome or state of other tests. This not only makes cleaner test code but also protects against the propagation of failures, minimizing the chances of flaky tests.

import { Test, TestingModule } from '@nestjs/testing';
import { FlakyService } from './flaky.service';

describe('FlakyService', () => {
  let service: FlakyService;

  beforeEach(async () => {
    const module: TestingModule = await Test.createTestingModule({
      providers: [FlakyService],
    }).compile();

    service = module.get<FlakyService>(FlakyService);
  });

  it('should increment the counter', () => {
    service.incrementCounter();
    expect(service.getCounter()).toBe(1);
  });

  it('should reset the counter', () => {
    service.resetCounter();
    expect(service.getCounter()).toBe(0);
  });
});

Here it is evident that the test 2 is not reliant on test 1 and vice versa. This way, we ensure that the success or failure of one test doesn’t impact the execution of others, promoting a stable testing environment.

Data Management

Data management is equally important. The probability of having flaky test, could be as a result of uncontrolled or inconsistent data, which may also introduce variability. In the example below, the second test doesn’t rely on the first test to create a user, ensuring both test isolation and proper data management. Managing data properly within each test, we can ensure that each test is self-contained, reducing dependencies and minimizing the chances of flaky outcomes.

import { Test, TestingModule } from '@nestjs/testing';
import { FlakyService } from './flaky.service';

describe('FlakyService', () => {
  let service: FlakyService;

  beforeEach(async () => {
    const module: TestingModule = await Test.createTestingModule({
      providers: [FlakyService],
    }).compile();

    service = module.get<FlakyService>(FlakyService);
  });

  it('should create a user and return the user object', async () => {
    const user = await service.createUser('testuser');
    expect(user).toBeDefined();
  });

  it('should delete a user by user object', async () => {
    const user = await service.createUser('testuser');
    await service.deleteUser(user);
    const userCount = await service.getUserCount();
    expect(userCount).toBe(0);
  });
});

Test-Driven Principles

Test-Driven Development (TDD) is more than just a methodology; it’s a philosophy that places testing at the forefront of the development process. TDD principles can help developers proactively shape the design and functionality of their code while simultaneously ensuring the reliability of their tests. Let’s look at a practical application. Imagine we want to build a basic calculator service with addition and subtraction functionalities:

import { Injectable } from '@nestjs/common';

@Injectable()
export class CalculatorService {
  add(a: number, b: number): number {
    return a + b;
  }

  subtract(a: number, b: number): number {
    return a - b;
  }
}

Now, let’s implement tests for these functionalities using TDD:

import { Test, TestingModule } from '@nestjs/testing';
import { CalculatorService } from './calculator.service';

describe('CalculatorService', () => {
  let service: CalculatorService;

  beforeEach(async () => {
    const module: TestingModule = await Test.createTestingModule({
      providers: [CalculatorService],
    }).compile();

    service = module.get<CalculatorService>(CalculatorService);
  });

  it('should add two numbers', () => {
    const result = service.add(2, 3);
    expect(result).toBe(5);
  });

  it('should subtract two numbers', () => {
    const result = service.subtract(5, 2);
    expect(result).toBe(3);
  });
});

In this example, we write tests before implementing the actual functionalities. TDD encourages a cycle of “Red-Green-Refactor,” where we start with a failing (Red) test, implement the minimum code to make the test pass (Green), and then refactor the code while keeping the tests passing. This approach minimizes the chances of test flakiness, as tests are designed to reflect the expected behavior of the code.

Test Environment Consistency – Creating a Stable Foundation

Test environment consistency is a cornerstone of reliable testing. Inconsistent environments between development, testing, and production can introduce variables that lead to flaky tests. Creating a consistent test environment ensures that tests produce reliable results, irrespective of the context in which they are executed. Let’s take a look at an example:

In this example, the ExternalApiService dynamically sets the API endpoint based on the environment. Let’s write tests for this service, ensuring that the environment consistency is maintained:

import { Test, TestingModule } from '@nestjs/testing';
import { ExternalApiService } from './external-api.service';

describe('ExternalApiService', () => {
  let service: ExternalApiService;

  beforeEach(async () => {
    process.env.NODE_ENV = 'test'; // Set the environment to 'test'
    const module: TestingModule = await Test.createTestingModule({
      providers: [ExternalApiService],
    }).compile();

    service = module.get<ExternalApiService>(ExternalApiService);
  });

  it('should fetch data from the test API', () => {
    const result = service.fetchData();
    expect(result).toBe('data from external API');
  });

  afterEach(() => {
    process.env.NODE_ENV = 'development'; // Reset the environment to 'development' after each test
  });
});

In the example above, we set the environment to ‘test’ before running the tests and reset it afterward. This ensures that the testsconsistently use the same environment settings, preventing variability that could lead to flaky outcomes.

Advanced Techniques for Maximum Stability

These techniques, include Property-Based Testing and Chaos Testing, which elevate our testing practices, ensuring that our tests remain steadfast even when there are complexities and uncertainties.

Property-Based Testing Property-Based Testing (PBT) takes a really different approach than traditional unit testing. Instead of specifying expected outcomes for specific inputs, PBT explores the properties or invariants that should hold true across a range of inputs. PBT uncovers edge cases and potential issues that might go unnoticed in example-based testing. It does this , by generating a diverse set of inputs. It checks that a function, program or whatever system under test abides by a property. Let’s take a look at an example:

Assume we have a simple function to reverse arrays:

import { Injectable } from '@nestjs/common';

@Injectable()
export class ArrayUtilsService {
  reverseArray<T>(arr: T[]): T[] {
    return [...arr].reverse();
  }
}

Now, let’s use PBT to ensure the reversal function maintains a crucial property — reversing an array twice should yield the originalarray:

import fc from 'fast-check';
import { Test, TestingModule } from '@nestjs/testing';
import { ArrayUtilsService } from './array-utils.service';

describe('ArrayUtilsService', () => {
  let service: ArrayUtilsService;

  beforeEach(async () => {
    const module: TestingModule = await Test.createTestingModule({
      providers: [ArrayUtilsService],
    }).compile();

    service = module.get<ArrayUtilsService>(ArrayUtilsService);
  });

  it('should satisfy the reversal property', () => {
    fc.assert(
      fc.property(fc.array(fc.integer()), (arr) => {
        // Reversing twice
        const reversedTwice = service.reverseArray(service.reverseArray(arr));

        // Verify that the reversed array is equal to the original array
        const areEqual = JSON.stringify(arr) === JSON.stringify(reversedTwice);

        // Verify that the reverseArray method really reverses the array
        const isReversed = JSON.stringify(arr) === JSON.stringify(service.reverseArray(arr));

        // Check if both conditions are true
        return areEqual && isReversed;
      })
    );
  });
});

Here, the fc.property function generates random arrays and verifies that the reversal property holds true. PBT allows us to test a broad range of scenarios, uncovering potential issues that might go unnoticed in example-based testing.

Chaos Testing

Chaos Testing is a practice where deliberate, controlled chaos is injected into a system to assess its resilience. By introducing faults and disruptions, developers gain valuable insights into how well their systems can recover from adverse conditions.

Let’s simulate chaos in an HTTP service using the chaos-monkey-engine library:

import { Injectable, HttpService } from '@nestjs/common';
import { Observable } from 'rxjs';

@Injectable()
export class HttpServiceService {
  constructor(private readonly httpService: HttpService) {}

  fetchData(): Observable<any> {
    return this.httpService.get('https://api.example.com/data');
  }
}

Now, let’s introduce chaos testing by randomly delaying HTTP requests:

import { Test, TestingModule } from '@nestjs/testing';
import { HttpServiceService } from './http-service.service';
import ChaosMonkey from 'chaos-monkey-engine';

describe('HttpServiceService', () => {
  let service: HttpServiceService;

  beforeEach(async () => {
    const module: TestingModule = await Test.createTestingModule({
      providers: [HttpServiceService],
    }).compile();

    service = module.get<HttpServiceService>(HttpServiceService);
  });

  it('should handle delayed HTTP requests under chaos', async () => {
    const chaosMonkey = new ChaosMonkey();
    chaosMonkey.enable();

    const response = await service.fetchData().toPromise();

    chaosMonkey.disable();

    expect(response).toBeDefined();
  });
});

In this example, Chaos Testing is simulated by enabling the ChaosMonkey to randomly delay HTTP requests during the test. Thischaos injection allows us to observe how well our service responds under adverse conditions.

Tools and Frameworks That Support Property-Based Testing and Chaos Testing

To facilitate Property-Based Testing, tools like;

Fast-check for JavaScript – https://www.npmjs.com/package/fast-check
Hypothesis for Python – https://hypothesis.readthedocs.io/

offer robust support.

For Chaos Testing, frameworks such as;

Chaos-monkey-engine for JavaScript – https://www.npmjs.com/package/chaos-engine
Chaos Toolkit for Python – https://chaostoolkit.org/

provide controlled chaos injection capabilities.

Why continuous integration and continuous delivery is important

Continuous Integration (CI) and Continuous Delivery (CD), automate the testing and deployment processes, ensuring that changes are continuously validated and seamlessly delivered to production.

Early Detection of Flakiness: Continuous Integration (CI) ensures that tests are run automatically with each code commit. This early and frequent testing helps detect flaky tests immediately, allowing developers to address issues promptly. This makes it easier to address issues before they escalate and that any deviations from the expected behaviour is identified as early as possible and not piled up.
Consistent Testing Environments: CI promotes consistent testing environments by automating the setup and configuration of test environments. This minimizes discrepancies between testing environments, which ensures dependable and reproducible test results.
Automated Deployment to Staging: Continuous Delivery (CD) extends the benefits of CI by automating the deployment & testing of code changes to & on staging environments. This automation ensures that flaky tests are validated in environments closely mirroring production, reducing the chances of environment-related flakiness.
Feedback Loop for Developers: CI/CD provides a rapid feedback loop for developers. Quick turnaround times in the testing and deployment processes enable developers to iterate on their code, fix flaky tests efficiently, and maintain the overall stability of the testing pipeline.
Scheduled Daily/Weekly Runs: This gives extra validation of the system regularly. This way with frequent tests, developers can identify and address flaky tests that may not be caught during regular CI/CD processes. Adding this in the CI/CD process adds an extra layer of assurance, allowing developers to catch and address flakiness that may occur over time or under specific conditions.

Building a Testing Culture Within Your Team

The foundation of test stability is laid not just in tools and techniques but in the culture of your development team. Pushig for a testing culture involves promoting collaboration, knowledge sharing, and a shared responsibility for test quality. Here are a few tips, for cultivating a healthy testing culture in your team:

Knowledge Sharing: Encourage team members to share their testing knowledge through regular meetings, workshops, or documentation. Cross-functional expertise strengthens the team’s overall testing capabilities.
Collaborative Code Reviews: Incorporate testing considerations into code reviews. Reviewing tests alongside the code helps identify potential issues early in the development process.
Pair Programming: Foster collaboration through pair programming sessions, where team members work together on writing tests and implementing features. This collaborative approach enhances testing skills across the team.
Learning Opportunities: Provide learning opportunities for team members to explore advanced testing techniques, tools, and methodologies. This can include workshops, training sessions, or participation in relevant conferences.
Celebrate Testing Achievements: Acknowledge and celebrate achievements related to test stability. Recognizing the efforts of team members in maintaining a reliable test suite reinforces the importance of testing within the team.

Conclusion

Tackling flaky tests before they fly is extremely important. In this article, we went over the problems they pose, from uncertainty to eroded confidence. Proactive strategies, like isolating tests and managing data, act as shields against flakiness. Advanced techniques, such as property-based and chaos testing, push testing boundaries. Integrating continuous testing into development workflows not only catches flaky tests early but also cultivates a culture of reliability.

Having reliable automated tests is like having a trusty guard for our software. It ensures our applications work well and are less likely to run into problems. These tests act as a safety net, giving us the confidence to make updates and improvements without worrying about breaking things. So by minimizing flaky tests, we boost trust in our testing process, saving us valuable time and effort.