Getting Integration Testing Right

Testing is an essential thing. You cannot be sure that your software project works if it lacks automated tests. There are dozens of awesome CI/CD tools that take the testing perspective to a new level (e.g. Semaphore).

Testing today is not just green and red icons in the IDE. Meaningful charts, descriptive logs, coverage percentage checks, and even individual running results are all available to the modern developer. Today, testing is an integral process of efficient software development.

When developers talk about testing, they usually mean unit-testing. But what about integration testing? Some people say that they are “too complicated”, and try to avoid them. This is a bad practice.

Tools, libraries, and CI/CD environments can make integration testing as straightforward as unit testing! How is this possible, you ask? Let’s find out.

Table of contents:

Unit vs Integration Testing
1. What is Unit Testing
2. What is Integration Testing
The Benefits of Integration Testing
Integration Testing Pitfalls
Known Patterns
Embrace Testcontainers
Conclusion

Unit vs Integration Tests

What is Unit Testing

The purpose of unit testing is to validate individual component behaviour. By the way, there is a small amount of ambiguity regarding the definition of the word “unit”. Is it a class? A function? Or maybe a whole package?

It depends on your preferences. First, let’s define the Test Driven Development paradigm. TDD is a software engineering approach for writing tests before the business code. You can read more about TDD here. There are two schools of TDD. The Detroit School (or Classicist) and the London School (or Mockist).

The Detroit School promotes Inside-Out design as a rule. So, when we’re starting to work on a project, the domain model comes first. The API level is the last thing to be developed. The diagram below shows the steps of Detroit TDD.

Unit tests drive the architecture flow. You can compare it to an onion where each unit test covers an application with a new layer. Acceptance testing is the last step to verify that the whole software product is functioning as intended.

There is another important detail about the Detroit School of TDD. Unit tests can interact with many components (i.e. classes). When you realize that you need a new class, you instantiate it in your test suite. In the case of Detroit, “unit” describes the tests’ isolation but not the fact you have to verify one class at a time.

London School of TDD turns this paradigm upside down (i.e. outside-in). It declares that application development should start with an API. Whilst the domain model gets developed at the last moment. Take a look at the diagram below.

In the case of the London School, acceptance testing leads the development. Unit tests are completely isolated from each other. A single unit test only has to verify one component at a time, and any class’s dependencies should be mocked.

There are two opposite opinions regarding Unit testing principles. How do we define it then? I would state the definition in this way.

A unit test is a kind of automated test that has the following features:

Unit test behaviour does not depend on any state outside the running application.
Unit tests can run in parallel without affecting each other.

The unit testing approach is visualized in the diagram below.

We test the components themselves but not their interaction with the external system. To test these interactions, we need integration tests.

What is Integration Testing

An integration test is a kind of automated test that has the following features:

An integration test verifies components’ interaction with the external system (database, message queue, etc.).
Integration tests may affect each other when run in parallel.

The goal of integration testing is to validate an application’s interaction with the external dependencies. Not a stub or a mock but the actual instance.

If you take a look at the integration testing approach diagram below, you can notice slight, but telling, differences from the unit testing approach.

Moreover, integration tests do not need to launch the entire application at once. The diagram distinguishes a database, a message queue, and mail service bindings as three separate integration tests. End-to-End (E2E) tests have the function of verifying the whole system’s correctness.

You can treat E2E tests as a superset for integration tests.

The Benefits of Integration Tests

The same developers who decry integration tests as overcomplicated and unnecessary might be asking: why are they so important? Are you sure that we can’t just get rid of them and focus on unit tests?

While it’s true that integration testing brings some obstacles (we’ll discuss them later). The fact of the matter is that sometimes unit tests are not enough.

Suppose we’re developing an online bookshop. When a user opens a book card, it should also include its average rating. We also want to track every request to view a book’s information, so analysts can have more data to base business decisions on.

Here is a possible Java implementation with the Spring Boot framework:

public interface BookRepository extends JpaRepository<Book, Long> {

  @Query("""
      SELECT b.id as id, b.name as name, AVG(r.value) as avgRating
          FROM Book b
      LEFT JOIN b.reviews r
      WHERE b.id = :id;""")
  Optional<BookCard> findBookWithAverageRating(@Param("id") long bookId);

}

@Service
public class BookService {

  private final BookRepository bookRepository;
  private final AuditService auditService;

  public BookCard getBookById(long bookId) {
    final var book =
        bookRepository.findBookWithAverageRating(bookId)
            .orElseThrow();
    auditService.bookRequested(book);
    return book;
  }

}

We could write a unit test for the BookService class here, but is it enough to be sure that the code is correct? The answer is no. Because the code is incorrect. There is a little detail that is easy to miss. Have a look at this query line:

WHERE b.id = :id;

This semicolon ; will cause an exception in runtime. So, even if you have 100% code coverage only in unit tests, you can still miss things like this.

Perhaps this example is not convincing enough. Spring developers (especially Spring Data ones) tend to notice such details. Let’s discuss something more complicated.

Assuming we need to retrieve a user by ID with roles, let’s write a possible query:

@Transactional(readOnly = true)
public class UserRepository {

  @PersistenceContext
  private EntityManager em;

  public Optional<UserView> findByIdWithRoles(Long userId) {
    List<Tuple> tuples = em.createQuery("""
            SELECT u.id, u.name, r.name FROM User u
            JOIN u.userRoles ur
            JOIN ur.role r
            WHERE u.id = :id""", Tuple.class)
        .setParameter("id", userId)
        .getResultList();
    // transform to dto
    return userView;
  }

}

The query does not produce runtime exceptions, but its behaviour is not always correct. You see, we put JOIN (alias for INNER JOIN) instead of LEFT JOIN. This means that we wouldn’t find users that have no roles.

You could test the aforementioned issues manually. Just launch the application locally and send HTTP-requests with Postman, right? Well, I would consider this approach to be incorrect.

The purpose of testing is to automate the validation of business features in order to increase the efficiency of the delivery pipeline. If you know that your code has been fully tested before merging it to the main branch, it’s much easier to deploy new versions. But, if the product is only partly verified, there is a greater chance that such a change might derail production entirely.

There is even a special term for this problem: Fear-Driven Development. Have you ever been afraid to refactor your code? Have you ever slept poorly because you merged something that was not truly validated the day before? Don’t blame yourself, because you’re not alone. Lack of integration tests causes this phenomenon.

My point is that unit tests are perfect to verify business logic because they are decoupled from implementation details. In reality, however, software applications interact with external systems, and we have to check those interactions to test our product. Integration testing is the perfect approach to solve the problem.

Integration Testing Pitfalls

We’ve talked at length about the importance of integration testing, but not about the problems associated with its implementation.

Implementing integration testing can be tricky from the get-go. How do we start? The initial step is environment installation, and it’s harder than it seems.

The application may interact with lots of external dependencies (e.g. PostgreSQL, Kafka, MongoDB, and so on).
We also have to make the setup reproducible on any machine, because most projects are being created by groups of developers.
Maintenance issues remain. For example, if one day our product starts to depend on another external service, we have to update the environment accordingly.
Finally, this whole domain has to run during the Continuous Integration process.

Known Patterns

Throughout the history of software development, programmers have proposed several patterns to approach integration testing issues.

Manual Environment Configuration

The idea is simple. If your application depends on X, install X on your computer. Every time you run integration tests, enter properties according to the configuration file.

Seems like a natural approach, right? This is what we do to launch the application, after all. So, the testing part should not be different, right? Sort of. There are a number of issues:

The developers are completely responsible for maintaining the environment. If somebody has upgraded the database version, everyone in the team should repeat it. Otherwise, the testing perspective won’t be reliable.
This adds additional difficulties to configuration. Open port management is the least of the problems.
This technique is inapplicable to CI.

You may disagree with the last point. For example, if the application needs PostgreSQL, we can run an instance on the remote server. Then anybody can connect to it during a pull request build.

Suppose there are two builds in two different projects running simultaneously. They may require completely different table structures. If both of them connect to the same database, neither will succeed.

Also, you could try to create and destroy databases dynamically on each CI build, but it would be rather laborious. There are better approaches that we’ll examine later.

Vagrant

Vagrant automates the manual approach. You need to declare your dependencies in the Vagrantfile and execute the vagrant up command. Then the instrument runs a bunch of virtual machines. Each of them represents a particular external dependency.

What are the benefits?

The tool follows the Infrastructure as code (IaC) methodology. The whole system configuration is a simple text file. The developers can change it via pull request.
You don’t have to worry about versioning or any infrastructure substitutes. Vagrant picks up all changes.
Manual managing is not required. Forget about open port issues or complex configuration obstacles.

Unfortunately, there are unresolvable points:

You have to own a powerful machine to use Vagrant, because virtual machines require way more resources than solid service launching.
The initialization will not be quick enough.
CI integration is tricky (and sometimes is impossible). Technically Vagrant is not a testing tool, rather it is a development tool. Its main purpose is to prepare the environment for the local development process. Even though you can use it for testing, it does not solve major problems.

Docker-Compose

Docker made a revolution in software development. To tell the truth, it didn’t invent anything new. Docker uses Linux namespaces and CGroups that have been around for a long time. Besides, the solutions already existed (e.g. LXC Containers), but Docker made container usage transparent and user-friendly. The simplest case requires a single docker run command.

Docker-Compose is the next evolutionary step. It allows several containers to be run on-demand. One should create a docker-compose.yml file to define all required services in a declarative way.

Sounds like a brilliant opportunity to approach the integration testing problem. Let’s have a look at what Docker-Compose offers:

Docker-Compose follows IaC methodology like Vagrant.
Since Docker is a cross-platform tool, the environment is reproducible anywhere.
Zero-configuration. You only need Docker installed, and then to execute the docker-compose up command.

So, is Docker-Compose the key? Well, almost. The CI integration is possible but rather tricky.

The problem lies within the nature of Docker-Compose itself. You see, Docker containers are inaccessible from the host operating system by default. You have to specify the ports that should accept packets and transfer them to the container. It is no surprise that those ports should not be in use by any other program. Since docker-compose.yml is a regular text file, it’s necessary to define ports statically. For example, here is a possible way to run a MySQL database:

version: '3.3'
services:
  db:
    image: mysql:5.7
    environment:
      MYSQL_DATABASE: 'db'
      MYSQL_USER: 'user'
      MYSQL_PASSWORD: 'password'
      MYSQL_ROOT_PASSWORD: 'password'
    ports:
      - '5555:3306'
    expose:
      - '3306'

The port 5555 accepts the connection. OK, looks good so far. How do we know that the port is open on the CI node? Well, there is a hack to overcome this restriction.

Put a placeholder (e.g. $MYSQL_PORT) instead of the actual port number.
Run a special script that will check all ports and choose the first accessible one. Then the process replaces the placeholder with the found port.
Run containers.
Run tests.
Stop containers.

Here is the schema showing the described algorithm.

Is that it? The silver bullet? Not quite.

1. Containers may keep running on build crashes.

Assume that something went wrong and the OS killed the process that was running the build. What will happen to containers? Nothing. They will keep running as usual. If such a scenario happened several times, that would lead to unnecessary resource consumption.

But, this issue is not the end of the world. We can create a job that runs on a schedule and terminates idle containers. It is, however, not the only problem.

2. The build itself may run inside a Docker-container.

That’s a common approach for many CI providers. It helps to run different builds in isolation. But this means that one loses the opportunity to run Docker containers.

It’s possible to run Docker containers inside another Docker container. The requirements are:

The outer container has to start in privileged mode (–privileged=true)
You should install docker inside the running container.

The thing about this approach is that you usually can’t control the properties of the container that runs your build. In this case, Docker-Compose is not a working solution.

Embrace Testcontainers

Testcontainers is a Java library that creates the required dependencies as Docker containers when the tests start running, and eventually destroys them when the tests are complete.

You might point out that this solution is not so different from Docker-Compose. We still have to deal with idle containers on build crash and the possibility of the build running itself inside Docker container. I can say that Testcontainers overcomes these obstacles. We’ll see how it’s done later.

Here is a simple Java test with JUnit5 to integrate Testcontainers. I took the code example from the library documentation.

@Testcontainers
class MixedLifecycleTests {

  // will be shared between test methods
  @Container
  private static final MySQLContainer MY_SQL_CONTAINER = new MySQLContainer();

  // will be started before and stopped after each test method
  @Container
  private PostgreSQLContainer postgresqlContainer = new PostgreSQLContainer()
      .withDatabaseName("foo")
      .withUsername("foo")
      .withPassword("secret");

  @Test
  void test() {
    assertTrue(MY_SQL_CONTAINER.isRunning());
    assertTrue(postgresqlContainer.isRunning());
  }

}

The key difference between the Docker-Compose approach and Testcontainers is dynamic configuration. The containers are described as plain Java code. This gives much more flexibility in configuring the environment.

There are also no explicit port mappings. How can the application connect to the instance, you ask? Testcontainers does the job behind the scenes. It scans the available ports and chooses an open one.

What are the benefits of using Testcontainers?

Simple configuration. You can tune the containers the way you want using the same language as the application’s code.
The environment is easy to reproduce. Even if one cannot install Docker on the machine, it’s not a big deal, because you can configure the library to connect to the Docker service on the remote host.
Dozens of ready-to-go solutions are packed into Docker containers for your use. If you don’t find the one you need, you can always apply the Generic Container.
Though Java is a primary language for Testcontainers, there are plenty of other options–for example, Rust, Python, Go, Scala or NodeJS.

That all sounds promising, but what about the potential issues? We’ve already shown that Docker-Compose might be challenging to integrate into the CI pipeline. Does Testcontainers share the same problems?

1. Containers may keep running on build crashes.

Testcontainers did have such an issue. Since Ryuk, however, container implementation is not relevant anymore. The idea is simple. Apart from the obligatory project dependencies, Testcontainers starts Ryuk. Its job is to track the health status of other containers by sending heartbeat requests. When a container stops answering, Ryuk deletes it with the corresponding image, network, and volumes.

2. The build itself may run inside a Docker-container.

The library can detect that the application itself is inside a Docker container. To overcome this obstacle you should apply the Docker wormhole pattern.

Here is the code example:

docker run -it --rm \
       -v $PWD:$PWD \
       -w $PWD \
       -v /var/run/docker.sock:/var/run/docker.sock \
       maven:3 \
       mvn test

When you run a build, you should map the volume and the working directory as the current one, and also mount the docker.sock file. Testcontainers will do the rest.

You probably won’t have to perform these configurations by yourself. Because most CI/CD tools on the market, e.g. Semaphore, support this pattern by default.

Conclusion

In the end, I can say that integration testing is tough indeed (but worth it!). On the other hand, it has never been easier than today. The thriving of modern technologies (e.g. Docker, Testcontainers, CI/CD instruments) has made it obvious and straightforward. This is certainly the case with Semaphore, which supports running Docker containers as well as the Testcontainers library usage out-of-box. Meaning that you don’t have to deal with any complex configurations–Semaphore does it for you!

I’ve heard it said that high code coverage does not prove the code’s quality. Well, I can say for sure that lack of integration tests is the marker of a buggy product.

That’s all for now. If you have any questions or suggestions, please, leave your comments down below.

Thanks for reading!