3 Feb 2016 · Software Engineering

    Testing Python-Requests with Betamax

    18 min read
    Contents

    Introduction

    Requests is one of the most downloaded and widely used Python libraries published on PyPI. Testing Requests, however, is something that most people attempt to avoid, or do only by using mocks and hand-written or hand-copied response data. VCRpy and Betamax are two libraries that avoid using mocks and hand-written or hand-copied data, and empower the developer to be lazy in how they write their tests by recording and replying responses for the developer.

    Betamax is most useful when a developer is attempting to test their project’s integration with a service, hence we will be performing integration testing. By the end of this tutorial, you will have learned how to use Betamax to write integration tests for a library which we will create together that communicates with Semaphore CI’s API.

    Prerequisites

    Before we get started, let’s make sure we have one of the versions of Python listed below and the packages listed:

    • Python 2.7, 3.3, 3.4, or 3.5
    • pip install betamax
    • pip install betamax-serializers
    • pip install pytest
    • pip install requests

    Setting Up Our Project

    You might want to structure your projects as follows:

    .
    ├── semaphoreci
    │   └── __init__.py
    ├── setup.py
    └── tests
        ├── __init__.py
        ├── conftest.py
        └── integration
            ├── __init__.py
            └── cassettes
    

    This ensures that the tests are easily visible and, as such, as visibly important as the project. You will notice that we have a subdirectory of the tests directory named integration. All of the tests that we will write using Betamax will live in the integration test folder. The cassettes directory inside the integration directory will store the interactions with our service that Betamax will record. Finally, we have a conftest.py file for py.test.

    Our setup.py file should look like this:

    # setup.py
    import setuptools
    
    import semaphoreci
    
    packages = setuptools.find_packages(exclude=[
        'tests',
        'tests.integration',
        'tests.unit'
    ])
    requires = [
        'requests'
    ]
    
    setuptools.setup(
        name="semaphoreci",
        version=semaphoreci.__version__,
        description="Python wrapper for the Semaphore CI API",
        author="Ian Cordasco",
        author_email="graffatcolmingov@gmail.com",
        url="https://semaphoreci.readthedocs.org",
        packages=packages,
        install_requires=requires,
        classifiers=[
            'Development Status :: 5 - Production/Stable',
            'Intended Audience :: Developers',
            'Programming Language :: Python',
            'Programming Language :: Python :: 2',
            'Programming Language :: Python :: 2.7',
            'Programming Language :: Python :: 3',
            'Programming Language :: Python :: 3.4',
            'Programming Language :: Python :: 3.5',
            'Programming Language :: Python :: Implementation :: CPython',
        ],
    )

    Our semaphoreci/__init__.py should look like this:

    """Top level module for semaphoreci API library."""
    
    __version__ = '0.1.0.dev'
    __build__ = (0, 1, 0)

    Creating Our API Client

    While you can practice test-driven development with Betamax, it is often easier to have code to test first. Let’s start by creating the submodule that will hold all of our actual client object:

    # semaphoreci/session.py
    
    class SemaphoreCI(object):
        pass

    All of Semaphore CI’s actions require the user to be authenticated, so our object will need to accept that auth_token and handle it properly for the user. Since the token is passed as a query-string parameter to the API, let’s start using a Session from Requests to manage this for us:

    # semaphoreci/session.py
    import requests
    
    
    class SemaphoreCI(object):
        def __init__(self, auth_token):
            if auth_token is None:
                raise ValueError(
                    "All methods require an authentication token. See "
                    "https://semaphoreci.com/docs/api_authentication.html "
                    "for how to retrieve an authentication token from Semaphore"
                    " CI."
                )
            self.session = requests.Session()
            self.session.params = {}
            self.session.params['auth_token'] = auth_token

    When we now make a request using self.session in a method, it will automatically attach ?auth_token=<our auth token> to the end of the URL, and we do not have to do any extra work. Now, let’s start talking to Semaphore CI’s API by writing a method to retrieve all of the authenticated user’s projects. This should look something like:

    # semaphoreci/session.py
    
    class SemaphoreCI(object):
        # ...
    
        def projects(self):
            """List the authenticated user's projects and their current status.
    
            See also
            https://semaphoreci.com/docs/projects-api.html
    
            :returns:
                list of dictionaries representing projects and their current
                status
            :rtype:
                list
            """
            url = 'https://semaphoreci.com/api/v1/projects'
            response = self.session.get(url)
            return response.json()

    If we test this manually, we can verify that it works. We’re now ready to start working on our test suite.

    Configuring the Test Suite

    First, let’s think about what we need to run integration tests and what we want:

    • We’ll need to tell Betamax where to save cassettes.
    • We’ll want our cassettes to be fairly easy to read.
    • We’ll need a way to pass a real Semaphore CI token to the tests to use but not to save (we do not want someone else using our API token for malicious purposes).

    One way to safely pass the credentials to our tests is through environment variables, and luckily Betamax has a way to sanitize cassettes.

    Now, let’s write this together line by line:

    # tests/conftest.py
    import betamax
    
    with betamax.Betamax.configure() as config:
        config.cassette_library_dir = 'tests/integration/cassettes'

    This tells Betamax where to look for cassettes or to store new ones, so it satisfies our first requirement. Next, let’s take care of our second requirement:

    # tests/conftest.py
    import betamax
    from betamax_serializers import pretty_json
    
    betamax.Betamax.register_serializer(pretty_json.PrettyJSONSerializer)
    
    with betamax.Betamax.configure() as config:
        config.cassette_library_dir = 'tests/integration/cassettes'
        config.default_cassette_options['serialize_with'] = 'prettyjson'

    This imports a custom serializer from the betamax_serializers project we installed earlier and registers that serializer with Betamax. Then, we tell Betamax that we want it to default the serialize_with cassette option to 'prettyjson' which is the name of the serializer we registered. Finally, let’s split the last item into two portions. We’ll retrieve the token first as follows:

    # tests/conftest.py
    import os
    
    api_token = os.environ.get('SEMAPHORE_TOKEN', 'frobulate-fizzbuzz')

    This will look for an environment variable named SEMAPHORE_TOKEN, and if it doesn’t exist, it will return 'frobulate-fizzbuzz'. Next, we’ll tell Betamax to look for that value and replace it with a placeholder:

    # tests/conftest.py
    with betamax.Betamax.configure() as config:
        # ...
        config.define_cassette_placeholder('<AUTH_TOKEN>', api_token)

    Finally, our conftest.py file should look like:

    import os
    
    import betamax
    from betamax_serializers import pretty_json
    
    api_token = os.environ.get('SEMAPHORE_TOKEN', 'frobulate-fizzbuzz')
    
    betamax.Betamax.register_serializer(pretty_json.PrettyJSONSerializer)
    
    with betamax.Betamax.configure() as config:
        config.cassette_library_dir = 'tests/integration/cassettes'
        config.default_cassette_options['serialize_with'] = 'prettyjson'
        config.define_cassette_placeholder('<AUTH_TOKEN>', api_token)

    We’re now all set to write our first test.

    Writing Our First Test

    Again, let’s see what we need and want out of this test:

    • We need to use an API token to interact with the API and record the response.
    • We need to use our SemaphoreCI class to list the authenticated user’s projects.
    • We want to use Betamax to record the request and response interaction.
    • We need to make assertions about what is returned from listing projects.

    Now, let’s start writing our test. First, we’ll retrieve our API token like we do in conftest.py:

    # tests/integration/test_session.py
    import os
    
    API_TOKEN = os.environ.get('SEMAPHORE_TOKEN', 'frobulate-fizzbuzz')

    Next, let’s write a test class to hold all of our integration tests related to our SemaphoreCI object:

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        def test_projects(self):
            pass

    In test_projects, we will need to create our SemaphoreCI object. Let’s go ahead and do that:

    # tests/integration/test_session.py
    import os
    
    from semaphoreci import session
    
    API_TOKEN = os.environ.get('SEMAPHORE_TOKEN', 'frobulate-fizzbuzz')
    
    
    class TestSemaphoreCI(object):
        def test_projects(self):
            semaphore_ci = session.SemaphoreCI(API_TOKEN)

    Next, we want to start using Betamax. We know that the session attribute on our SemaphoreCI instance is our Session instance from Requests. Betamax’s main API is through the Betamax object. The Betamax object takes an instance of a Session from Requests (or a subclass thereof). Knowing this, let’s create our Betamax recorder:

    # tests/integration/test_session.py
    import os
    
    import betamax
    
    from semaphoreci import session
    
    API_TOKEN = os.environ.get('SEMAPHORE_TOKEN', 'frobulate-fizzbuzz')
    
    
    class TestSemaphoreCI(object):
        def test_projects(self):
            semaphore_ci = session.SemaphoreCI(API_TOKEN)
            recorder = betamax.Betamax(semaphore_ci.session)

    The recorder we now have needs to know what cassette it should use.
    To tell it what cassette to use, we call the use_cassette method. This method will return the recorder again, so it can be used as a context manager as follows:

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        def test_projects(self):
            semaphore_ci = session.SemaphoreCI(API_TOKEN)
            recorder = betamax.Betamax(semaphore_ci.session)
            with recorder.use_cassette('SemaphoreCI_projects'):

    Inside of that context, Betamax is recording and saving all of the HTTP requests made through the session it is wrapping. Since our instance of the SemaphoreCI class uses the session created for that instance, we can now complete our test by calling the projects method and making an assertion (or more than one assertion) about the items returned.

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        def test_projects(self):
            semaphore_ci = session.SemaphoreCI(API_TOKEN)
            recorder = betamax.Betamax(semaphore_ci.session)
            with recorder.use_cassette('SemaphoreCI_projects'):
                projects = semaphore_ci.projects()
    
            assert isinstance(projects, list)

    Our complete test file now looks as follows:

    # tests/integration/test_session.py
    import os
    
    import betamax
    
    from semaphoreci import session
    
    API_TOKEN = os.environ.get('SEMAPHORE_TOKEN', 'frobulate-fizzbuzz')
    
    
    class TestSemaphoreCI(object):
        def test_projects(self):
            semaphore_ci = session.SemaphoreCI(API_TOKEN)
            recorder = betamax.Betamax(semaphore_ci.session)
            with recorder.use_cassette('SemaphoreCI_projects'):
                projects = semaphore_ci.projects()
    
            assert isinstance(projects, list)

    We can now run our tests:

    $ SEMAPHORE_TOKEN='<our-private-token>' py.test

    After that, your tests/integration directory should look similar to the following:

    tests/integration
    ├── __init__.py
    ├── cassettes
    │   └── SemaphoreCI_projects.json
    └── test_session.py
    

    We can now re-run our tests without the environment variable:

    $ py.test

    The tests will still pass, but they will not talk to Semaphore CI.
    Instead, they will use the response stored in tests/cassettes/SemaphoreCI_projects.json. They will continue to use that file until we do something so that Betamax will have to re-record it.

    Adding More API Methods

    Now that we’ve added tests, let’s add methods to:

    • List the branches on a project and
    • Rebuild the last revision of a branch.

    We want the former method in order to be able to get the appropriate branch information to make the second request. Let’s get started.

    Listing Branches on a Project

    According to Semaphore CI’s documentation about Branches, we need the project’s hash_id. So when we write our new method, it will look mostly like our last method but it will need to take a parameter. If we take the same steps we took to write our branches method, then we should arrive at something that looks like

    # semaphoreci/session.py
    
    class SemaphoreCI(object):
        # ...
    
        def branches(self, project):
            """List branches for for a project.
    
            See also
            https://semaphoreci.com/docs/branches-and-builds-api.html#project_branches
    
            :param project:
                the project's hash_id
            :returns:
                list of dictionaries representing branches
            :rtype:
                list
            """
            url = 'https://semaphoreci.com/api/v1/projects/{hash_id}/branches'
            response = self.session.get(url.format(hash_id=project))
            return response.json()

    Now, let’s write a new test for this method. To write this test, we’ll need a project’s hash_id. We can approach this in a couple ways:

    • We can hard-code the hash_id we choose to use,
    • We can select one at random from the user’s projects listing or
    • We can search the user’s projects for one with a specific name.

    Since this is a simple use case, any of these options is perfectly valid. In other projects you will need to use your best judgment. For this case, however, we’re going to search our own projects for the project named betamax.

    Now, let’s start adding to our TestSemaphoreCI class that we created earlier:

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        # ...
        def test_branches(self):
            semaphore_ci = session.SemaphoreCI(API_TOKEN)
            recorder = betamax.Betamax(semaphore_ci.session)
            with recorder.use_cassette('SemaphoreCI_branches'):
                for project in semaphore_ci.projects():
                    if project['name'] == 'betamax':
                        hash_id = project['hash_id']
                        break
                else:
                    hash_id = None

    You’ll note that we’ve duplicated some of the code from our last test.
    We’ll fix that later. You’ll also note that we’re using a different cassette in this test – SemaphoreCI_branches. This is intentional. While Betamax does not require it, it is advisable that each test have its own cassette (or even more than one cassette) and that no two tests rely on the same cassette.

    Now that we have our project’s hash_id, let’s list its branches:

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        # ...
        def test_branches(self):
            semaphore_ci = session.SemaphoreCI(API_TOKEN)
            recorder = betamax.Betamax(semaphore_ci.session)
            with recorder.use_cassette('SemaphoreCI_branches'):
                for project in semaphore_ci.projects():
                    if project['name'] == 'betamax':
                        hash_id = project['hash_id']
                else:
                    hash_id = None
                branches = semaphore_ci.branches(hash_id)
            assert len(branches) >= 1

    You’ll notice that we’re making a second API request while recording the same cassette. This is perfectly normal usage of Betamax. In the simplest case, a cassette will record one interaction, but one cassette can have multiple interactions.

    If we run our tests now:

    $ SEMAPHORE_TOKEN=<our-token> py.test

    We’ll find a new cassette – tests/integration/cassettes/SemaphoreCI_branches.json.

    Rebuilding the Latest Revision of a Branch

    Now that we can get our list of projects and the list of a project’s branches, we can write a way to trigger a rebuild of the latest revision of a branch.

    Let’s write our method:

    # semaphoreci/session.py
    
    class SemaphoreCI(object):
        # ...
    
        def rebuild_last_revision(self, project, branch):
            """Rebuild the last revision of a project's branch.
    
            See also
            https://semaphoreci.com/docs/branches-and-builds-api.html#rebuild
    
            :param project:
                the project's hash_id
            :param branch:
                the branch's id
            :returns:
                dictionary containing information about the newly triggered build
            :rtype:
                dict
            """
            url = 'https://semaphoreci.com/api/v1/projects/{hash_id}/{id}/build'
            response = self.session.post(url.format(hash_id=project, id=branch))
            return response.json()

    Now, let’s write our last test in this tutorial:

    class TestSemaphoreCI(object):
        # ...
        def test_rebuild_last_revision(self):
            """Verify we can rebuild the last revision of a project's branch."""
            semaphore_ci = session.SemaphoreCI(API_TOKEN)
            recorder = betamax.Betamax(semaphore_ci.session)
            with recorder.use_cassette('SemaphoreCI_rebuild_last_revision'):
                for project in semaphore_ci.projects():
                    if project['name'] == 'betamax':
                        hash_id = project['hash_id']
                        break
                else:
                    hash_id = None
    
                for branch in semaphore_ci.branches(hash_id):
                    if branch['name'] == 'master':
                        branch_id = branch['id']
                        break
                else:
                    branch_id = None
    
                rebuild = semaphore_ci.rebuild_last_revision(
                    hash_id, branch_id
                )
    
            assert rebuild['project_name'] == 'betamax'
            assert rebuild['result'] is None

    Like our other tests, we build a SemaphoreCI instance and a Betamax instance. This time we name our cassette SemaphoreCI_rebuild_last_revision and, in addition to looking for our project named betamax, we also look for a branch named master. Once we have our project’s hash_id and our branch’s id, we can rebuild the last revision on master for betamax. Semaphore CI’s API responds immediately so that you know the request to rebuild the last revision was successful. As such, there is no result in the response body and it will be returned as null in JSON, which Python translates to None.

    Writing Less Test Code

    At this point, you probably noticed that we’re repeating some portions of each of our tests and we can improve this by taking advantage of the fact that all of our tests are methods on a class. Before we start refactoring our tests, let’s enumerate what’s being repeated in each test:

    • We create a SemaphoreCI instance in each test,
    • We create a Betamax instance in each test,
    • We retrieve a project by its name in some tests,
    • We retrieve a branch from a project by its name in some tests.

    The last two items could actually be summarized as “We retrieve objects from the API by their name”.

    Let’s start with the first two items. We will take advantage of py.test‘s autouse fixtures and create a setup method on the class to mimic the xUnit style of testing. We will need to add pytest to our list of imports.

    # tests/integration/test_session.py
    import os
    
    import betamax
    import pytest
    
    from semaphoreci import session

    Then, we’ll add our method to our TestSemaphoreCI class.

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        """Integration tests for the SemaphoreCI object."""
    
        @pytest.fixture(autouse=True)
        def setup(self):
            pass

    Remember that we create our SemaphoreCI instance the same way in every test. Let’s do that in setup and store it on self so the tests can all access it:

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        # ...
        @pytest.fixture(autouse=True)
        def setup(self):
            self.semaphore_ci = session.SemaphoreCI(API_TOKEN)

    Now, let’s tackle the way we create our Betamax instances (which we also do in exactly the same way each time).

    # tests/integration/test_session.py
        @pytest.fixture(autouse=True)
        def setup(self):
            """Create SemaphoreCI and Betamax instances."""
            self.semaphore_ci = session.SemaphoreCI(API_TOKEN)
            self.recorder = betamax.Betamax(self.semaphore_ci.session)

    Our test_projects test will now look as follows:

    # tests/integration/test_session.py
        def test_projects(self):
            """Verify we can list an authenticated user's projects."""
            with self.recorder.use_cassette('SemaphoreCI_projects'):
                projects = self.semaphore_ci.projects()
    
            assert isinstance(projects, list)

    Note that we use self.recorder in our context manager instead of recorder and self.semaphore_ci instead of semaphore_ci. If we make similar changes to our other tests and re-run the tests, we can be confident that this continued to work.

    Next, let’s tackle finding objects in the API by name. Since our pattern is fairly simple, let’s enumerate it to ensure that we will not refactor it incorrectly.

    1. We need to iterate over the items in a collection (or list),
    2. We need to check the 'name' attribute of each object to see if it is the name that we want,
    3. If the current item is the correct one, we need to break the loop and return it,
    4. Otherwise, we want to return None.

    With that in mind, let’s write a generic method to find something by its name:

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        # ...
        @staticmethod
        def _find_by_name(collection, name):
            for item in collection:
                if item['name'] == name:
                    break
            else:
                return None
            return item

    For now, we’ll make this a method on the class, but since it does not need to know about self, we can make it a static method with the staticmethod decorator.

    Now, let’s create methods that will find a project by its name and a branch by its name using _find_by_name:

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        # ...
        def find_project_by_name(self, project_name):
            """Retrieve a project by its name from the projects list."""
            return self._find_by_name(self.semaphore_ci.projects(), project_name)
    
        def find_branch_by_name(self, project, branch_name):
            """Retrieve a branch by its name from the branches list."""
            return self._find_by_name(self.semaphore_ci.branches(project),
                                      branch_name)

    Notice that we do use self in these methods because we use self.semaphore_ci. We can now rewrite our test_rebuild_last_revision method:

    # tests/integration/test_session.py
    class TestSemaphoreCI(object):
        # ...
        def test_rebuild_last_revision(self):
            """Verify we can rebuild the last revision of a project's branch."""
            with self.recorder.use_cassette('SemaphoreCI_rebuild_last_revision'):
                project = self.find_project_by_name('betamax')
                branch = self.find_branch_by_name(project, 'master')
    
                rebuild = self.semaphore_ci.rebuild_last_revision(
                    project, branch
                )
    
            assert rebuild['project_name'] == 'betamax'
            assert rebuild['result'] is None

    That’s much easier to read now. We can go ahead and refactor our test_branches method also so that it uses find_project_by_name, and then run our tests to make sure they pass.

    Conclusion

    We now have a good, strong base to add support for more of Semaphore CI’s endpoints, while writing integration tests for each method and endpoint. Our test class has methods to help us write very clean and easy-to-read test methods. It allows us to continue to improve our testing of this small library we have created together.

    Now that we’ve laid a good foundation, you can go ahead and add some more methods and tests and start looking into how you might write unit tests for our small library.

    For those of you curious, we did start building a client library. The project can be installed:

    pip install semaphoreci
    

    Or contributed to on GitLab.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    Ian Cordasco is a maintainer, core developer, and creator of many open source libraries. Of note, Ian is a core developer of Requests and a maintainer of Flake8 as well as the founder of the Python Code Quality Authority (PyCQA).