Building and testing an api wrapper in python

Building and Testing an API Wrapper in Python

Learn how to write and test a custom Python library to interact with an HTTP API.

Brought to you by

Semaphore

Introduction

Most websites we use provide an HTTP API to enable developers to access their data from their own applications. For developers utilizing the API, this usually involves making some HTTP requests to the service, and using the responses in their applications. However, this may get tedious since you have to write HTTP requests for each API endpoint you intend to use. Furthermore, when a part of the API changes, you have to edit all the individual requests you have written.

A better approach would be to use a library in your language of choice that helps you abstract away the API's implementation details. You would access the API through calling regular methods provided by the library, rather than constructing HTTP requests from scratch. These libraries also have the advantage of returning data as familiar data structures provided by the language, hence enabling idiomatic ways to access and manipulate this data.

In this tutorial, we are going to write a Python library to help us communicate with The Movie Database's API from Python code.

By the end of this tutorial, you will learn:

  • How to create and test a custom library which communicates with a third-party API and
  • How to use the custom library in a Python script.

Prerequisites

Before we get started, ensure you have one of the following Python versions installed:

  • Python 2.7, 3.3, 3.4, or 3.5

We will also make use of the Python packages listed below:

  • requests - We will use this to make HTTP requests,
  • vcrpy - This will help us record HTTP responses during tests and test those responses, and
  • pytest - We will use this as our testing framework.

Project Setup

We will organize our project as follows:

.
├── requirements.txt
├── tests
│   ├── __init__.py
│   ├── test_tmdbwrapper.py
│   └── vcr_cassettes
└── tmdbwrapper
    └── __init__.py
    └── tv.py

This sets up a folder for our wrapper and one for holding the tests. The vcr_cassettes subdirectory inside tests will store our recorded HTTP interactions with The Movie Database's API.

Our project will be organized around the functionality we expect to provide in our wrapper. For example, methods related to TV functionality will be in the tv.py file under the tmdbwrapper directory.

We need to list our dependencies in the requirements.txt file as follows. At the time of writing, these are the latest versions. Update the version numbers if later versions have been published by the time you are reading this.

requests==2.11.1
vcrpy==1.10.3
pytest==3.0.3

Finally, let's install the requirements and get started:

pip install -r requirements.txt

Test-driven Development

Following the test-driven development practice, we will write the tests for our application first, then implement the functionality to make the tests pass.

For our first test, let's test that our module will be able to fetch a TV show's info from TMDb successfully.

# tests/test_tmdbwrapper.py

from tmdbwrapper import TV

def test_tv_info():
    """Tests an API call to get a TV show's info"""

    tv_instance = TV(1396)
    response = tv_instance.info()

    assert isinstance(response, dict)
    assert response['id'] == 1396, "The ID should be in the response"

In this initial test, we are demonstrating the behavior we expect our complete module to exhibit. We expect that our tmdbwrapper package will contain a TV class, which we can then instantiate with a TMDb TV ID. Once we have an instance of the class, when we call the info method, it should return a dictionary containing the TMDb TV ID we provided under the 'id' key.

To run the test, execute the py.test command from the root directory. As expected, the test will fail with an error message that should contain something similar to the following snippet:

    ImportError while importing test module '/Users/kevin/code/python/tmdbwrapper/tests/test_tmdbwrapper.py'.
    'cannot import name TV'
    Make sure your test modules/packages have valid Python names.

This is because the tmdbwrapper package is empty right now. From now on, we will write the package as we go, adding new code to fix the failing tests, adding more tests and repeating the process until we have all the functionality we need.

Implementing Functionality in Our API Wrapper

To start with, the minimal functionality we can add at this stage is creating the TV class inside our package.

Let's go ahead and create the class in the tmdbwrapper/tv.py file:

# tmdbwrapper/tv.py

class TV(object):
  pass

Additionally, we need to import the TV class in the tmdbwrapper/__init__.py file, which will enable us to import it directly from the package.

# tmdbwrapper/__init__.py

from .tv import TV

At this point, we should re-run the tests to see if they pass. You should now see the following error message:

    >        tv_instance = TV(1396)
    E       TypeError: object() takes no parameters

We get a TypeError. This is good. We seem to be making some progress. Reading through the error, we can see that it occurs when we try to instantiate the TV class with a number. Therefore, what we need to do next is implement a constructor for the TV class that takes a number. Let's add it as follows:

# tmdbwrapper/tv.py

class TV(object):
  def __init__(self, id):
        pass

As we just need the minimal viable functionality right now, we will leave the constructor empty, but ensure that it receives self and id as parameters. This id parameter will be the TMDb TV ID that will be passed in.

Now, let's re-run the tests and see if we made any progress. We should see the following error message now:

>       response = tv_instance.info()
E       AttributeError: 'TV' object has no attribute 'info'

This time around, the problem is that we are using the info method from the tv_instance, and this method does not exist. Let's add it.

# tmdbwrapper/tv.py

class TV(object):
    def __init__(self, id):
        pass

    def info(self):
        pass

After running the tests again, you should see the following failure:

    >       assert isinstance(response, dict)
    E       assert False
    E        +  where False = isinstance(None, dict)

For the first time, it's the actual test failing, and not an error in our code. To make this pass, we need to make the info method return a dictionary. Let's also pre-empt the next failure we expect. Since we know that the returned dictionary should have an id key, we can return a dictionary with an 'id' key whose value will be the TMDb TV ID provided when the class is initialized.

To do this, we have to store the ID as an instance variable, in order to access it from the info function.

# tmdbwrapper/tv.py

class TV(object):
    def __init__(self, id):
        self.id = id

    def info(self):
        return {'id': self.id}

If we run the tests again, we will see that they pass.

Writing Foolproof Tests

You may be asking yourself why the tests are passing, since we clearly have not fetched any info from the API. Our tests were not exhaustive enough. We need to actually ensure that the correct info that has been fetched from the API is returned.

If we take a look at the TMDb documentation for the TV info method, we can see that there are many additional fields returned from the TV info response, such as poster_path, popularity, name, overview, and so on.

We can add a test to check that the correct fields are returned in the response, and this would in turn help us ensure that our tests are indeed checking for a correct response object back from the info method.

For this case, we will select a handful of these properties and ensure that they are in the response. We will use pytest fixtures for setting up the list of keys we expect to be included in the response.

Our test will now look as follows:

# tests/test_tmdbwrapper.py

from pytest import fixture
from tmdbwrapper import TV

@fixture
def tv_keys():
    # Responsible only for returning the test data
    return ['id', 'origin_country', 'poster_path', 'name',
              'overview', 'popularity', 'backdrop_path',
              'first_air_date', 'vote_count', 'vote_average']

def test_tv_info(tv_keys):
    """Tests an API call to get a TV show's info"""

    tv_instance = TV(1396)
    response = tv_instance.info()

    assert isinstance(response, dict)
    assert response['id'] == 1396, "The ID should be in the response"
    assert set(tv_keys).issubset(response.keys()), "All keys should be in the response"

Pytest fixtures help us create test data that we can then use in other tests. In this case, we create the tv_keys fixture which returns a list of some of the properties we expect to see in the TV response. The fixture helps us keep our code clean, and explicitly separate the scope of the two functions.

You will notice that the test_tv_info method now takes tv_keys as a parameter. In order to use a fixture in a test, the test has to receive the fixture name as an argument. Therefore, we can make assertions using the test data. The tests now help us ensure that the keys from our fixtures are a subset of the list of keys we expect from the response.

This makes it a lot harder for us to cheat in our tests in future, as we did before.

Running our tests again should give us a constructive error message which fails because our response does not contain all the expected keys.

Fetching Data from TMDb

To make our tests pass, we will have to construct a dictionary object from the TMDb API response and return that in the info method.

Before we proceed, please ensure you have obtained an API key from TMDb by registering. All the available info provided by the API can be viewed in the API Overview page and all methods need an API key. You can request one after registering your account on TMDb.

First, we need a requests session that we will use for all HTTP interactions. Since the api_key parameter is required for all requests, we will attach it to this session object so that we don't have to specify it every time we need to make an API call. For simplicity, we will write this in the package's __init__.py file.

# tmdbwrapper/__init__.py

import os
import requests

TMDB_API_KEY = os.environ.get('TMDB_API_KEY', None)

class APIKeyMissingError(Exception):
    pass

if TMDB_API_KEY is None:
    raise APIKeyMissingError(
        "All methods require an API key. See "
        "https://developers.themoviedb.org/3/getting-started/introduction "
        "for how to retrieve an authentication token from "
        "The Movie Database"
    )
session = requests.Session()
session.params = {}
session.params['api_key'] = TMDB_API_KEY

from .tv import TV

We define a TMDB_API_KEY variable which gets the API key from the TMDB_API_KEY environment variable. Then, we go ahead and initialize a requests session and provide the API key in the params object. This means that it will be appended as a parameter to each request we make with this session object. If the API key is not provided, we will raise a custom APIKeyMissingError with a helpful error message to the user.

Next, we need to make the actual API request in the info method as follows:

# tmdbwrapper/tv.py

from . import session

class TV(object):

    def __init__(self, id):
        self.id = id

    def info(self):
        path = 'https://api.themoviedb.org/3/tv/{}'.format(self.id)
        response = session.get(path)
        return response.json()

First of all, we import the session object that we defined in the package root. We then need to send a GET request to the TV info URL that returns details about a single TV show, given its ID. The resulting response object is then returned as a dictionary by calling the .json() method on it.

There's one more thing we need to do before wrapping this up. Since we are now making actual API calls, we need to take into account some API best practices. We don't want to make the API calls to the actual TMDb API every time we run our tests, since this can get you rate limited.

A better way would be to save the HTTP response the first time a request is made, then reuse this saved response on subsequent test runs. This way, we minimize the amount of requests we need to make on the API and ensure that our tests still have access to the correct data. To accomplish this, we will use the vcr package:

# tests/test_tmdbwrapper.py
import vcr

@vcr.use_cassette('tests/vcr_cassettes/tv-info.yml')
def test_tv_info(tv_keys):
    """Tests an API call to get a TV show's info"""

    tv_instance = TV(1396)
    response = tv_instance.info()

    assert isinstance(response, dict)
    assert response['id'] == 1396, "The ID should be in the response"
    assert set(tv_keys).issubset(response.keys()), "All keys should be in the response"

We just need to instruct vcr where to store the HTTP response for the request that will be made for any specific test. See vcr's docs on detailed usage information.

At this point, running our tests requires that we have a TMDB_API_KEY environment variable set, or else we'll get an APIKeyMissingError. One way to do this is by setting it right before running the tests, i.e. TMDB_API_KEY='your-tmdb-api-key' py.test.

Running the tests with a valid API key should have them passing.

Adding More Functions

Now that we have our tests passing, let's add some more functionality to our wrapper. Let's add the ability to return a list of the most popular TV shows on TMDb. We can add the following test:

# tests/test_tmdbwrapper.py

@vcr.use_cassette('tests/vcr_cassettes/tv-popular.yml')
def test_tv_popular():
    """Tests an API call to get a popular tv shows"""

    response = TV.popular()

    assert isinstance(response, dict)
    assert isinstance(response['results'], list)
    assert isinstance(response['results'][0], dict)
    assert set(tv_keys).issubset(response['results'][0].keys())

Note that we are instructing vcr to save the API response in a different file. Each API response needs its own file.

For the actual test, we need to check that the response is a dictionary and contains a results key, which contains a list of TV show dictionary objects. Then, we check the first item in the results list to ensure it is a valid TV info object, with a test similar to the one we used for the info method.

To make the new tests pass, we need to add the popular method to the TV class. It should make a request to the popular TV shows path, and then return the response serialized as a dictionary.
Let's add the popular method to the TV class as follows:

# tmdbwrapper/tv.py

  @staticmethod
  def popular():
      path = 'https://api.themoviedb.org/3/tv/popular'
      response = session.get(path)
      return response.json()

Also, note that this is a staticmethod, which means it doesn't need the class to be initialized for it to be used. This is because it doesn't use any instance variables, and it's called directly from the class.

All our tests should now be passing.

Taking Our API Wrapper for a Spin

Now that we've implemented an API wrapper, let's check if it works by using it in a script. To do this, we will write a program that lists out all the popular TV shows on TMDb along with their popularity rankings. Create a file in the root folder of our project. You can name the file anything you like — ours is called testrun.py.

# example.py

from __future__ import print_function
from tmdbwrapper import TV

popular = TV.popular()

for number, show in enumerate(popular['results'], start=1):
    print("{num}. {name} - {pop}".format(num=number,
                                         name=show['name'], pop=show['popularity']))

If everything is working correctly, you should see an ordered list of the current popular TV shows and their popularity rankings on The Movie Database.

Filtering Out the API Key

Since we are saving our HTTP responses to a file on a disk, there are chances we might expose our API key to other people, which is a Very Bad Idea™, since other people might use it for malicious purposes. To deal with this, we need to filter out the API key from the saved responses. To do this, we need to add a filter_query_parameters keyword argument to the vcr decorator methods as follows:

@vcr.use_cassette('tests/vcr_cassettes/tv-popular.yml', filter_query_parameters=['api_key'])

This will save the API responses, but it will leave out the API key.

Continuous Testing on Semaphore CI

Lastly, let's add continuous testing to our application using Semaphore CI.

We want to ensure that our package works on various platforms and that we don't accidentally break functionality in future versions. We do this through continuous automatic testing.

Ensure you've committed everything on Git, and push your repository to GitHub or Bitbucket, which will enable Semaphore to fetch your code. Next, sign up for a free Semaphore account, if don't have one already. Once you've confirmed your email, it's time to create a new project.

Follow these steps to add the project to Semaphore:

  1. Once you're logged into Semaphore, navigate to your list of projects and click the "Add New Project" button:

    Add New Project Screen

  2. Next, select the account where you wish to add the new project.

    Select Account Screen

  3. Select the repository that holds the code you'd like to build:

    Select Repository Screen

  4. Configure your project as shown below:

    Project Configuration Screen

Finally, wait for the first build to run.

It should fail, since as we recall, the TMDB_API_KEY environment key is required for the tests to run.

Navigate to the Project Settings page of your application and add your API key as an environment variable as shown below:

Add environment variable screen

Make sure to check the Encrypt content checkbox when adding the key to ensure the API key will not be publicly visible. Once you've added that and re-run the build, your tests should be passing again.

Conclusion

We have learned how to write a Python wrapper for an HTTP API by writing one ourselves. We have also seen how to test such a library and what are some best practices around that, such as not exposing our API keys publicly when recording HTTP responses.

Adding more methods and functionality to our API wrapper should be straightforward, since we have set up methods that should guide us if we need to add more. We encourage you to check out the API and implement one or two extra methods to practice. This should be a good starting point for writing a Python wrapper for any API out there.

Please reach out with any questions or feedback that you may have in the comments section below. You can also check out the complete code and contribute on GitHub.

2eb9f1ee14da7fb9701287b2d7b7754d
Kevin Ndung'u Gathuku

Kevin is a Full Stack Web Developer specializing in Python, JavaScript and React. He occasionally blogs about his experiences. Find him online under the username @kevgathuku.

on this tutorial so far.
User deleted author {{comment.createdAt}}

Edited on {{comment.updatedAt}}

Cancel

Sign In You must be logged in to comment.