Our new ebook “CI/CD with Docker & Kubernetes” is out. Download it here.

Reproducible Node Builds With npm ci

Reproducible Node Builds with npm ci

Less famous than its sibling, npm install, with npm clean-install (npm ci for short), your CI/CD process becomes more robust. Here’s how to use it.

What Is npm?

Every developer who has worked with anything related to the web has used or heard about Node Package Manager: npm. npm is a command-line utility that ships with Node.js. Its primary function is to install JavaScript modules from the Node official repository.

The typical install invocation is:

$ npm install -s MODULE_NAME

This does a number of things:

  1. Searches for the module by name.
  2. Downloads and installs the module and its dependencies.
  3. Updates (or creates) package-lock.json. This file is called the lockfile, and it lists the URL and checksum of every module installed.
  4. Adds the module name and version to package.json. This file is known as the manifest.

The key to reproducibility lies in the lockfile, package-lock.json. The next time we run npm install, the package manager will compare it with the contents of node_modules, the folder that contains every JavaScript module for the current project, and install any missing modules. npm will use package-lock.json to make sure it downloads the same files as it did the first time, even if newer compatible versions were released since.

So, What’s Wrong With npm install?

If we look closely, npm install has been designed with developer convenience in mind. And it shows, npm is one of my favorite tools and one reason I love working with Node.

The thing is that the install algorithm can be too clever sometimes. See what happens when the package-lock.json and package.json are not in sync.

Suppose I install a new dependency in my Node project:

$ npm install -s axios

+ axios@0.20.0
added 2 packages from 4 contributors and audited 2 packages in 1.269s

Everything looks fine in my machine, so I commit the change:

$ git add mycode.js package.json
$ git commit -m "add axios dependency"
$ git push origin mybranch

Did you see my mistake? That’s right: I forgot to add the lockfile into the commit. Sometime later, when a second developer pulls my branch, npm won’t know the exact version I intended initially. That information was on the lockfile, and I forgot to include it in the commit.

At this point, you may be saying: “but the manifest does include the module version”. You’re right, the manifest lists it in this form:

"dependencies": {
  "axios": "^0.21.0"
}

However, this doesn’t necessarily correspond to an exact version. Node encourages the use of a semantic versioning scheme. The ^ symbol in my manifest means I accept any minor release that is equal or greater than 0.21.0. Thus, npm may install newer versions released in the interim like 0.21.1,0.22.0, or 0.23.1, which in theory should be compatible, but may not.

Two Sources of Truth

The npm install algorithm first checks if package.json and package-lock.json match. If they do, npm follows the lockfile alone. But if they don’t, npm takes the manifest as canonical and updates the lockfile accordingly.

This behavior is by design. Kat Marchán, the developer who wrote package-lock.json and later npm ci, said they did it this way when they realized that people were changing dependencies by hand in package.json.

Most times, when the lockfile and manifest don’t match, npm install does the right thing and gets the version originally intended by the committer, but there are no guarantees. Other developers may end up having slightly different versions, leading to the “works in my machine” syndrome.

What’s worse is that artifacts generated by the CI/CD pipeline will inexorably change over time, contributing to general instability and causing hard-to-diagnose, hard-to-reproduce errors.

npm ci: A Stricter Install

The npm clean-install command (or npm ci for short) is an in-place replacement for npm install with two major differences:

  • It does a clean install: if the node_modules folder exists, npm deletes it and installs a fresh one.
  • It checks for consistency: if package-lock.json doesn’t exist or if it doesn’t match the contents of package.json, npm stops with an error.

Think of npm ci as a stricter version of npm install, one that doesn’t accept inconsistencies of any kind (it would have flagged the mistake I made earlier).

Trying Out npm ci in Semaphore

The good news is that npm ci and npm install are interchangeable. So you can keep with the comfort of npm install on your development machine while switching to npm ci in your continuous integration environment for extra safety.

Let’s try using npm ci in one of Semaphore’s quick fork-and-run demos. To continue, you’ll need a Semaphore account. You can sign up for free by clicking on the Sign up with GitHub button.

Once logged in, create a new project by clicking on +New Project on the top-right corner. Then, choose the JavaScript demo. Alternatively, you can fork the demo repository on GitHub.

New Project

This will clone a new repository on GitHub and configure a sample pipeline:

Repository on GitHub
First run

Now that we know the demo works, we’ll change the pipeline. Click on Edit Workflow to open the workflow builder:

Edit workflow

Click on the Install Dependencies block to show the two jobs inside.

Install Dependencies

One of the first things to realize is that it doesn’t make sense to use Semaphore’s cache to persist node_modules between jobs. npm ci always deletes this folder before installing.

Make the following changes in both jobs:

  1. Completely remove the cache restore … and cache store … lines.
  2. Replace npm install with npm ci.
Remove the cache

Repeat these steps in the rest of the blocks. Then, click on Run the workflow > Start.

Run the workflow

From now on, when someone forgets to commit package-lock.json or package.json, the pipeline will catch the error before it can do any harm.

JavaScript Pipeline

Install vs. Clean Install: Which Is Better?

On the one hand, npm ci behavior is safer and saner; it may prevent a lot of trouble down the road. Besides, because the install process is simple, it runs faster than npm install. On the other hand, using it means we can’t benefit from the cache to speed up the build.

So, which is better? It depends. I can think of three scenarios:

Scenario 1: you don’t need the cache

If you are already not using the cache, or if taking it out barely puts a dent in build time, go for the safest possible level and change every npm install for an npm ci in your pipeline—as we did in the example.

Scenario 2: you absolutely need the cache

If you can’t afford to slow the CI pipeline at all, keep npm install and use the cache as usual. Nevertheless, consider switching to a npm ci in the continuous delivery or deployment pipelines. For example, you can switch to npm ci in your Dockerfiles in your deployment stage. That way, you’ll know for sure which modules are included in the production release.

Scenario 3: you want to use both the cache and npm ci

Here, you’d like to use npm ci, but removing the cache just makes the pipeline a bit too slow. The solution is to replace the first appearance of npm install in your pipeline with npm ci and cache the node_modules folder right away. The ensuing jobs would use cached modules that you know are consistent. This option sits in between the two previous scenarios and balances speed and consistency.

Conclusion

Any change that saves us from making a mistake—no matter how small—is welcome. I hope that this post helps you find the best trade-off between speed, convenience, and reliability for your JavaScript projects.

Read more about JavaScript and Node:

Have a comment? Join discussions on our forum.
Post a comment

Sign up for a weekly Semaphore newsletter