NEW Run your Ruby tests in a few minutes and ship faster with Boosters, one-click auto parallelization · Learn more…
×

Semaphore Blog

News and updates from your friendly continuous integration and deployment service.

Lightweight Docker Images in 5 Steps

Get future posts like this one in your inbox.

Follow us on

Lightweight Docker Images

Lightweight Docker Images Speed Up Deployment

Deploying your services packaged in lightweight Docker images has many practical benefits. In a container, your service usually comes with all the dependencies it needs to run, it’s isolated from the rest of the system, and deployment is as simple as running a docker run command on the target system.

However, most of the benefits of dockerized services can be negated if your Docker images are several gigabytes in size and/or they take several minutes to boot up. Caching Docker layers can help, but ideally you want to have small and fast containers that can be deployed and booted in a mater of minutes, or even seconds.

The first time we used Docker at Rendered Text to package one of Semaphore services, we made many mistakes that resulted in a huge Docker image that was painful for deployment and maintenance. However, we didn’t give up, and, step by step, we improved our images.

We’ve managed to make a lot of improvements since our first encounter with Docker, and we’ve successfully reduced the footprint of our images from several gigabytes to around 20 megabytes for our latest microservices, with boot times that are always under 3 seconds.

Our First Docker Service

You might be wondering how a Docker image can possibly be larger than a gigabyte. When you take a standard Rails application — with gems, assets, background workers and cron jobs — and package it using a base image that comes with everything but the kitchen sink preinstalled, you will surely cross the 1 GB threshold.

We started our Docker journey with a service that used Capistrano for deployment. To make our transition easy, we started out with a base Docker image that resembled our old workflow. The phusion/passenger-full image was a great candidate, and we managed to package up our application very quickly.

A big downside of using passenger-full was that it’s around 300 MB in size. When you add all of your application’s dependency gems, which can easily be around 300 MB in size, you are already starting at around 600 MB.

The deployment of that image took around 20 minutes, which is an unacceptable time frame if you want to be happy with your continuous delivery pipeline. However, this was a good first step.

We knew that we could do better.

Step 1: Use Fewer Layers

One of the first things you learn when building your Docker images is that you should squash multiple Docker layers into one big layer.

Let’s take a look at the following Dockerfile, and demonstrate why it’s better to use fewer layers in a Docker image:

FROM ubuntu:14.04

RUN apt-get update -y

# Install packages
RUN apt-get install -y curl
RUN apt-get install -y postgresql
RUN apt-get install -y postgresql-client

# Remove apt cache to make the image smaller
RUN rm -rf /var/lib/apt/lists/*

CMD bash

When we build the image with docker build -t my-image ., we get an image that is 279 MB in size. With docker history my-image we can list the layers of our Docker image:

$ docker history my-image

IMAGE               CREATED             CREATED BY                                      SIZE
47f6bd778b89        7 minutes ago       /bin/sh -c #(nop)  CMD ["/bin/sh" "-c" "bash"   0 B
3650b449ca91        7 minutes ago       /bin/sh -c rm -rf /var/lib/apt/lists/*          0 B
0c43b2bf2d13        7 minutes ago       /bin/sh -c apt-get install -y postgresql-client 1.101 MB
ce8e5465213b        7 minutes ago       /bin/sh -c apt-get install -y postgresql        56.72 MB
b3061ed9d53a        7 minutes ago       /bin/sh -c apt-get install -y curl              11.38 MB
ee62ceeafb06        8 minutes ago       /bin/sh -c apt-get update -y                    22.16 MB
ff6011336327        3 weeks ago         /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B
<missing>           3 weeks ago         /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/   1.895 kB
<missing>           3 weeks ago         /bin/sh -c rm -rf /var/lib/apt/lists/*          0 B
<missing>           3 weeks ago         /bin/sh -c set -xe   && echo '#!/bin/sh' > /u   194.6 kB
<missing>           3 weeks ago         /bin/sh -c #(nop) ADD file:4f5a660d3f5141588d   187.8 MB

There are several things to note in the output above:

  1. Every RUN command creates a new Docker layer
  2. The apt-get update command increases the image size by 23 MB
  3. The rm -rf /var/lib/apt/lists/* command doesn’t reduce the size of the image

When working with Docker, we need to keep in mind that any layer added to the image is never removed. In other words, it’s smarter to update the apt cache, install some packages, and remove the cache in a single Docker RUN command.

Let’s see if we can reduce the size of our image with this technique:

FROM ubuntu:14.04

RUN apt-get update -y && \
    apt-get install -y curl postgresql postgresql-client && \
    rm -rf /var/lib/apt/lists/*

CMD bash

Hooray! After the successful build, the size of our image dropped to 250 megabytes. We’ve just reduced the size by 25 MB just by joining the installation commands in our Dockerfile.

Step 2: Make Container Boot Time Predictable

This step describes an anti-pattern that you should avoid in your deployment pipeline.

When working on a Rails-based application, the biggest portion of your Docker images will be gems and assets. To circumvent this, you can try to be clever and place your gems outside of the container.

For example, you can run the Docker image by mounting a directory on the host machines, and cache the gems between two subsequent runs of your Docker image.

FROM ruby

WORKDIR /home/app
ADD . /home/app

CMD bundle install --path vendor/bundle && bundle exec rails server

Let’s build such an image. Notice that we use the CMD keyword, which means that our gems will be installed every time we run our Docker image. The build step only pushes the source code into the container.

docker build -t rails-image .

When we start our image this time around, it will first install the gems, and then start our Rails server.

docker run -tdi rails-image

Now, let’s use a volume to cache the gems between each run of our image. We will achieve this by mounting an external folder in our Docker image with -v /tmp/gems:vendor/bundle option.

docker run -v /tmp/gems:vendor/bundle -tdi rails-image

Hooray! Or is it?

The technique above looks promising, but in practice, it turns out to be a bad idea. Here are some reasons why:

  1. Your Docker images are not stateless. If you run the image twice, you can experience different behaviour. This is not ideal because it makes your deployment cycle more exciting than it should be.

  2. Your boot time can differ vastly depending on the content of your cache directory. For bigger Rails projects, the boot time can range from several seconds up to 20 minutes.

We have tried to build our images with this technique, but we ultimately had to drop this idea because of the above drawbacks. As a rule of thumb, predictable boot time and immutability of your images outweigh any speed improvement you may gain by extracting dependencies from your containers.

Step 3: Understand and Use Docker Cache Effectively

When creating your first Docker image, the most obvious choice is to use the same commands you would use in your development environment.

For example, if you’re working on a Rails project, you would probably want to use the following:

FROM ruby

WORKDIR /home/app
ADD . /home/app

RUN bundle install --path vendor/bundle
RUN bundle exec rake asset:precompile

CMD bundle exec rails server

However, by doing this, you will effectively wipe every cached layer, and start from scratch on every build.

New Docker layers are created for every ADD, RUN and COPY command. When you build a new image, Docker first checks if a layer with the same content and history exists on your machine. If it already exists, Docker reuses it. If it doesn’t exist, Docker needs to create a new layer.

In the above example, ADD . /home/app creates a new layer even if you make the smallest change in your source code. Then, the next command RUN bundle install --path vendor/bundle always needs to do a fresh install of every gem because the history of your cached layers has changed.

To avoid this, it’s better to just add the Gemfile first, since it changes rarely compared to the source code. Then, you should install all the gems, and add your source code on top of it.

FROM ruby

WORKDIR /tmp/gems
ADD Gemfile /tmp/gems/Gemfile
RUN bundle install --path vendor/bundle

WORKDIR /home/app
ADD . /home/app
RUN bundle exec rake asset:precompile

RUN mv /tmp/gems/vendor/bundle vendor/bundle

CMD bundle exec rails server

With the above technique, you can shorten the build time of your image and reduce the number of layers that need to be uploaded on every deploy.

Step 4: Use a Small Base Image

Big base images are great when you’re starting out with Docker, but you’ll eventually want to move on to smaller images that contain only the packages that are essential for your application.

For example, if you start with phusion/passenger-full, a logical next step would be to try out phusion/baseimage-docker and enable only the packages that are necessary. We followed this path too, and we successfully reduced the size of our Docker images by 200 megabytes.

But why stop there? You can also try to run your image on a base ubuntu image. Then, as a next step, go and try out debian that’s only around 80 MB in size.

You will notice that every time you reduce the image size, some of the dependencies will be missing, and you will probably need to spend some time on figuring out how to install them manually. However, this is only a one-time issue, and once you’ve resolved it, you can most likely enjoy faster deployment for several months to follow.

We used ubuntu for several months too. However, as we moved away from Ruby as our primary language and started using Elixir, we knew that we could go even lighter.

Being a compiled language, Elixir has a nice property that the resulting compiled binaries are self-contained, and can pretty much run on any Linux distribution. This is when the alpine image becomes an awesome candidate.

The base image is only 5 MB, and if you compile your Elixir application, you can achieve images that are only around 25 MB in size. This is awesome comparing to the 1.5 GB beast from the beginning of our Docker journey.

With Go, which we also use occasionally, we could go even further and build an image FROM scratch, and achieve 5 MB-sized images.

Step 5: Build Your Own Base Image

Using lightweight images can be a great way to improve build and deployment performance, but bootstrapping a new microservice can be painful if you need to remember to install a bunch of packages before you can use it.

If you create new services frequently, building your own customized base Docker image can bring great improvement. By building your own custom Docker image and publishing it on DockerHub, you can have small images, that are also easy to use.

Keep Learning, Docker is Great

Switching to a Docker-based development and deployment environment can be tricky at first. You can even get frustrated and think that Docker is not for you. However, if you persist and do your best to learn some good practices including how to make and keep your Docker images lightweight, the Docker ecosystem will reward you with speed, stability and reliability you’ve never experienced before.

P.S. If you’re looking for a continuous integration and deployment solution that works great with Docker — including fully-featured toolchain support, registry integrations, image caching, and fast image builds — try Semaphore for free.

Activity Monitor: Visualizing Your Team's Running CI/CD Processes

Growth always has its challenges. With more developers, there are more ongoing projects, branches and CI builds. And sometimes while you work, you find yourself wanting to know what else is in the pipeline on Semaphore.

For example, you’ve just merged a pull request, which created a new commit on master, and Semaphore launched a new build to run your tests. You’ve already configured continuous deployment, so you just need to wait a little bit until that change is automatically delivered to your users.

Your excitement drops a bit though, as you realize that your build is probably being blocked by another already running continuous deployment process. So you want to find out if that is the case, and if so, when can you expect to have your turn.

Even though a project’s page on Semaphore shows you a detailed status of ongoing activities, getting a big picture across all projects is not easy. What seems like a simple question becomes more complex when we consider that builds can be made of many parallel jobs, and that Semaphore is running builds and deploys sequentially. So knowing your current work’s place in the grand scheme of CI processing order can be tricky.

This is why we’ve created the Activity Monitor.

CI/CD Activity Monitor on Semaphore

Much like the process monitor on your desktop computer, Activity Monitor displays all CI/CD processes that are currently running on Semaphore: builds, deploys, and active SSH sessions.

Let’s go through the screenshot above:

  1. In the “Upcoming” section, we have one build which has not yet started, since there is another one on the same branch that is still running.

  2. There are two running CI builds with 35 parallel jobs, where one (first on the list) is partially started, with 2 jobs waiting still waiting for capacity. The total time the last of the waiting jobs has been waiting to run is shown in the “Waiting Time” column.

  3. Deploys always have priority over builds on Semaphore, so a deploy to production is labeled accordingly.

  4. Another big build has been partially completed, with 8 jobs already done.

  5. A yellow stripe highlights your work — builds or deploys which contain your commits.

  6. Current account’s capacity is shown with a colorful stripe at the bottom of the screen (the more Boxes your plan has, the more you can run in parallel).

Since Activity Monitor is most useful when your work is partially blocked by other processes, you will see link to it on pages of affected builds and branches. You can also always access Activity Monitor from the left sidebar on your organization’s profile page.

We hope that Activity Monitor will provide you with more insights into how your work flows through Semaphore. If you have any questions or feedback, please get in touch by sending us an email or posting a comment below.

PHP Versions Used in Commercial Projects, 2016 Edition

As in earlier reports on other languages (see Ruby, Node.js, Python), today we’re sharing data from Semaphore about usage distribution across versions of PHP in active private projects.

PHP version usage for commercial projects on Semaphore

More than 80% of commercial PHP projects are on 5.x, with 5.6, which was released two years ago, being the leading choice.

PHP 7.0 was released less than a year ago, in December 2015. And if we look at just the projects which have been added on Semaphore in 2016 — and assume that most of them are newly started — we see that a third of new projects are choosing PHP 7.0.

PHP versions used for new commercial projects in 2016 on Semaphore

It’s worth noting that versions 5.3, 5.4 and 5.5 have reached end of life and are no longer receiving any security updates. So if you’re on one of those versions, we strongly recommend that you move to 5.6 which will be maintained until the end of 2018. For more details on PHP release plans, refer to the official supported versions page.

Another factor to consider is version support on hosting platforms. PHPVersions.info provides an extensive overview.

Semaphore CI/CD continues to provide all versions mentioned in this report preinstalled on its default platform.

This is our first annual report on PHP; next year we’ll be able to observe changes over time.

What are your thoughts on working with and choosing among different PHP versions? Post your comments below.

To make testing in PHP easier for you, we regularly publish tutorials on TDD, BDD, deployment, and using Docker with PHP. Read our tutorials and subscribe here.

How to Capture All Errors Returned by a Function Call in Elixir

If there is an Elixir library function that needs to be called, how can we be sure that all possible errors coming from it will be captured?

Elixir/Erlang vs. Mainstream Languages

Elixir is an unusual language because it functions as a kind of a wrapper around another language. It utilizes Erlang and its rock solid libraries to build new concepts on top of it. Erlang is also different compared to what one may call usual or mainstream languages, e.g. Java, C++, Python, Ruby, etc. in that it’s a functional programming language, designed with distributed computing in mind.

Process and Concurrency

The languages we are used to working with (at least here in Rendered Text) are imperative programming languages. They’re quite different compared to Erlang and Elixir. It’s taken for granted that all those languages do not provide any significant abstraction over the operating system process model. In contrast to that, threads of execution (i.e. units of scheduling) are implemented as user-space processes in Erlang.

Also, these mainstream programming languages do not support concurrency themselves. They support it through libraries, which are usually based on OS capabilities. There is usually either no native support for concurrency in the language at all, or there is minimal support which is backward compatible with the initial sequential model of the language. Consequently, in mainstream languages there are no error handling mechanisms designed for concurrent/distributed processing.

The Consequence

When a new language is introduced, we search for familiar concepts. In the context of error handling, we look for exceptions, and try to use them the way we are used to. And, with Elixir — we fail. Miserably.

Error Capturing

Why is error capturing a challenge? Isn’t it trivial? Well, in Erlang/Elixir, errors can be propagated using different mechanisms. When a library function is called, it’s sometimes unclear which mechanism it’s using for error propagation. Also, it might happen that an error is generated in some other process (created by an invoked function) and propagated to the invoking function/process using some of the language mechanisms.

Let’s consider the foo/0 library function. All we know is that it returns a numeric value on successful execution. It can also fail for different reasons and notify the caller in non-obvious ways. Here’s a trivial example of foo/0:

defmodule Library do
  def foo, do: 1..4 |> Enum.random |> choice

  defp choice(1), do: 1/3
  defp choice(2), do: 1/0
  defp choice(3), do: Process.exit(self, :unknown_error)
  defp choice(4), do: throw :overflow
end

What can we do to capture all possible errors generated by foo/0?

Error Propagation

In mainstream languages, there is only one way to interrupt processing and propagate errors up the call stack — exceptions. Nothing else. Exceptions operate within the boundaries of a single operating system thread. They cannot reach outside the thread scope, because the language does not recognize anything beyond that scope.

In Elixir, error handling works differently. There are multiple mechanisms for error notification, and this can be quite confusing to novice users.

The error condition can be propagated as an exception, or as an exit signal. There are two mutually exclusive flavors of exceptions: raised and thrown. When it comes to signals, a process can either send an exit signal to itself, or to other processes. Reaction to receiving an exit signal is different based on the state of the receiving process and signal value. Also, a process can choose to terminate itself because of an error, calling the exit/1 function.

Exceptions

There are two mechanisms to create an exception, and two mechanisms to handle them. As previously mentioned, these mechanisms are mutually exclusive!

A Raised exception can only be rescued and a thrown exception can only be caught. So, the following exceptions will be captured:

try do raise "error notification" rescue e -> e end
try do throw :error_notification  catch  e -> e end

But these won’t:

try do raise "error notification" catch  e -> e end
try do throw :error_notification  rescue e -> e end

This will do the job:

try do raise "error notification" rescue e -> e catch e -> e end
try do throw :error_notification  rescue e -> e catch e -> e end

However, this is still not good enough because neither rescue nor catch will handle the exit signal sent from this or any other process:

try do Process.exit self, :error_notification rescue e -> e catch e -> e end

From a single-process (and try block mechanism) perspective, there are just too many moving parts to get it right. And at the end of the day, we cannot cover all possible scenarios anyway.

Something is obviously wrong with this approach. Let’s look for a different solution.

The Erlang Way

Now, let’s move one step back and look at Erlang again. It’s an intrinsically concurrent language. Everything in it is designed to support distributed computing. Not Google scale distributed, but still distributed. Elixir is built on top of that.

An Elixir application, no matter how simple, should not be perceived as a single entity. It’s a distributed system on its own, consisting of tens, and often even hundreds of processes.

The try block is useful in the scope of a single process, and that’s where it should be used: to capture errors generated in the same process. However, if we need to handle all errors that might affect a process while a particular function is being executed (possibly originating in some other process), we’ll need to use some other mechanism. The try block cannot take care of that. This is a mindset change we’ll have to accept.

When in Rome, Do as the Romans Do

The Erlang philosophy is “fail fast”. In theory, this is a sound fault-tolerance approach. It basically means that you shouldn’t try to fix the unexpected! This makes much more sense than the alternative, since the unexpected is difficult to test. Instead, you should let the process or the entire process group die, and start over, from a known state. This can be easily tested.

So, what happens when an error notification is propagated above a process’s initial function? The process is terminated, and a notification is sent to all interested parties — all the processes that need to be notified. This is done consistently for all processes, and for all termination reasons, including a normal exit of the initial function.

If you want to capture all errors, you will need to engage an interprocess notification mechanism. This cannot be done using an intraprocess mechanism like the try block, at least not in Elixir.

Now, let’s discuss some approaches to capturing errors.

Approach 1: Exit Signals

Erlang’s “fail fast” mechanism are exit signals combined with Erlang messages. When a process terminates for any reason (whether it’s a normal exit, or an error), it sends an exit signal to all processes it is linked with.

When a process receives an exit signal, it usually dies, unless it’s trapping exit signals. In that case, the signal is transformed into a message and delivered to the process message box.

So, to capture all errors from a function, we can:

  • enable the exit signal trapped in a calling process,
  • execute the function in separate but linked processes, and
  • wait for the process exit signal message and determine if the process/function has finished successfully or failed, and if it failed, for what reason.
def capture_link(callback) do
  Process.flag(:trap_exit, true)
  pid = spawn_link(callback)
  receive do
    {:EXIT, ^pid, :normal} -> :ok
    {:EXIT, ^pid, reason}   -> {:error, reason}
  end
end

This approach is acceptable, but it’s a little intrusive, since capture_link/1 changes the invoking process state by calling the Process.flag/2 function. A non-intrusive approach (with no side effects involving the running process) is preferable.

Approach 2: Process Monitoring

Instead of linking (and possibly dying) with the process whose lifecycle is to be monitored, a process can be simply monitored. The process that requested monitoring will be informed when the monitored process terminates for any reason. The algorithm becomes as follows:

  • execute the function in a separate process that is monitored, but not linked to,
  • wait for the process termination message delivered by the monitor, and determine if the process/function has successfully completed or failed, and if it has failed, what is the reason behind the failure.

Here’s an example of a successfully completed monitored process:

iex> spawn_monitor fn -> :a end
{#PID<0.88.0>, #Reference<0.0.2.114>}
iex> flush
{:DOWN, #Reference<0.0.2.114>, :process, #PID<0.88.0>, :normal}

When a monitored process terminates, the process that requested monitoring receives a message in the following form: {:DOWN, MonitorRef, Type, Object, Info}.

Here’s a non-intrusive example of capturing all errors:

def capture_monitor do
  {pid, monitor} = spawn_monitor(&Library.foo/0)
  receive do
    {:DOWN, ^monitor, :process, ^pid, :normal} -> :ok
    {:DOWN, ^monitor, :process, ^pid, reason}  -> {:error, reason}
  end
end

Let’s take a look at an example implementation of described capturing mechanism that can:

  • invoke any function and capture whatever output the invoked function generates (a return value or the reason behind the error) and
  • transfer it to the caller in a uniform way:
    • {:ok, state} or
    • {:error, reason}

The example implementation is as follows:

def capture(callback, timeout_ms) do
  {pid, monitor} = callback |> propagate_return_value_wrapper |> spawn_monitor
  receive do
    {:DOWN, ^monitor, :process, ^pid, :normal} ->
      receive do
        {__MODULE__, :response, response} -> {:ok, response}
      end
    {:DOWN, ^monitor, :process, ^pid, reason}  ->
      Logger.error "#{__MODULE__}: Error in handled function: #{inspect reason}";
      {:error, reason}
  after timeout_ms ->
    pid |> Process.exit(:kill)
    Logger.error "#{__MODULE__}: Timeout..."
    {:error, {:timeout, timeout_ms}}
  end
end

defp propagate_return_value_wrapper(callback) do
  caller_pid = self
  fn-> caller_pid |> send( {__MODULE__, :response, callback.()}) end
end

Approach 3: The Wormhole

We’ve covered some possible approaches to ensuring that all errors coming from an Elixir function are captured. To simplify error capturing, we created the Wormhole module, a production-ready callback wrapper. You can find it here, feel free to use it!

In Wormhole, we used Task.Supervisor to monitor the callback lifecycle. Here is the most important part of the code:

def capture(callback, timeout_ms) do
  Task.Supervisor.start_link
  |> callback_exec_and_response(callback, timeout_ms)
end

defp callback_exec_and_response({:ok, sup}, callback, timeout_ms) do
  Task.Supervisor.async_nolink(sup, callback)
  |> Task.yield(timeout_ms)
  |> supervisor_stop(sup)
  |> response_format(timeout_ms)
end
defp callback_exec_and_response(start_link_response, _callback, _timeout_ms) do
  {:error, {:failed_to_start_supervisor, start_link_response}}
end

defp supervisor_stop(response, sup) do
  Process.unlink(sup)
  Process.exit(sup, :kill)

  response
end

defp response_format({:ok,   state},  _),          do: {:ok,    state}
defp response_format({:exit, reason}, _),          do: {:error, reason}
defp response_format(nil,             timeout_ms), do: {:error, {:timeout, timeout_ms}}

Wormhole.capture starts Task.Supervisor, executes callback under it, waits for the response at most timeout_ms milliseconds, stops the supervisor, and returns a response in the :ok/:error tuple form.

Takeaways

Elixir is inherently a concurrent language designed for developing highly distributed, fault-tolerant applications. Elixir provides multiple mechanisms for error handling. A user needs to be precise abut what kinds of errors are to be handled and where they are coming from. If our intention is to handle errors originating from the same process they are being handled in, we can use common mechanisms utilized in mainstream, sequential languages like the try block.

When capturing errors originating from a nontrivial logical unit (involving multiple processes), well known, sequential mechanisms will not be appropriate. In these types of situations, process monitoring mechanisms and a supervisor-like approach are in order.

A logical unit entry function (callback) needs to be executed in a separate process, in which it can succeed or fail without affecting the function-invoking process. In such a scenario, the function-invoking process spawns a supervisor. Then, it engages the language mechanism to transport the pass or fail information from the callback-executing process to the supervisor. All of this can be achieved without making any changes in the code from which errors are being captured, which makes this approach generally applicable.

Semaphore is described by Elixir developers as the only CI which supports Elixir out of the box. To make testing in Elixir even easier, we regularly publish tutorials on TDD, BDD, and using Docker with Elixir. Read our tutorials and subscribe here.

Platform Update on November 22nd

The upcoming platform update is scheduled for November 22nd, 2016.

Chromedriver gets and update with version 2.25.

Git has been updated to version 2.10.2.

Go receives a version update with 1.7.3.

Java 8 has been updated to version 8u111.

JRuby receives an update with version 9.1.6.0.

Node.js gets a version update with 4.6.2.

PHP receives two version updates with 5.6.28 and 7.0.13.

Pip has been updated to version 9.0.1, including a number of deprecations and possible breaking changes.

New things

The following additions will be available after switching to the release candidate platform.

Node.js version 7.1.0 is now part of the platform. To use it in your builds, add nvm use 7.1 to your build commands. This platform update also includes yarn, pre-installed for all the supported Node.js versions (>= 4.x) and a new default Node.js version, which is set to the latest 4.6 release.

Trying the new platform

To ensure that the updates are compatible with your current setup, please switch to the Ubuntu 14.04 LTS v1611 (release candidate) platform in Project Settings > Platform. We’re looking forward to hearing your feedback and requirements, which will help us to fix the potential issues and tailor the platform to better suit your needs. The release candidate period will last until November 22nd, 2016.

Changes in the final release

The Docker-enabled platform gets an update with docker-engine version 1.12.3.

Ruby 2.3.3 is now part of the platform.

A full list of changes is available in the platform changelog.

Python Versions Used in Commercial Projects, 2016 Edition

In October we shared data on which versions of Ruby and Node.js developers use to get the job done. Next up is a report on Python, for which the main question is binary — is it Python 2 or 3 that developers use at work today?

Python version usage for commercial projects on Semaphore

Python 3.0 final was released almost 8 years ago, in December 2008. And while some have predicted that it will take 5 years for Python 3 to become the default choice, that didn’t really happen, as over 70% of ongoing private projects is based on Python 2.7, the last 2.x version released in July 2010.

If we zoom in on the projects which have been added on Semaphore in 2016 — in practice most of them are newly started — the data is almost the same, showing that 2.7 is the default choice of the majority.

Python versions used for new commercial projects in 2016 on Semaphore

If we compare to other language communities where new versions are adopted much faster, it seems that for most people the language improvements in Python 3 did not outweigh the inconvenience of incompatibility. So it’s “if it ain’t broke, don’t fix it”.

Semaphore CI/CD provides Python versions 2.6, 2.7, 3.3, 3.4 and 3.5 preinstalled on the platform. Most new projects on 3.x are based on the latest version, 3.5:

Python 3 versions used for new commercial projects in 2016 on Semaphore

This is our first annual report on Python, so next year we’ll be able to observe changes over time.

What are your thoughts on working with different Python versions? Post your comments below.

Sending ECS Service Logs to the ELK Stack

Having comprehensive logs is a massive life-saver when debugging or investigating issues. Still, logging too much data can have the adverse effect of hiding the actual information you are looking for. Because of these issues, storing your log data in a structured format becomes very helpful. Also, being able to track the number of occurrences and monitor their rate of change can be quite indicative of the underlying causes. This is why we decided to include the ELK stack in our architecture.

I’ll introduce you to one possible solution for sending logs from various separate applications to a central ELK stack and storing the data in a structured way. A big chunk of Semaphore architecture is made up of microservices living inside Docker containers and hosted by Amazon Web Services (AWS), so ELK joins this system as the point to which all of these separate services send their logs, which it then processes and visualizes.

In this article, I’ll cover how to configure the client side (the microservices) and the server side (the ELK stack).

Client-side Configuration

Before developing anything, the first decision we needed to make was to pick a logging format. We decided on syslog, which is a widely accepted logging standard. It allows for a client-server architecture for log collection, with a central log server receiving logs from various client machines. This is the structure that we’re looking for.

Our clients are applications sitting inside Docker containers, which themselves are parts of AWS ECS services. In order to connect them to ELK, we started by setting up the ELK stack locally, and then redirecting the logs from a Docker container located on the same machine. This setup was useful for both development and debugging. All we needed to do on the client side was start the Docker container as follows:

docker run --log-driver=syslog --log-opt syslog-address=udp://localhost:2233 <image_id>

We’ll assume here that Logstash is listening to UDP traffic on port 2233.

Once we were done developing locally, we moved on to updating our ECS services. For this, all we needed to do was update our task definition by changing the log configuration of the container:

{
  "containerDefinitions": [
    {
      "logConfiguration": {
        "logDriver": "syslog",
        "options": {
          "syslog-address": "udp://<logstash_url>:2233"
        }
      }
    },
    ...
  ],
  ...
}

This started our Docker containers with the same settings we previously used locally.

Server-side Configuration

On the server side, we started with the Dockerized version of the ELK stack. We decided to modify it so that it accepts syslog messages, and enable it to read the custom attributes embedded inside our messages (more on that in the ‘Processing’ section below). For both of these, we needed to configure Logstash. In order to do that, we needed to look into the config/logstash.conf file.

Logstash pipeline consists of input, filter, and output sections. Inputs and outputs describe the means for Logstash to receive and send data, whereas filters describe the data transformations that Logstash performs. This is the basic structure of a logstash.conf file:

# logstash/config/logstash.conf

input {
  tcp {
    port => 5000
  }
}

## Add your filters / logstash plugins configuration here

output {
  elasticsearch {
    hosts => "elasticsearch:9200"
  }
}

Receiving Messages

In order to receive syslog messages, we expanded the input section:

input {
  udp {
    port => 2233
    type => inline_attributes
  }
}

This allowed Logstash to listen for UDP packets on the specified port. type is just a tag that we added to the received input in order to be able to recognize it later on.

Now, since our ELK stack components are sitting inside Docker containers, we needed to make the required port accessible. In order to do that, we modified our docker-compose.yml by adding port “2233:2233/udp” to Logstash:

services:
  logstash:
    ports:
      - "2233:2233/udp"
      ...
    ...
  ...
...

Since we’re hosting our ELK stack on AWS, we also needed to update our task definition to open the required port. We added the following to the portMappings section of our containerDefinition:

{
  "containerDefinitions": [
    {
      "portMappings": [
        {
          "hostPort": 2233,
          "containerPort": 2233,
          "protocol": "udp"
        },
        ...
      ]
    },
    ...
  },
  ...
}

Processing

For processing, we decided to add the ability to extract key=value pairs from our message strings and add them as attributes to the structure that is produced by processing of a message. For example, the message ... environment=staging ... would produce a structure containing the key environment with the value staging.

We implemented this by adding the following piece of code into the filters section:

  ruby {
    code => "
      return unless event.get('message')
      event.get('message').split(' ').each do |token|
        key, value = token.strip.split('=', 2)
        next if !key || !value || key == ''
        event.set(key, value)
      end
    "
  }

The Ruby plugin allowed us to embed the Ruby code inside the configuration file, which came in very handy. Another useful thing we did at this point was to enable the outputting of processed messages to the console by adding the following to the output section:

stdout { codec => rubydebug }

Finally, we logged the following string in a client application:

“Some text service=service1 stage=stage1”

This produced the following event in the console debug, showing us the structure that gets persisted once the message has been processed:

{
     "message" => "<30>Nov  2 15:01:42  [998]: 14:01:42.671 [info] Some text service=service1 stage=stage1",
    "@version" => "1,
  "@timestamp" => "2016-11-02T14:01:42.672Z",
        "type" => "inline_attributes",
        "host" => "172.18.0.1",
     "service" => "service1",
       "stage" => "stage1"
}

Note that service=service1 and stage=stage1 were added as attributes to the final structure. This final structure is then available for searching and inspection through Kibana’s GUI, as seen below:

Wrap-up

This sums up the logging setup. The result is a centralized logging system that can include new service logs with minimal setup on the client side. This allows for us to analyse logs quickly and efortlessly, as well as visualise the logs of various separate services through Kibana’s GUI.

This is the first post on our brand new engineering blog. We hope that you’ll find it useful in setting up your logging architecture. Speaking of useful, we also hope that you’ll trust Semaphore to run the tests and deploy your applications for you.

Happy building!

How BDD and Continuous Delivery Help Developers Maintain Flow

Programming is cognitive work, and programmers perform their best work under intense concentration. While there are external factors that can affect this, such as having a quiet office, controllable ways of communication, etc., there are also some internal factors that need to be taken into account. In fact, the way we work can influence the quality of the outcome the most.

The state in which programmers are 100% focused while pure magic is coming out of their keyboards is often referred to as the zone, or flow.

Flow is not easy to get into, one interruption is enough to blow it, and, once lost, it’s usually difficult to regain. Flow is also a crucial ingredient to getting meaningful creative work done. In Creativity: Flow and the Psychology of Discovery and Invention, psychologist Mihaly Csikszentmihalyi identified nine elements that can get a person into a state of flow:

  1. There are clear goals every step of the way.
  2. There is immediate feedback to one’s actions.
  3. There is a balance between challenges and skills.
  4. Action and awareness are merged.
  5. Distractions are excluded from consciousness.
  6. There is no worry of failure.
  7. Self-consciousness disappears.
  8. The sense of time becomes distorted.
  9. The activity becomes autotelic.

Behavior-driven development (BDD), continuous integration and continuous deployment can help programmers amplify several of these elements. In this blog post, I’ll try to explain how.

There are Clear Goals Every Step of the Way

Before I started practicing BDD, I often felt at a loss when facing a big new feature to develop. It’s not that I couldn’t do it, it’s just that the way I worked would often be inefficient: I’d procrastinate while trying to decide where to start, and once I’d reach the middle, I’d often meander between different parts of the incomplete system. After the feature was shipped, much too often I’d realize that I had over-engineered the thing, while also missing to implement at least one crucial part of it.

In BDD, you always start by writing a high-level user scenario of a feature that doesn’t yet exist. It usually isn’t immediately obvious what that feature is, so taking time to think this through and discuss it with your team, product manager or client before you write any code is time well spent. Once you have the outlined feature in front of you (written in a DSL such as Gherkin, for example), you’ll have a clear goal towards which you can work. Your next sole objective should be to implement that test scenario. You’ll know that you’ve succeeded in achieving this objective once the test scenario starts passing.

As we dive into the lower levels of implementation — for web and mobile apps this is usually the view, then controller, model, possibly a deeper business logic layer — we should always keep defining our next goal by writing a test case for it. Each goal is derived from the implementation we’ve just completed, while keeping in mind the high-level goal represented by the initial scenario. We can re-run the high-level scenario (sometimes also called acceptance test) whenever we need a hint on how to proceed.

There is Immediate Feedback to One’s Action

Fast feedback loops are essential in everything we do if we want to be good at it. Feedback tells us how we’re doing, as well as where we are relative to our goal. When debugging an issue that involves network roundtrips and multiple files, and spans a potentially wide area of code, your primary goal should be to shorten the loop of reproducing it as much as possible. This will help you spend as little time as possible typing and/or clicking in between attempts to reproduce the issue in order to see if you’ve resolved it.

BDD and continuous integration (CI) are all about feedback. When programming, the tests you’re writing provide you with feedback on your ongoing work. You write a test, program a little, run the test and observe the result, go back to refactor a little if it’s green, or work some more on making it pass in case it’s red.

Making this process fast is crucial, as you can lose focus if you need to wait for a single test result for longer than a few seconds. It’s also helpful to practice switching between tests and code, as well as running the related test(s) part of your muscle memory almost subconsciously. For example, at Rendered Text we all program in Vim and use a small plugin to run the test or test file under cursor with a handy shortcut.

One of the biggest benefits of continuous deployment (CD) is that it enables rapid feedback for the entire company. Releasing on a daily basis ensures that developers can fine-tune the implementation, product managers can validate solutions and get feedback from users, and the business as a whole can innovate, learn, and course-correct quickly.

There is a Balance Between Challenges and Skills

If there’s a fundamental mismatch between a task at hand and one’s skills, no process can help with that issue. However, BDD plays an important role as an aid in the general process of breaking up a big task into many smaller, manageable ones. It helps us work in small increments. Assuming we’re working in a familiar domain, there might be one high-level scenario that’s far from complete, but there are always at least some tests which we can get to pass in about an hour of programming.

In a way, CI and CD help make conquering big challenges easier too. While our final goal may be a functionality which requires multiple weeks of teamwork, continuous delivery — complete with feature flags, gradual rollout, and incremental product development informed by data and feedback — helps us ensure that our units of work are always of manageable size. This also means that in case it turns out that what we are building is not useful, we can discard it soon enough without doing pointless work.

There is No Worry of Failure

Worry of failure often stems from the social setting, however there’s a tremendous benefit of having a comprehensive test suite that acts as a safety net for the entire team. In the process of continuous integration, we minimize the risk that a fatal bug will be left unnoticed and deployed to production by automating the process of running tests, along with performing coding style, security and other checks. This helps developers work without unnecessary stress. Of course, this is assuming that every contributor follows the basic rule of not submitting a pull request without adequate test coverage.

If your team is practicing peer code review (which we highly recommend!), then having a build status indicator right in the pull request, as provided by Semaphore via its GitHub and Bitbucket integrations, helps the reviewer know that she can focus on more useful things than whether the code will work.

At the end of the day, let’s keep in mind that the primary benefit of working in a state of flow is our deeply personal sense of achievement and self-worth that comes with it. That’s the ultimate measure of value of any process or tool.

Platform Update on October 25th

The upcoming platform update is scheduled for October 25th, 2016.

Cassandra has been updated to version 2.2.8.

Erlang gets an update with version 19.1.

Elixir receives a version update with 1.3.4.

Git gets an update with version 2.10.1.

Gradle has been updated to version 3.1.

MySQL receives an update with version 5.6.34.

PHP gets two updates with versions 5.6.27 and 7.0.12.

New things

The following additions will be available after switching to the release candidate platform.

NodeJS 6.8.1 and 4.6.0 have been added to the platform. To use them in your builds, add nvm use 6.8 or nvm use 4.6 to your build commands. These versions will be selectable in Project Settings after the release candidate period.

Trying the new platform

To ensure that the updates are compatible with your current setup, please switch to the Ubuntu 14.04 LTS v1610 (release candidate) platform in Project Settings > Platform. We’re looking forward to hearing your feedback and requirements, which will help us to fix the potential issues and tailor the platform to better suit your needs. The release candidate period will last until October 25th, 2016.

Changes in the final release

Node.js gets several security updates, including 0.10.48 and 0.12.17. The previously announced versions (6.8.1 and 4.6.0) are also replaced with more up-to-date versions, namely 6.9.1 and 4.6.1.

The Docker-enabled platform gets an update with docker-engine version 1.12.2. Our tool for caching Docker images (docker-cache) has been updated as well, featuring full layer caching for tagged Docker images.

A full list of changes is available in the platform changelog.

Node.js Versions Used in Commercial Projects, 2016 Edition

Following up on our last week’s post for Ruby, we’re presenting the second annual report of Node.js version usage in commercial JavaScript projects on Semaphore (see 2015 numbers here). We think it’s always interesting to see what people are using to get the job done, and projects actively using continuous integration can be a solid sample.

Node.js version usage for commercial JavaScript projects on Semaphore

Naturally when you compare year over year, there’s always a trend towards using newer versions. A third of projects is still using Node 0.10 (down from 55% last year), whose long-term support (LTS) maintenance period ends on October 31. If you are not familiar with Node.js foundation’s release and maintenance schedule, you can see it here.

Node.js version adoption for private JavaScript projects over the years

At the moment, Node version 4 is the LTS and Node 6 the “current” release. It is also notable that AWS introduced support for Node 4.3 on its growingly popular Lambda platform in April 2016 (previously only 0.10 was available).

Node.js version fragmentation

Let’s zoom in on what the picture would look like if we focused only on the projects started in 2016. Note that Semaphore knows only when a project’s CI/CD was set up, so it’s an approximation.

Node.js versions used for new commercial projects in 2016 on Semaphore

The current LTS and latest versions dominate, although it’s still very much a mix of everything.

What’s your team’s approach to working with Node.js version(s)? Share your thoughts in the comments below.

P.S. Looking for CI/CD solution that’s easy to use, fast and supports all major Node.js versions out of the box? Try Semaphore for free.

Get future posts like this one in your inbox.

Follow us on