Semaphore Blog

News and updates from your friendly continuous integration and deployment service.

Sending ECS Service Logs to the ELK Stack

Get future posts like this one in your inbox.

Follow us on

Having comprehensive logs is a massive life-saver when debugging or investigating issues. Still, logging too much data can have the adverse effect of hiding the actual information you are looking for. Because of these issues, storing your log data in a structured format becomes very helpful. Also, being able to track the number of occurrences and monitor their rate of change can be quite indicative of the underlying causes. This is why we decided to include the ELK stack in our architecture.

I’ll introduce you to one possible solution for sending logs from various separate applications to a central ELK stack and storing the data in a structured way. A big chunk of Semaphore architecture is made up of microservices living inside Docker containers and hosted by Amazon Web Services (AWS), so ELK joins this system as the point to which all of these separate services send their logs, which it then processes and visualizes.

In this article, I’ll cover how to configure the client side (the microservices) and the server side (the ELK stack).

Client-side Configuration

Before developing anything, the first decision we needed to make was to pick a logging format. We decided on syslog, which is a widely accepted logging standard. It allows for a client-server architecture for log collection, with a central log server receiving logs from various client machines. This is the structure that we’re looking for.

Our clients are applications sitting inside Docker containers, which themselves are parts of AWS ECS services. In order to connect them to ELK, we started by setting up the ELK stack locally, and then redirecting the logs from a Docker container located on the same machine. This setup was useful for both development and debugging. All we needed to do on the client side was start the Docker container as follows:

docker run --log-driver=syslog --log-opt syslog-address=udp://localhost:2233 <image_id>

We’ll assume here that Logstash is listening to UDP traffic on port 2233.

Once we were done developing locally, we moved on to updating our ECS services. For this, all we needed to do was update our task definition by changing the log configuration of the container:

{
  "containerDefinitions": [
    {
      "logConfiguration": {
        "logDriver": "syslog",
        "options": {
          "syslog-address": "udp://<logstash_url>:2233"
        }
      }
    },
    ...
  ],
  ...
}

This started our Docker containers with the same settings we previously used locally.

Server-side Configuration

On the server side, we started with the Dockerized version of the ELK stack. We decided to modify it so that it accepts syslog messages, and enable it to read the custom attributes embedded inside our messages (more on that in the ‘Processing’ section below). For both of these, we needed to configure Logstash. In order to do that, we needed to look into the config/logstash.conf file.

Logstash pipeline consists of input, filter, and output sections. Inputs and outputs describe the means for Logstash to receive and send data, whereas filters describe the data transformations that Logstash performs. This is the basic structure of a logstash.conf file:

# logstash/config/logstash.conf

input {
  tcp {
    port => 5000
  }
}

## Add your filters / logstash plugins configuration here

output {
  elasticsearch {
    hosts => "elasticsearch:9200"
  }
}

Receiving Messages

In order to receive syslog messages, we expanded the input section:

input {
  udp {
    port => 2233
    type => inline_attributes
  }
}

This allowed Logstash to listen for UDP packets on the specified port. type is just a tag that we added to the received input in order to be able to recognize it later on.

Now, since our ELK stack components are sitting inside Docker containers, we needed to make the required port accessible. In order to do that, we modified our docker-compose.yml by adding port “2233:2233/udp” to Logstash:

services:
  logstash:
    ports:
      - "2233:2233/udp"
      ...
    ...
  ...
...

Since we’re hosting our ELK stack on AWS, we also needed to update our task definition to open the required port. We added the following to the portMappings section of our containerDefinition:

{
  "containerDefinitions": [
    {
      "portMappings": [
        {
          "hostPort": 2233,
          "containerPort": 2233,
          "protocol": "udp"
        },
        ...
      ]
    },
    ...
  },
  ...
}

Processing

For processing, we decided to add the ability to extract key=value pairs from our message strings and add them as attributes to the structure that is produced by processing of a message. For example, the message ... environment=staging ... would produce a structure containing the key environment with the value staging.

We implemented this by adding the following piece of code into the filters section:

  ruby {
    code => "
      return unless event.get('message')
      event.get('message').split(' ').each do |token|
        key, value = token.strip.split('=', 2)
        next if !key || !value || key == ''
        event.set(key, value)
      end
    "
  }

The Ruby plugin allowed us to embed the Ruby code inside the configuration file, which came in very handy. Another useful thing we did at this point was to enable the outputting of processed messages to the console by adding the following to the output section:

stdout { codec => rubydebug }

Finally, we logged the following string in a client application:

“Some text service=service1 stage=stage1”

This produced the following event in the console debug, showing us the structure that gets persisted once the message has been processed:

{
     "message" => "<30>Nov  2 15:01:42  [998]: 14:01:42.671 [info] Some text service=service1 stage=stage1",
    "@version" => "1,
  "@timestamp" => "2016-11-02T14:01:42.672Z",
        "type" => "inline_attributes",
        "host" => "172.18.0.1",
     "service" => "service1",
       "stage" => "stage1"
}

Note that service=service1 and stage=stage1 were added as attributes to the final structure. This final structure is then available for searching and inspection through Kibana’s GUI, as seen below:

Wrap-up

This sums up the logging setup. The result is a centralized logging system that can include new service logs with minimal setup on the client side. This allows for us to analyse logs quickly and efortlessly, as well as visualise the logs of various separate services through Kibana’s GUI.

This is the first post on our brand new engineering blog. We hope that you’ll find it useful in setting up your logging architecture. Speaking of useful, we also hope that you’ll trust Semaphore to run the tests and deploy your applications for you.

Happy building!

How BDD and Continuous Delivery Help Developers Maintain Flow

Programming is cognitive work, and programmers perform their best work under intense concentration. While there are external factors that can affect this, such as having a quiet office, controllable ways of communication, etc., there are also some internal factors that need to be taken into account. In fact, the way we work can influence the quality of the outcome the most.

The state in which programmers are 100% focused while pure magic is coming out of their keyboards is often referred to as the zone, or flow.

Flow is not easy to get into, one interruption is enough to blow it, and, once lost, it’s usually difficult to regain. Flow is also a crucial ingredient to getting meaningful creative work done. In Creativity: Flow and the Psychology of Discovery and Invention, psychologist Mihaly Csikszentmihalyi identified nine elements that can get a person into a state of flow:

  1. There are clear goals every step of the way.
  2. There is immediate feedback to one’s actions.
  3. There is a balance between challenges and skills.
  4. Action and awareness are merged.
  5. Distractions are excluded from consciousness.
  6. There is no worry of failure.
  7. Self-consciousness disappears.
  8. The sense of time becomes distorted.
  9. The activity becomes autotelic.

Behavior-driven development (BDD), continuous integration and continuous deployment can help programmers amplify several of these elements. In this blog post, I’ll try to explain how.

There are Clear Goals Every Step of the Way

Before I started practicing BDD, I often felt at a loss when facing a big new feature to develop. It’s not that I couldn’t do it, it’s just that the way I worked would often be inefficient: I’d procrastinate while trying to decide where to start, and once I’d reach the middle, I’d often meander between different parts of the incomplete system. After the feature was shipped, much too often I’d realize that I had over-engineered the thing, while also missing to implement at least one crucial part of it.

In BDD, you always start by writing a high-level user scenario of a feature that doesn’t yet exist. It usually isn’t immediately obvious what that feature is, so taking time to think this through and discuss it with your team, product manager or client before you write any code is time well spent. Once you have the outlined feature in front of you (written in a DSL such as Gherkin, for example), you’ll have a clear goal towards which you can work. Your next sole objective should be to implement that test scenario. You’ll know that you’ve succeeded in achieving this objective once the test scenario starts passing.

As we dive into the lower levels of implementation — for web and mobile apps this is usually the view, then controller, model, possibly a deeper business logic layer — we should always keep defining our next goal by writing a test case for it. Each goal is derived from the implementation we’ve just completed, while keeping in mind the high-level goal represented by the initial scenario. We can re-run the high-level scenario (sometimes also called acceptance test) whenever we need a hint on how to proceed.

There is Immediate Feedback to One’s Action

Fast feedback loops are essential in everything we do if we want to be good at it. Feedback tells us how we’re doing, as well as where we are relative to our goal. When debugging an issue that involves network roundtrips and multiple files, and spans a potentially wide area of code, your primary goal should be to shorten the loop of reproducing it as much as possible. This will help you spend as little time as possible typing and/or clicking in between attempts to reproduce the issue in order to see if you’ve resolved it.

BDD and continuous integration (CI) are all about feedback. When programming, the tests you’re writing provide you with feedback on your ongoing work. You write a test, program a little, run the test and observe the result, go back to refactor a little if it’s green, or work some more on making it pass in case it’s red.

Making this process fast is crucial, as you can lose focus if you need to wait for a single test result for longer than a few seconds. It’s also helpful to practice switching between tests and code, as well as running the related test(s) part of your muscle memory almost subconsciously. For example, at Rendered Text we all program in Vim and use a small plugin to run the test or test file under cursor with a handy shortcut.

One of the biggest benefits of continuous deployment (CD) is that it enables rapid feedback for the entire company. Releasing on a daily basis ensures that developers can fine-tune the implementation, product managers can validate solutions and get feedback from users, and the business as a whole can innovate, learn, and course-correct quickly.

There is a Balance Between Challenges and Skills

If there’s a fundamental mismatch between a task at hand and one’s skills, no process can help with that issue. However, BDD plays an important role as an aid in the general process of breaking up a big task into many smaller, manageable ones. It helps us work in small increments. Assuming we’re working in a familiar domain, there might be one high-level scenario that’s far from complete, but there are always at least some tests which we can get to pass in about an hour of programming.

In a way, CI and CD help make conquering big challenges easier too. While our final goal may be a functionality which requires multiple weeks of teamwork, continuous delivery — complete with feature flags, gradual rollout, and incremental product development informed by data and feedback — helps us ensure that our units of work are always of manageable size. This also means that in case it turns out that what we are building is not useful, we can discard it soon enough without doing pointless work.

There is No Worry of Failure

Worry of failure often stems from the social setting, however there’s a tremendous benefit of having a comprehensive test suite that acts as a safety net for the entire team. In the process of continuous integration, we minimize the risk that a fatal bug will be left unnoticed and deployed to production by automating the process of running tests, along with performing coding style, security and other checks. This helps developers work without unnecessary stress. Of course, this is assuming that every contributor follows the basic rule of not submitting a pull request without adequate test coverage.

If your team is practicing peer code review (which we highly recommend!), then having a build status indicator right in the pull request, as provided by Semaphore via its GitHub and Bitbucket integrations, helps the reviewer know that she can focus on more useful things than whether the code will work.

At the end of the day, let’s keep in mind that the primary benefit of working in a state of flow is our deeply personal sense of achievement and self-worth that comes with it. That’s the ultimate measure of value of any process or tool.

Platform Update on October 25th

The upcoming platform update is scheduled for October 25th, 2016.

Cassandra has been updated to version 2.2.8.

Erlang gets an update with version 19.1.

Elixir receives a version update with 1.3.4.

Git gets an update with version 2.10.1.

Gradle has been updated to version 3.1.

MySQL receives an update with version 5.6.34.

PHP gets two updates with versions 5.6.27 and 7.0.12.

New things

The following additions will be available after switching to the release candidate platform.

NodeJS 6.8.1 and 4.6.0 have been added to the platform. To use them in your builds, add nvm use 6.8 or nvm use 4.6 to your build commands. These versions will be selectable in Project Settings after the release candidate period.

Trying the new platform

To ensure that the updates are compatible with your current setup, please switch to the Ubuntu 14.04 LTS v1610 (release candidate) platform in Project Settings > Platform. We’re looking forward to hearing your feedback and requirements, which will help us to fix the potential issues and tailor the platform to better suit your needs. The release candidate period will last until October 25th, 2016.

Changes in the final release

Node.js gets several security updates, including 0.10.48 and 0.12.17. The previously announced versions (6.8.1 and 4.6.0) are also replaced with more up-to-date versions, namely 6.9.1 and 4.6.1.

The Docker-enabled platform gets an update with docker-engine version 1.12.2. Our tool for caching Docker images (docker-cache) has been updated as well, featuring full layer caching for tagged Docker images.

A full list of changes is available in the platform changelog.

Node.js Versions Used in Commercial Projects, 2016 Edition

Following up on our last week’s post for Ruby, we’re presenting the second annual report of Node.js version usage in commercial JavaScript projects on Semaphore (see 2015 numbers here). We think it’s always interesting to see what people are using to get the job done, and projects actively using continuous integration can be a solid sample.

Node.js version usage for commercial JavaScript projects on Semaphore

Naturally when you compare year over year, there’s always a trend towards using newer versions. A third of projects is still using Node 0.10 (down from 55% last year), whose long-term support (LTS) maintenance period ends on October 31. If you are not familiar with Node.js foundation’s release and maintenance schedule, you can see it here.

Node.js version adoption for private JavaScript projects over the years

At the moment, Node version 4 is the LTS and Node 6 the “current” release. It is also notable that AWS introduced support for Node 4.3 on its growingly popular Lambda platform in April 2016 (previously only 0.10 was available).

Node.js version fragmentation

Let’s zoom in on what the picture would look like if we focused only on the projects started in 2016. Note that Semaphore knows only when a project’s CI/CD was set up, so it’s an approximation.

Node.js versions used for new commercial projects in 2016 on Semaphore

The current LTS and latest versions dominate, although it’s still very much a mix of everything.

What’s your team’s approach to working with Node.js version(s)? Share your thoughts in the comments below.

P.S. Looking for CI/CD solution that’s easy to use, fast and supports all major Node.js versions out of the box? Try Semaphore for free.

Ruby Versions Used in Commercial Projects, 2016 Edition

Which versions of Ruby do people use when building apps at work? This is the question we’ve been answering for fun for four years now, based on data about private projects that are tested and deployed on Semaphore.

Ruby version usage for commercial projects on Semaphore

Since our last year’s report, Ruby 2.3 has been released, and the trend towards moving to newer versions has continued. Nearly 85% of all commercial projects are now using some version of Ruby 2, up from 79% last year.

Ruby version adoption for private projects over the years

In practice, teams seem to treat minor versions as “major”, and if we put the data that way, the trend towards increasing overall fragmentation continues:

Ruby version fragmentation

The charts above take into account all active projects. What would the picture look like if we focused only on the projects started in 2016? Well, Semaphore knows only when a project’s CI/CD was set up, so we can take that as an approximation:

Ruby version usage for commercial projects on Semaphore

Most people are starting with the latest version(s). That’s great!

What’s your team’s approach to keeping up with new Ruby releases? Feel free to discuss in the comments below.

P.S. Looking for CI/CD solution that’s easy to use, fast and supports all Ruby versions out of the box? Try Semaphore for free.

Video Tutorials on Setting up Continuous Integration and Deployment with Semaphore

We at Semaphore are all about continuous learning and sharing knowledge — our team spends a lot of time learning from various online and offline sources, as well as sharing our knowledge by writing and editing tutorials on TDD and BDD best practices, which we publish in the Semaphore Community.

One of the YouTube channels our developers regularly follow is Will Stern’s LearnCode.academy, where he posts useful video tutorials covering a wide range of topics, including React.js, Node.js, Angular.js, Docker, DevOps, and deployment strategies.

Continuous Integration and Deployment with Semaphore

We’ve recently had the pleasure of working with Will, who wrote an excellent tutorial on unit testing for React applications for our Community. He also made video tutorials on using Semaphore for testing a Node.js application, and on continuous deployment with Semaphore.

Continuous Integration for JavaScript Applications

The first video will help you get continuous integration up and running for a JavaScript project:


You can also read Will’s tutorial on getting started with unit testing for a React and MobX application using Enzyme and Jest in our Community.

Continuous Deployment with Semaphore

In the second video, Will covers how to improve development workflow with continuous deployment:


If you’re just starting out with continuous integration and deployment, or you just haven’t tried Semaphore yet, you can use Semaphore to test your open source projects for free, or start a 30-day free trial for your private projects.

Happy building!

Minor platform update on September 29th

We received a number of reports that the newly updated Bundler (version 1.13.0) wasn’t correctly resolving dependencies due to a bug in its code. Sadly, this wasn’t detected during the release candidate period of the 1609 platform, and it slipped into production. We’ll try our best to prevent situations like this from happening in future platform updates.

The 1609.1 update reverts Bundler to version 1.12.5.

A full list of changes is available in the platform changelog.

Platform Update on September 27th

The upcoming platform update is scheduled for September 27th, 2016.

Bundler has been updated to version 1.13.0.

ChromeDriver gets an update with version 2.24.

Firefox ESR receives an update with version 45.3.0.

Go gets an update with version 1.7.1.

JRuby gets two version updates, namely 1.7.26 and 9.1.5.0.

MySQL gets a version update with 5.6.33.

PHP receives two updates with versions 5.6.26 and 7.0.11.

RethinkDB gets an update with version 2.3.5.

wkhtmltopdf has been updated to version 0.12.3.

New things

The following additions will be available after switching to the release candidate platform.

Amazon ECS CLI version 0.4.4 is now part of the platform.

NodeJS 6.6.0 and 4.5.0 have been added to the platform. To use them in your builds, add nvm use 6.6 or nvm use 4.5 to your build commands. These versions will be selectable in Project Settings after the release candidate period.

Trying the new platform

To ensure that the updates are compatible with your current setup, please switch to the Ubuntu 14.04 LTS v1609 (release candidate) platform in Project Settings > Platform. We’re looking forward to hearing your feedback and requirements, which will help us to fix the potential issues and tailor the platform to better suit your needs. The release candidate period will last until September 27th, 2016.

Changes in the final release

The Docker-enabled platform gets an update with docker-engine version 1.12.1.

A full list of changes is available in the platform changelog.

Platform Update on August 23rd

The upcoming platform update is scheduled for August 23rd, 2016.

ChromeDriver gets an update with version 2.23.

Git has been updated to version 2.9.3.

MySQL receives an update with version 5.6.32.

Node.js gets an update with version 6.3.1.

Oracle JDK 8 has been updated to version 8u101.

PHP gets three version updates with 7.0.9, 5.6.24, and 5.5.38.

RabbitMQ has been updated to version 3.6.5.

New things

The following additions will be available after switching to the release candidate platform.

Go 1.7 is now part of the platform. To use it, add change-go-version 1.7 to your setup commands. This version will be selectable in Project Settings after the release candidate period ends.

A new tool called install-package has been added to the platform. Its main purpose is to install and cache packages from APT. Instead of using sudo apt-get install <pkg1> <pkg2>, use install-package <pkg1> <pkg2>. It will preserve the installed packages and re-use them in consequent builds, avoiding always needing to fetch them from an APT mirror. Please keep in mind that this tool is still under development, so do let us know if you run into any issues.

Trying the new platform

To ensure that the updates are compatible with your current setup, please switch to the Ubuntu 14.04 LTS v1608 (release candidate) platform in Project Settings > Platform. We’re looking forward to hearing your feedback and requirements, which will help us to fix the potential issues and tailor the platform to better suit your needs. The release candidate period will last until August 23rd, 2016.

Changes in the final release

Java 7 gets an update with version 7u111.

The Docker-enabled platform gets two updates with docker-engine 1.12.0 and docker-compose 1.8.0.

A full list of changes is available in the platform changelog.

Platform Update on July 26th

The upcoming platform update is scheduled for July 26th, 2016.

Bundler gets an update with version 1.12.5.

Cassandra is updated to version 2.2.7.

Firefox receives two updates with versions 45.2.0 and 38.8.0.

Git has been updated to version 2.9.2.

Go gets and update with version 1.6.3.

Java gets an update with version 7u101.

JRuby receives an update with version 9.1.2.0.

Node.js gets several updates with versions 6.2.2, 4.4.7, 0.12.15, and 0.10.46.

PHP receives three version updates with 7.0.8, 5.6.23, and 5.5.37.

RabbitMQ is updated with version 3.6.3.

New things

The following additions will be available after switching to the release candidate platform.

Elixir 1.3.2 has been added to the platform. This brings a lot of improvements to the language and its tooling, including a new Calendar module, new mix tasks, and several ExUnit enhancements. To use it, add kiex use 1.3 to your setup commands.

Erlang 19.0 is now part of the platform. The highlights of this release include the overhauled garbage collector and improvements to erts, as well as several other language modules. You can switch to this version by adding source /home/runner/.kerl/installs/19.0 to your setup commands. The rebar3 build tool is also part of this update.

Node.js 6.3.0 is now included in the platform. This version can be activated by adding nvm use 6.3 to your setup commands.

All the mentioned additions will be selectable in Project Settings, after the release candidate period.

Trying the new platform

To ensure that the updates are compatible with your current setup, please switch to the Ubuntu 14.04 LTS v1607 (release candidate) platform in Project Settings > Platform. We’re looking forward to hearing your feedback and requirements, which will help us to fix the potential issues and tailor the platform to better suit your needs. The release candidate period will last until July 26th, 2016.

A full list of changes is available in the platform changelog.

Get future posts like this one in your inbox.

Follow us on