NEW Run your Ruby tests in a few minutes and ship faster with Boosters, one-click auto parallelization · Learn more…
×

Semaphore Blog

News and updates from your friendly continuous integration and deployment service.

Measure and Improve Your CI Speed with Semaphore

Get future posts like this one in your inbox.

Follow us on

Measuring CI speed on Semaphore

We’re happy to announce a new Semaphore feature that will help you monitor and improve your CI speed over time. We’re calling it CI Speed Insights, and it’s available to all Semaphore users as of today.

Here’s how it works. On the project page, below all of your recently active branches, you’ll find a chart. In it, you’ll be able to see the runtime duration of recent builds and an indicator of your average CI speed, in minutes:

Fast CI speed indicator on Semaphore

The chart will also include a line indicating the 10 minute mark, in case your build runs around or longer than that.

Slow continuous integration speed indicator on Semaphore

Why 10 minutes?

We’re convinced that having a build slower than 10 minutes is not proper continuous integration. When a build takes longer than 10 minutes, we waste too much precious time and energy waiting, or context switching back and forth. We merge rarely, making every deploy more risky. Refactoring is hard to do well. You can read our full rationale about this in our recent blog post — What is Proper Continuous Integration?.

There’s more to CI Speed Insights if you click to view the details.

Continuous Integration Speed Insights on Semaphore

Measuring CI speed

We’ve defined CI speed as the average time that elapses from pushing new code to getting the build status report. In many cases this is how long it takes for your build setup and tests to run on Semaphore. However, another possibility is that your build is blocked waiting for available resources, which are being used by other builds. When this happens, you’ll see hints on Semaphore that point you to the Activity Monitor.

So, if your build runs for 4 minutes on average, but is waiting for other builds to finish for 5 minutes on average, then the CI speed for that project is reported as 9 minutes. This is what the screenshot above illustrates.

At the bottom of the screen, you’ll find an interactive chart of your recent builds that shows both the build runtime and your waiting time. You can click on each one to view full build details.

Viewing build details on CI Speed Insights chart

If your build lasts too long, Semaphore will recommend running tests in parallel. You can set this up by manually configuring parallel jobs. This is a great way to cut down your build time.

Automatic test parallelisation

Still, wouldn’t it be great if Semaphore could automatically parallelise all your tests for you, without any effort needed on your side? We’ve been hard at work on making that possible. A recent beta customer managed to reduce build time from 2 hours to 8 minutes(!).

The initial release of automatic parallelisation will support Ruby — RSpec and Cucumber to be precise. If you’re using these tools and are interested in setting up automatic parallelisation in your project, please get in touch with us to schedule a personal demo. We’re excited to show you how Semaphore can optimize the way your team runs tests.

Happy building!

Faster Rails: How to Check if a Record Exists

Ruby and Rails are slow — this argument is often used to downplay the worth of the language and the framework. This statement in itself is not false. Generally speaking, Ruby is slower than its direct competitors such as Node.js and Python. Yet, many businesses from small startups to platforms with millions of users use it as the backbone of their operations. How can we explain these contradictions?

Faster Rails: How to Check if a Record Exists

What makes your application slow?

While there can be many reasons behind making your application slow, database queries usually play the biggest role in an application’s performance footprint. Loading too much data into memory, N+1 queries, lack of cached values, and the lack of proper databases indexes are the biggest culprits that cause slow requests.

There are some legitimate domains where Ruby is simply too slow. However, most of the slow responses in our applications usually boil down to unoptimized database calls and the lack of proper caching.

Even if your application is blazing fast today, it can become much slower in only several months. API calls that worked just fine can suddenly start killing your service with a dreaded HTTP 502 response. After all, working with a database table with several hundred records is very different from working with a table that has millions of records.

Existence checks in Rails

Existence checks are probably the most common calls that you send to your database. Every request handler in your application probably starts with a lookup, followed by a policy check that uses multiple dependent lookups in the database.

However, there are multiple ways to check the existence of a database record in Rails. We have present?, empty?, any?, exists?, and various other counting-based approaches, and they all have vastly different performance implications.

In general, when working on Semaphore, I always prefer to use .exists?.

I’ll use our production database to illustrate why I prefer .exists? over the alternatives. We will try to look up if there has been a passed build in the last 7 days.

Let’s observe the database calls produced by our calls.

Build.where(:created_at => 7.days.ago..1.day.ago).passed.present?

# SELECT "builds".* FROM "builds" WHERE ("builds"."created_at" BETWEEN
# '2017-02-22 21:22:27.133402' AND '2017-02-28 21:22:27.133529') AND
# "builds"."result" = $1  [["result", "passed"]]


Build.where(:created_at => 7.days.ago..1.day.ago).passed.any?

# SELECT COUNT(*) FROM "builds" WHERE ("builds"."created_at" BETWEEN
# '2017-02-22 21:22:16.885942' AND '2017-02-28 21:22:16.886077') AND
# "builds"."result" = $1  [["result", "passed"]]


Build.where(:created_at => 7.days.ago..1.day.ago).passed.empty?

# SELECT COUNT(*) FROM "builds" WHERE ("builds"."created_at" BETWEEN
# '2017-02-22 21:22:16.885942' AND '2017-02-28 21:22:16.886077') AND
# "builds"."result" = $1  [["result", "passed"]]


Build.where(:created_at => 7.days.ago..1.day.ago).passed.exists?

# SELECT 1 AS one FROM "builds" WHERE ("builds"."created_at" BETWEEN
# '2017-02-22 21:23:04.066301' AND '2017-02-28 21:23:04.066443') AND
# "builds"."result" = $1 LIMIT 1  [["result", "passed"]]

The first call that uses .present? is very inefficient. It loads all the records from the database into memory, constructs the Active Record objects, and then finds out if the array is empty or not. In a huge database table, this can cause havoc and potentially load millions of records, that can even lead to downtimes in your service.

The second and third approaches, any? and empty?, are optimized in Rails and load only COUNT(*) into the memory. COUNT(*) queries are usually efficient, and you can use them even on semi-large tables without any dangerous side effects.

The third approach, exists?, is even more optimized, and it should be your first choice when checking the existence of a record. It uses the SELECT 1 ... LIMIT 1 approach, which is very fast.

Here are some numbers from our production database for the above queries:

present? =>  2892.7 ms
any?     =>   400.9 ms
empty?   =>   403.9 ms
exists   =>     1.1 ms

This small tweak can make your code up to 400 times faster in some cases.

If you take into account that 200ms is considered the upper limit for an acceptable response time, you will realize that this tweak can spell the difference between a good, sluggish, and bad user experience.

Should I always use exists?

I consider exists? a good sane default that usually has the best performance footprint. However, there are some exceptions.

For example, if we are checking for the existence of an association record without any scope, any? and empty? will also produce a very optimized query that uses SELECT 1 FROM ... LIMIT 1 form, but any? fill not hit the database again if the records are already loaded into memory.

This makes any? faster by one whole database call when the records are already loaded into memory:

project = Project.find_by_name("semaphore")

project.builds.load    # eager loads all the builds into the association cache

project.builds.any?    # no database hit
project.builds.exists? # hits the database

# if we bust the association cache
project.builds(true).any?    # hits the database
project.builds(true).exists? # hits the database

As a conclusion, my general advice is to always use exists? and improve the code based on metrics.

At Semaphore, we’re all about speed. We’re on a mission to make continuous integration fast and easy. Driven by numerous conversations with our customers and our own experiences, we’ve built a new CI feature that can automatically parallelize any test suite and cut its runtime to just a few minutes - Semaphore Boosters. Learn more and try it out.

Celebrate Continuous Delivery in Slack with New Semaphore Notifications

By popular demand, we’re happy to announce that we’ve made some tweaks to Slack notifications coming from Semaphore.

Here’s what our build and deployment notifications now look like:

We experimented with different approaches, ranging from minimal to multi-line, rich format messages, and used them for our projects for several weeks. In the end, we settled on a minimal, yet informative single-line format that includes all the key information (what, where) and links to relevant resources.

Deployment is everyone’s success

If you’re using Slack, and you haven’t had the chance to set up your Semaphore notifications, now is a great time to do that. All it takes is a few clicks, and it can help keep everyone in the team informed about what everyone else is working on. Learn how to set up Semaphore Slack notifications here.

We recommend setting up deployment notifications for production to a common Slack channel, such as #engineering or #general. Regardless of whether you ship weekly or continuously, every time you deliver something to your users, it’s a moment of accomplishment that’s worth sharing with the entire team.

Our internal settings are currently set to send notifications when build status of master branch changes (when it fails and recovers) and after deployment to production. Watch this quick video to see what our notifications look like in action:

If your team is not yet practicing continuous deployment, here’s an overview of how it works, and if you’re ready you can look into options for setting up deployment on Semaphore. Even just starting with a central manual trigger and history of deployment within your CI process is a great first step towards more complete automation.

Happy building!

Making a Mailing Microservice with Elixir and RabbitMQ

At Rendered Text, we like to decompose our applications into microservices. These days, when we have an idea, we think of its implementation as a composition of multiple small, self-sustaining services, rather than an addition to a big monolith.

In a recent Semaphore product experiment, we wanted to create a service that gathers information from multiple sources, and then emails a report that summarizes the data in a useful way to our customers. It’s a good use case for a microservice, so let’s dive into how we did it.

Our microservices stack includes Elixir, RabbitMQ for communication, and Apache Thrift for message serialization.

We designed a mailing system that consists of three parts:

  1. the main user-facing application that contains the data,
  2. a data-processing microservice, and
  3. a service for composing and sending the emails.

Asynchronous messaging with RabbitMQ

For asynchronous messaging, we decided to use RabbitMQ, a platform for sending and receiving messages.

First, we run the cron job inside our main application which gathers the data. Then, we encode that data and send it to the Elixir mailing microservice using RabbitMQ.

Every RabbitMQ communication pipeline consists of a producer and a consumer. For publishing messages from our main application, we have a Ruby producer that sends a message to a predefined channel and queue.

def self.publish(message)
  options = {
    :exchange => "exchange_name",
    :routing_key => "routing_key",
    :url => "amqp_url"
    }
  Tackle.publish(message, options)
end

In the last line above, we are using our open source tool called Tackle to publish a message. Tackle tackles the problem of processing asynchronous jobs in a reliable manner by relying on RabbitMQ. It serves as an abstraction around RabbitMQ’s API.

The consumer is a microservice written in Elixir, so we use ex-tackle, which is an Elixir port of Tackle:

defmodule MailingService.Consumer do
  use Tackle.Consumer,
  url: "amqp_url",
  exchange: "exchange_name",
  routing_key: "routing_key",
  service: "service_name",
  retry_limit: 10,
  retry_delay: 10

  def handle_message(message) do
    message
    |> Poison.decode!
    |> Mail.compose
  end
end

We connect to the specified exchange and wait for encoded messages to arrive. Options like retry_limit and retry_delay are there to allow us to specify how many times we want to retry message handling before the message is sent to the dead letter queue. The delay is there to set the timespan between each retry. This ensures the stability and reliability of our publish-subscribe system.

In our case, we use the decoded message to request data from the data processing microservice, and later use that response to send a message to the mailing service.

HTML template rendering in Elixir with EEx

Once our data is received and successfully decoded, we start composing the email by inserting the received data into the email templates. Similar to Ruby’s ERB and Java’s JSPs, Elixir has EEx, or Embedded Elixir. EEx allows us to embed and evaluate Elixir inside strings.

While using EEx, we go through three main phases. The first one is evaluation, the second is definition, and the third is compilation. EEx rules apply when a filename contains an extension html.eex.

Our email consists of multiple sections, all of which are using different datasets. Because of this, we divided our HTML email into partials for easier composing and improved code readability.

In order to evaluate the data inside a partial, we call the eval_file function and pass the data to partials:

<div>
  <%= EEx.eval_file "data/_introduction.html.eex",
                    [data1: template_data.data1,
                     data2: template_data.data2] %>

  <%= EEx.eval_file "data/_information.html.eex",
                    [data2: template_data.data2,
                     data3: template_data.data3] %>
</div>

Once we have all partials in place, we can combine them by evaluating them inside an entry point HTML template.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
 <%= EEx.eval_file "data/_header.html.eex" %>
 <%= EEx.eval_file "data/_content.html.eex", [template_data: template_data] %>
</html>

Sending email with the SparkPost API

For email delivery, we rely on SparkPost. It provides email delivery services for apps, and it also provides useful email analytics. In our case, we used the Elixir SparkPost API client. We used it by creating a Mailer module that is very easy to use for email sending when instanced.

defmodule MailingService.Mailer do
  alias SparkPost.{Content, Recipient, Transmission}

  @return_path "semaphore+notifications@renderedtext.com"

  def send_message(recipient_email, content) do
    Transmission.send(%Transmission{
                      recipients: [ recipient_email ],
                      return_path: @return_path,
                      content: content,
                      campaign_id: "Campaign Name"
                    })
  end
end

Once we’ve defined this module, we can easily use it anywhere as long as we pass the correct data structure. For example, we have a function that creates a data structure for the email template and passes it to the send_message function with desired recipients.

def compose_and_deliver(data1, data2) do
  mail = %Content.Inline{
    subject: "[Semaphore] #{TimeFormatter.today} Mail subject",
    from: "Semaphore <semaphore+notifications@renderedtext.com>",
    text: template_data(data1, data2) |> text,
    html: template_data(data1, data2) |> html
  }
  @mailer.send_message(data2["email"], mail)
end

SparkPost also enables us to send to multiple recipients at the same time, as well as send both HTML and plain text versions of an email. Bear in mind that you also need to provide .txt templates in order to send a plain text email.

As a final step in this iteration, we have developed a preview email that service owners receive a few hours before the production reports go out to customers.

Wrapping up

We now have a mailing microservice, made using Elixir, SparkPost, and RabbitMQ. Combining these three has allowed us to create a microservice that takes less than 4 seconds to gather data, send it, receive it on the other end, compose the emails, and dispatch them to customers.

What is Proper Continuous Integration?

Standard continuous integration time

Continuous integration (CI) is confusing. As with all ideas, everybody does their own version of it in practice.

CI is a solution to the problems we face while writing, testing and delivering software to end users. Its core promise is reliability.

A prerequisite for continuous integration is having an automated test suite. This is not a light requirement. Learning to write automated tests and mastering test-driven development takes years of practice. And yet, in a growing app, the tests we’ve developed can become an impediment to our productivity.

Are We Doing CI?

Let’s take two development teams, both writing tests, as an example. The first one’s CI build runs for about 3 minutes. The second team clocks at 45 minutes. They both use a CI server or a hosted CI service like Semaphore that runs tests on feature branches. They both release reliable software in predictable cycles. But are they both doing proper continuous integration?

Martin Fowler recently shared a description of an informal CI certification process performed by Jez Humble:

He usually begins the certification process by asking his [conference] audience to raise their hands if they do Continuous Integration. Usually most of the audience raise their hands.

He then asks them to keep their hands up if everyone on their team commits and pushes to a shared mainline (usually shared master in git) at least daily.

Over half the hands go down.

He then asks them to keep their hands up if each such commit causes an automated build and test. Half the remaining hands are lowered.

Finally he asks if, when the build fails, it’s usually back to green within ten minutes.

With that last question only a few hands remain. Those are the people who pass his certification test.

Software Development or a Sword Fight?

If a CI build takes long enough for us to have time to go practice swordmanship while we wait, we approach our work defensively. We tend to keep branches on the local computer longer, and thus every developer’s code is in a significantly different state. Merges are rarer, and they become big and risky events. Refactoring becomes hard to do on the scale that the system needs to stay healthy.

With a slow build, every “git push” sends us to Limbo. We either wait, or look for something else to do to avoid being completely idle. And if we context-switch to something else, we know that we’ll need to switch back again when the build is finished. The catch is that every task switch in programming is hard and it sucks up our energy.

The point of continuous in continuous integration is speed. Speed drives high productivity: we want feedback as soon as possible. Fast feedback loops keep us in a state of flow, which is the source of our happiness at work.

So, it’s helpful to establish criteria for what proper continuous integration really means and how it’s done.

The 10 Minutes Test

It’s simple: does it take you less than 10 minutes from pushing new code to getting results? If so, congratulations. Your team is equipped for high performance. If not, your workflow only has elements of a CI process, for lack of a better term. But, this slowness develops wrong habits and hurts the productivity of all developers in a team. This ultimately inhibits the performance of the company as a whole.

Nobody sets out to build an unproductive delivery pipeline. Yet, we’re busy enough writing code until we feel like a boiling frog — we don’t notice the change until we accept it as the way things just are. Of course our build takes long, we have over 10,000 lines of code!

The Light at the End of the Tunnel

But, things don’t have to be this way. Regardless of how big your test suite is, parallelizing tests can cut waiting time down to just a couple of minutes or less. A fast hosted CI service that allows you to easily configure jobs to run in parallel and run as many jobs as you need can be a good solution. By parallelizing tests, you’ll reduce the time you spend deciding what to do while you wait, and keep your team in a state of flow.

We are building Semaphore, which has been proven to be the fastest hosted CI service on the market. We’re on a mission to make CI fast and easy. Driven by numerous conversations with our customers and our own experiences, we’ve built a new CI feature that can automatically parallelize any test suite and cut its runtime to just a few minutes - Semaphore Boosters. Learn more and try it out.

Moving Platform Updates to Changelog

Quick heads up that we’ll be publishing future platform updates on the Semaphore changelog.

This will help us focus the blog on important product updates and helpful engineering posts, while the changelog will contain a more complete timeline of evolutionary changes that we’re making to the product.

We’ll be tweeting every new post from the changelog, and you can subscribe to the changelog’s RSS feed here

Migrating from Snap CI to Semaphore

In the last week, news has spread throughout the CI/CD community about the announced shutdown of Snap CI, one of our highly-respected competitors. If you’re a Snap CI user looking for a new hosted CI/CD solution, here’s a comparison of Semaphore’s and Snap CI’s offering to help you decide if Semaphore is the right choice for you.

Migrating from Snap CI to Semaphore

What makes Semaphore a good choice?

Everything about Semaphore is engineered for speed and functionality. Our user experience has been praised as simple and straightforward. Setting up a build and deploy pipeline can be done in a couple of minutes, and the performance of our machines promises as little time spent waiting for CI as possible. Scaling up is effortless, offering easy parallelization of your builds.

With all of this included, CI/CD via Semaphore should take up the minimum of your time, enabling you to use CI/CD the way they are meant to be used.

What makes Semaphore similar to Snap CI?

Semaphore tracks your GitHub or Bitbucket projects, running automatic builds when pushes or merges come, and keeps the history of builds and deploys of your projects. Builds and deploys themselves are described primarily through plain Bash, and can be edited through our UI.

Semaphore also matches Snap CI when it comes to the support of major programming languages. It also provides native Docker support, and offers deployment integrations which include Heroku, Capistrano, Cloud 66, as well as AWS Elastic Beanstalk, S3 and Lambda.

What makes it different?

The main difference between Semaphore and Snap CI is the implementation of pipelines. Semaphore’s build/deploy pipeline is split into a build stage and a deploy stage. Builds are run after each push or merge, and deploys can then be run either automatically or manually. Artifacts are not shared between these two stages.

One more notable difference is debugging. Instead of Snap’s snap-shell, Semaphore offers you the ability to directly SSH into the build servers, giving you the full power to debug your builds and deploys.

How to migrate?

Migration from Snap CI to Semaphore should be as simple as signing up, adding your projects to Semaphore, copying your build commands, and setting up our build/deploy model to best match your Snap CI pipelines.

For more details on pricing, check out our pricing page.

If you need any help in this process, or any additional information, feel free to contact us, and we will be more than happy to help you start building again.

Platform Update on January 24th

The upcoming platform update is scheduled for January 24th, 2017.

ChromeDriver has been updated to version 2.27.

Erlang receives an update with version 19.2.

Firefox ESR has been updated to version 45.6.0.

Git receives an update with version 2.11.

Go gets two version updates with 1.6.4 and 1.7.4.

Java 7 has been updated with version 7u121.

MongoDB receives an upgrade to version 3.4.1. This is a major update, including some breaking changes. Refer to the MongoDB compatibility pages for versions 3.0 and 3.4 for more details. A downgrade script is available here, if you wish to revert to the previous, 2.6.12 version.

MySQL gets an update with version 5.6.35.

Node.js has been updated with versions 0.12.18, 4.7.2, and 6.9.4.

PHP gets two updates with versions 5.6.29 and 7.0.14.

RabbitMQ has been updated to version 3.6.6.

New things

The following additions will be available after switching to the release candidate platform.

Elixir 1.4.0 has been added to the platform.

Python 3.6 is now part of the platform.

Scala 2.12.1 is included in the platform.

Node.js 7.4.0 has been added to the platform.

Changes in the final release

The Docker-enabled platform gets two updates with docker-engine version 1.12.6 and docker-compose version 1.9.0.

Java 8 gets an update with version 8u121.

A full list of changes is available in the platform changelog.

Keeping Logs Available When Rebuilding a Build

Flaky tests can be hard to track down. When you are working on a new feature, your builds can fail because of one or a few unrelated flaky tests. In this case a rebuild can help to make your build pass.

Until now, a rebuild of a failed build replaced that build. This way, both the logs and the exact reason behind the failure becomes inaccessible for further inspection.

Rebuild button now creates new builds

To make debugging easier, we decided to to change the behaviour of the Rebuild button on Semaphore to ensure that developers still have access to information on failed builds after a rebuild.

Instead of deleting the failed build and starting the build process from scratch, Semaphore now creates a new build with the same commits without affecting the original build. So, if you need to rebuild a revision because of a flaky test, you will now still have access to logs on previous failures, which you can use for later debugging.

With the old rebuild flow, a failed build: Failed Build

was replaced with a new build: Old Rebuild Flow

In the new rebuild flow, both the failed build and the new one are accessible: New Rebuild Flow

The new rebuild flow ensures that all your builds are preserved and accessible if necessary. We hope that this change will make debugging easier for you and your team.

Happy building!

Ruby 2.4.0 Available In a Minor Platform Update

We just released an incremental platform update, versioned 1611.1, adding several new languages to the platform.

New things

Node.js versions 7.3.0 and 4.7.0 are now part of the platform.

PHP 7.1.0 is now included in the platform, adding several new features to the language.

Ruby 2.4.0 is now part of the platform. This long awaited update brings a lot of new features and performance improvements.

A full list of changes is available in the platform changelog.

Get future posts like this one in your inbox.

Follow us on