27 Oct 2022 · Software Engineering

    Efficiency With Elixir

    7 min read
    Contents

    The work I am doing at my current company, Vody, is quite similar in principle to the work I was doing in my previous role. Both startups build large-scale data pipelines (i.e. ETLs). But at my current gig, we are doing more with only a fraction of the developers. I credit Elixir with a large part of that efficiency. This article will examine why a small number of Elixir engineers outperforms a much larger team.

    Monoliths vs. Microservices

    There is a microservice epidemic sweeping the nation. Like OxyContin or Fentanyl, they can help alleviate certain ailments, but there can be serious side-effects. What I saw in my previous role reinforced observations gathered from other gigs: the microservice architecture can cause as many problems as it solves.

    • More repositories mean more infrastructure, more PRs, more code reviews, actions, tagging et al.
    • More boilerplate code, and ironically, there are more barriers to its reuse.
    • Integration tests can become practically impossible.
    • Thinking about the system as a whole becomes disincentivized.

    At this other company, “the monolith” became a whipping boy that took the fall for everything from poorly-defined business requirements to developer negligence. When microservices were presented as an escape hatch, the tough conversations about various domain problems simply did not happen. Efficiency suffered, and the code bases became another demonstration of Conway’s Law: the code and its conventions became as varied and isolated as the teams who authored it.

    By building a single Elixir app instead of a conglomerate of Node and Python microservices, my current team required less code and had better control over its organization and quality. Our tests were able to make thorough and meaningful assertions on all aspects of its functionality. And, because we could run the tests locally, there were rarely bugs that emerged only in production. Elixir apps execute code inside isolated processes and they can scale horizontally across multiple nodes, which makes them both fault tolerant and scalable. This means that Elixir apps inherently offer some of the same benefits as using microservices, but without the extra work.

    Infrastructure: Less is More

    Serverless is one of the most popular frameworks used to wrangle microservice infrastructure. We need some “infrastructure as code” (IaC) solution, but let’s not kid ourselves: this can be a steep learning curve. The irony is that I have never had to think so much about servers and infrastructure than when using Serverless. When you chop up your app and manage its complex interactions by hacking on a temperamental YML file, you now have two problems.

    When an application is dissected into microservice components, it becomes harder and harder to perceive the system as a whole. The duplication or shirking of responsibilities can go unnoticed. A kidney here, a severed foot there — is this really part of the same creature? Who would notice if your app had two heads or was missing a spleen if its body parts were strewn across the lab in so many jars?

    The infrastructure simplifications available to an Elixir app can go beyond those available to any monolith. Instead of needing Redis, for example, we have been able to rely on Elixir’s native ETS storage or Erlang’s distributed Mnesia. Instead of Amazon SQS, our app relies on Elixir’s native message passing. No third-party drivers or services required! So our Elixir app requires even less code and less infrastructure than a typical monolith, and we didn’t have to get bogged down with an IaC framework. Deployment is as simple as shipping a single compiled binary.

    Smaller Teams

    One of my children’s books retells a Zen koan named “The Taste of Banzo’s Sword”. It recounts a boy’s efforts to master swordsmanship by studying under the great master Banzo. The crux is that the harder and faster the boy promises to work, the more it delays the development of his skills. The reason, as Master Banzo points out, is that “one in such a hurry seldom achieves good results.” When management is so driven by speed and growth, they can end up hiring more developers than can reasonably work together. Be careful of too many cooks in the kitchen!

    Code repositories are like a clean room or a crime scene: each person in there represents a possible vector for contamination. Each developer might be adding debt or bugs to the code base. My previous company had thirty-some developers. Getting to know them took months, never mind the communication challenges that arose when trying to correct simple mistakes.

    In measurable ways, having an app that has fewer lines of code and fewer dependencies means that it requires fewer developers to maintain. In other words, the savings extend beyond the AWS bill and into payroll. Planning, debate, and course corrections all happen much faster on a smaller team. Maintaining proper patterns and conventions is easier too. Elixir is rarely a developer’s first language, so a typical Elixir developer has experience with other technologies. The result is that an Elixir team tends to be highly skilled.

    Performance and Concurrency

    When it comes to optimizing for performance, it is important to ask what exactly needs to be optimized? If you need to wrangle statistics, you might reach for R, but if you need to do CPU-intensive calculations you might reach for Rust. Elixir is quite performant for many tasks, but its real strength lies within its fault tolerance and concurrency. There is a reason why Elixir’s concurrency model is discussed specifically in Paul Butcher’s Seven Concurrency Models in Seven Weeks whereas Python or PHP are not. If you need to move lots of data through a network of computations, Elixir should be an obvious choice.

    It can be tempting to visualize the performance characteristics of different languages, but these graphical comparisons can be misleading or incomplete. For me, one of the simplest indicators of a solution’s efficiency is the monthly AWS bill. At Vody, our data ingestion infrastructure costs a tenth of what my previous company paid — think hundreds of dollars instead of thousands, to say nothing of the headcount.

    Ostensibly, at my previous company we were concerned with scraping and parsing data. But in reality, much of our time was spent cobbling together tools to manage concurrency at some kind of scale. The fact that my previous company was reinventing wheels should have been a red flag. In contrast, Elixir runs on the Erlang VM (the BEAM), which was designed from the very start to be fault tolerant and support massive concurrency across distributed nodes. The language was literally designed to solve problems like this; it’s in its DNA. So when we use Elixir, we are tapping into Erlang libraries that have been in use around the world for about as long as the JVM. Instead of feeling like we are pounding square pegs into round holes, the problem and the solution are well matched. We spend minimal effort getting the data so we can maximize the time we spend working with it.

    Conclusion

    There is no such thing as a “perfect solution” for anything with software. But I have been quite pleased with how elegantly Elixir has helped us move massive amounts of data through our pipelines. Elixir has allowed us to capitalize on efficiencies that wouldn’t be possible in many other stacks. The discussion of microservices, when it has come up, has been about enhancing the architecture rather than escaping from it. And we have been able to pursue these goals with a small team of skilled developers.

    Systems are subtle and adapting them to solve specific problems takes time and experience. There is rarely a one-size-fits all solution. Elixir has offered us a refreshing alternative to other options, and in the case of large-scale data scraping, it punches far above its weight.

    2 thoughts on “Efficiency With Elixir

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    Everett is a software engineer with over twenty years experience writing code. He has taught full-stack and data visualization bootcamps at the University of Southern California and UC Berkeley. His hobbies include marathon running and mushroom foraging.
    Avatar
    Reviewed by:
    I picked up most of my soft/hardware troubleshooting skills in the US Army. A decade of Java development drove me to operations, scaling infrastructure to cope with the thundering herd. Engineering coach and CTO of Teleclinic.