20 Oct 2022 · Software Engineering

    The Secret Art of Storytelling in Programming

    16 min read
    Contents

    It’s quite common for developers to spend more time reading code than writing it. Therefore, code readability significantly impacts the efficiency of a development team. Indeed, when code is easy to read, it takes developers less time to understand the intent. This means that they can spend more time refactoring and writing new code. Moreover, when code flows like a story, developers have more fun at work. Code is not only a means for machines to execute programs, but also a means of communication between team members. There is a provocative quote about the importance of code readability in Structure and Interpretation of Computer Programs:

    “Programs must be written for people to read, and only incidentally for machines to execute” (SICP)

    In this article, I present a concept from psychology called brain spans. I use brain spans to explain what makes traditional code hard to read. Finally, I suggest simple guidelines to write code that is easy to read and flows like a story.

    Brain spans

    Some tasks are harder for the human brain to perform than others. Psychologists have measured the average ability of the human brain in various capacities, and have come up with a measurement of sorts that they call spans. The term span comes from an ancient unit of measure that corresponds to the breadth of a splayed human hand, from the top of the thumb to the tip of the little finger (approximately 9 inches or 23 cm).

    Many tools that human beings have developed over the years take the span into account. For instance, the length of many handles (doors, pans, forks, knives, etc.) is approximately 20 cm. If the tools were longer than the span, it would be very inconvenient for us to use them. Imagine trying to use a spoon with a 50cm handle. You could do it, but it would take some concentration.

    So it is with the brain as well. While the brain can certainly operate in difficult conditions, it doesn’t like to and it takes more energy–we could use this energy for other things. 

    The thesis of this article is that when we write code that respects our brain spans, it is easy to read.

    Code that respects brain spans is easy to read

    In the context of programming, three brain spans are relevant:

    1. Memory span.
    2. Attention span.
    3. Structure span.

    Memory Span

    The memory span relates to the limitations of our short-term memory. It consists of the longest list of items that a person can repeat back in the correct order from memory. The average is roughly seven. As you can see for yourself, trying to remember more than 7 items requires a lot of effort.

    Attention Span

    The attention span relates to the amount of time one can easily stay focused on a task before becoming distracted.

    When we perform a task, like learning for an exam, our level of attention varies:

    1. Zone of Primacy: our attention level is high (first 20 minutes)
    2. Zone of Huh: our attention level is low (unlimited)
    3. Zone of Recency: our attention level is high again (last 5 minutes)

    The Pomodoro Technique takes our attention span into consideration and advises breaking tasks into work intervals of 25 minutes and break intervals of 5 minutes. According to the Pomodoro Technique, it is easier to stay focused for 8 30-minutes sessions than to stay focused for 4 hours straight.

    The term Pomodoro refers to a kitchen timer that looks like a tomato (Pomodoro means tomato in Italian).

    Structure Span

    The structure span (a term I invented) relates to how information usually flows within an organization. We tend to structure information in such a way that there is a clear distinction between the what and the how.

    Imagine that the CEO of a company needs to increase the company’s top line.

    Here’s how the information could flow from the CEO down to the sales people:

    1. The CEO tells the VP of sales they have to increase sales. (What)
    2. Then the VP of sales thinks about how to increase sales. They decide that what makes the most sense is to increase sales in Europe. (How)
    3. The VP of sales tells the Europe sales manager that they have to increase sales in their region. (What)
    4. Then the Europe sales manager of sales decides to: (How)
      1. Tell Bob to call more customers each day.
      2. Ask Bryan to revise his sales playbook.
      3. Suggest that Bruce be more aggressive in his sales pitch.
      4. Hire a new salesperson.

    It doesn’t make sense for the CEO to deal with the details of the revision of the sales playbook! There is a clear distinction between the what and the how. This is how our minds tend to structure information.

    In the next section, we will see how the brain spans help us to understand the brain burden involved in reading traditional code.

    Brain burden when reading code

    We are going to explain the brain burden involved in reading traditional code. We will use code from a JavaScript function that handles a search query over a book catalog. The function is called handleSearchQuery and it receives 3 arguments:

    • catalog a map with the following fields: 
      • book: an array of books
      • authors: a map with author IDs as keys and authors as values
    • query a query string
    • options a map with the following fields: 
      • query a map with wholeWord set either to true or false
      • format a map with the following fields:
      • fields an array of field names to include in the book results
      • sort a map with the following fields:
      • fields the field names to use for sorting the book results
      • order either “desc” (for sorting in descending order) or “asc” (for sorting in ascending order)

    Here’s an example of a catalog with two books:

    var catalog = {
        books: [
            {
                title: "Watchmen",
                publicationYear: 1987,
                authorID: "alan-moore",
            },
            {
                title: "Water for Elephants",
                publicationYear: 2011
            }
        ],
        authors: {
            "alan-moore": {
                firstName: "Alan",
                lastName: "Moore",
            },
            "sara-gruen": {
                firstName: "Sarah",
                lastName: "Gruen"
            }
        }
    };

    Here’s an example of a options map:

    var options = {
        query: { wholeWord: false },
        format: {
            fields: ["title", "author"],
            sort: {
                fields: ["publicationYear"],
                order: "desc"
            }
        }
    };

    And an example of how to use handleSearchQuery:

    handleSearchQuery(catalog, "Wat", options);

    It returns the following book results:

    [
     {
      "author": "Sara Gruen",
      "title": "Water for Elephants"
     },
     {
      "author":"Alan Moore",
      "title": "Watchmen"
     }
    ]

    A traditional implementation of handleSearchQuery() using the JavaScript Lodash utility library could look like this:

    import _ from 'lodash';
    
    function handleSearchQuery(catalog, query, options) {
        // Finding books that match the query
        var prefixRegExp = new RegExp("\\b" + query + "\\b");
        var books = _.filter(catalog.books, function (book) {
            if (options.query.wholeWord) {
                return book.title.match(prefixRegExp) != null;
            }
            return book.title.includes(query);
        });
        // Enriching books
        var enrichedBooks = _.map(books, function (book) {
            var author = catalog.authors[book.authorID];
            return _.set(book, "author",
                _.capitalize(author.firstName) +
                " " + _.capitalize(author.lastName));
        });
        // Sorting books according to options
        var sortingOptions = options.format.sort;
        var sortedBooks = _.sortBy(enrichedBooks, sortingOptions.fields);
        if (sortingOptions.order == "desc") {
            sortedBooks = _.reverse(sortedBooks);
        }
        // Selecting fields 
        var fields = options.format.fields;
        var formattedBooks = _.map(sortedBooks, function (book) {
            var book = _.pick(book, fields);
            return _.omitBy(book, _.isUndefined);
        });
        return formattedBooks;
    }

    In the following sections, I will refer to our three brain spans to explain what makes this type of code hard to read. After that, I will suggest simple guidelines for writing code that is easier to read. The main assumption is that when we write code that respects our brain spans, it is easy to read.

    Long functions

    When a function has too many lines of code, it’s a challenge both for our attention span and our memory span. I haven’t found any research on the topic, but I assume that the limit is around 7 lines of code. The handleSearchQuery() function has 30 lines of code, and that’s too much!

    Long functions challenge our attention span: we have to keep our attention focused for a long period. The brain doesn’t like this.

    Long functions also challenge our memory span: we have to keep a big amount of information in our short-term memory. The brain doesn’t like this either.

    Mixing code and comments

    There are two kinds of comments that programmers write in their code:

    1. Comments that describe what the code does.
    2. Comments that explain why the code is written in a particular way.

    Comments of the first type should be avoided. We need them to describe what the code does only when the code doesn’t describe itself. This sort of comment disturbs the reading flow. They demand the reader to switch between reading code and reading plain English.

    We cannot avoid comments that explain why the code is written in a particular way. Those comments are necessary and make the reading experience smoother.

    The comments in the body of handleSearchQuery() are of the first type: they hurt code readability.

    Mixing what and how

    Let’s take a look at the first part of handleSearchQuery(), where we implement the book matching. We have to deal with two cases:

    1. Whole word match
    2. Partial match

    We implement whole word matching, using the regular expression \b word boundary. And we implement partial matching via the .includes() string method.

    We use _.filter() to retrieve the books that match the condition.

    // Finding books that match the query
    var prefixRegExp = new RegExp("\\b" + query + "\\b");
    var books = _.filter(catalog.books, function (book) {
        if (options.query.wholeWord) {
            return book.title.match(prefixRegExp) != null;
        }
        return book.title.includes(query);
    });

    The problem with this code is that it mixes low-level concerns like regular expressions and string matching with high-level concerns like filtering the elements of an array. There is no clear separation in the code between low-level and high-level concerns. When a reader wants to understand what the code does, they are forced to go through all the implementation details. This type of code doesn’t respect our structure span.

    Unclear Code Flow

    The flow of the code in handleSearchQuery is quite simple. There are 4 steps:

    1. Search: find the books that match the query.
    2. Enrich: enrich the books with author information.
    3. Sort: sort the books according to options.
    4. Project: keep the fields according to options.

    The problem is that the only way for a reader to see this flow is to read all of the code. In other words, the flow of the code is not explicitly reflected in its structure. This kind of code doesn’t respect the structure span.

    Unclear Scope

    Let’s take a look at the part of handleSearchQuery that implements book sorting:

    // Sorting books according to options
    var sortingOptions = options.format.sort;
    var sortedBooks = _.sortBy(enrichedBooks, sortingOptions.fields);
    if (sortingOptions.order == "desc") {
        sortedBooks = _.reverse(sortedBooks);
    }

    A nice property of this code is that it is well-scoped.

    • The code’s behavior is only influenced by enrichedBooks and options.format.sort
    • The code’s impact is limited to storing a result into sortedBooks

    Unfortunately, the structure of the code doesn’t reflect the well-scoped property of the code. The only way for a reader to discover the scope of this part of the code is to read all of the code. Again, the structure span is not respected.

    After having explored what makes code hard to read, let’s see what it takes to make code easy to read.

    Guidelines for writing code that flows like a story

    It appears that it’s quite simple to refactor our code so that it respects our brain spans. We need to follow 3 simple guidelines:

    1. Keep your functions small.
    2. Use a single level of abstraction.
    3. Give descriptive names to your functions.

    Let’s illustrate each of these guidelines by refactoring the code of handleSearchQuery().

    Keep your functions small

    We consider a function to be small when it has less than 7 lines of code. To refactor handleSearchQuery into a small function, we split the code into inner functions, like this:

    function searchBooks(books, query, queryOptions) { ... }
    function enrichBooks(books, authors) { ... }
    function sortBooks(books, sortingOptions) { ... }
    function selectBookFields(books, fields) { ... }
    
    function handleSearchQuery(catalog, query, options) {
        var books = searchBooks(catalog.books, query, options.query);
        var enrichedBooks = enrichBooks(books, catalog.authors);
        var sortedBooks = sortBooks(enrichedBooks, options.format.sort);
        return selectBookFields(sortedBooks, options.format.fields);
    }

    When a function is small, both the attention span and the memory span of the reader are respected. A reader can easily read the code of the function and focus their attention on what interests them.

    Use a single level of abstraction

    In the original implementation of the book matching code, low-level concerns (like string matching) are mixed with high-level concerns (like array filtering). We can improve the readability of the code by wrapping low-level concerns with utility functions.

    For instance, we create includesWholeWord() that wraps the regular expression matching and includesPartialWord() that wraps the string matching:

    function includesWholeWord(sentence, word) {
        var wholewordRegExp = new RegExp("\\b" + word + "\\b");
        return sentence.match(prefixRegExp) != null;
    }
    
    function includesPartialWord(sentence, word) {
        return sentence.includes(word);
    }

    Then, we create a matchQuery() function that calls either includeWholeWord() or includePartialWord() according to the query options:

    function matchQuery(book, query, queryOptions) {
        if (queryOptions.wholeWord) {
            return includesWholeWord(book.title, query);
        }
        return includesPartialWord(book.title, query);
    }

    Finally, we implement searchBooks() by filtering books for which matchQuery() returns true:

    function searchBooks(books, query, queryOptions) {
        return _.filter(books, function (book) {
            return matchQuery(book, query, queryOptions);
        });
    }

    Reading the code of searchBooks is now easier, not only because it is short but also because all the pieces of the code belong to the same level of abstraction within each function. This respects the structure span. 

    Give descriptive names to your functions

    The simplest way to avoid comments that describe what a piece of code does is to give descriptive names to your functions. Here is how we can use descriptive names in the code of sortBooks:

    function shouldReverse(sortingOptions) {
        return sortingOptions.order == "desc";
    }
    
    function maybeReverse(items, sortingOptions) {
        if (shouldReverse(sortingOptions)) {
            return _.reverse(items);
        }
        return items;
    }
    
    function sortByFields(items, sortingOptions) {
        return _.sortBy(items, sortingOptions.fields);
    }
    
    function sortBooks(books, sortingOptions) {
        var sortedBooks = sortByFields(books, sortingOptions);
        return maybeReverse(sortedBooks, sortingOptions);
    }

    When you give descriptive names to your functions, the code describes itself. As a consequence, the writer doesn’t have to include descriptive comments and the reader doesn’t have to read the code of a function to understand what the code does. For instance, it is clear from its name that maybeReverse() maybe reverses the book results.

    Good function names give the reader information about what the function does. The reader needs to read the code only when they want to discover how the function does what its name says it does (this is how the structure span likes things to be).

    When we apply the guidelines to all the parts of handleSearchQuery(), we get code that looks like this:

    function includesWholeWord(sentence, word) {
        var wholewordRegExp = new RegExp("\\b" + word + "\\b");
        return sentence.match(prefixRegExp) != null;
    }
    
    function includesPartialWord(sentence, word) {
        return sentence.includes(word);
    }
    
    function matchQuery(book, query, queryOptions) {
        if (queryOptions.wholeWord) {
            return includesWholeWord(book.title, query);
        }
        return includesPartialWord(book.title, query);
    }
    
    function searchBooks(books, query, queryOptions) {
        return _.filter(books, function (book) {
            return matchQuery(book, query, queryOptions);
        });
    }
    
    function fullName(person) {
        return _.capitalize(person.firstName) + " "
            + _.capitalize(person.lastName);
    }
    
    function authorName(authors, authorID) {
        return fullName(authors[authorID]);
    }
    
    function setAuthorName(authors, book) {
        return _.set(book, "author", authorName(authors, book.authorID));
    }
    
    function enrichBooks(books, authors) {
        return _.map(books, function (book) {
            return setAuthorName(authors, book);
        });
    }
    
    function shouldReverse(sortingOptions) {
        return sortingOptions.order == "desc";
    }
    
    function maybeReverse(items, sortingOptions) {
        if (shouldReverse(sortingOptions)) {
            return _.reverse(items);
        }
        return items;
    }
    
    function sortByFields(items, sortingOptions) {
        return _.sortBy(items, sortingOptions.fields);
    }
    
    function sortBooks(books, sortingOptions) {
        var sortedBooks = sortByFields(books, sortingOptions);
        return maybeReverse(sortedBooks, sortingOptions);
    }
    
    function selectBookFields(book, fields) {
        var book = _.pick(book, fields);
        return _.omitBy(book, _.isUndefined);
    }
    
    function selectBookFields(books, fields) {
        return _.map(books, function (book) {
            return selectBookFields(book, fields);
        });
    }
    
    function formatBooks(books, formattingOptions) {
        var sortedBooks = sortBooks(books, options.format.sort);
        return selectBookFields(sortedBooks, options.format.fields);
    }
    
    function handleSearchQuery(catalog, query, options) {
        var books = searchBooks(catalog.books, query, options.query);
        var enrichedBooks = enrichBooks(books, catalog.authors);
        var sortedBooks = sortBooks(enrichedBooks, options.format.sort);
        return selectBookFields(sortedBooks, options.format.fields);
    }

    We can visualize the code of handleSearchQuery() as a tree of function calls, where each node is a function with the following properties:

    1. It is small.
    2. It has a descriptive name.
    3. It has a single level of abstraction.

    An additional benefit of writing code that flows like a story is that it encourages the development team to write limited-scoped, independent unit tests. For instance, we can write independent write unit tests for searchBooks(), enrichBooks(), sortBooks() and selectBookFields(). For example, writing a test for selectBookFields() doesn’t require the creation of a book catalog or a full option map. We only need to pass an array of books and a list of fields.

    Before wrapping up this article, I would like to add a few words about moderation regarding the aforementioned guidelines.

    Moderation

    We should take care with the guidelines for writing code that is easy to read. If you use them without moderation, you might create a new set of challenges for your readers. You can, in this case, have too much of a good thing.

    Depth

    Creating too many small functions might cause a problem of function call depth. In a situation like this, the reader has to navigate through many function definitions before they find the meat of the code. This is bad for all three spans.

    Naming

    Sometimes, wrapping low-level concerns into a function to use a single level of abstraction might cause a naming problem. When you cannot come up with a clear name for one of your functions, you should ask yourself whether you need to create this function at all. Sometimes it’s better to break the “Single level of abstraction” guideline than to create functions with unclear names. The structure span likes things to be well-defined.

    Wrapping up

    In this article, we have presented three simple guidelines for writing code that is easy to read:

    1. Keep your functions small.
    2. Use a single level of abstraction.
    3. Give descriptive names to your functions.

    These guidelines come from the following hypothesis:  what makes code hard to read is that it doesn’t respect our brain spans. 

    1. Memory span.
    2. Attention span.
    3. Structure span.


    When code is easier to read, developer teams are more efficient as they spend less time and energy understanding code written by their teammates. Moreover, it makes it easier to share knowledge when a new developer joins a team. Overall, it improves the quality of life of developers at work, and makes for better code!

    2 thoughts on “The Secret Art of Storytelling in Programming

    1. it is really made me to re-check the code and way of organizing. What the tools/extensions like beautify do the same?

    2. Looks like the intent was to rename a function, but it turns out that there are two functions with the same selectBookFields name (i.e. the same name is used to define two separate functions – a bug)

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    Yehonathan Sharvit has been working as a software engineer since 2000, programming with C++, Java, Ruby, JavaScript, Clojure, and ClojureScript. He currently works as a software architect at Cycognito, building software infrastructures for high-scale data pipelines. He is the author of Data-Oriented programming, published by Manning.
    Avatar
    Reviewed by:
    I picked up most of my soft/hardware troubleshooting skills in the US Army. A decade of Java development drove me to operations, scaling infrastructure to cope with the thundering herd. Engineering coach and CTO of Teleclinic.