August 14, 2014 Posted in programming

The importance of a good-looking history

"Study the past if you would define the future." ~Confucius

Since the dawn of civilization, common sense has taught us that the way forward starts by knowing how we got here in the first place. While this powerful principle applies to practically all aspects of life, it's especially true when developing software.

For us programmers, the rear mirror through which we look at the history of a code base before we go on to shape its future is version control. Among all the information captured by a version control tool, the most critical ones are the commit messages.

Git's view of history

When we're trying to understand how a piece of software has evolved over time, the first thing we tend to do is look at the trails of messages left by the programmers who came before us. Those sentences hold the key to understanding the choices that molded the software into what it is today.

In other words, what you write in these messages is crucial and you should put extra effort in making them as loud and clear as possible.

This is true regardless of what version control system you happen to be using. However, it is especially true for Git. Why? Because Git simply holds the history of your code to a higher standard.

As Linus Torvalds explained in his excellent Teck Talk at Google back in 2007, Git evolved out of the need to manage the development of the Linux kernel, a humongous open source project with a 20 year history and hundreds of contributors from all around the world.

If source code history has ever played a more critical role in a software project, the Linux kernel is where it's at.

Torvalds' attention to history is also reflected in the features he built into his own distributed version control tool. To put it in his own words:

I want clean history, but that really means (a) clean and (b) history.

Regarding the "clean" part, he goes on to elaborate:

Keep your own history readable.

Some people do this by just working things out in their head first, and not making mistakes. But that's very rare, and for the rest of us, we use "git rebase" etc. while we work on our problems.

Don't expose your crap.

When it comes to "history", he says:

People can (and probably should) rebase their private trees (their own work). That's a cleanup. But never other people's code. That's a "destroy history"

You see, Git grants you all the tools you need to go back in time and rewrite your own commits (for example by changing their order, contents and messages) because having a clear history of the code matters. It matters to the sanity of whoever is working on it; present or future.

A legacy of e-mails

Having talked about the importance of keeping your history clean, let's take the concept one step further.

When you use Git, you should not only pay attention to the contents of your commit messages, but also how they're formatted.

There's a reason for that. As Torvalds himself stated in his Google talk, for a long period of time the history of the Linux kernel was captured in e-mail threads with patches attached:

"For the first 10 years of kernel maintenance we literally used tarballs and patches." ~Linus Torvalds

Even in the early days of Git, e-mail was still used as a way to send patches among collaborators of the Linux project.

If you look closely, you'll notice that the concept of e-mail is pretty pervasive throughout Git. Here's some evidence off the top of my head:

  • Every user has to have an e-mail address which is always part of the commit's metadata
  • The git format-patch and git am commands are specifically designed to convert commits into e-mails with patches as attachments
  • Both git blame and git shortlog have special options to display the committers' e-mail addresses instead of their names
  • The git log command has dedicated placeholders to indicate a commit message's subject and body

The last one is particularly interesting. Git seems to assume that a commit message is divided in two parts:

  1. A short one-sentence summary
  2. An optional longer description defined in its own paragraph separated by an empty line

A "well-formed" Git commit message would then look like this:

A short summary, possibly under 50 characters.

A longer description of the change and the reasoning
behind it for the future generations to know.
Even better if it's wrapped at 80 characters so that
it will look good in the console.

If you follow this simple convention, Git will reward you by going out of its way to show you your history in the prettiest way possible. And that's a good thing.

Formatting matters

Once you fall into the habit of keeping your commit messages under 50 characters and relegate any longer description to a separate paragraph, you can start pretty-printing your history in almost any way you like.

For example, you could choose to only display the commits' summaries by using the %s placeholder in the --format option of git log:

Simple example of pretty-printing the commit history

Or you could go crazy with all kinds of colors and indentation:

Gorgeous-looking commit history

The format string I used in this particular example can be broken down as:

"%C(cyan)%s%Creset %C(dim white)(%ar)%Creset%n%w(72,4,4)%b"

where:

  • %C(cyan) colors the following text in cyan
  • %s shows the commit summary
  • %Creset restores the default color for the text
  • %C(dim white) colors the following text in grey
  • %ar shows the time of the commit relative to now
  • %n adds a newline character
  • %w(72,4,4) wraps the following text at 72 characters. Then, indents the first line as well as the remaining ones with 4 spaces
  • %b shows the long description of the commit, if any

GitHub itself follows this convention when showing the commit history of a project. In fact, they will only show you the summary of each commit by default. If there's a longer description available, they allow you to expand it with the press of a button.

Commit message formatting in the GitHub web UI Pretty-printed commit message on GitHub

Enforcing the rule

Of course, this all works best if everyone on the project agrees to follow the convention.

But how do you ensure that the team sticks to the golden rule of pretty commits™?

Well, you give your peers a gentle nudge at exactly the right moment: just when they're about to make a commit. This is what Jeff Atwood calls the "Just In Time" theory:

You do it by showing them:

  • the minimum helpful reminder
  • at exactly the right time

GitHub does this already, both on the Web:

Commit message validation in the GitHub web UI Commit message being validated in the GitHub web UI

and in its desktop clients:

Commit message validation in GitHub for Mac Commit message being validated in GitHub for Mac...

Commit message validation in GitHub for Windows ...and in GitHub for Windows

But what if you prefer to use Git from the command line, the way it should be?

Easy. You write a shell script that gets triggered by Git's client side hooks every time you're about to do a commit. In that script, you make sure the message is formatted according to the rules.

Here's my version of it:

#!/bin/sh
#
# A hook script that checks the length of the commit message.
#
# Called by "git commit" with one argument, the name of the file
# that has the commit message. The hook should exit with non-zero
# status after issuing an appropriate message if it wants to stop the
# commit. The hook is allowed to edit the commit message file.

DEFAULT="\033[0m"
YELLOW="\033[1;33m"

function printWarning {
    message=$1
    printf >&2 "${YELLOW}$message${DEFAULT}\n"
}

function printNewline {
    printf "\n"
}

function captureUserInput {
    # Assigns stdin to the keyboard
    exec < /dev/tty
}

function confirm {
    question=$1
    read -p "$question [y/n]"$'\n' -n 1 -r
}

messageFilePath=$1
message=$(cat $messageFilePath)
firstLine=$(printf "$message" | sed -n 1p)
firstLineLength=$(printf ${#firstLine})

test $firstLineLength -lt 51 || {
    printWarning "Tip: the first line of the commit message shouldn't be longer than 50 characters and yours was $firstLineLength."
    captureUserInput
    confirm "Do you want to modify the message in your editor or just commit it?"

    if [[ $REPLY =~ ^[Yy]$ ]]; then
        $EDITOR $messageFilePath
    fi

    printNewline
    exit 0
}

In order to use it in your local repo, you'll have to manually copy the script file into the .git\hooks directory and call it commit-msg. Finally, you'll have grant execute rights to the file in order to make it runnable:

cp commit-msg somerepo/.git/hooks
chmod +x somerepo/.git/hooks/commit-msg

From that point forward, every time you attempt to create a commit that doesn't follow the rules you'll get a chance to do the right thing:

The commit-msg shell script in action

If you choose to press y, the commit message will open up in your default text editor from which you can rewrite it properly. Pressing n, on the other hand, will override the rule altogether and commit the message as it is.

Not that you'd ever want to do that.


June 25, 2014 Posted in gaming

Leveraging the cloud for fun and games

The story of a group of programmers, who one day decided to have a Counter-Strike: Global Offensive deathmatch, but didn't have the hardware to host their own game. So they took it to the cloud, learning a few lessons along the way.

Friday night, February 21, 2014. That's when the tretton37 Counter-Strike: Global Offensive fragfest was bound to start. Avid gamers looking to share virtual blood together, were eager to join in from our offices in Lund and Stockholm. A few more would play over the Internet.

CS:GO icon by griddark

The time and place were set. Pizzas were ordered. Everything was ready to go. Except for one thing:

We didn't have a dedicated Counter-Strike server to host the game on.

Finding a spare machine to dedicate for that one night wasn't an easy task, given our requirements:

  • The host should be reachable from the Internet
  • The machine should have enough hardware to handle a CS:GO game with 15+ players
  • The machine should be able to scale up as more players join the game
  • The whole thing should be a breeze to setup

For days I pondered my options when, suddenly, it hit me:

Where's the place to find commodity hardware that's available for rent, is on the Internet and can scale at will?

The cloud, of course! This realization fell on my head like the proverbial apple from the tree.

Step 1: Getting a machine in the cloud

Valve puts out their Source Dedicated Server software both for Windows and Linux. The Windows version has a GUI and is generally what you'd call "user friendly". The Linux version, on the other hand, is lean & mean and is managed entirely from the command line. Programmers being programmers, I decided to go for the Linux version.

Now, having established that I needed a Linux box, the next question was: which of the available clouds was I going to entrust with our gaming night? Since tretton37 is mainly a Microsoft shop, it felt natural to go for Microsoft Azure. However, I wasn't holding any high hopes that they would allow me to install Linux on one of their virtual machines.

As it turned out, I had to eat my hat on that one. Azure does, in fact, offer pre-installed Linux virtual machines ready to go. To me, this is proof that the cloud division at Microsoft totally gets how things are supposed to work in the 21st century. Kudos to them.

Creating a Linux VM on Azure

After literally 2 minutes, I had an Ubuntu Server machine with root access via SSH running in the cloud.

Creating a Linux VM on Azure

If I hadn't already eaten my hat, I would take it off for Azure.

Step 2: Installing the Steam Console Client

Hosting a CS:GO server implies setting up a so called Source Dedicated Server, also known as SRCDS. That's Valve's server software used to run all their games that are based on the Source Engine. The list includes Half Life 2, Team Fortress, Counter-Strike and so on.

A SRCDS is easily installed through the Steam Console Client, or SteamCMD. The easiest way to get it on Linux, is to download it and unpack it from a tarball. But first things first.

It's probably a good idea to run the Source server with a dedicated user account that doesn't have root privileges. So I went ahead and created a steam user, switched to it and headed to its home directory:

adduser steam
su steam
cd ~

Next, I needed to install a few libraries that SteamCMD depends on, like the GNU C compiler and its friends. That's where I hit the first roadblock.

steamcmd: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

Uh? A quick search on the Internet revealed that SteamCMD doesn't like to run on a 64-bit OS. In fact:

SteamCMD is a 32-bit binary, so it needs 32-bit libraries.

On the other hand:

The prepackaged Linux VMs available in Azure come in 64 bit only.

Ouch. Luckily, the issue was easily solved by installing the right version of Libgcc:

apt-get install lib32gcc1

Finally, I was ready to download the SteamCMD binaries and unpack them:

wget http://media.steampowered.com/installer/steamcmd_linux.tar.gz
tar xzvf steamcmd_linux.tar.gz

The client itself was kicked off by a Bash script:

cd ./steamcmd
./steamcmd.sh

That brought down the necessary updates to the client tools and started an interactive prompt from where I could install any of Valve's Source games servers.

At this point, I could have continued down the same route and install the CS:GO Dedicated Server (CSGO DS) by using SteamCMD.

However, a few intricate problems would be waiting further down the road. So, I decided to back out and find a better solution.

Steam> quit

Step 3: Installing the CS:GO Dedicated Server

Remember that thing about SteamCMD being a 32-bit binary and the Linux VM on Azure being only available in 64-bit?

Well, that turned out to be a bigger issue than I thought. Even after having successfully installed the CS:GO server, getting it to run became a nightmare. The server was constantly complaining about the wrong version of some obscure libraries. Files and directories were missing. Everything was a mess.

Salvation came in the form of a meticulously crafted script, designed to take care of those nitty-gritty details for me.

Thanks to Daniel Gibbs' hard work, I could use his fabulous csgoserver script to install, configure and, above all, manage our CS:GO Dedicated Server without pain.

You can find a detailed description how to use the csgoserver up on his site, so I'm just gonna report how I configured it to suit our deathmatch needs.

The status of the CS:GO Dedicated server as reported by csgoserver

Step 4: Configuration

The CS:GO server can be configured in a few different ways and it's all done in the server.cfg configuration file. In it, you can set up things like the game mode (Arms Race, Classic, Competitive to name a few) the maximum number of players and so on.

Here's how I configured it for the tretton37 deathmatch:

sv_password "secret" # Requires a password to join the server
sv_cheats 0 # Disables hacks and cheat codes
sv_lan 0 # Disables LAN mode

Step 5: Gold plating

The final touch was to provide an appropriate Message of the Day (or MOTD) for the occasion. That would be the screen that greets the players as they join the game, setting the right tone.

Once again, the whole thing was done by simply editing a text file. In this case, the file contained some HTML markup and a few stylesheets and was located in /home/steam/csgo/motd.txt.

Here's how it looked like in action:

The tretton37 Message of the Day in action

Step 6: Deathmatch!

This article is primarily meant as a reference on how to configure a dedicated CS:GO server on a Linux box hosted on Microsoft Azure. Nonetheless, I figured it would be interesting to follow up with some information on how the server itself held up during that glorious game night.

The CS:GO Dedicated server stats while running on Azure during game night

Here's a few stats taken both from the Azure Dashboard as well as from the operating system itself. Note that the server was running on a Large VM sporting a quad core 1.6 GHz CPU and 7 GB of RAM:

  • Number of simultaneous players: 16
  • Average CPU load: 15 %
  • Memory usage: 2.8 GB
  • Total outbound network traffic served: 1.18 GB

In retrospect, that configuration was probably a little overkill for the job. A Medium VM with a dual 1.6 GHz CPU and 3.5 GB of RAM would have probably sufficed. But hey, elastic scaling is exactly what the cloud is for.

One final thought

Oddly enough, this experience opened up my eyes to the great potential of cloud computing.

The CS:GO server was only intended to run for the duration of the event, which would last for a few hours. During that short period of time, I needed it to be as fast and responsive as possible. Hence, I went all out on the hardware.

As soon as the game night was over, I immediately shut down the virtual machine. The total cost for borrowing that awesome hardware for a few hours? Literally peanuts.

Amazing.


May 09, 2014 Posted in mechanical-keyboards

A gentle introduction to mechanical keyboards

The past few of years have seen a renaissance for mechanical keyboards, especially among two large groups of computer enthusiasts: programmers and gamers. I, too, share a passion for these wonderful pieces of hardware that goes way back. A quick search on the Internet reveals a wealth of web sites entirely dedicated to the subject. Unfortunately, most of the information in them is too "nerdy" for someone who's approaching it for the first time. In this post I'd like to give a gentle introduction to the world of mechanical keyboards for the curious, but uninitiated.

I love typing. I still remember how I, as a kid, used to borrow my mother's Diplomat typewriter from 1964 and started hammering away on it, pretending to type some long text:

The legendary Diplomat typewriter

"It sure sounds like you're a fast typist!"

she would say when she heard the racket coming from my room. Little did she know that what came out on the sheet of paper was actually gibberish. Or maybe she was just being kind.

Anyway, I enjoyed the feeling of pressing those stiff plastic buttons and hear the sound made by the metallic arm every time it punched a letter.

Today, I continue to enjoy a similar experience on my computer by using a mechanical keyboard.

So, what's a mechanical keyboard?

First and foremost, let's get the terminology straight by answering that burning question.

Wikipedia's definition gives us a good starting point, although it's still pretty vague:

Mechanical-switch keyboards use real switches underneath every key. Depending on the construction of the switch, such keyboards have varying response and travel times.

That leaves us wondering: what do they mean by real switches? Well, in this case real means physical, in the sense that under each key there's a metal spring that compresses every time you press a key. The spring is housed inside a plastic device that makes sure the key is registered by the computer at a specific point in time while the key is being pressed. That device is called a switch.

A mechanical switch

So, let's reformulate the definition of a mechanical keyboard:

A keyboard is considered mechanical if it uses metal springs underneath each key.

It's still true that there are different types of mechanical switches, each with their own special characteristics. The kind of switch used in a mechanical keyboard determines the overall feeling you'll get when you type on it.

A little bit of history

At this point you might rightfully wonder:

Wait a minute. Aren't all keyboards mechanical?

No, they are not. At least not anymore. The keyboards sold with the first personal computers back in the late 70's, however, were indeed.

Back then, computer keyboards were designed to mimic the feeling and behavior of typewriters. These were robust pieces of hardware, built with quality in mind. They used solid materials, like metal plates and thick ABS plastic frames. The keys were stiff enough to allow the typist to rest her fingers on the home row without accidentally pressing any key.

If you're in your 30's, chances are you remember how it felt to type on one of these keyboards, maybe in front of a Macintosh or a Commodore 64. All throughout the 80's and early 90's, every keyboard was mechanical. And heavy.

A Commodore 64 computer. Photo by Shane Doucette on Flickr

But then things started to change. When PCs became a mass market product, the additional cost of producing quality mechanical keyboards to go along with them was no longer justifiable. Consumers didn't seem to care, either. As graphical user interfaces (GUI) grew in popularity during the mid 80's, the keyboard, which had been the primary way to interact with computers up to that point, had to leave room to another kind of input device: the mouse.

Keyboards, like computers, became a commodity sold in large volumes. In an effort to continuously try to bring the production costs down, new designs and materials became mainstream. Focus shifted from quality and durability to cost effectiveness.

Consequently, the heavy metal plates disappeared in favor of cheaper silicon membranes put on top of electrical matrixes. Metal springs left room to air pads under each key. This keyboard technique is called rubber dome or membrane, and is to this day the most common type of keyboard sold with desktop computers.

Rubber dome switches. Source: codinghorror.com

For laptops, a new kind of low profile rubber dome switch began to appear. They were made out of thin plastic parts and were called scissors because of their collapsable X shape.

The scissor switch

While today's mainstream keyboards certainly serve their purpose, they have a limited lifetime.

With time and usage, the rubber bubbles eventually wear out, which means you have to press harder on the keys to activate them. The letters printed on the keys fade or peel off. The cheap plastic parts loosen up or bend. And by the time you're finally ready to throw those suckers in the bin, they've managed to drive you nuts.

Mechanical keyboards, on the other hand, are built to last.

As a testimony of their extreme durability, I can tell you that the oldest one in my collection is an Apple Extended Keyboard II dated all the way back to 1989 and it still works like a charm.

My glorious Apple Extended Keyboard II from 1989

That's a 25 years old piece of hardware still in perfectly good condition. So much, in fact, that I use it daily as my primary keyboard at home.

It's all about the feeling

However, believe it or not, the biggest problem with rubber dome keyboards isn't their fragility. It's the fact that they're an ergonomic nightmare.

We said that the switches are responsible for telling the computer which key is being pressed. The precise moment when that happens is called actuation point. The distance between a key's rest position and when it's fully pressed down is called key travel.

Now, in rubber dome keyboards, the only way to actuate a key is to press it all the way to the bottom (also known as bottoming out). That's when the silicon air bubble beneath it gets squashed, thus sending an electrical signal to the computer. This means that your fingers have to travel a longer distance when typing, which puts a strain on your hands and wrists.

Mechanical keys, on the other hand, are actuated before they bottom out. For most types of switches, that happens about half way through the key travel.

What does this mean in practice? Well, it means that you don't have to press as hard on your keyboard when you type. Your fingers don't have to travel as much and, consequently, you don't get as fatigued. It may even help you type faster.

To each his own switch

There are at least a dozen different kinds of mechanical switches available on the market today. That number doubles if you count the vintage ones.

Each and every one of them has its own way to physically actuate a key, which conforms its own distinctive feel.

The buckling spring switch patented by IBM in 1971

The buckling spring switch patented by IBM in 1971, for example, actuates a key by having the internal spring buckle outwards as it collapses. That activates a plastic hammer located beneath the spring, which, upon hitting a contact, closes an electrical circuit. As the spring buckles, it hits the inner wall of its plastic housing producing a loud "click" sound. That feedback signals the typist that the key has been registered.

The Cherry MX Blue switch

The modern Cherry MX Blue switch, on the other hand, uses a leaf spring pushed by a plastic slider (the grey one in the picture) as it passes by on its way down. When the slider is half way through, the leaf spring encounters no more resistance from the slider, thus closing a circuit that signals the computer about the key press. As the grey slider hits the bottom of the switch, it emits a high pitched "click" sound similar to the one of a mouse button.

Don't worry, I won't get into all the nuts and bolts of all the different switches. The Internet will happily provide you more information than you can possibly ask for in that regard.

The important thing to remember is that the ergonomic properties of a mechanical keyboard are determined by three fundamental characteristics of its switches:

  • Tactility
  • Force
  • Sound

Let's look at each of them briefly.

Tactility

A switch is called tactile if it gives some kind of feedback about when a key is actuated. This is usually done by significantly dropping the key's resistance the instant it gets registered. Some switches even accompany that with a "click" sound and are therefore referred to as "clicky".

Force

The force indicates how hard you have to press down a key in order to actuate it. This is measured in centinewtons (cN) or, more commonly grams-force (gf). In practice they're interchangeable, since 1 cN is roughly equivalent to 1 gf (more precisely 1 cn = 1.02 gf).

Switches get progressively stiffer the further down you press a key, which, again, helps you avoid bottoming out. A soft switch requires about 45 cN to actuate. A stiff one is about twice that, around 80 cN.

The force graph of the stiff tactile buckling spring switch The force graph of a soft tactile Cherry switch

These are called force graphs and indicate how much force in cN (Y axis) is required to press a key during its travel in mm (X axis) from rest position to bottom. Did you notice the sudden drop in force about half way through the travel? That's how tactile switches announce that the key has been actuated, as shown by the little black dot.

Sound

As I mentioned earlier, certain switches emit a "click" sound when they get actuated. This extra piece of feedback can either be pleasurable (even nostalgic if you will) or totally annoying. I strongly advise you to investigate how the people who will be sitting next to you feel about clicky keyboards before buying one. And telling them that all keyboards sounded like that back in the 80's no longer cuts it.

Wrapping it up

I know it sounds crazy, but I barely scratched the surface of what there is to know about mechanical keyboards. However, the purpose of this post was to give you an idea about what they are and where they come from, and I hope I've succeeded in that.

If you're eager to know more, fear not. I intend to write more about this topic. Next time, I'll start reviewing some of my favorite mechanical keyboards. In the meantime, take a look at the official guide to mechanical keyboards put together by this growing community of keyboard enthusiasts.


April 24, 2014 Posted in speaking  |  programming

BDD all the way down

When I started doing TDD a few years ago, I often felt an inexplicable gap between the functionality described in the requirements and the tests I was writing to drive my implementation. BDD turned out to be the answer.

TDD makes sense and improves the quality of your code. I don't think anybody could argue against this simple fact. Is TDD flawless? Well, that's a whole different discussion.Turtles all the way down

But let's back up for a moment.

Picture this: you've just joined a team tasked with developing some kind of software which you know nothing about. The only thing you know is that it's an implementation of Conway's Game of Life as a web API. Your first assignment is to develop a feature in the system:

In order to display the current state of the Universe
As a Game of Life client
I want to get the next generation from underpopulated cells

You want to implement this feature by following the principles of TDD, so you know that the first thing you should do is start writing a failing test. But, what should you test? How would you express this requirement in code? What's even a cell?

If you're looking at TDD for some guidance, well, I'm sorry to tell you that you'll find none. TDD has one rule. Do you remember what the first rule of TDD is? (No, it's not you don't talk about TDD):

Thou shalt not write a single line of production code without a failing test.

And that's basically it. TDD doesn't tell you what to test, how you should name your tests and not even how to understand why they fail in first place.

As a programmer, the only thing you can think of, at this point, is writing a test that checks for nulls. Arguably, that's the equivalent of trying to start a car by emptying the ashtrays.

Let the behavior be your guide

What if we stopped worrying about writing tests for the sake of, well, writing tests, and instead focused on verifying what the system is supposed to do in a certain situation? That's called behavior and a lot of good things may come out of letting it be the driving force when developing software. Dan North noticed this during his research, which ultimately led him to the formalization of Behavior-driven Development:

I started using the word “behavior” in place of “test” in my dealings with TDD and found that not only did it seem to fit but also that a whole category of coaching questions magically dissolved. I now had answers to some of those TDD questions. What to call your test is easy – it’s a sentence describing the next behavior in which you are interested. How much to test becomes moot – you can only describe so much behavior in a single sentence. When a test fails, simply work through the process described above – either you introduced a bug, the behavior moved, or the test is no longer relevant.

Focusing on behavior has other advantages, too. It forces you to understand the context in which the feature you're developing is going work. It makes you see the value that it brings for the users. And, last but not least, it forces you to ask questions about the concepts mentioned in the requirements.

So, what's the first thing you should do? Well, let's start by understanding what a generation of cells is and then we can move on to the concept of underpopulation:

Any live cell with fewer than two live neighbors dies, as if by needs caused by underpopulation.

Acceptance tests

At this point we're ready to write our first test. But, since there's still no code for the feature, what exactly are we supposed to test? The answer to that question is simpler than you might expect: the system itself.

We're implementing a requirement for our Game of Life web API. The user is supposed to make an HTTP request to a certain URL sending a list of cells formatted as JSON and get back a response containing the same list of cells after the rule of underpopulation has been applied. We'll know we'll have fulfilled the requirement when the system does exactly that. It's, in other words, the requirement's acceptance criteria. It sure sounds like a good place to start writing a test.

Let's express it the way formalized by Dan North:

Scenario: Death by underpopulation
    Given a live cell has fewer than 2 live neighbors
    When I ask for the next generation of cells
    Then I should get back a new generation
    And it should have the same number of cells
    And the cell should be dead

Here, we call the acceptance criteria scenario and use a Given-When-Then syntax to express its premises, action and expected outcome. Having a common language like this for expressing software requirements is one of the greatest innovations brought by BDD.

So, we said we were going to test the system itself. In practice, that means we must put ourselves in the user's shoes and let the test interact with the system at its outmost boundaries. In case of a web API, that translates in sending and receiving HTTP requests.

In order to turn our scenario into an executable test, we need some kind of framework that can map the Given-When-Then sentences to methods. In the realm of .NET, that framework is called SpecFlow. Here's how we could use it together with C# to implement our test:

IEnumerable<dynamic> generation;
HttpResponseMessage response;
IEnumerable<dynamic> nextGeneration;

[Given]
public void Given_a_live_cell_has_fewer_than_COUNT_live_neighbors(int count)
{
    generation = new[]
                 {
                     new { Alive = true, Neighbors = --count }
                 };
}

[When]
public void When_I_ask_for_the_next_generation_of_cells()
{
    response = WebClient.PostAsJson("api/generation", generation);
}

[Then]
public void Then_I_should_get_back_a_new_generation()
{
    response.ShouldBeSuccessful();
    nextGeneration = ParseGenerationFromResponse();
    nextGeneration.ShouldNot(Be.Null);
}

[Then]
public void Then_it_should_have_the_same_number_of_cells()
{
    nextGeneration.Should(Have.Count.EqualTo(1));
}

[Then]
public void Then_the_cell_should_be_dead()
{
    var isAlive = (bool)nextGeneration.Single().Alive;
    isAlive.ShouldBeFalse();
}

IEnumerable<dynamic> ParseGenerationFromResponse()
{
    return response.ReadContentAs<IEnumerable<dynamic>>();
}

As you can see, each portion of the scenario is mapped directly to a method by matching the words used within the sentences separated by underscores.

At this point, we can finally run our acceptance test and watch it fail:

Given a live cell has fewer than 2 live neighbors
-> done.
When I ask for the next generation of cells
-> done.
Then I should get back a new generation
-> error: Expected: in range (200 OK, 299)
          But was:  404 NotFound

As expected, the test fails on the first assertion with an HTTP 404 response. We know this is correct since there's nothing listening to that URL on the other end. Now that we understand why this test fails, we can now officially start implementing that feature.

The TDD cycle

We're about to get our hands dirty (albeit keeping the code clean) and dive into the system. At this point we follow the normal rules of Test-driven development with its Red-Green-Refactor cycle. However, we're faced with the exact same problem we had at the boundaries of the system. What should be our first test? Even in this case, we'll let the behavior be our guiding light.

If you have experience writing unit tests, my guess is that you're used to write one test class for every production code class. For example, given SomeClass you'd write your tests in SomeClassTests. This is fine, but I'm here to tell you that there's a better way. If we focus on how a class should behave in a given situation, wouldn't it be more natural to have one test class per scenario?

Consider this:

public class When_getting_the_next_generation_with_underpopulation
    : ForSubject<GenerationController>
{
}

This class will contain all the tests related to how the GenerationController class behaves when asked to get the next generation of cells given that one of them is underpopulated.

But, wait a minute. Wouldn't that create a lot of tiny test classes?

Yes. One per scenario to be precise. That way, you'll know exactly which class to look at when you're working with a certain feature. Besides, having many small cohesive classes is better than having giant ones with all kinds of tests in them, don't you think?

Let's get back to our test. Since we're testing one single scenario, we can structure it much in same way as we would define an acceptance test:

For each scenario there's exactly one context to set up, one action and one or more assertions on the outcome.

As always, readability is king, so we'd like to express our test in a way that gets us as close as possible to human language. In other words, our tests should read like specifications:

public class When_getting_the_next_generation_with_underpopulation
    : ForSubject<GenerationController>
{
    static Cell solitaryCell;
    static IEnumerable<Cell> currentGen;
    static IEnumerable<Cell> nextGen;

    Establish context = () =>
    {
        solitaryCell = new Cell { Alive = true, Neighbors = 1 };
        currentGen = AFewCellsIncluding(solitaryCell);
    };

    Because of = () =>
        nextGen = Subject.GetNextGeneration(currentGen);

    It should_return_the_next_generation_of_cells = () =>
        nextGen.ShouldNotBeEmpty();

    It should_include_the_solitary_cell_in_the_next_generation = () =>
        nextGen.ShouldContain(solitaryCell);

    It should_only_include_the_original_cells_in_the_next_generation = () =>
        nextGen.ShouldContainOnly(currentGen);

    It should_mark_the_solitary_cell_as_dead = () =>
        solitaryCell.Alive.ShouldBeFalse();
}

In this case I'm using a test framework for .NET called Machine.Specifications or MSpec. MSpec belongs to a category of frameworks called BDD-style testing frameworks. There are at least a few of them for almost every language known to men (including Haskell). What they all have in common is a strong focus on allowing you to express your tests in a way that resembles requirements.

Speaking of readability, see all those underscores, static variables and lambdas? Those are just the tricks MSpec has to pull on the C# compiler, in order to give us a domain-specific language to express requirements while still producing runnable code. Other frameworks have different techniques to get as close as possible to human language without angering the compiler. Which one you choose is largely a matter of preference.

Wrapping it up

I'll leave the implementation of the GenerationController class as an exercise for the Reader, since it's outside the scope of this article. If you like, you can find mine over at GitHub.

What's important here, is that after a few rounds of Red-Green-Refactor we'll finally be able to run our initial acceptance test and see it pass. At that point, we'll know with certainty that we'll have successfully implemented our feature.

Let's recap our entire process with a picture:

The Outside-in development cycle

This approach to developing software is called Outside-in development and is described beautifully in Steve Freeman's & Nat Pryce's excellent book Growing Object-Oriented Software, Guided by Tests.

In our little exercise, we grew a feature in our Game of Life web API from the outside-in, following the principles of Behavior-driven Development.

Presentation

You can see a complete recording of the talk I gave at Foo Café last year about this topic. The presentation pretty much covers the material described in this articles and expands a bit on the programming aspect of BDD. I hope you'll find it useful. If have any questions, please fill free to contact me directly or write in the comments.

Here's the abstract:

In this session I’ll show how to apply the Behavior Driven Development (BDD) cycle when developing a feature from top to bottom in a fictitious.NET web application. Building on the concepts of Test-driven Development, I’ll show you how BDD helps you produce tests that read like specifications, by forcing you to focus on what the system is supposed to do, rather than how it’s implemented.

Starting from the acceptance tests seen from the user’s perspective, we’ll work our way through the system implementing the necessary components at each tier, guided by unit tests. In the process, I’ll cover how to write BDD-style tests both in plain English with SpecFlow and in C# with MSpec. In other words, it’ll be BDD all the way down.


April 16, 2013 Posted in programming  |  autofixture

General-purpose customizations with AutoFixture

If you’ve been using AutoFixture in your tests for more than a while, chances are you’ve already come across the concept of customizations. If you’re not familiar with it, let me give you a quick introduction:

A customization is a group of settings that, when applied to a given Fixture object, control the way AutoFixture will create instances for the types requested through that Fixture.

At this point you might find yourself feeling an irresistible urge to know everything there’s to know about customizations. If that’s the case, don’t worry. There are a few resources online where you learn more about them. For example, I wrote about how to take advantage of customizations to group together test data related to specific scenarios.

In this post I’m going to talk about something different which, in a sense, is quite the opposite of that: how to write general-purpose customizations.

A (user) story about cooking

It’s hard to talk about test data without a bit of context. So, for the sake of this post, I thought we would pretend to be working on a somewhat realistic project. The system we’re going to build is an online catalogue of food recipies. The domain, at the very basic level, consists of three concepts:

  • Cookbook
  • Recipes
  • Ingredients

Basic domain model for a recipe catalogue.

Now, let’s imagine that in our backlog of requirements we have one where the user wishes to be able to search for recepies that contain a specific set of ingredients. Or in other words:

As a foodie, I want to know which recipes I can prepare with the ingredients I have, so that I can get the best value for my groceries.

From the tests…

As usual, we start out by translating the requirement at hand into a set of acceptance tests. In order do that, we need to tell AutoFixture how we’d like the test data for our domain model to be generated.

For this particular scenario, we need every Ingredient created in the test fixture to be randomly chosen from a fixed pool of objects. That way we can ensure that all recepies in the cookbook will be made up of the same set of ingredients.

Here’s how such a customization would look like:

public class RandomIngredientsFromFixedSequence : ICustomization
{
    private readonly Random randomizer = new Random();
    private IEnumerable sequence;

    public void Customize(IFixture fixture)
    {
        InitializeIngredientSequence(fixture);
        fixture.Register(PickRandomIngredientFromSequence);
    }

    private void InitializeIngredientSequence(IFixture fixture)
    {
        this.sequence = fixture.CreateMany();
    }

    private Ingredient PickRandomIngredientFromSequence()
    {
        var randomIndex = this.randomizer.Next(0, sequence.Count());
        return sequence.ElementAt(randomIndex);
    }
}

Here we’re creating a pool of ingredients and telling AutoFixture to randomly pick one of those every time it needs to create an Ingredient object by using the Fixture.Register method.

Since we’ll be using Xunit as our test runner, you can take advantage of the AutoFixture Data Theories to keep our tests succinct by using AutoFixture in a declarative fashion. In order to do so, we need to write an xUnit Data Theory attribute that tells AutoFixture to use our new customization:

public class CookbookAutoDataAttribute : AutoDataAttribute
{
    public CookbookAutoDataAttribute()
        : base(new Fixture().Customize(
                   new RandomIngredientsFromFixedSequence())))
    {
    }
}

If you prefer to use AutoFixture directly in your tests, the imperative equivalent of the above is:

var fixture = new Fixture();
fixture.Customize(new RandomIngredientsFromFixedSequence());

At this point, we can finally start writing the acceptance tests to satisfy our original requirement:

public class When_searching_for_recipies_by_ingredients
{
    [Theory, CookbookAutoData]
    public void Should_only_return_recipes_with_a_specific_ingredient(
        Cookbook sut,
        Ingredient ingredient)
    {
        // When
        var recipes = sut.FindRecipies(ingredient);
        // Then
        Assert.True(recipes.All(r => r.Ingredients.Contains(ingredient)));
    }

    [Theory, CookbookAutoData]
    public void Should_include_new_recipes_with_a_specific_ingredient(
        Cookbook sut,
        Ingredient ingredient,
        Recipe recipeWithIngredient)
    {
        // Given
        sut.AddRecipe(recipeWithIngredient);
        // When
        var recipes = sut.FindRecipies(ingredient);
        // Then
        Assert.Contains(recipeWithIngredient, recipes);
    }
}

Notice that during these tests AutoFixture will have to create Ingredient objects in a couple of different ways:

  • indirectly when constructing Recipe objects associated to a Cookbook
  • directly when providing arguments for the test parameters

As far as AutoFixture is concerned, it doesn’t really matter which code path leads to the creation of ingredients. The algorithm provided by the RandomIngredientsFromFixedSequence customization will apply in all situations.

…to the implementation

After a couple of Red-Green-Refactor cycles spawned from the above tests, it’s not completely unlikely that we might end up with some production code similar to this:

// Cookbook.cs
public class Cookbook
{
    private readonly ICollection recipes;

    public Cookbook(IEnumerable recipes)
    {
        this.recipes = new List(recipes);
    }

    public IEnumerable FindRecipies(params Ingredient[] ingredients)
    {
        return recipes.Where(r => r.Ingredients.Intersect(ingredients).Any());
    }

    public void AddRecipe(Recipe recipe)
    {
        this.recipes.Add(recipe);
    }
}

// Recipe.cs
public class Recipe
{
    public readonly IEnumerable Ingredients;

    public Recipe(IEnumerable ingredients)
    {
        this.Ingredients = ingredients;
    }
}

// Ingredient.cs
public class Ingredient
{
    public readonly string Name;

    public Ingredient(string name)
    {
        this.Name = name;
    }
}

Nice and simple. But let’s not stop here. It’s time to take it a bit further.

An opportunity for generalization

Given the fact that we started working from a very concrete requirement, it’s only natural that the RandomIngredientsFromFixedSequence customization we came up at with encapsulates a behavior that is specific to the scenario at hand. However, if we take a closer look we might notice the following:

The only part of the algorithm that is specific to the original scenario is the type of the objects being created. The rest can easily be applied whenever you want to create objects that are picked at random from a predefined pool.

An opportunity for writing a general-purpose customization has just presented itself. We can’t let it slip.

Let’s see what happens if we extract the Ingredient type into a generic argument and remove all references to the word “ingredient”:

public class RandomFromFixedSequence : ICustomization
{
    private readonly Random randomizer = new Random();
    private IEnumerable sequence;

    public void Customize(IFixture fixture)
    {
        InitializeSequence(fixture);
        fixture.Register(PickRandomItemFromSequence);
    }

    private void InitializeSequence(IFixture fixture)
    {
        this.sequence = fixture.CreateMany();
    }

    private T PickRandomItemFromSequence()
    {
        var randomIndex = this.randomizer.Next(0, sequence.Count());
        return sequence.ElementAt(randomIndex);
    }
}

Voilà. We just turned our scenario-specific customization into a pluggable algorithm that changes the way objects of any type are going to be generated by AutoFixture. In this case the algorithm will create items by picking them at random from a fixed sequence of T.

The CookbookAutoDataAttribute can easily changed to use the general-purpose version of the customization by closing the generic argument with the Ingredient type:

public class CookbookAutoDataAttribute : AutoDataAttribute
{
    public CookbookAutoDataAttribute()
        : base(new Fixture().Customize(
                   new RandomFromFixedSequence())))
    {
    }
}

The same is true if you’re using AutoFixture imperatively:

var fixture = new Fixture();
fixture.Customize(new RandomFromFixedSequence());

Wrapping up

As I said before, customizations are a great way to set up test data for a specific scenario. Sometimes these configurations turn out to be useful in more than just one situation.

When such opportunity arises, it’s often a good idea to separate out the parts that are specific to a particular context and turn them into parameters. This allows the customization to become a reusable strategy for controlling AutoFixture’s behavior across entire test suites.