If you’ve been using AutoFixture in your tests for more than a while, chances are you’ve already come across the concept of customizations. If you’re not familiar with it, let me give you a quick introduction:

A customization is a group of settings that, when applied to a given Fixture object, control the way AutoFixture will create instances for the types requested through that Fixture.

At this point you might find yourself feeling an irresistible urge to know everything there’s to know about customizations. If that’s the case, don’t worry. There are a few resources online where you learn more about them. For example, I wrote about how to take advantage of customizations to group together test data related to specific scenarios.

In this post I’m going to talk about something different which, in a sense, is quite the opposite of that: how to write general-purpose customizations.

A (user) story about cooking

It’s hard to talk about test data without a bit of context. So, for the sake of this post, I thought we would pretend to be working on a somewhat realistic project. The system we’re going to build is an online catalogue of food recipies. The domain, at the very basic level, consists of three concepts:

  • Cookbook
  • Recipes
  • Ingredients
Basic domain model for a recipe catalogue.

Basic domain model for a recipe catalogue.

Now, let’s imagine that in our backlog of requirements we have one where the user wishes to be able to search for recepies that contain a specific set of ingredients. Or in other words:

As a foodie, I want to know which recipes I can prepare with the ingredients I have,
so that I can get the best value for my groceries.

From the tests…

As usual, we start out by translating the requirement at hand into a set of acceptance tests. In order do that, we need to tell AutoFixture how we’d like the test data for our domain model to be generated.

For this particular scenario, we need every Ingredient created in the test fixture to be randomly chosen from a fixed pool of objects. That way we can ensure that all recepies in the cookbook will be made up of the same set of ingredients.

Here’s how such a customization would look like:

public class RandomIngredientsFromFixedSequence : ICustomization
{
    private readonly Random randomizer = new Random();
    private IEnumerable<Ingredient> sequence;

    public void Customize(IFixture fixture)
    {
        InitializeIngredientSequence(fixture);
        fixture.Register(PickRandomIngredientFromSequence);
    }

    private void InitializeIngredientSequence(IFixture fixture)
    {
        this.sequence = fixture.CreateMany<Ingredient>();
    }

    private Ingredient PickRandomIngredientFromSequence()
    {
        var randomIndex = this.randomizer.Next(0, sequence.Count() - 1);
        return sequence.ElementAt(randomIndex);
    }
}

Here we’re creating a pool of ingredients and telling AutoFixture to randomly pick one of those every time it needs to create an Ingredient object by using the Fixture.Register<T> method.

Since we’ll be using Xunit as our test runner, you can take advantage of the AutoFixture Data Theories to keep our tests succinct by using AutoFixture in a declarative fashion. In order to do so, we need to write an xUnit Data Theory attribute that tells AutoFixture to use our new customization:

public class CookbookAutoDataAttribute : AutoDataAttribute
{
    public CookbookAutoDataAttribute()
        : base(new Fixture().Customize(
                   new RandomIngredientsFromFixedSequence())))
    {
    }
}

If you prefer to use AutoFixture directly in your tests, the imperative equivalent of the above is:

var fixture = new Fixture();
fixture.Customize(new RandomIngredientsFromFixedSequence());

At this point, we can finally start writing the acceptance tests to satisfy our original requirement:

public class When_searching_for_recipies_by_ingredients
{
    [Theory, CookbookAutoData]
    public void Should_only_return_recipes_with_a_specific_ingredient(
        Cookbook sut,
        Ingredient ingredient)
    {
        // When
        var recipes = sut.FindRecipies(ingredient);
        // Then
        Assert.True(recipes.All(r => r.Ingredients.Contains(ingredient)));
    }

    [Theory, CookbookAutoData]
    public void Should_include_new_recipes_with_a_specific_ingredient(
        Cookbook sut,
        Ingredient ingredient,
        Recipe recipeWithIngredient)
    {
        // Given
        sut.AddRecipe(recipeWithIngredient);
        // When
        var recipes = sut.FindRecipies(ingredient);
        // Then
        Assert.Contains(recipeWithIngredient, recipes);
    }
}

Notice that during these tests AutoFixture will have to create Ingredient objects in a couple of different ways:

  • indirectly when constructing Recipe objects associated to a Cookbook
  • directly when providing arguments for the test parameters

As far as AutoFixture is concerned, it doesn’t really matter which code path leads to the creation of ingredients. The algorithm provided by the RandomIngredientsFromFixedSequence customization will apply in all situations.

…to the implementation

After a couple of Red-Green-Refactor cycles spawned from the above tests, it’s not completely unlikely that we might end up with some production code similar to this:

// Cookbook.cs
public class Cookbook
{
    private readonly ICollection<Recipe> recipes;

    public Cookbook(IEnumerable<Recipe> recipes)
    {
        this.recipes = new List<Recipe>(recipes);
    }

    public IEnumerable<Recipe> FindRecipies(params Ingredient[] ingredients)
    {
        return recipes.Where(r => r.Ingredients.Intersect(ingredients).Any());
    }

    public void AddRecipe(Recipe recipe)
    {
        this.recipes.Add(recipe);
    }
}

// Recipe.cs
public class Recipe
{
    public readonly IEnumerable<Ingredient> Ingredients;

    public Recipe(IEnumerable<Ingredient> ingredients)
    {
        this.Ingredients = ingredients;
    }
}

// Ingredient.cs
public class Ingredient
{
    public readonly string Name;

    public Ingredient(string name)
    {
        this.Name = name;
    }
}

Nice and simple. But let’s not stop here. It’s time to take it a bit further.

An opportunity for generalization

Given the fact that we started working from a very concrete requirement, it’s only natural that the RandomIngredientsFromFixedSequence customization we came up at with encapsulates a behavior that is specific to the scenario at hand. However, if we take a closer look we might notice the following:

The only part of the algorithm that is specific to the original scenario is the type of the objects being created. The rest can easily be applied whenever you want to create objects that are picked at random from a predefined pool.

An opportunity for writing a general-purpose customization has just presented itself. We can’t let it slip.

Let’s see what happens if we extract the Ingredient type into a generic argument and remove all references to the word “ingredient”:

public class RandomFromFixedSequence<T> : ICustomization
{
    private readonly Random randomizer = new Random();
    private IEnumerable<T> sequence;

    public void Customize(IFixture fixture)
    {
        InitializeSequence(fixture);
        fixture.Register(PickRandomItemFromSequence);
    }

    private void InitializeSequence(IFixture fixture)
    {
        this.sequence = fixture.CreateMany<T>();
    }

    private T PickRandomItemFromSequence()
    {
        var randomIndex = this.randomizer.Next(0, sequence.Count() - 1);
        return sequence.ElementAt(randomIndex);
    }
}

Voilà. We just turned our scenario-specific customization into a pluggable algorithm that changes the way objects of any type are going to be generated by AutoFixture. In this case the algorithm will create items by picking them at random from a fixed sequence of T.

The CookbookAutoDataAttribute can easily changed to use the general-purpose version of the customization by closing the generic argument with the Ingredient type:

public class CookbookAutoDataAttribute : AutoDataAttribute
{
    public CookbookAutoDataAttribute()
        : base(new Fixture().Customize(
                   new RandomFromFixedSequence<Ingredient>())))
    {
    }
}

The same is true if you’re using AutoFixture imperatively:

var fixture = new Fixture();
fixture.Customize(new RandomFromFixedSequence<Ingredient>());

Wrapping up

As I said before, customizations are a great way to set up test data for a specific scenario. Sometimes these configurations turn out to be useful in more than just one situation.

When such opportunity arises, it’s often a good idea to separate out the parts that are specific to a particular context and turn them into parameters. This allows the customization to become a reusable strategy for controlling AutoFixture’s behavior across entire test suites.

When I first started getting into Git a couple of years ago, one of the things I found most frustrating about the learning experience was the complete lack of guidance on how to interpret the myriad of commands and switches found in the documentation. On second thought, calling it frustrating is actually an understatement. Utterly painful would be a better way to describe it.
What I was looking for, was a way to represent the state of a Git repository in some sort of graphical format. In my mind, if only I could have visualized how the different combinations of commands and switches impacted my repo, I would have had a much better shot at actually understand their meaning.

After a bit of research on the Git forums, I noticed that many people was using a simple text-based notation to describe the state of their repo. The actual symbols varied a bit, but they all essentially came down to something like this:


               C4--C5 (feature)
              / 
C1--C2--C3--C4'--C5'--C6 (master)
                       ^

where the symbols mean:

  • Cn represents a single commit
  • Cn’ represents a commit that has been moved from another location in history, i.e. it has been rebased
  • (branch) represents a branch name
  • ^ indicates the commit referenced by HEAD

This form of graphical DSL proved itself to be extremely useful not only as a learning tool but also as a universal Git language, useful for documentation as well as for communication during problem solving.

Now, keeping this idea in mind, imagine having a tool that is able to draw a similar diagram automatically. Sounds interesting? Well, let me introduce SeeGit.

SeeGit is Windows application that, given the path to a Git repository on disk, will generate a diagram of its commits and references. Once done, it will keep watching that directory for changes and automatically update the diagram accordingly.

This is where the idea for my Grokking Git by seeing it session came from. The goal is to illustrate the meaning behind different Git operations by going through a series of demos, while having the command line running on one half of the screen and SeeGit on the other. As I type away in the console you can see the Git history unfold in front of you, giving you an insight in how things work under the covers.

In other words, something like this:


SeeGit session in progress.

So, this is just to give you a little background. Here you’ll find the session’s abstract, slides and demos. There’s also a recording from when I presented this talk at LeetSpeak in Malmö, Sweden back in October 2012. I hope you find it useful.

Abstract

In this session I’ll teach you the Git zen from the inside out. Working out of real world scenarios, I’ll walk you through Git’s fundamental building blocks and common use cases, working our way up to more advanced features. And I’ll do it by showing you graphically what happens under the covers, as we fire different Git commands.

You may already have been using Git for a while to collaborate on some open source project or even at work. You know how to commit files, create branches and merge your work with others. If that’s the case, believe me, you’ve only scratched the surface. I firmly believe that a deep understanding of Git’s inner workings is the key to unlock its true power allowing you, as a developer, to take full control of your codebase’s history.

Recording from LeetSpeak

Resources

I know I’ve said it before, but I love the command line. And being a command line junkie, I’m naturally attracted to all kinds of tools the involve a bright blinking square on a black canvas. Historically, I’ve always been a huge fan of the mighty Bash. PowerShell, however, came to change that.

PowerShell logo

Since PowerShell made its first appearance under the codename “Monad back in 2005, it proposed itself as more than just a regular command prompt. It brought, in fact, something completely new to the table: it combined the flexibility of a Unix-style console, such as Bash, with the richness of the .NET Framework and an object-oriented pipeline, which in itself was totally unheard of.
With such a compelling story, it soon became apparent that PowerShell was aiming to become the official command line tool for Windows, replacing both the ancient Command Prompt and the often criticized Windows Script Host. And so it has been.

Seven years has passed since “Monadwas officially released as PowerShell, and its presence is as pervasive as ever. Nowadays you can expect to find PowerShell in just about all of Microsoft’s major server products, from Exchange to SQL Server. It’s even become part of Visual Studio thorugh the NuGet Package Manager Console. Not only that, but tools such as posh-git, make PowerShell a very nice, and arguably more natural, alternative to Git Bash when using Git on Windows.

Following up on my interest for PowerShell, I’ve found myself talking a fair deal about it both at conferences and user groups. In particular, during the last year or so, I’ve been giving a presentation about how to integrate PowerShell into your own applications.

The idea is to leverage the PowerShell programming model to provide a rich set of administrative tools that will (hopefully) improve the often stormy relationship between devs and admins.

Since I’m often asked about where to get the slides and the code samples from the talk, I thought I would make them all available here in one place for future reference.

So here it goes, I hope you find it useful.

Abstract

Have you ever been in a software project where the IT staff who would run the system in production, was accounted for right from the start? My guess is not very often. In fact, it’s far too rare to see functionality being built into software systems specifically to make the job of the IT administrator easier. It’s a pity, because pulling that off doesn’t require as much time and effort as you might think with tools like PowerShell.

In this session I’ll show you how to enhance an existing .NET web application with a set of administrative tools, built using the PowerShell programming model. Once that is in place, I’ll demonstrate how common maintenance tasks can either be performed manually using a traditional GUI or be fully automated through PowerShell scripts using the same code base.

Since the last few years, Microsoft itself has committed to making all of its major server products fully administrable both through traditional GUI based tools as well as PowerShell. If you’re building a server application on the .NET platform, you will soon be expected to do the same.

Resources

I love working with the command line. In fact, I love it so much that I even use it as my primary way of interacting with the source control repositories of all the projects I’m involved in. It’s a matter of personal taste, admittedly, but there’s also a practical reason for that.

Depending on what I’m working on, I regularly have to switch among several different source control systems. Just to give you an example, just in the last six months I’ve been using Mercurial, Git, Subversion and TFS on a weekly basis. Instead of having to learn and get used to different UIs (whether it be standalone clients or IDE plugins), I find that I can be more productive by sticking to the uniform experience of the command line based tools.

To enforce my point, let me show you how to check in some code in the source control systems I mentioned above:

  • Mercurial: hg commit -m "Awesome feature"
  • Git: git commit -m "Awesome feature"
  • Subversion: svn commit -m "Awesome feature"
  • TFS: tf checkin /comment:"Awesome feature"

As you can see, it looks pretty much the same across the board.

Of course, you need to be aware of the fundamental differences in how Distributed Version Control Systems (DVCS) such as Mercurial and Git behave compared to traditional centralized Version Control Systems (VCS) like Subversion and TFS. In addition to that, each system tries to characterize itself by having its own set of features or by solving a common problem (like branching) in a unique way.
However, there aspects must be taken into consideration regardless of your client of choice.
What I’m saying is that the command line interface at least offers a single point of entry into those systems, which in the end makes me more productive.

Unified DIFFs

One of the most basic features of any source control system is the ability to compare two versions of the same file to see what’s changed. The output of such comparison, or DIFF, is commonly represented in text using the Unified DIFF format, which looks something like this:

--- a/QuoteBookTests/Classes/Models/QuoteTest.h
+++ b/QuoteBookTests/Classes/Models/QuoteTest.h
@@ -6,12 +6,10 @@
 //  Copyright 2011 Thoughtology. All rights reserved.
 //

-#import <SenTestingKit/SenTestingKit.h>
-#import <UIKit/UIKit.h>
-
 @interface QuoteTest : SenTestCase {    
 }

 - (void)testQuoteForInsert_ReturnsNotNull;
+- (void)testQuoteForInsert_ReturnsPersistedQuote;

 @end

In the Unified DIFF format changes are displayed at the line level through a set of well-known prefixes. The rule is simple:

A line can either be added, in which case it will be preceded by a + sign, or removed, in which case it will be preceded by a - sign. Unchanged lines are preceded by a whitespace.

In addition to that, each modified section, referred to as hunk, is preceded by a header that indicates the position and size of the section in the original and modified file respectively. For example this hunk header:

@@ -6,12 +6,10 @@

means that in the original file the modified lines start at line 6 and continue for 12 lines. In the new file, instead, that same change starts at line 6 and includes a total of 10 lines.

True Colors

At this point, you may wonder what all of this has to do with PowerShell, and rightly so. Remember when I said that I prefer to work with source control from the command line? Well, it turns out that scrolling through gobs of text in a console window isn’t always the best way to figure out what has changed between two change sets.

Fortunately, since PowerShell allows to print text in the console window using different colors, it only took a switch statement and a couple of regular expressions, to turn that wall of text into something more readable. That’s how the Out-Diff cmdlet was born:

function Out-Diff {
<#
.Synopsis
    Redirects a Universal DIFF encoded text from the pipeline to the host using colors to highlight the differences.
.Description
    Helper function to highlight the differences in a Universal DIFF text using color coding.
.Parameter InputObject
    The text to display as Universal DIFF.
#>
[CmdletBinding()]
param(
    [Parameter(Mandatory=$true, ValueFromPipeline=$true)]
    [PSObject]$InputObject
)
    Process {
        $contentLine = $InputObject | Out-String
        if ($contentLine -match "^Index:") {
            Write-Host $contentLine -ForegroundColor Cyan -NoNewline
        } elseif ($contentLine -match "^(\+|\-|\=){3}") {
            Write-Host $contentLine -ForegroundColor Gray -NoNewline
        } elseif ($contentLine -match "^\@{2}") {
            Write-Host $contentLine -ForegroundColor Gray -NoNewline
        } elseif ($contentLine -match "^\+") {
            Write-Host $contentLine -ForegroundColor Green -NoNewline
        } elseif ($contentLine -match "^\-") {
            Write-Host $contentLine -ForegroundColor Red -NoNewline
        } else {
            Write-Host $contentLine -NoNewline
        }
    }
}

Let’s break this function down into logical steps:

  1. Take whatever input comes from the PowerShell pipeline and convert it to a string.
  2. Match that string against a set of regular expressions to determine whether it’s part of the Unified DIFF format.
  3. Print the string to the console with the appropriate color: green for added, red for removed and gray for the headers.

Pretty simple. And using it is even simpler: just load the script into your PowerShell session using dot sourcing or by adding it to your profile and redirect the output of a ‘diff’ command to the Out-Diff cmdlet through piping to start enjoying colorized DIFFs. For example the following commands:

. .\Out-Diff.ps1
git diff | Out-Diff

will generate this output in PowerShell:

The Out-Diff cmdlet in action

One thing I’d like to point out is that even if the output of git diff consists of many lines of text, PowerShell will redirect them to the Out-Diff function one line at a time. This is called a streaming pipeline and it allows PowerShell to be responsive and consume less memory even when processing large amounts of data, which is neat.

Wrapping up

PowerShell is an extremely versatile console. In this case, it allowed me to enhance a traditional command line tool (diff) through a simple script. Other projects, like Posh-Git and Posh-Hg, take it even further and leverage PowerShell’s rich programming model to provide a better experience on top of existing console based source control tools. If you enjoy working with the command line, I seriously encourage you to check them out.

Download Download Out-Diff.ps1 from GitHub

When I first incorporated AutoFixture as part of my daily unit testing workflow, I noticed how a consistent usage pattern had started to emerge.
This pattern can be roughly summarized in three steps:

  1. Initialize an instance of the Fixture class.
  2. Configure the way different types of objects involved in the test should be created by using the Build<T> method.
  3. Create the actual objects with the CreateAnonymous<T> or CreateMany<T> methods.

As a result, my unit tests had started to look a lot like this:

[Test]
public void WhenGettingAListOfPublishedPostsThenItShouldOnlyIncludeThose()
{
    // Step 1: Initialize the Fixture
    var fixture = new Fixture();

    // Step 2: Configure the object creation
    var draft = fixture.Build<Post>()
        .With(a => a.IsDraft = true)
        .CreateAnonymous();
    var publishedPost = fixture.Build<Post>()
        .With(a => a.IsDraft = false)
        .CreateAnonymous();
    fixture.Register<IEnumerable<Post>>(() => new[] { draft, publishedPost });

    // Step 3: Create the anonymous objects
    var posts = fixture.CreateMany<Post>();

   // Act and Assert...
}

In this particular configuration, AutoFixture will satisfy all requests for IEnumerable<Post> types by returning the same array with exactly two Post objects: one with the IsDraft property set to True and one with the same property set to False.

At that point I felt pretty satisfied with the way things were shaping up: I had managed to replace entire blocks of boring object initialization code with a couple of calls to the AutoFixture API, my unit tests were getting smaller and all was good.

Duplication creeps in

After a while though, the configuration lines created in Step 2 started to repeat themselves across multiple unit tests. This was naturally due to the fact that different unit tests sometimes shared a common set of object states in their test scenario. Things weren’t so DRY anymore and suddenly it wasn’t uncommon to find code like this in the test suite:

[Test]
public void WhenGettingAListOfPublishedPostsThenItShouldOnlyIncludeThose()
{
    var fixture = new Fixture();
    var draft = fixture.Build<Post>()
        .With(a => a.IsDraft = true)
        .CreateAnonymous();
    var publishedPost = fixture.Build<Post>()
        .With(a => a.IsDraft = false)
        .CreateAnonymous();
    fixture.Register<IEnumerable<Post>>(() => new[] { draft, publishedPost });
    var posts = fixture.CreateMany<Post>();

    // Act and Assert...
}

[Test]
public void WhenGettingAListOfDraftsThenItShouldOnlyIncludeThose()
{
    var fixture = new Fixture();
    var draft = fixture.Build<Post>()
        .With(a => a.IsDraft = true)
        .CreateAnonymous();
    var publishedPost = fixture.Build<Post>()
        .With(a => a.IsDraft = false)
        .CreateAnonymous();
    fixture.Register<IEnumerable<Post>>(() => new[] { draft, publishedPost });
    var posts = fixture.CreateMany<Post>();

    // Different Act and Assert...
}

See how these two tests share the same initial state even though they verify completely different behaviors? Such blatant duplication in the test code is a problem, since it inhibits the ability to make changes.
Luckily a solution was just around the corner as I discovered customizations.

Customizing your way out

A customization is a pretty general term. However, put in the context of AutoFixture it assumes a specific definition:

A customization is a group of settings that, when applied to a given Fixture, control the way AutoFixture will create anonymous instances of the types requested through that Fixture.

What that means is that I could take all the boilerplate configuration code produced during Step 2 and move it out of my unit tests into a single place, that is a customization. That allowed me to specify only once how different objects needed to be created for a given scenario, and reuse that across multiple tests.

public class MixedDraftsAndPublishedPostsCustomization : ICustomization
{
    public void Customize(IFixture fixture)
    {
        var draft = fixture.Build<Post>()
            .With(a => a.IsDraft = true)
            .CreateAnonymous();
        var publishedPost = fixture.Build<Post>()
            .With(a => a.IsDraft = false)
            .CreateAnonymous();
        fixture.Register<IEnumerable<Post>>(() => new[] { draft, publishedPost });
    }
}

As you can see, ICustomization is nothing more than a role interface that describes how a Fixture should be set up. In order to apply a customization to a specific Fixture instance, you’ll simply have to call the Fixture.Customize(ICustomization) method, like shown in the example below.
This newly won encapsulation allowed me to rewrite my unit tests in a much more terse way:

[Test]
public void WhenGettingAListOfDraftsThenItShouldOnlyIncludeThose()
{
    // Step 1: Initialize the Fixture
    var fixture = new Fixture();

    // Step 2: Apply the customization for the test scenario
    fixture.Customize(new MixedDraftsAndPublishedPostsCustomization());

    // Step 3: Create the anonymous objects
    var posts = fixture.CreateMany<Post>();

    // Act and Assert...
}

The configuration logic now exists only in one place, namely a class whose name clearly describes the kind of test data it will produce.
If applied consistently, this approach will in time build up a library of customizations, each representative of a given situation or scenario. Assuming that they are created at the proper level of granularity, these customizations could even be composed to form more complex scenarios.

Conclusion

Customizations in AutoFixture are a pretty powerful concept in of themselves, but they become even more effective when mapped directly to test scenarios. In fact, they represent a natural place to specify which objects are involved in a given scenario and the state they are supposed to be in. You can use them to remove duplication in your test code and, in time, build up a library of self-documenting modules, which describe the different contexts in which the system’s behavior is being verified.

Now that AutoFixture 2.2 is approaching on the horizon, it’s a good time to start talking about some of the changes that were made to the underlying behavior of some existing APIs. I’ll start off this series of posts by focusing on the new generation strategy for anonymous numbers.

The good old fashioned way

Before I jump into the details of what exactly has been changed and how, allow me to set up a little bit of stage:

A key part of AutoFixture’s mission statement is to make the process of authoring unit tests faster by providing an easy way of creating test values (or “specimens“) for the variables involved in the test. The goal of providing values that are as neutral as possible to the test scenario at hand is achieved by employing “constrained non-deterministic” generation algorithms.

Put in simple terms, this essentially means that AutoFixture will come up with test values at run time that can be considered “random” within some predefined bounds. These bounds are imposed at the lowest level by the variable’s own data type: a string is a string, a number is a number and so on. More constraints, however, can be added at a higher level, based on any semantics the variable may have in the specific test scenario. For example a string can’t be longer than 20 characters or a number must be between 1 and 100.

AutoFixture comes with a set of built-in generation algorithms that can produce test values for all the primitive types included in the .NET Framework. The algorithm for numeric types has historically been based on individually incremented sequences, one for each numeric data type. Let’s look at an example that illustrates this:

var fixture = new Fixture();
Console.WriteLine("Byte specimen is {0}, {1}",
    fixture.CreateAnonymous<byte>(),
    fixture.CreateAnonymous<byte>());
Console.WriteLine("Int32 specimen is {0}, {1}",
    fixture.CreateAnonymous<int>(),
    fixture.CreateAnonymous<int>());
Console.WriteLine("Single specimen is {0}, {1}",
    fixture.CreateAnonymous<float>(),
    fixture.CreateAnonymous<float>());

// The output will be:
// Byte specimen is 1, 2
// Int32 specimen is 1, 2
// Single specimen is 1, 2

The key point here is that AutoFixture will only guarantee unique numeric specimens within the scope of a specific data type. Now, you may wonder how this would be a problem. Well, it certainly isn’t in itself, but if you asked AutoFixture to give you an anonymous instance of a class with multiple properties of different numeric types, you would get something like this:

public class NumericBag
{
    public byte ByteValue { get; set; }
    public int Int32Value { get; set; }
    public float SingleValue { get; set; }
}

var fixture = new Fixture();
var specimen = fixture.CreateAnonymous<NumericBag>();
Console.WriteLine("ByteValue property is {0}", specimen.ByteValue);
Console.WriteLine("Int32Value property is {0}", specimen.Int32Value);
Console.WriteLine("SingleValue property is {0}", specimen.SingleValue);

// The output will be:
// ByteValue property is 1
// Int32Value property is 1
// SingleValue property is 1

We can agree that the end result doesn’t exactly live up to the expectation of anonymous values being “random”. Starting from version 2.2, however, this behavior is due to change.

The fresh new way

AutoFixture has taken a different approach to numeric specimen generation and will now by default return unique values across all numeric types. Running our first example in AutoFixture 2.2 will therefore yield a very different result:

var fixture = new Fixture();
Console.WriteLine("Byte specimen is {0}, {1}",
    fixture.CreateAnonymous<byte>(),
    fixture.CreateAnonymous<byte>());
Console.WriteLine("Int32 specimen is {0}, {1}",
    fixture.CreateAnonymous<int>(),
    fixture.CreateAnonymous<int>());
Console.WriteLine("Single specimen is {0}, {1}",
    fixture.CreateAnonymous<float>(),
    fixture.CreateAnonymous<float>());

// The output will be:
// Byte specimen is 1, 2
// Int32 specimen is 3, 4
// Single specimen is 5, 6

In other words, AutoFixture is being a little more “non-deterministic” when it comes to numeric test values. Take for example the following scenario:

public class NumericBag
{
    public byte ByteValue { get; set; }
    public int Int32Value { get; set; }
    public float SingleValue { get; set; }
}

var fixture = new Fixture();
var specimen = fixture.CreateAnonymous<NumericBag>();
Console.WriteLine("ByteValue property is {0}", specimen.ByteValue);
Console.WriteLine("Int32Value property is {0}", specimen.Int32Value);
Console.WriteLine("SingleValue property is {0}", specimen.SingleValue);

// The output will be:
// ByteValue property is 1
// Int32Value property is 2
// SingleValue property is 3

See how all the numeric properties on the generated object have different values? That’s what I’m talking about.

Now, in theory, this shouldn’t be considered a breaking change. I say this because AutoFixture is all about anonymous variables, which, by definition, can’t be expected to have specific values during a test run. So, as long as you’ve played by this rule, the new behavior shouldn’t impact any of your existing tests.

However, if this does turn out to be a problem or you simply prefer the old way of doing things, you shouldn’t feel left out in the cold. The previous behavior is still in the box, packaged up in a nice customization unambiguously named NumericSequencePerTypeCustomization. The simple act of adding it to a Fixture instance will restore things the way they were:

var fixture = new Fixture();
fixture.Customize(new NumericSequencePerTypeCustomization());

If you wish to try this out today, I encourage you to go head and grab the latest build off of AutoFixture’s project page on Team City. Enjoy.

I’m excited to announce that AutoFixture now officially supports delegates in the main trunk up on CodePlex.

If you aren’t familiar with AutoFixture, let me give you the pitch:

AutoFixture is an open source framework for .NET designed to minimize the ‘Arrange’ phase of your unit tests. Its primary goal is to allow developers to focus on what is being tested rather than how to setup the test scenario, by making it easier to create object graphs containing test data.

Does this sound interesting to you? In that case head over to the AutoFixture CodePlex site and find out more. You’ll be glad you did.

For those of you already familiar with AutoFixture, the newly added support for delegates means that every time AutoFixture is asked to create an anonymous instance of a delegate type (or more precisely a delegate specimen), it will actually return one, instead of throwing an exception.

So, you’ll be able to say things like:

public delegate void MyDelegate();

var fixture = new Fixture();
var delegateSpecimen = fixture.CreateAnonymous<MyDelegate>();

and get back a delegate pointing to a dynamically generated method, whose signature matches the one of the requested delegate type. In other words AutoFixture will satisfy the requests for delegates by providing a method specimen.

That’s cool, but it may leave you wondering: what on Earth does a method specimen do when it gets invoked? Well, in order to answer that question, we need to look at the signature of the delegate that was requested in the first place. The rule basically says:

  • If the signature of the requested delegate has a return value (i.e. it’s a function), the method specimen will always return an anonymous value of the return type.
  • If the signature of the requested delegate doesn’t have a return value (i.e. it’s an action) the returned method specimen will have an empty body.

This principle is best illustrated by examples. Consider the following code snippet:

var fixture = new Fixture();
var funcSpecimen = fixture.CreateAnonymous<Func<string>>();
var result = funcSpecimen();

// result = "fd95320f-0a37-42be-bd49-3afbbe089d9d"

In this example, since the signature of the requested delegate has a return value of type String, the result variable will contain an anonymous string value, which in AutoFixture usually translates into a GUID.
On the other hand, if requested delegate didn’t have a return value, invoking the anonymous delegate would do just about nothing:

var fixture = new Fixture();
var actionSpecimen = fixture.CreateAnonymous<Action<string>>();
actionSpecimen("whatever"); // no-op

Note that in both cases any input arguments passed to the anonymous delegate will be ignored, since they don’t have any impact on the generated method specimen.

Now, if you’re using AutoFixture from its NuGet package (which, by the way, you should) you’ll have to wait until the next release to get this feature. However, taking advantage of it with the current version of AutoFixture requires a minimal amount of effort. Just grab the DelegateGenerator.cs class from AutoFixture’s main trunk on CodePlex and include it in your project. You’ll then be able to add support for delegates to your Fixture instance by simply saying:

var fixture = new Fixture();
fixture.Customizations.Add(new DelegateGenerator());

You can even wrap that up in a Customization to make it more centralized and keep your test library DRY:

public class DelegateCustomization : ICustomization
{
    public void Customize(IFixture fixture)
    {
        if (fixture == null)
        {
            throw new ArgumentNullException("fixture");
        }

        fixture.Customizations.Add(new DelegateGenerator());
    }
}

Before finishing this off, let me give you a more concrete example that shows how this is useful in a real world scenario. Keeping in mind that delegates offer a pretty terse way to implement the Strategy Design Pattern in .NET, consider this implementation of the IEqualityComparer<T> interface:

public class EqualityComparer<T> : IEqualityComparer<T>
{
    private readonly Func<T, T, bool> equalityStrategy;
    private readonly Func<T, int> hashCodeStrategy;

    public EqualityComparer(Func<T, T, bool> equalityStrategy, Func<T, int> hashCodeStrategy)
    {
        if (equalityStrategy == null)
        {
            throw new ArgumentNullException("equalityStrategy");
        }

        if (hashCodeStrategy == null)
        {
            throw new ArgumentNullException("hashCodeStrategy");
        }

        this.equalityStrategy = equalityStrategy;
        this.hashCodeStrategy = hashCodeStrategy;
    }

    public bool Equals(T x, T y)
    {
        return equalityStrategy(x, y);
    }

    public int GetHashCode(T obj)
    {
        return hashCodeStrategy(obj);
    }
}

That’s a nice flexible class that, by allowing to specify the comparison logic in the form of delegates, is suitable in different scenarios. Before the support for delegates was added, however, having AutoFixture play along with this class in the context of unit testing would be quite problematic. The tests would, in fact, fail consistently with a NotSupportedException, since the constructor of the EqualityComparer class requires the creation of two delegates.
Luckily, this is not a problem anymore.

There was a time when all my energies and effort went into building web applications. In the beginning the platform I was on was Microsoft ASP 2.0, but since 2002 it became all about ASP.NET Web Forms.

I still remember clearly the excitement there was around the programming model and architecture brought by Web Forms. The new code-behind style allowed to finally separating a web page’s layout from its code logic. The server controls programming model were built so that developers could build web pages pretending they were Windows Forms.

The promise of Web Forms was that web developers would never have to touch HTML and JavaScript ever again. They could simply have to add a bunch of .NET controls to a class, set a couple of properties and web pages would magically appear in the browser.

Although ASP.NET Web Forms was designed with a noble goal in mind, it turned out the Forms/Controls metaphor never completely worked for the web.

infoSure, the Web Forms model boosted productivity compared to previous technologies like ASP. However it never succeeded in shielding developers from having to deal HTML, CSS and JavaScript. That essentially meant ignoring the very basic elements of the web.

This is a story of how I was painfully reminded of this reality.

ASP.NET and the Ajax Control Toolkit

At some point in time the Web became all about rich user interaction. One way to achieve this was through the power of asynchronous HTTP requests made through JavaScript, which returned XML data. This combination is commonly referred to as Ajax.
When Ajax got widespread popularity, Microsoft built upon the Web Forms model to enable developers to leverage this new programming paradigm. Once more with the promise of ever having to touch a line of JavaScript.AspNetAjaxLogo

This effort culminated in a new infrastructure made available in the .NET Framework 3.5 and a collection of Ajax-enabled server controls. Once dragged into a Web Forms page, these controls would instantly deliver rich functionality by emitting the required JavaScript code to make it happen.

Contrarily to what had been done in the past, the new Ajax controls were not made part of an official version of the .NET Framework, but only the underlying framework support they need in order to work. The control themselves were released as an open-source project called the ASP.NET Ajax Control Toolkit hosted on the Microsoft CodePlex site.

The illusion

After having been away from web development for almost three years, I’ve lately been involved in a project to build a web application. Of course the customer expected a modern and interactive web application, which meant we were going to be using Ajax on the frontend to some extent.

After having brought myself up to speed on the latest innovations around Ajax in ASP.NET 3.5, I was excited at the idea of be able to deliver that kind of functionality on a web page without having to handcraft (and debug) gobs of JavaScript. Or at least, so I thought.

Facing reality

I have to admit that the Ajax support in ASP.NET 3.5 held up to my high expectations quite well. Up until the Web Forms metaphor leaked again and I was roughly brought back to earth.

It turns out the ComboBox control contained in the Ajax Control Toolkit has a nasty bug that manifested itself for me when I used it inside a TabPanel control (also part of the same library).

Here is what happens: the first screenshot shows a ComboBox control inside a TabPanel that is visible when the page loads for the first time. Below is another ComboBox control this time hosted in a second TabPanel that is initially hidden.

ASP.NET Ajax ComboBox control working
ASP.NET Ajax ComboBox control broken

The second definitely doesn’t look right. After a quick check to the documentation available online I couldn’t find anything I was doing wrong when using the controls. The only possible explanation was that there must be a bug in the JavaScript generated by the ComboBox. Let me just check. Yes, here it is.

Apparently there is currently no plan from Microsoft to fix this issue anytime soon. That could mean only one thing: I had to dig in and debug the JavaScript myself. The Web Forms’ bubble had burst once again.

The “pragmatic” workaround

After having downloaded the AJAX Control Toolkit source code off CodePlex, I started to look around among the project files. I quickly indentified that the JavaScript code for the ComboBox control is all contained in a single file found in /AjaxControlToolkit/ComboBox/ComboBox.js (actually the ComboBox.debug.js file contains the original source code while its ComboBox.js counterpart contains the minified JavaScript optimized for production).

The general design of the client-side Ajax framework and controls built by Microsoft makes a lot of sense and the source code is well organized. This allowed me to quickly arrive at the root of the problem, which is:

infoThe ComboBox control calculates its size (width and height) during initialization relatively to the size of its parent container. If the parent container is hidden when it gets measured, the returned size will be zero. That means the ComboBox has nothing to calculate its own site against and it ends up looking the way it does.

Without having to dig too much into the inner workings of the ComboBox, I came up with the simplest possible solution to the problem:

infoWe need to make sure that the ComboBox’s parent container is visible during the control’s initialization phase. That way the ComboBox’s size can correctly be calculated and assigned. Afterwards we can restore the parent container to its original state.

In order to achieve this I added the following code (lines 9-16 and 30-40) to the ComboBox.debug.js file:

AjaxControlToolkit.ComboBox.prototype = {

    initialize: function() {

        AjaxControlToolkit.ComboBox.callBaseMethod(this, 'initialize');

        // Workaround for issue #24251
        // http://ajaxcontroltoolkit.codeplex.com/WorkItem/View.aspx?WorkItemId=24251
        var hiddenParent = this._findHiddenParent(this.get_comboTableControl());
        var hiddenParentDisplay;

        if (hiddenParent != null) {
            hiddenParentDisplay = hiddenParent.style.display;
            hiddenParent.style.visibility = "visible";
            hiddenParent.style.display = "block";
        }

        this.createDelegates();
        this.initializeTextBox();
        this.initializeButton();
        this.initializeOptionList();
        this.addHandlers();

        if (hiddenParent != null) {
            hiddenParent.style.visibility = "hidden";
            hiddenParent.style.display = hiddenParentDisplay;
        }

    },
    _findHiddenParent: function(element) {

        var parent = element.parentElement;

        if (parent == null || parent.style.visibility == "hidden") {
            return parent;
        }

        return this._findHiddenParent(parent);

    }

Yes I know this isn’t the most elegant solution, but it works. After all, I said it was going to be pragmatic.

Once I made sure the patch worked correctly, I used the freely available Microsoft Ajax Minifier to produce a new ultra-compact (or minified) version of the ComboBox.js file.

Integrating the workaround into the solution

The workaround itself may not be a piece of art. However the way it got integrated into the existing ASP.NET web application is quite elegant in my opinion. Let me give a quick background first.

With ASP.NET Web Forms 3.5 Microsoft introduced a new mechanism for delivering JavaScript content into web pages. This is done by a specialized server control called the ScriptManager.

infoAll controls that need some piece of JavaScript code to in order to work, have to register the required scripts with the ScriptManager. Its responsibility is to make sure that the links to the appropriate resources are ultimately included in the page output.

The ScriptManager obviously plays a central role in the ASP.NET Ajax infrastructure. However it has some great features too. In this case  I’m referring to the possibility to substitute a JavaScript resource required by a server control with a local resource. Scott Hanselman wrote a great article explaining how to take advantage of this feature, which served me well in this case.

infoSince all JavaScript files contained in the Ajax Control Toolkit are statically compiled in the AjaxControlToolkit.dll assembly, the only way to replace the original ComboBox.js file with the patched one without having to deploy a recompiled version of the library, was to substitute the original reference within the ScriptManager and have it point to a local version of the file.

Here is how it was done:

<asp:scriptmanager id="scmScriptManager" runat="server">
    <scripts>
        <asp:scriptreference path="~/UI/Scripts/ComboBox.js" name="AjaxControlToolkit.ComboBox.ComboBox.js" assembly="AjaxControlToolkit, Version=3.0.30512.20315, Culture=neutral, PublicKeyToken=28f01b0e84b6d53e" />
    </scripts>
</asp:scriptmanager>

What about the original ComboBox.debug.js file? Well, the ScriptManager is smart enough to deliver the appropriate version of the file whenever debugging is enabled in the web application’s configuration file. This will work automatically as long as both files are located in the same folder on the server and are named according to the following convention:

  • Original: filename.debug.js
  • Optimized: filename.js

You can download the modified JavaScript files from the link at the end of this page. Note that they are based on and will work with the Ajax Control Toolkit release 3.0.30512.

Conclusions

The Ajax support built into ASP.NET 3.5 Web Forms together with the control freely available in the Ajax Control Toolkit is a powerful combination. When used wisely it will allow you to get quite far in creating rich and interactive web pages without having to worry about JavaScript.

However we all know that software abstractions are leaky, and this is especially true for the one that is ASP.NET Web Forms. That means that sooner or later you will have to take control of what’s being sent down to the browser, whether it be the HTML markup, CSS stylesheets or JavaScript code. And when that time comes, you’d better be prepared.

Download Download Ajax Control Toolkit ComboBox JavaScript files

/Enrico

It’s interesting how a lot of the work I’ve been doing lately has in some way involved a kind of performance tuning. Previously I’ve talked about how to increase the performance of .NET applications by using delegates instead of reflection in code that runs frequently.

This time it is all about performance in database processing.

The scenario

Imagine an application that manages a wait list. Users of this application put themselves in line and wait for their turn to gain access to some kind of shared resource. Here are the basic rules of this system:

  • The same user can appear in the wait list multiple times, once for every resource she is queuing for.
  • The users’ position in the wait list at any given time is decided by a score.
  • This score is calculated based on the number of credits each user has in the system compared to the amount required by the resource they wish to access.

Let’s say that this wait list is modeled in a Microsoft SQL Server database with the following schema:

WaitListSchema

The position of the different users in the wait list is periodically updated by a Stored Procedure that calculates the current score for each and every row in the WaitList table.

So far so good. Now, imagine that this WaitList table contains somewhere around 30 millions rows, and the Stored Procedure that updates all of the scores takes about 9 hours to complete. And now we have problem.

 The imperative SQL approach

Before going into all kinds of general database optimization techniques, let’s start off by looking at how that Stored Procedure is implemented.
Here is a slightly simplified version of it:

CREATE PROCEDURE CalculateWaitListScores_Imperative
AS
BEGIN

DECLARE @rowsToCalculate INT

SELECT @rowsToCalcualte = COUNT(*)
FROM WaitList
AND Score IS NULL

WHILE ( @rowsToCalculate > 0 )
BEGIN

  DECLARE @userID INT
  DECLARE @resourceID INT
  DECLARE @score INT

  SELECT TOP 1 @userID = UserID, @resourceID = ResourceID
  FROM WaitList
  AND Score IS NULL

  -- The actual calculation of the score is omitted for clarity.
  -- Let's just say that it involves a SELECT query that joins
  -- the [WaitList] table with the [User] and [Resource] tables
  -- and applies a formula that associates the values
  -- of the [Credit] columns in each of them.
  -- For the sake of this example we just set it to a constant value
  SET @score = 150

  UPDATE WaitList
  SET Score = @score
  WHERE UserID = @userID
  AND ResourceID = @resourceID
  AND Score IS NULL

  SELECT @rowsToCalcualte = COUNT(*)
  FROM WaitList
  AND Score IS NULL

END

END

If you aren’t into the Transact-SQL language syntax, let me spell out the algorithm for you:

  1. Get the number of rows in the WaitList table where the score has never been calculated
  2. If there are any such rows, get the user and the resource IDs for the first row in the WaitList table where the score has never been calculated
  3. Calculate the score for that user and resource
  4. Update the score with the newly calculated value
  5. Go to Step 1

In the worst case, this set of operations will be repeated 30 millions times, that is once for every row in the WaitList table. Think about it for a moment.

While looking at this code, I immediately imagined this dialogue taking place between SQL Server and the developer(s) who wrote the Stored Procedure:

Developer: Listen up, SQL Server. I want you to calculate a new score and update all of those 3o millions rows, but do it one row at a time.

SQL Server: That’s easy enough, but I’m pretty sure I can find a faster way to do this, if you’ll let me.

Developer: No, no. I want you to do exactly what I said. That way it’s easier for me to understand what’s going on and debug if any problem occurs.

SQL Server: Alright, you’re the boss.

Jokes aside, the bottom line here is this:

infoBy implementing database operations in an imperative manner, you effectively tie up the hands of the query execution engine, thus preventing it from performing a number of optimizations at runtime in order to speed things up.

And that basically means trading performance and scalability for more fine-grained control.

 The declarative SQL approach

Let’s see if we can make this Stored Procedure run any faster, by changing our approach to the problem altogether.

infoThis time, we’ll tell the database what we want done in a declarative manner, and we’ll let the query execution engine figure out the best way to get the job done.

Here is a rewritten version of the original Stored Procedure:

CREATE PROCEDURE CalculateWaitListScores_Declarative
AS
BEGIN

UPDATE WaitList
SET Score = dbo.CalculateScore(UserID, ResourceID)
WHERE Score IS NULL
	
END

What we did is basically removing the explicit loop and merging all operations into a single UPDATE statement executed on the WaitList table, which invokes a custom a scalar function (CalculateScore) to calculate the score with the value of the current row.

Now, let’s look at some performance comparison:

WaitListPerfChart

That’s a pretty significant leap in speed. How is that possible? A look at the CPU usage on the database server while running the two versions of the Stored Procedure pretty much explains it all:

CPU usage while executing CalculateWaitListScores_Imperative:

CpuUsageWithImperativeSql

CPU usage while executing CalculateWaitListScores_Declarative:

CpuUsageWithDeclarativeSql 

As you see, in the first picture the CPU is steadily at 9-10% and is basically using only one out of four available cores. This is because SQL Server is forced to do its work sequentially and has to wait until the score for the current row has been calculated and updated before proceeding to the next.

In the second picture, we are simply telling SQL Server our intent, rather than dictating exactly how it should be done. This allows SQL Server to parallelize the workload than can now be executed on multiple CPU/Cores at once leveraging the full power of the hardware.

Lessons learned

Here are a couple of getaways I learned from this exercise:

  1. SQL is a declarative language at its core, designed to work with sets of rows. That’s what it does best and that’s how you should use it.
  2. Whenever possible, try to avoid applying an imperative programming mindset when implementing database operations, even if  the constructs available in SQL-derived languages like T-SQL make it easy to do so
  3. Don’t be afraid to give up some control over what happens at runtime when your database code runs. Let the database find out the best way to do things, and get ready to experience some great performance improvements.

Hope this helps.

/Enrico

Lately I have been involved in the performance profiling work of a Windows client application, which customers had lamented to be way too slow for their taste.

The application was originally developed a couple of years ago on top of the .NET Framework 2.0. Its user interface is built using Windows Forms and it retrieves its data from a remote remote server through Web Services using ASMX.

Everything worked just fine from a functionality standpoint. However customers complained over long delays as data was being retrieved from the Web Services and slowly populated the widgets on the screen.
Something had to be done to speed things up.

Reflection is a bottleneck

A look with a .NET profiler tool such as JetBrains DotTrace revealed that a lot of time was spent sorting large collections of objects by the value of one of their properties. This would typically be done before binding them to various list controls in the UI.
The code would typically look like this and was spread out all over the code base:

// Retrieves the entire list of customers from the DAL
List<Customer> customerList = CustomerDAO.GetAll();

// Sorts the list of 'Customer' objects
// by the value of the 'FullName' property
customerList.Sort(new PropertyComparer("FullName"));

// Binds the list to a ComboBox control for display
cmbCustomers.DataSource = customerList;

Apparently line 6 was the one that takes forever to execute. Now, since the sorting algorithm used in the IList.Sort method can’t be changed from outside the class, the weak link here must be the PropertyComparer. But what is it doing? Well, here it is:

using System;
using System.Collections.Generic;
using System.Reflection;

namespace Thoughtology.Samples.Collections
{
    public class PropertyComparer<T> : IComparer<T>
    {
        private string propertyName;
    
        public PropertyComparer(string propertyName)
        {
            this.propertyName = propertyName;
        }
    
        public int Compare(T x, T y)
        {
            Type targetType = x.GetType();

            PropertyInfo targetProperty = targetType.GetProperty(propertyName);
    
            string xValueText = targetProperty.GetValue(x, null).ToString();
            string yValueText = targetProperty.GetValue(y, null).ToString();
     
            int xValueNumeric = Int32.Parse(xValueText);
            int yValueNumeric = Int32.Parse(yValueText);
    
            if (xValueNumeric < yValueNumeric)
            {
                return -1;
            }
            else if (xValueNumeric == yValueNumeric)
            {
                return 0;
            }
            else
            {
                return 1;
            }
        }
    }
}

Likely not the prettiest code you have ever seen. However, it’s pretty easy to see what it’s doing:

  1. Extracts the value of the specified property from the input objects using reflection.
  2. Converts that value to a String.
  3. Parses the converted value to an Integer.
  4. Compares the numeric values to decide which one is bigger.

That seems like a lot of extra work for a simple value comparison to me.

I’m sure the method was built that way for a reason. This IComparer class is designed to be “generic” and work on any type of value on any object. However my guess is that it won’t work with anything but primitive types (numbers, strings and booleans). In fact the default implementation of the Object.ToString() (used in lines 22-23) method returns the fully qualified name of the class, and that usually doesn’t isn’t much of a sorting criteria in most cases.

infoThe real performance bottleneck here is caused by the use of reflection inside of a method that is called hundreds if not thousands of times from all over the application.

Use delegates instead

At this point it is clear that we need to refactor this class to improve its performance and still retain its original functionality, that is to provide a generic way to compare object by the value of one of their properties.

The key is to find a better way to retrieve the value of a property from any type of object without having to use reflection.

Well, since we do know the type of the objects we are comparing through the generic parameter T, we could let the caller specify which value to compare the objects with by. This can be done by having the caller pass a reference to a method, which would return that value when invoked inside of the Compare method. Let’s try it and see how it works.

Implementing the solution in .NET 2.0

Since the application was on .NET 2.0, we need to define our own delegate type that will allow callers to pass the reference to a method returning the comparable value . Here is the complete implementation of the refactored PropertyComparer class:

using System;
using System.Collections.Generic;

namespace Thoughtology.Samples.Collections
{
    public class PropertyComparer<T> : IComparer<T>
    {
        public delegate IComparable ComparableValue(T arg);

        public PropertyComparer(ComparableValue propertySelector)
        {
            this.PropertySelector = propertySelector;
        }

        public ComparableValue PropertySelector { get; set; }

        public int Compare(T x, T y)
        {
            if (this.PropertySelector == null)
            {
                throw new InvalidOperationException("PropertySelector cannot be null");
            }

            IComparable firstValue = this.PropertySelector(x);
            IComparable secondValue = this.PropertySelector(y);

            return firstValue.CompareTo(secondValue);
        }
    }
}

Our delegate, called ComparableValue, takes an object of the generic type T as input and returns a value to compare that object by.

The comparison itself is than performed by the returned value itself, by invoking the IComparable.CompareTo method on it (see line 27).

infoAll primitive types in .NET implement the IComparable interface. Custom objects can easily be compared in a semantically meaningful way by manually implementing that same interface.

The caller can now invoke the Sort method by specifying the property to compare the items by with an anoymous delegate:

customerList.Sort(new PropertyComparer(delegate(Customer c)
    {
        return c.FullName;
    });

Notice how the property name is no longer passed a a string. Instead it is actually invoked on the object providing compile-time type checking.

Alternative implementation in .NET 3.5

This same solution can be implemented slightly differently in .NET 3.5 by taking advantage of the built in Func<T,TResult> delegate type:

using System;
using System.Collections.Generic;

namespace Thoughtology.Samples.Collections
{
    public class PropertyComparer<T> : IComparer<T>
    {
        public PropertyComparer(Func<T, IComparable> propertySelector)
        {
            this.PropertySelector = propertySelector;
        }

        public Func<T, IComparable> PropertySelector { get; set; }

        public int Compare(T x, T y)
        {
            if (this.PropertySelector == null)
            {
                throw new InvalidOperationException("PropertySelector cannot be null");
            }

            IComparable firstValue = this.PropertySelector(x);
            IComparable secondValue = this.PropertySelector(y);

            return firstValue.CompareTo(secondValue);
        }
    }
}

Great, this saved us exactly one line of code.
Don’t worry, things get much nicer on the caller’s side where the anonymous delegate is substituted by a much more compact lambda expression:

customerList.Sort(new PropertyComparer(c => c.FullName));

The results

Now that we put reflection out of the picture, it is a good time to run a simple test harness to see how the new comparison strategy performs. For this purpose we will sort an increasingly large collection of objects with the two PropertyComparer implementations and compare how long it takes to complete the operation. Here are the results in a graph:

SortingPerformanceChart

As you see, by using delegates the sorting algorithm stays on the linear O(n). On the other hand with reflection it quickly jumps over in the exponential O(cn) space, where c is the time it takes to make a single comparison.

Lessons learned

This exercise teaches three general guidelines that can be applied when programming in .NET:

  • Reflection is expensive. Use it sparingly and avoid it whenever possible in code that is executed very often, such as loops.
  • Generic delegates allow to build flexible code in a fast and strongly-typed fashion. This can be achieved by letting callers “inject” custom code into an algorithm by passing a delegate as argument to a method. The code referred to by the delegate will then be executed at the appropriate stage in the algorithm inside the method.
  • When reflection is used to dynamically invoke members on a class, the same thing can be achieved by using generic delegates instead, like demonstrated in this article. This technique is widely used by modern isolation frameworks such as Rhino Mocks, Moq and TypeMock Isolator.

Download Download Sort Test Harness Sample

/Enrico

Follow

Get every new post delivered to your Inbox.

Join 185 other followers