Archive for the 'Programming' Category

Refactoring To Functional Code

In my previous post I explained that some problems are better suited to a functional approach than traditional, imperative code. I showed an example of a problem that suits a functional approach and demonstrated turning several lines of imperative code with nested for-each loops and if-statements into a single line of functional code. Paul Harrington posted a comment asking what individual steps I took to refactor the code. Here’s my explanation of the refactoring:

First, here is the original, imperative method.

public int[] Translate(string word)
{
    ArrayList numbers = new ArrayList();

    foreach (char character in word)
    {
        foreach(KeyValuePair<int, char[]> key in keys)
        {
            foreach(char c in key.Value)
            {
                if(c == character)
                {
                    numbers.Add(key.Key);
                }
            }
        }
    }

    return (int[]) numbers.ToArray(typeof (int));
}

As you can see, it takes some effort to determine what’s really going on. From this code we can determine that:

For each character in a word, select the first key in the list of keys that contains the character and add it to an array of numbers to return.

Now, lets express the statement above as functional code:

To select each character in a word, we need to convert the string to an array of characters. The LINQ Select method projects each element of a sequence into a new form. In this case, a string translates to a sequence of characters:

IEnumerable<char> characters = word.Select(c => c);

For each character, we need to find the corresponding key that contains the character. The First method returns the first element of a sequence:

int key = keys.First().Key;

In our case we need the first key in the list of keys that contains the character. The First method has an overload that returns the first element in a sequence that satisfies a specified condition.  To specify our condition we use the Contains method, which determines whether a sequence contains a specified element:

char c = 'a';
int number = keys.First(k => k.Value.Contains(c)).Key;

When we combine the code that selects each character with the code that gets the first key containing the character, we end up with this:

IEnumerable<int> numbers = word.Select(c =>
        keys.First(k => k.Value.Contains(c)).Key);

Finally, we use the ToArray method to return an array of type int. Now we have our final, refactored method:

public int[] Translate(string word)
{
    return word.Select(c =>
        keys.First(k => k.Value.Contains(c)).Key).ToArray();
}

I hope this helps to explain the steps I took to refactor imperative code to functional code. You can get really clever with functional code, but remember, readability is what’s most important. Sometimes it’s best to stick with good old-fashioned for-loops and if-statements, but for some problems, like above, a functional approach can lead to more readable, clean and concise code.

An Example of Functional vs Imperative Programming in C#

Last week I went to a talk presented by Mike Wagg and Mark Needham from ThoughtWorks on Mixing Functional and Object-Oriented Approaches to Programming in C#. Mike and Mark discussed using a functional approach with LINQ to solve problems in C#.

I have come to realise lately that some problems are much better suited to a functional approach than traditional imperative programming. Many problems that involve selecting, filtering or performing actions on a list of items are best suited to functional programming, which can significantly reduce the amount of code required to solve the problem.

Here is a simple example of a problem that is best solved with a functional rather than an imperative approach.

Some businesses advertise their phone number as a word, phrase or combination of numbers and alpha characters. This is easier for people to remember than a number. You simply dial the numbers on the keypad that correspond to the characters. For example, “1-800 FLOWERS” translates to 1-800 3569377.

We will write a simple program that translates a word into a list of corresponding numbers.

phone-keypad

First, let’s start with a dictionary that contains each number and the corresponding characters:

private readonly Dictionary<int, char[]> keys =
            new Dictionary<int, char[]>()
                {
                    {1, new char[] {}},
                    {2, new[] {'a', 'b', 'c'}},
                    {3, new[] {'d', 'e', 'f'}},
                    {4, new[] {'g', 'h', 'i'}},
                    {5, new[] {'j', 'k', 'l'}},
                    {6, new[] {'m', 'n', 'o'}},
                    {7, new[] {'p', 'q', 'r', 's'}},
                    {8, new[] {'t', 'u', 'v'}},
                    {9, new[] {'w', 'x', 'y', 'z'}},
                    {0, new[] {' '}},
                };

Next, we create a Translate method that takes a word and returns an array of corresponding numbers.

With a traditional, imperative approach, we would use for-each loops and if-statements to iterate through characters and populate an array of matching numbers:

public int[] Translate(string word)
{
    ArrayList numbers = new ArrayList();

    foreach (char character in word)
    {
        foreach(KeyValuePair<int, char[]> key in keys)
        {
            foreach(char c in key.Value)
            {
                if(c == character)
                {
                    numbers.Add(key.Key);
                }
            }
        }
    }

    return (int[]) numbers.ToArray(typeof (int));
}

Alternatively, with a functional approach we can use LINQ to select elements from the dictionary and transform the output to an array of matching numbers:

public int[] Translate(string word)
{
    return word.Select(c => 
        keys.First(k => k.Value.Contains(c)).Key).ToArray();
}

And there you have it. Several lines of nested for-each loops replaced with a single line of succinct functional code. Much nicer!

Internal And External Collaborators

The Single Responsibility Principal (SRP) states that every object should have a single responsibility, and that all its services should be aligned with that responsibility. By separating object responsibilities we are able to achieve a clear Separation of Concerns. Objects need to collaborate with each other in order to perform a behaviour. I find I use two distinct styles of collaboration with other objects which I have called internal and external collaborators.

The principals of object collaboration are nothing new, but I have found that defining these roles has helped me to better understand how to design and test the behaviour of objects.

External Collaborators

External collaborators are objects that provide a service or resource to an object but are not directly controlled by the object. They are passed to an object by dependency injection or through a service locator. An object makes calls to its external collaborators to perform actions or retrieve data. When testing an object, any external collaborators are stubbed-out. We can then write tests that perform an action, then determine if the right calls were made to the external object.

Examples of external collaborator objects include: services, repositories, presenters, framework classes, email senders, loggers and file system wrappers.

We are interested in testing how we interact with the external collaborator and not how it affects the behaviour of our object. For example, if we are testing a controller that retrieves a list of customers from a repository, we want to know that we have asked the repository for a list of customers, but we are not concerned that the repository returns the correct customers (this is a test for the repository itself). Of course, we might need some particular customer objects returned by the stub for the purpose of testing the behaviour of the controller. These customer objects then become internal collaborators, which we’ll come to next.

External collaborators can be registered with an IoC container to manage the creation and lifecycle of the object, and to provide an instance to dependent objects.

Internal Collaborators

Internal collaborators have a smaller scope than external collaborators. They are used in the context of the local object to provide functions or hold state. An object and its internal collaborators work together closely and should be treated as a single unit of behaviour.

Examples of internal collaborators include: DTOs, domain entities, view-models, utilities, system types and extension methods.

When testing an object with internal collaborators, we are interested in the effect on behaviour, not the interaction with the object. Therefore we shouldn’t stub-out internal collaborators. We don’t care how we interact with them, just that the correct behaviour occurs.

These objects are not affected by external influences, such as a database, email server, or file system. They are also not volatile or susceptible to environmental changes, such as a web request context. Therefore, they should not require any special context setup before testing.

We don’t get passed an instance of a internal collaborator through dependency injection, instead they may be passed to us by an external collaborator (e.g. a repository returning an entity), or we create an instance within our own object when we need it (such as a DTO).

By understanding the roles and responsibilities of collaboration between objects, our design becomes clearer and tests are more focused and easier to maintain.

object_collaborators

Becoming a Polyglot Programmer

We .NET developers have always had the security blanket of a general purpose language like C# or VB.NET that we’ve been able to use for pretty much anything we need. However, it is becoming increasingly important for a developer to know several languages covering different paradigms and to have the ability to choose the best language for the problem at hand.

The Pragmatic Programmer recommends learning a new language every year. I’m currently taking that to the extreme and have several languages on the go.

C

I never learned C. I went straight from a basic understanding of Java to ASP/VBScript (ah, those were so not the days). So now I’m playing catch-up, learning about exciting things like pointers and memory allocation – all things that I have heard about many times, but never fully understood.

I’m current reading The C Programming Language, which is a very concise (only 274 pages) and superbly written book. If you know C#, then most of what’s covered will not be new. In fact, C is nowhere near as intimidating as I thought it would be, and it’s really made me appreciate the things I took for granted in modern C-based languages.

Haskell

Functional languages are hitting the mainstream, big time. I learned basic functional language principals using Haskell at Uni over 10 years ago. Back then it was pretty much an academic language that complemented a mathematics paper I was doing. These days, functional languages offer a promising solution to concurrency problems with software running on multi-core processors. Haskell is a purely functional language, which means you cannot sneak in a for-each loop when the going gets tough. It’s a great way to learn how to solve a problem in a functional manner, using a functional mindset.

It’s easy to get started with Haskell. I’m currently making my way through this excellent tutorial: Learn You a Haskell for Great Good.

Javascript

I already know Javascript. Or at least I thought I did. Javascript has made a big comeback recently with the increasing popularity of client-side web programming for creating rich websites. The emergence of excellent frameworks like JQuery take the pain out of cross-browser support and DOM manipulation, which has always been the bane of Javascript development. On the surface, Javascript looks like a weak language with some bad features. But strip away the bad parts and there’s a really neat, powerful and dynamic language that runs pretty much anywhere!

With web applications becoming increasingly reliant on rich-client capabilities, it is important for any web developer to have a solid understanding of Javascript. If you are already familiar with Javascript, I would recommend the book Javascript: The Good Parts to take you to the next level.

Ruby

I previously posted on my adventures in Ruby and mentioned some good resources for getting started. Ruby is a very powerful, dynamic and portable language that can be used for many tasks big or small. From building full-scale web applications using Rails, to writing build scripts using Rake. Many .NET developers are replacing the XML-heavy NAnt build files with Rake scripts, which allow much greater flexibility by using the expressiveness of Ruby to coordinate the build tasks. The potent combination of RSpec and Cucumber for writing BDD specs make good use of Ruby’s dynamic and readable syntax. With the release of IronRuby on the horizon, .NET developers will be able to get the benefits of this great language running natively on the .NET Framework via the Dynamic Language Runtime.

F#

F# is a multi-paradigm language for the .NET framework and encompasses both functional and imperative elements. I’ve barely scratched the surface of F#, but it looks very promising indeed. The more we require functional elements in our day-to-day programming, the more important it is to have a language that fully supports functional programming. C# 3.0 has some great functional features, such as LINQ and lambda expressions, but F# looks to offer much more powerful set of functional features.

Conclusion

So, I have a lot of learn! I don’t expect to become an expert in all these languages at once. But I think having a basic understanding of each has already helped me to improve my general programming skills and to approach problems in different ways.

If you are interested in learning a new language, I would recommend you know at least one dynamic and one functional language. If you’re feeling particularly up for a challenge, try LISP 🙂

The War On Nulls

As .NET developers, we’ve all seen this exception hundreds of times: “System.NullReferenceException – Object reference not set to an instance of an object”. In .NET, this exception occurs when trying to access a reference variable with a null value. A null value means the variable does not hold a reference to any object on the heap. It is one of the most frustrating and prolific errors that we programmers encounter. But it needn’t be this way! We can prevent this error by following a few simple rules. But first, a little history…

The null reference was invented by Tony Hoare, inventor of QuickSort, one of the world’s most widely used sorting algorithms. In this introduction to his talk at QCon 2009, Tony describes the impact the null reference has had on software:

I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years. In recent years, a number of program analysers like PREfix and PREfast in Microsoft have been used to check references, and give warnings if there is a risk they may be non-null. More recent programming languages like Spec# have introduced declarations for non-null references. This is the solution, which I rejected in 1965.

So obviously null references have caused quite a lot of damage. But neither Tony or null references are to blame. It’s the careless use of null references that has made them as damaging and prolific as they are.

I can’t think of a single reason why you would need to use null references as part of your system design. Here are some tips for preventing null references in your system.

Never use null references as part of the design

Business logic should not be based around testing for null references. If an object requires an empty state, be explicit about it by creating an empty representation of the object. You can then check if the object is in an empty state by comparing the current instance to an empty instance.

Here is an example of some code that uses a generic interface called ICanBeEmpty to support an empty representation of a Customer object. An extension method called HasValue() allows us to check if an object represents an empty instance.

public class Customer : ICanBeEmpty<Customer>
{
    private int id;
    private string name = string.Empty;
    //...
 
    public bool Equals(Customer other)
    {
        return this.id == other.id;
    }
 
    public static Customer Empty
    {
        get { return new Customer(); }
    }
 
    Customer ICanBeEmpty<Customer>.Empty
    {
        get { return Empty; }
    }
}
 
public interface ICanBeEmpty<T> : IEquatable<T>
{
    T Empty { get; }
}
 
public static class Extensions
{
    public static bool HasValue<T>(this ICanBeEmpty<T> obj)
    {
        return obj.Equals(obj.Empty);
    }
}

Don’t accept null references as parameters

Guard statements are often used to check for null references in methods. If you design your system not to pass nulls, you won’t need guards to check for null in your methods. But when you can’t guarantee input to your public methods, then you need to be defensive about null references.

Don’t return null references

A call to a method or property should never return a null reference. Instead, return an empty representation of an object, or throw an exception if a non-empty value is expected.

Fail fast if a null reference is detected

Design-by-contract technologies, such as Spec#, have declarations that can check for null references at compile time. You can also use an aspect-oriented programming (AOP) solution, such as PostSharp, to create custom attributes that ensures an exception is thrown if any null references are passed in, or returned by a method at runtime. By throwing an exception as soon as a null reference is detected, we can avoid hunting through code to find the source of a null reference.

public class CustomerRepository
{
    [DoesNotReturnNull]
    public Customer GetCustomer(int id)
    {
        //...
        return Customer.Empty;
    }
 
    [DoesNotAcceptNull]
    public void SaveCustomer(Customer customer)
    {
        if (customer.HasValue())
        {
            //...
        }
    }
}

Wrap potential sources of null references

If you are using a third-party service or component where you might receive a null reference, then wrap the call in a method that handles any null references to ensure they don’t leak into the rest of the system.

Always ensure object members are properly instantiated

All object members should be instantiated when an object is created. Be careful with strings in C#, as these are actually reference types. Always set string variables to a default value, such as string.Empty.

Nullable value types are ok

The nullable value types introduced in C# 2.0, such as int? and DateTime?, are better at handling null references as you have to explicitly cast them to a non-null value before accessing them. Be careful with using the Value property on a nullable type without first checking if the variable has a non-null value using the HasValue property. You can use GetValueOrDefault to return a default value if the variable is null.

By limiting the use of null references and not letting them leak into other parts of the system, we can prevent the troublesome NullReferenceException from ruining our day.

Adventures In Ruby

I’ve recently been learning a bit about the Ruby programming language. What really struck a chord with me was watching Dave Thomas’ series of screencasts. These provide an excellent introduction to object-oriented programming in Ruby. There is a small fee per episode, but if you’re interested in a good introduction to Ruby, then I’d highly recommend trying at least the first three episodes.

I’m really excited about what Ruby can offer in terms of dynamic functionality. It’s a bit of a mind-shift coming from the more class-oriented perspective of C# and Java, into a truly dynamic object-oriented environment. But it’s this dynamism that let you be more expressive without being constrained by existing classes, or the type system of the language.

I have yet to look into Iron Ruby, which is the upcoming implementation of Ruby in .NET. But it’s exciting to think that we can use features of .NET with the expressiveness of Ruby.

For more information, check out this Alt.NET podcast.

And of course Dave Thomas’ screencasts mentioned above!

Enjoy.