Sunday 25 April 2010

Algebra of Functions, Part 2: Monads

My Motivation

As you might have guessed now, this is the second in a series of articles attempting gently to introduce C# users to a few functional programming concepts (note: Part 1 has been updated & extended since the original post).

One great motivation for functional techniques is that they facilitate stateless programming, which in turn is good for parallelisation - for example, sharing out the work among many processor cores. That's because the removal of state solves the synchronisation problem beautifully. If there's no state, then there's no state to become inconsistent while you're waiting for all your threads to complete!

Myriad patterns and other techniques exist to handle the awkward manipulations and side effects imposed on your code by a seriously stateful world. The Monad is such a pattern. Prominent in the established functional languages such as Haskell and Scheme, it is by now almost a first-class citizen of the C# world too - it's just a little hidden.

Service Tunnels

Monads are great whenever the work involves concatenating pure functions into a pipeline, then feeding your raw ingredients in at one end, and finally collecting your perfectly prepared hot meals as they cascade out of the other. Among their many other uses, one thing at which monads truly excel is: everything else except your basic vanilla process. "Exceptions", if you will, though in a much more general sense than "error conditions". They work by attaching secondary pipeline systems, one per monad, to divert the impurities from the main flow and manage them independently.

This means that each atomic function, each section in the main pipeline, has to be capable of returning slightly more information than the succeeding one strictly requires. In general therefore, a monad is associated with functions accepting inputs of certain types, and returning so-called amplified types.

The Maybe monad is the archetypal Haskell example. Its job is to handle data values that might be "missing".

Maybe?

Suppose that we want to compose functions accepting and returning value type data. Occasionally a function might encounter a problem, and be unable to return a value. In C# 2.0 and up, we can use a Nullable value type such as int? or double? as the return type. So, this will be our first example of an amplified type. Can we compose such functions?
static Func<T, V?> After<T, U, V>(this Func<U, V?> f, Func<T, U?> g)
where U : struct
where V : struct
{
return x => f(g(x));
}
No, composition fails, because g outputs an amplified U?, whereas f expects just a U.

The monad trick is to combine or "bind" the two functions in a slightly different way, so that all our atomic fs and gs don't have to be rewritten to accept nullable inputs, and to worry about propagating these correctly. Instead, that worry is delegated to the new Bind operation:
static Func<T, V?> After<T, U, V>(this Func<U, V?> f, Func<T, U?> g)
where U : struct
where V : struct
{
return x => g(x).Bind(f);
}

static V? Bind<U, V>(this U? u, Func<U, V?> f)
where U : struct
where V : struct
{
return u == null ? null : f((U) u);
}
Presto! We can once more compose functions, using After. This is despite each function expecting some value type, of which the preceding function actually returns rather the "amplified", Nullable variant.

The example below contains three ad hoc functions from int to int?. Function up returns its input plus one, if such is in int range; otherwise null. Similarly, down returns either its input minus one, or null. Composite function climb calls up once, then down twice. If any constituent call fails (returns null), then so does the composite. But see how such failures are actually routed through the Bind operation, and not propagated through the rest of the main pipeline, which is therefore free not to accept null inputs:
static void Main()
{
Func<int, int?>
up = x =>
{
if (x < int.MaxValue) return x + 1;
return null;
},
down = x =>
{
if (x > int.MinValue) return x - 1;
return null;
},
climb = down.After(down.After(up));
Console.WriteLine(" '{0}' '{1}' '{2}' ",
climb(0), climb(int.MaxValue), climb(int.MinValue));
// Displays '-1' '' ''
Console.ReadKey();
}
Obviously each atomic function is still responsible for detecting and reporting its own exceptional conditions, but no longer for relaying those of its pipeline predecessors. This decoupling pattern is absolutely key to containment or efficient management of emergent complexity.

Complexity Containment

Statelessness and parallelism aside, monads are the key to the management of complexity, because they offer the maximal generalisation of functional composition. The best example available to us C# programmers is LINQ itself; because IEnumerable is a monad, and its Bind operation is SelectMany. Check its signature and you'll see! Ever wondered how LINQ managed to deliver such incredible power over a range of domains including queries, objects, XML etc? Because it was designed to the strict algebraic constraints of a monadic architecture!

Next time: Comprehensions.

No comments:

Post a Comment