Hacking async/await - Monads [C#] | discipline equals freedom

Introduction

In the previous article, we implemented an Optional<T> with its custom async method builder. This allowed us to write asynchronous methods with async/await using Optional<T> type instead of built-in Task<T> or Value<T> types.

[AsyncMethodBuilder(typeof(OptionalMethodBuilder<>))]
public partial class Optional<T>
{
    public bool HasValue { get; }
    public T Value { get; }
    public Optional() => (HasValue, Value) = (false, default);
    public Optional(T value) => (HasValue, Value) = (true, value);

    public static Optional<T> None { get; } = new Optional<T>();
}

public class OptionalMethodBuilder<T> { ... }

class OptionalSamples
{
    Optional<int> TryParseInt(string text) =>
        int.TryParse(text, out var result) ? new Optional<int>(result) : Optional<int>.None;

    async Optional<string> ProcessText2(string text1, string text2)
    {
        int number1 = await TryParseInt(text1);
        int number2 = await TryParseInt(text2);

        return (number1 + number2).ToString();
    }
}

The implementation of OptionalMethodBuilder<T> was not so easy to do and understand. It’s because its API must match the API of the state machine generated by the compiler. Unfortunately, we had to deal with many details of how those two objects should interact.

When we look at different programming languages, especially functional languages like F# or Haskell, we see some similarities. The C# language has async methods with the async/await keywords designed mainly for writing asynchronous code. The creators of the C# language admit that they were inspired by the F# feature, computation expressions. In the Haskell language, Monads and “do notation” serve a similar purpose. In C#, async/await solves a specific problem related to asynchronous programming. In contrast, in F# and Haskell, general-purpose language features can be used for async programming or many other things. This article explores how far we can go with C# extension points to achieve capabilities similar to F# and Haskell. But first, we have to introduce Monads.

Monad

We can think of Monad as a design pattern. Design patterns are named solutions for commonly occurring problems. I like to compare Monad to the Iterator pattern.

Let’s say we have different collections, such as an array, an array list, a dictionary, or a linked list. Each has a different internal structure and API, so even a simple iteration over all items would have to be implemented differently for each collection. The Iterator design pattern introduces a simple API enabling walking through all items from the beginning to the end. All concrete collection types implement the same standard Iterator API.

That gives key benefits such as:

The code responsible for iteration looks the same regardless of the collection.
A programming language can introduce special keywords to simplify and shorten necessary code. foreach is a while loop using IEnumerable<T> and IEnumerator<T> objects behind the scenes.

var items = new[] { 5, 10, 15 };
foreach (var item in items)
{
    Console.WriteLine(item);
}

// code above is translated into code below
IEnumerable<int> iterable = items;
using IEnumerator<int> iterator = iterable.GetEnumerator();
while (iterator.MoveNext())
{
    var item = iterator.Current;
    Console.WriteLine(item);
}

Instead of implementing the same functionality for each collection, we can implement it once for an abstracted interface. The Enumerable class provides over 50 operators working with IEnumerable<T>, such as Where, Select, etc.

What exactly is an abstraction of the Monad? Someone noticed that types like Nullable<T>, Optional<T>, Task<T>, T[], IEnumerable<T>, IObservable<T>, List<T>, … are “the same”. Our first guess would be that all types are generic with one generic parameter T. That’s a correct observation, but it is not enough. There are many such genetic types; we could not do much with them.

Every Monad type Monad<T> must implement two functions:

Monad<T> Return<T>(T value)
Monad<R> SelectMany<T, R>(Monad<T> m, Func<T, Monad<R>> f)

Monad is a mathematical concept (Category theory), but it is also used in programming. Haskell is a functional language famous for its heavy usage of Monads. In Haskell and other languages supporting Monads, Return is called return or pure, and SelectMany is called bind. I used names that were more familiar to C# programmers. There are many items in Category Theory besides Monad. The other one, similar to Monad, is Functor.

Every Functor type Functor<T> must implement two functions:

Functor<T> Return<T>(T value)
Functor<R> Select<T, R>(Functor<T> m, Func<T, R> f)

In Haskell, the Select function is called map. As you can see, Return functions are the same, and Select and SelectMany are very similar. The only difference is that the type of the second parameter is Func<T, R> instead of Func<T, Monad<R>>. Each Monad type is also a Functor, because the Select function can be easily implemented using Return and SelectMany. In an object-oriented world, we could say that Monad inherits from Functor. From this article’s point of view, only Monads are crucial. But Monads and Functors are often presented together, so that we will implement both.

It’s worth mentioning that everything I said about Monad and Functor is very simplified. There is a lot of theory behind those concepts. I am not an expert in those topics, and even additional things I know (like rules that those functions need to obey) would not change much in the context of this article. The goal is to show that async/await features are much deeper than we think. It is not only about asynchronous programming. We will see that async/await keywords to Monads can be seen the same way as foreach to Iterator, it’s just a language feature utilizing the API underneath.

Optional

Let’s start with the most straightforward Monad implementation, the Optional<T> type.

public static class Monad
{
    public static Optional<T> ReturnO<T>(this T value)
        => new(value);

    public static Optional<R> Select<T, R>(this Optional<T> optional, Func<T, R> f)
        => optional.HasValue ? new (f(optional.Value)) : Optional<R>.None;

    public static Optional<R> SelectMany<T, R>(this Optional<T> optional, Func<T, Optional<R>> f)
        => optional.HasValue ? f(optional.Value) : Optional<R>.None;
}

The code is intuitive and makes sense. The Optional<T> type represents any value of type T that can be potentially absent. Select function maps the value of type T to type R using a logic defined by the function parameter f, only when it is present. In the case of the SelectMany function, the mapping logic returns Optional<R> instead of R.

I mentioned before that Select can be implemented using Return and SelectMany.

public static class Monad
{
    public static Optional<R> Select<T, R>(this Optional<T> optional, Func<T, R> f)
        => optional.SelectMany(v => ReturnO(f(v)));    
}

Now, let’s look at the two implementations of the ProcessText methods side by side.

static async Optional<string> ProcessText2(string text1, string text2)
{
    int number1 = await TryParseInt(text1);
    int number2 = await TryParseInt(text2);

    return (number1 + number2).ToString();
}

static Optional<string> ProcessText3(string text1, string text2)
    => TryParseInt(text1).SelectMany(number1 =>
        TryParseInt(text2).Select(number2 => (number1 + number2).ToString()));

In C#, foreach can be used with any Iterable object because the C# compiler generates a simple while loop calling appropriate members, as shown before. We can think the same way about async/await and Monads.

We implemented a custom async method builder in the previous article to combine async/await with Optional<T>. But that was a dedicated implementation that worked only with Optional<T>. I started wondering if it would be possible to implement an async method builder for any Monad type. Finally, after a long fight, I “did it” :) However, this topic will be discussed in detail in the next article. For now, let’s focus on Monads and provide more examples.

Task

The next type, built-in Task<T>, is also a Monad.

public static class Monad
{
    public static Task<T> ReturnT<T>(this T value) => Task.FromResult(value);

    public static Task<R> Select<T, R>(this Task<T> task, Func<T, R> f)
        => task.ContinueWith(t => f(t.Result));

    public static Task<R> SelectMany<T, R>(this Task<T> task, Func<T, Task<R>> f)
        => task.ContinueWith(t => f(t.Result)).Unwrap();
}

Interestingly, there is a direct mapping between the Task built-in methods and the Monad functions. This is the power of abstractions. Once we see some abstraction, we can simplify and unify the API. We will see later that many of the .NET types we use daily are Monads, and unfortunately, Monad methods are named differently for each of them.

The goal of the series is to show that async/await can work not only with those types, but with any types supporting the necessary API. Writing custom async method builders is complex and error-prone; that’s why we used the concept of Monad. It’s much simpler to explain and implement Return, Select, and SelectMany methods than a detailed compiler-oriented async method builder.

Task<T> and ValueTask<T> are integrated with async/await by default; the .NET Framework provides appropriate async method builders. It is possible to override the default builder by pointing to our own. But for now, let’s use a different approach and define a wrapper around Task<T> called TTask<T> and implement Monad functions.

[AsyncMethodBuilder(typeof(TTaskMethodBuilder<>))]
public class TTask<T>
{
    public Task<T> Task { get; }
    public TTask(Task<T> task) => Task = task;
}

public static class Monad
{
    public static TTask<T> ReturnTT<T>(this T value) => new(ReturnT(value));

    public static TTask<R> Select<T, R>(this TTask<T> task, Func<T, R> f)
        => new(task.Task.Select(f));

    public static TTask<R> SelectMany<T, R>(this TTask<T> task, Func<T, TTask<R>> f)
        => new(task.Task.SelectMany(x => f(x).Task));
}

We introduced the helper type TTask<T> to configure a custom async method builder using the AsyncMethodBuilder attribute. Let’s write a simple async method using TTask instead of Task.

async TTask<string> Run()
{
    int a = await new TTask<int>(Task.Delay(1000).ContinueWith(_ => 1));
    int b = await new TTask<int>(Task.Delay(1000).ContinueWith(_ => 2));
    return (a + b).ToString();
}

The above method works exactly as the default async method using Task or ValueTask types. In the next article, we will discuss the TTaskMethodBuilder type and the problem with implementing a custom async method builder for the built-in type Task (instead of TTask). It could be done, but we must know one specific constraint.

IEnumerable

The Enumerable static class provides over 50 operators working with the IEnumerable<T> type, and all Monad functions (Return, Select, SelectMany) among them. For educational purposes, let’s implement them from scratch.

public static class Monad
{
    public static IEnumerable<T> ReturnE<T>(this T value) => new[] { value };

    public static IEnumerable<R> Select<T, R>(this IEnumerable<T> enumerable, Func<T, R> f)
    {
        foreach (var item in enumerable)
        {
            yield return f(item);
        }
    }

    public static IEnumerable<R> SelectMany<T, R>(this IEnumerable<T> enumerable, Func<T, IEnumerable<R>> f)
    {
        foreach (var item in enumerable)
        {
            foreach (var subitem in f(item))
            {
                yield return subitem;
            }
        }
    }
}

Next, let’s implement async methods based on the IEnumerable<T> type.

[AsyncMethodBuilder(typeof(IEnumerableMethodBuilder<>))]
async IEnumerable<string> Run()
{
    IEnumerable<int> items = new[] { 5, 10, 15 };
    int item = await items;
    return item + " `zl";
}

This code does not work as expected. I would expect that calling Run().ToList()returns a collection with values ["5 zl", "10 zl", "15 zl"], but it throws a stack overflow exception. It’s because the Run method returns an infinite sequence of values "5 zl", "5 zl", "5 zl", ... . Calling Run().Take(2).ToList() limiting the number of items returns ["5 zl", "5 zl"].

Unfortunately, the C# authors did not consider Monad abstraction when designing the async/await feature. This feature was originally designed only for the Task type, and then the ability to define custom task-like types and an async method builder were added. And now, it’s too late. I have been trying to implement a universal Monad async method builder for a long time. My solution works for some Monads, but not all. The Run above should be translated into items.SelectMany(item => item + " zl"), as it is in languages like Haskell or F#, where the code works as expected. I don’t want to go into details, but the problem appears when the lambda function passed into SelectMany is called more than once. This starts an async method from the beginning, which is why the first value, 5, is returned over and over again.

It might become clearer once you read the whole implementation of the IEnumerableMethodBuilder type in the next part of the series. Please note that this time, the AsyncMethodBuilder attribute decorated an async method instead of a type. The ability to decorate methods was added in C#10 and .NET6 in 2021 for scenarios where we cannot put the attribute above the type we don’t own.

IO

Let’s discuss the less obvious type, IO Monad. I still remember the feeling when I finally understood how we could write whole programs in a pure language like Haskell and still perform IO operations. It was like a magic.

A pure function must follow two rules:

It always returns the same result for the same parameters, so it’s deterministic. For example, (a,b) => a + b is pure, and x => DateTime.Now.Add(TimeSpan.FromDays(x)) is not.
It cannot have any “side effects”, such as changing its parameters or external state, calling the database or web service, or even printing to the console.

The ParseText method reads a text from the console, tries to parse it into an int, and then prints information back to the console.

async static IO<int> ParseText()
{
    await WriteLine("Give me a number:");
    var text = await ReadLine();
    if (int.TryParse(text, out var number))
    {
        await WriteLine($" '{number}' is a correct number");
        return number;
    }
    await WriteLine($" '{text}' is an incorrect number");
    return -1;
}

We don’t know how called methods are implemented yet, but trust me, the ParseText method is pure. The Haskell programming language is pure, meaning only pure functions can be written, and we cannot change any state, variables, parameters, or data structures like records or collections. Now, let’s discover the secret of the IO<T> type and how console functions are defined.

[AsyncMethodBuilder(typeof(IOMethodBuilder<>))]
public delegate T IO<T>();

public class Unit
{
    public static Unit V { get; } = new Unit();
}


public static class IOOperators
{
    public static IO<string> ReadLine() => () =>
        Console.ReadLine()!;

    public static IO<Unit> WriteLine(string text) => () =>
    {
        Console.WriteLine(text);
        return Unit.V;
    };
	
	public static void Run<T>(this IO<T> io) => io();
}

IO<T> is a delegate type without arguments that returns the specified T generic parameter. ReadLine and WriteLine methods wrap the built-in Console.WriteLine and Console.ReadLine methods. Wrapping some logic into a function is often called “thunk”; it delays its execution. The helper Unit type works like a void. It’s time for the Monad functions.

public static class Monad
{
    public static IO<T> ReturnIO<T>(this T value) => () => value;

    public static IO<R> Select<T, R>(this IO<T> io, Func<T, R> f) =>
        () => f(io());

    public static IO<R> SelectMany<T, R>(this IO<T> io, Func<T, IO<R>> f) =>
        () => f(io())();
}

We can write the whole program this way. Some methods will be standard ones that return types other than IO<T>, like strings, ints, and arrays. The other ones will return IO<T> types, but inside their body, the typical control flow constructs (if-then-else, loops, variables, …) will be used. All of them will be pure. The Main method will start the whole program executing io.Run() which calls the IO<T>delegate.

Summary

This article was all about Monads. We implemented Monad functions for the following types: Optional<T>, Task<T>, TTask<T>, IEnumerable<T>, and IO<T>. Many other types in .NET could potentially be a Monad, such as Nullable<T>, IObservable<T>, T[], etc.

I mentioned languages like Haskell and F# a few times. As an illustrative example, look at the “do notation” in Haskell.

// result:: Option<string>
result = do
  n <- tryReadNumber()
  m <- tryReadNumber()
  return `${n}+${m}=${n+m}`

// is translated into ~
tryReadNumber().bind(n => tryReadNumber().bind(m => some(`${n}+${m}=${n+m}`) );

Or computation expressions in F#

let parseTwoInts str1 str2 =
    option {
        let! value1 = tryParseInt str1
        let! value2 = tryParseInt str2
        return value1 + value2
    }

// is translated into ~
let parseTwoInts str1 str2 =
    option.Bind(
        (tryParseInt str1),
        (fun value1 -> option.Bind(tryParseInt str2, (fun value2 -> option.Return(value1 + value2))))
    )

An option is not a keyword in the language, but just a variable defined like this:

type OptionBuilder() =
    member this.Return(value) = returnO value
    member this.Bind(monad, binder) = bindO binder monad

let option = OptionBuilder()

Even in C#, we have Monads inside LINQ expressions.

Option<string> result = 
  from m in tryReadNumber()
  from n in tryReadNumber()
  select `${n}+${m}=${n+m}`

// is translated into
tryReadNumber().SelectMany(n => tryReadNumber().SelectMany(m => new [] {`${n}+${m}=${n+m}`]} );

Monads are everywhere :)