Aug 31, 2008

Zero the value is not

One of my pet-peeves is code written like this,

if (0 != value) { ... }

This reads backwards to me. I hear echoes of Frank Oz in my head, "if zero the value is not". As cute as this dialect is coming from a pointy-eared muppet, it still takes a bit to translate back into the more common English phrasing, "if the value is not zero". It seems programmers who insist on writing in this style have forgotten that their source should be readable as well as executable. Don't force your readers to have to do grammar gymnastics as well as having to decipher your algorithmic gynmastics.

Once upon a time, this was considered good style in order to avoid accidentally writing something like,

if (value = 0) { ... }

In C#, this is illegal. In most (all?) modern C and C++ compilers, this will generate a warning. (You do compile with warnings reported as errors right?) There ceases to be a good reason to write test expressions backwards. Let's let this style rest in peace.

Even if you think the above form has some redeeming value, let's all agree that any programmer that voluntarily writes something like,

if (false == value) { ... }

should have an electric shock sent directly through their keyboard.

Jun 16, 2008

BeginInvoke's last parameter: What's in a name?

I have been playing around with asynchronous programming lately and it bothers me that the last parameter to the begin invoke pattern doesn't have a name that everyone agrees on. The begin invoke pattern is the asynchronous calling pattern were the BeginXxxx(), such as Stream.BeginRead(), takes a set of parameters, a callback method, and some, well, thing, at the end that is carried along with the asynchronous operation that eventually finds it way into the IAsyncResult. The problem is what do we call that thing? In an informal survey of the methods in the BCL that implement this pattern I have found a wide variation. Here is a partial list in somewhat order of popularity,

  • object
  • stateObject
  • state
  • asyncState
  • extraData
  • data

There seems to be little agreement about what to call it. We could pick the most prevalent but the name object occurs the most often because every delegate gets a BeginInvoke() method created for it and in this automatically generated code the parameter is called object. We can't standardize on object because it is a reserved word in several languages so either it would be impossible to specify or awkward (i.e. @object in C#). What I would like is a name that we can all use and we can all agree on.

To name the thing, we first must understand why it it is there at all. It exists to hold some state on behalf of the caller. Having the state object makes programming against the begin invoke pattern easier in languages that do not have anonymous methods that capture local variables. Consider the following program (which intentionally ignores errors because it is only an example),

const string uriFormat = 
    "http://messenger.services.live.com/users/{0}@apps.messenger.live.com/presence";
const string responsePattern = 
    @"""statusText""\s*:\s*""(?<status>\w+).*""displayName""\s*:\s*""(?<name>\w+)""";

static void Main(string[] args) {
    var uri = string.Format(uriFormat, args[0]);
    var request = WebRequest.Create(uri);
    var result = request.BeginGetResponse(PresenceCallback, request);

    // Other interesting work....

    result.AsyncWaitHandle.WaitOne();
}

private static void PresenceCallback(IAsyncResult result) {
    var request = result.AsyncState as WebRequest;
    var response = request.EndGetResponse(result);
    var s = new StreamReader(response.GetResponseStream()).ReadToEnd();
    var search = new Regex(responsePattern);
    var match = search.Match(s);    
    if (match.Success)
        Console.WriteLine("{0} is {1}", 
           match.Groups["name"].Captures[0].Value, 
           match.Groups["status"].Captures[0].Value);
    else
        Console.WriteLine("Unexpected response");
}

This will get the online status of a Windows Live Messanger account given the live ID account number. For example, passing 1afa695addc07e5 as an argument to the above will tell whether or not I am online. In this case I am using last parameter of the BeginGetResponse() method to pass the request itself. This then is cast back to WebRequest in the PresenceCallback() method so I can call EndGetResponse() to retrieve the actual response. As far at the BeginGetResponse() call is concerned, this value is opaque. It ignores it completely and just supplies it blindly in the IAsyncResult. It makes no assumptions about the data at all, it is just something the caller just wants carried around. If I was using anonymous delegates in C# this would look a lot better as,

static void Main(string[] args) {
    var uri = string.Format(uriFormat, args[0]);
    var request = WebRequest.Create(uri);
    request.BeginGetResponse(result => {
        var response = request.EndGetResponse(result);
        var s = new StreamReader(response.GetResponseStream()).ReadToEnd();
        var search = new Regex(responsePattern);
        var match = search.Match(s);    
        if (match.Success)
            Console.WriteLine("{0} is {1}", 
               match.Groups["name"].Captures[0].Value, 
               match.Groups["status"].Captures[0].Value);
        else
            Console.WriteLine("Unexpected response");
        }   
    }, null);

    // Other interesting work....

    result.AsyncWaitHandle.WaitOne();
}

Here the request local variable request is captured automatically by the C# compiler and placed into a compiler generated class as a field. The compiler generated class also contains the code I supplied in the lambda as an instance method. When I refer to response in the lambda the reference is automatically translated into a field lookup in the generated class. Since request is already accessible in the lambda I don't need the last parameter to carry anything useful so I pass null.

Anonymous methods makes using callbacks much easier but since not all languages support anonymous methods or lambdas the BCL standardized on a method pattern for begin invoke that can easily be used by those languages. If the calling pattern did not have a caller state object then the work performed automatically by the C# compiler would have to be repeated manually by the programmer in these, then, second class languages. The .NET team did not want such languages to be second class citizen (especially since neither C# nor VB.NET supported anonymous methods initially) so they required the presence of the caller state object parameter.

Now we know why it is there, what to do we call it? I like state because the parameter represents state the caller want's to preserve. I don't like object because it is a common reserved word. I don't like stateObject because a symbol's type should not be repeated in the name, we already know it is an object by its type. asyncState is acceptable, especially since that is the name it is given by IAsyncResult, but it is a bit redundant, we already know, by context, it is asynchronous state. Plus we should avoid abbreviation, like "async", in symbol names (though it is very common, and asynchronous is very very long, so not that bad). data seems fine to me, it is a synonym to state, but it is overused. extraData I cannot, for the life of me, figure out. Extra for whom? Extra as opposed to what? Unfortunately, this is the name given to the parameter by IHttpAsyncHandler (see, I told you "async" was common).  Its name tells me nothing about what I should do with it. It is very unclear that this value should be mapped to AsyncState in IAsyncResult.

I propose we call it state, with an acceptable variation of asyncState.

Now, if changing a parameter name was not a breaking change...


Trivia:

  • The above uses the Live Services that you can find more about here.
  • The IM presence service returns JSON which I parse using a fancy looking Regex instance. I recommend that production code use DataContractJsonSerializer() instead.

Mar 29, 2008

If you are using a loop, you're doing it wrong

"If you are using a loop, you're doing it wrong."

That is the advice one of my college professors told us when he was teaching us APL. APL was designed to perform vector and matrix operations. Programming in APL is an exercise in stringing operators together to perform strange and wonderful things. Using a loop is just wrong and slows things down.

It is similar with LINQ, if you are using a loop you are doing it wrong. I find myself doing a lot of prototyping lately and I am forcing myself to use LINQ; not because I don't like it, far from it, I really like LINQ, but using loops is so ingrained into my psyche that I have to stop myself and force myself to think in LINQ. Every time I am tempted to write a loop that involves a collection or an array I ask myself, could I use LINQ instead? Programmers with more of a database background seem to take to LINQ like a duck to water. They think in sets and vectors, I don't, but I am getting there.

For example sometimes I find myself needing to return an IEnumerable<T> when I have a List<T> but of a different T. This often happens when I want to keep internal implementation details private. My internal List<T> might have the actual class but I need to return an enumerable for some interface. Before I would simply write a loop using the C# 2.0's yield syntax,

List<Foo> foos = new List<Foo>();

...

public IEnumerable<IFoo> Foos {
    get {
        foreach (Foo f in foos)
            yield return f;
    }
}

This loop involves a collection, can I use LINQ? Sure! By using LINQ's Cast<T>() method this can be replaced with,

public IEnumerable<IFoo> Foos {
    get { return foos.Cast<IFoo>(); }
}

If you are trying to find if a list contains an object by some name, you could write a loop like,

public bool Contains(string value) {
    foreach (Foo foo in foos)
        if (foo.Name == value)
            return true;
    return false;
}

Using LINQ this might look like,

public bool Contains(string value) {
    return (from foo in foos where foo.Name == value select foo).Any();
}

A nice thing about LINQ is you can perform complicated queries in pieces. With deferred execution of enumerators, this is fairly efficient as well. This is really helpful in the debugger. In one chunk of code I was writing I needed to coalesce adjacent ranges that are marked with the same text value. Assume you have a structure called Range that looks like,

struct Range {
    public int Start;
    public int Length;
}

and another struct that labels the ranges with names,

struct NamedRange {
    public string Name;
    public Range Range;
}

Now lets have a routine that calculates the range information over some stream,

public IEnumerable<NamedRange> GetNamedRanges(Stream stream) {
    ...
}

Lets assume that name ranges are some things like "whtespace", "reserved", "identifier", "number" "string", etc. as you might expect to receive from a lexical scanner like found in this post.

What I want to do with these ranges is to convert the names into styles such as you might find referencing a CSS style sheet. So, in effect, I am mapping NameRange values to StyledRange values where StyleRange would look something like,

struct StyledRange {
    public Style Style;
    public Range Range;
}

Lets create a dictionary that maps range names to styles such as,

styleMap["number"] = new Style() { Name = "green" };
styleMap["reserved"] = new Style() { Name = "bold" };

I only wanted to highlight numbers and reserved words, for everything else I will use the default style,

defaultStyle = new Style { Name = "normal" };

We can translate our named ranges directly into styled ranges by using a LINQ query expression such as,

var ranges = GetNamedRanges(stream);
var styledRanges = from range in ranges
                   select new StyledRange() {
                       Style = styleMap.MapOrDefault(range.Name) ?? defaultStyle,
                       Range = range.Range
                   };

where MapOrDefault() is an extension method for IDictionary<TKey, TValue> that looks like,

public static TValue MapOrDefault(this IDictionary dictionary, TKey key) {
    TValue result;
    if (dictionary.TryGetValue(key, out result))
        return result;
    return default(TValue);
}

which is patterned after the existing LINQ methods for IEnumerable<T>, FirstOrDefault() and LastOrDefault().

Since many of the ranges that have different names will have the same style, it would be nice to coalesce adjacent styles together so no two adjacent ranges have the same style. In other words, we only want a new styled range when the style changes. The above query expression just produces a one-to-one mapping of named range to styled range. What we need is something that will merge adjacent ranges. Do do this I will introduce another extension method, Reduce<T>() for IEnumerable<T>,

public static IEnumerable<T> Reduce<T>(this IEnumerable<T> e,
    Func<T, T, bool> match,
    Func<T, T, T> reduce) {
    var en = e.GetEnumerator();
    T last;
    if (en.MoveNext()) {
        last = en.Current;
        while (en.MoveNext()) {
            if (!match(last, en.Current)) {
                yield return last;
                last = en.Current;
            }
            else
                last = reduce(last, en.Current);
        }
        yield return last;
    }
}

What this method does is if two adjacent elements match (as defined by the match delegate returning true) they will be reduced into one element by calling the reduce delegate. For example, the Sum<T>()standard extension method could be implemented using Reduce<T>() as,

var sum = numbers.Reduce((a, b) => true, (a, b) => a + b).First();

Now that we have Reduce<T>(), lets reduce the list of styled ranges to coalesce the adjacent ranges with the same style. This can be done by,

styledRanges = styledRanges.Reduce(
    (r1, r2) => r1.Style == r2.Style,
    (r1, r2) => new StyledRange() {
        Style = r1.Style,
        Range = MergeRanges(r1.Range, r2.Range)});

MergeRanges() referenced above, is,

Range MergeRanges(Range r1, Range r2) {
    return new Range() { Start = r1.Start, Length = r2.Start - r1.Start + r2.Length };
}

In my example, this took 18 ranges for a typical line of C# source down to 7 ranges. Pretty good. But I noticed that some of those ranges were styling whitespace as "normal". This seems like a waste; why switch back from green to black text for writing whitespace? Why not combine those with adjacent ranges instead? A simple approach to this is to add the following code prior to mapping the styles,

ranges = ranges.Reduce(
    (r1, r2) => r1.Name == "whitespace" || r2.Name == "whitespace",
    (r1, r2) => new NamedRange() {
        Name = r1.Name == "whitespace" ? r2.Name : r1.Name,
        Range = MergeRanges(r1.Range, r2.Range)
    });

This says to merge whitespace ranges with adjacent non-whitespace ranges. This reduces the range count from 7 to 4. Not bad from the original 18 and that was just for one line of source. This savings adds up quickly over an entire file.

The complete example, made as a function, looks like,

IEnumerable<StyledRange> StyleRanges(IEnumerable<NamedRange> ranges) {

    // Merge whitespace ranges with adjacent non-whitespace ranges
    ranges = ranges.Reduce(
        (r1, r2) => r1.Name == "whitespace" || r2.Name == "whitespace",
        (r1, r2) => new NamedRange() {
            Name = r1.Name == "whitespace" ? r2.Name : r1.Name,
            Range = MergeRanges(r1.Range, r2.Range)});

    // Map named ranges to styles.
    var styledRanges = from range in ranges 
                       select new StyledRange() { 
                           Style = styleMap.MapOrDefault(range.Name) ?? defaultStyle,
                           Range = range.Range };

    // Merge adjacent ranges with the same style.
    styledRanges = styledRanges.Reduce(
        (r1, r2) => r1.Style == r2.Style,
        (r1, r2) => new StyledRange() {
            Style = r1.Style,
            Range = MergeRanges(r1.Range, r2.Range)});

    return styledRanges;
}

There are a few nice things about this function. First, it builds up an IEnumerable<T> but this enumerable doesn't execute until one of its enumerators is enumerated. This is due to the deferred execution of enumerators. Second, even though deferred execution makes single step debugging challenging, you can get a picture of what the function will do by using ToArray() in the watch windows on the intermediate results. This allows you to inspect the intermediate result to see if the mapping or reducing is what you expected. Third, this routine has no loops. It operates on each enumerable as a set. Now some of you will rightly say there is a loop buried in the Reduce<T>() method. True; Reduce<T>() is just like many other LINQ extension methods, they contain loops implied by the returned enumerator. But LINQ allows me to write code that communicates what I am trying to do without it being obscured by the details of how it is done. I think of LINQ not so much as a set of features in C# but a way of programming. A way of programming that, if you are using loops, you are doing it wrong.

Dec 30, 2007

Scanners (not the movie)

In my previous post I introduced a program I wrote to that uses the method described in First-Order Logic by Raymond M. Smullyan to prove an arbitrary propositional logic expression is a tautology. In this program I use some techniques that I want to describe. The first is the scanner. When writing scanners I have always hand written the scanner. There are two primary reasons for this, first I have never found scanners that difficult to write and, second, they are very time critical, often taking more than 20% of the overall parsing time, and auto-generated scanners are hard to optimize. Tableaux uses a hand-written scanner. The code for the scanner is in Scanner.cs in the zip file found here.

A scanner is a routine that takes text input and classifies it into a stream of tokens. Scanners typically give one token at a time. The scanner is the lowest level of the parsing process. Parsers are built on top of scanners and use the scanner results to determine the meaning of the source using some kind of grammar. The distinction between the parser and the scanner is somewhat arbitrary. You could just consider each character of the input as a token and build the parser up from there. Doing this is typically slower than a traditional scanner, however, so most parsers use a scanner. The line between the scanner and the parser is usually drawn at the lexical elements such as identifiers, reserved words, numeric and string constants, comments, etc. More formally, a lexical element is a string of characters that can be matched and classified by a regular expression. Scanner generators, such as Lex, use regular expressions to specify the scanner.

The trick to a fast scanner is to reduce the per-character cost of the scanner. One way to do this is to make sure you are executing the fewest number of instructions per-character-scanned as possible. A few of tricks I use to reduce the per-character cost include,

  1. Use the switch statement
    • That included switch or case statement typically heavily optimize the result, often generating jump tables.
  2. Pull fields into local variables and only write them at the before returning.
    • Local variables can be mapped to registers during code generation. Fields never will be. Stores to global memory (i.e. a field of a object allocated from the heap) are very hard for code generators to optimize. Local variable are much easier.
  3. Make checking for boundaries (end of file or end of buffer) appear as a character.
    • This merges the end-of-file check with character classification.
    • In this case I use the character value 0 to indicate end of file. If I was willing to use unsafe code I would have taken advantage of the fact that the CLR ensures a 0 value at the end of every string.

One thing that looks slow that isn't is the GetChar() routine. Normally you would see that and think I am paying for a CALL/RET pair but his gets inlined in retail. I am paying for the end of buffer check every character but, as noted above, this can be removed if I wanted to use unsafe code, definitely overkill for a small program like tableaux.

I went a bit overboard, intentionally, with identifier. It handles C# style identifiers, handling Unicode letter and number classifications. To avoid calling the Unicode classification routines for every character I hard-coded the ASCII part and only call the Unicode classification routines when I encounter a non-ASCII potential identifier character. It is not that the classification routines are slow, it is that every cycle counts in the scanner and avoiding the CALL/RET and character table look is significant for file that contain mostly ASCII characters. Note also that I hard-code some non-identifier characters in the identifier internal loop. This avoids calling the Unicode classification routines for those character, which are the most common characters you might find after an identifier, which avoids calling the Unicode classification routines at all for most input.

Dec 19, 2007

To b | !b, that is the question

When I was reading First-Order Logic by Raymond M. Smullyan I was fascinated by the second chapter in which he describes a method for proving a propositional logic expression is a tautology. This method is called the Tableau Method. The Tableau Method is a rather straight forward way to prove a statement is always going to be true by demonstrating that asserting it is false will result in a contradiction. The propositional logic expression language Dr. Smullyan uses is rather simple. I present a modified version of it here. My version contains a not operator (!), a conjunction operator (&), a disjunction operator (|) and an implication operator (→).1 For example,

p & q

is true when both p and q are true;

p | q

is true when either p is true or q is true;

p → q

is true when p is false or p is true and q is also true; and finally,

!p

is true only when p is false. This should be very familiar to any programmer. A propositional logic expression is a tautology if it will always be true regardless of the value of its propositional variables. An example of a simple tautology is,

p | !p

which will be true if p is true or false. This statement is always true. This can be proven by simply demonstrating that asserting it is false results in a contradiction. For example, we assert a statement is false by prefixing it with a capital F or true with a capital T. For the above expression this would look like,

F p | !p

For this to be false both of the sub-expressions would also have to be false. This would then look like

F p
F !p

For !p to be false p would have to be true, which looks like,

T p

Now we have a contradiction, according to propositional logic, p cannot be both true and false so p | !p must always be true. Using the Tableau Method this would look like,

(1) F p | !p
(2) F p (1)
(3) F !p (1)
(4) T p (3)
X

where the X indicates that the tableau ends in a contradiction (i.e. is closed) and the numbers to the left label the expressions and the numbers to the right indicate which expression the assertion is derived from. To produce a tableau you first assert the expression is false then use the following table to add assertions,

Expression of the form Asserts
T !p F p
F !p T p
T p & q T p and T q
F p & q F p or F q
T p | q T p or T q
F p | q F p and F q
T p → q F p or T q
F p → q T p and F q

When two expressions are asserted the table might branch. If the above table says "and" then both statement are asserted directly, but if it contains an "or" then both assertions must lead to a contradiction independently. For example, consider wanting to prove the transitive property of the implication operator, that is (p→q & q→r) → (p→r), this would start off with adding asserts until we are required to branch,

(1) F (p→q & q→r) → (p→r)
(2) T p→q & q→r (1)
(3) F p→r (1)
(4) T p→q (2)
(5) T q→r (2)
(6) T p (3)
(7) F r (3)

We now have to handle (4) and (5) with require us to branch the table. The first branch is the F p branch,

(8) F p (4)
X

This lead immediately to a contradiction because of (6) so this branch of the tableau is closed. We now need handle the T q branch,

(9) T q (4)

This doesn't lead to an immediate contradiction so we need to branch again for (5). The first branch for (5) looks like

(10) F q (5)
X

This again leads to an immediate contradiction, we already asserted that q must be true in (9). The second branch for (5) is the T r branch.

(11) T r (5)
X

This also lead to an immediate contradiction because of (7). Since all branches lead to a contradiction the original expression must always be true.

Putting this altogether we get,

(1) F (p→q & q→r) → (p→r)
(2) T p→q & q→r (1)
(3) F p→r (1)
(4) T p→q (2)
(5) T q→r (2)
(6) T p (3)
(7) F r (3)

(8) F p (4)
X

(9) T q (4)

(10) F q (5)
X

(11) T r (5)
X

This is a simple and elegant way to demonstrate proof by contradiction. Of course I just had to write a program to tell me whether an arbitrary propositional logic expression is a tautology and automatically generate the tableau.

You can find the source project that is works with the just released Visual Studio 2008 here. It requires VS 2008 because I am addicted to some of the new C# features, especially the var keyword.


1 Dr. Sumullyan uses different characters for his operators but his characters are not as commonly included in browser fonts and are not as easy to type on a standard keyboard (using -> as a synonym of →) as the ones I use. << back

Jul 31, 2007

Operators, Generics and Policies II: Adapter Policy

The mixin technique in my last post was a bit awkward to use with generic methods. It is much less awkward to use for classes, however. For example, if you want an expression evaluator much like I posted here, but you want to leave the result type open, you can use this technique to do it.

First, I declared a policy interface for the operators I wanted to abstract over,

partial interface IOperatorPolicy<T> {
    T Add(T a, T b);
    T Subtract(T a, T b);
    T Multiply(T a, T b);
    T Divide(T a, T b);
}

Second, I created the operator policies for int and double. I could have created more, decimal, float, etc. but two is sufficient for demonstration.

partial struct IntOperatorPolicy : IOperatorPolicy<int> {
    int IOperatorPolicy<int>.Add(int a, int b) { return a + b; }
    int IOperatorPolicy<int>.Subtract(int a, int b) { return a - b; }
    int IOperatorPolicy<int>.Multiply(int a, int b) { return a * b; }
    int IOperatorPolicy<int>.Divide(int a, int b) { return a / b; }
}

partial struct DoubleOperatorPolicy : IOperatorPolicy<double> {
    double IOperatorPolicy<double>.Add(double a, double b) { return a + b;
    double IOperatorPolicy<double>.Subtract(double a, double b) { return a - b; }
    double IOperatorPolicy<double>.Multiply(double a, double b) { return a * b; }
    double IOperatorPolicy<double>.Divide(double a, double b) { return a / b; }
}

Next are the expression nodes themselves. I wrap them in a generic class that takes the parameters. This mitigates some of the clutter caused by the policy.

partial class Expression<T, OperatorPolicy>
    where OperatorPolicy : struct, IOperatorPolicy<T> {

    static OperatorPolicy _operatorPolicy = new OperatorPolicy();

    public abstract partial class Expr {
        public abstract T Evaluate();
    }

    public partial class Literal : Expr {
        private T _literal;

        public Literal(T literal) { _literal = literal; }

        public override T Evaluate() {
            return _literal;
        }

        public override string ToString() {
            return _literal.ToString();
        }
    }

    public abstract partial class BinaryOp : Expr {
        private Expr _left;
        private Expr _right;

        protected BinaryOp(Expr left, Expr right) { _left = left; _right = right; }

        public override T Evaluate() {
            return EvaluateOp(_left.Evaluate(), _right.Evaluate());
        }

        protected abstract T EvaluateOp(T left, T right);
    }

    public partial class Add : BinaryOp {
        public Add(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Add(left, right);
        }
    }
    
    public partial class Add : BinaryOp {
        public Add(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Add(left, right);
        }
    }

    public partial class Subtract : BinaryOp {
        public Subtract(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Subtract(left, right);
        }
    }

    public partial class Multiply : BinaryOp {
        public Multiply(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Multiply(left, right);
        }
    }

    public partial class Divide : BinaryOp {
        public Divide(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Divide(left, right);
        }
    }
}

Here the Expression class acts as a container of a family of classes. Once you have an instance of one of the Expr derived classes, calling Evaluate() is straight-forward. Unfortunately, constructing one is a bit of a challenge. To get a literal of the int expression you would need to do,

var i1 = new Expression<int, IntOperatorPolicy>.Literal(1);

You can make this a little better by introducing a static method into Expression like,

public static Expr L(T literal) {
    return new Literal(literal);
}

then the above becomes,

var i1 = Expression<int, IntOperatorPolicy>.L(1);

This still a bit awkward especially when you want to construct a more complicated expression,

var e = new Expression<int, IntOperatorPolicy>.Add(i1, i1);

This can be made significantly better by adding operator overloads to Expr as in,

public abstract partial class Expr {
    public abstract T Evaluate();
    public static Expr operator +(Expr a, Expr b) {
        return new Add(a, b);
    }
    public static Expr operator -(Expr a, Expr b) {
        return new Subtract(a, b);
    }
    public static Expr operator *(Expr a, Expr b) {
        return new Multiply(a, b);
    }
    public static Expr operator /(Expr a, Expr b) {
        return new Divide(a, b);
    }
}

now the e assignment can look like,

var e = i1 + i1;

and finally, to make it look even better you can put the code to create the expression in a static method of a class that derives from Expression and closes the type parameters. Here are examples of one for int and a similar one for double.

class TestInt : Expression<int, IntOperatorPolicy> {
    public static void Run() {
        var iv = L(1) + L(2) * L(3);
        Console.WriteLine(iv.Evaluate());
    }
}

class TestDouble : Expression<double, DoubleOperatorPolicy> {
    public static void Run() {
        var dv = L(1.2) + L(3.4) * L(5.6);
        Console.WriteLine(dv.Evaluate());
    }
}

This shows taking advantage of the C# compiler's operator overloading to produce an expression tree and the use of the L() method as a sort of cast to change a literal of T into a Literal<T>. The call to Evaluate() calculates the result of the expression as an int or double. The code generated, as discussed last time, is quite good when JIT'ing outside the debugger and using optimized retail bits.

This use of a policy struct is what I refer to as an adapter policy. The Expression class has an abstraction it supports, represented by IOperatorPolicy<T>, but this abstraction is not supported by any of the interesting types you would want to pass as Expression's T. This is solved by creating an adapter policy and passing that along with T. This allows a generic type to introduce an abstraction that is unknown to the types the generic wants to range over, allow us, as above, to pass int for Expression's T even though int doesn't implement IOperatorPolicy<T> itself.

Next time I will discuss another application of an adapter policy.

Jul 25, 2007

Operators, Generics and Policies

One of the limitations of C# generics is you cannot abstract over operators. That is not completely true, you can if the base class used in where clause has operators; but, since operators are more useful, and more common, on value types, that is only marginally helpful. Also, this abstraction is provided by the base class not by using generics. What I would like to do is declare that my type  parameter requires the type to implement a particular operator. For example, to create a generic Add() method I would like to specify that the type parameter must implement the + operator and then be able to use the + operator in my code such as,

// Invalid C#
static T Add<T>(T a, T b) where T: T operator +(T, T) {
    return a + b;
}

C# doesn't support this but you can get somewhat close by supplying a policy struct like I described in this post. The policy interface and the int policy struct would look like,

interface IAddPolicy<T> {
    T Add(T a, T b);
}

struct IntAddPolicy : IAddPolicy<int> {
    public int Add(int a, int b) { return a + b; }
}

This allows you to write the following,

static T Add<T, P>(T a, T b) where P: struct, IAddPolicy<T> {
    return new P().Add(a, b);
}

Unfortunately the policy struct cannot be inferred from the parameters so you have to supply the type parameter explicitly,

var i = Add<int, IntAddPolicy>(3, 4);

which doesn't look pretty and generates less than efficient code for the Add() method on an i386 architecture,

00000000  sub         esp,8 
00000003  xor         eax,eax 
00000005  mov         dword ptr [esp],eax 
00000008  mov         dword ptr [esp+4],eax 
0000000c  mov         byte ptr [esp],0 
00000010  movsx       eax,byte ptr [esp] 
00000014  mov         byte ptr [esp+4],al 
00000018  add         ecx,edx 
0000001a  mov         eax,ecx 
0000001c  add         esp,8 
0000001f  ret              

This generates code to initialize the policy structure we don't actually use. We can get rid of that by recasting this a bit to save the dummy struct allocation by putting the Add() method in a wrapping class that takes the type parameters. The type would allocate the struct instead of the method. This looks like,

class Adder<T, P>
    where P: struct, IAddPolicy<T>
{
    static P AddPolicy = new P();
    static public T Add(T a, T b) { return AddPolicy.Add(a, b); }
}

which generates the following code for Add(),

00000000  add         ecx,edx 
00000002  mov         eax,ecx 
00000004  ret              

This can be called like,

var i2 = Adder<int, IntAddPolicy>.Add(3, 4);

Note that the code in IntAddPolicy gets inlined into the Add() method. Even though this doesn't get inlined into the caller of Add(), it will still be pretty quick with the branch prediction and register aliasing of modern processors. This means that you are not paying much (and in many cases nothing) in code quality for using an AddPolicy over writing out the Add() method explicitly.

To use the Add() method above with another type, such as double, you need to create another add policy struct. This would look like,

struct DoubleAddPolicy : IAddPolicy<double> {
    public double Add(double a, double b) { return a + b; }
}

Using this struct to call Add() will produce the following code for the Add() method,

00000000  fld         qword ptr [esp+0Ch] 
00000004  fadd        qword ptr [esp+4] 
00000008  ret         10h  

This is not quite as good as the code generated for int because it uses the stack to pass the parameters but it is still pretty fast.

We now have a method that can add two values of any type T for which an AddPolicy can be created. This is pretty close to, and more general than, what I originally wanted. Unfortunately, even though the code generated is good, calling this code is very awkward; we have to call a static method of a parameterized class and the policy struct prevents the compiler from being able to use type inferencing to supply the parameters for us. Next time I will describe a more practical application of this technique where the awkwardness is not as noticiable.