Expression Trees

Introduction

Expression Trees are Expressions arranged in a treelike data structure. Each node in the tree is a representation of an expression, an expression being code. An In-Memory representation of a Lambda expression would be an Expression tree, which holds the actual elements (i.e. code) of the query, but not its result. Expression trees make the structure of a lambda expression transparent and explicit.

Syntax

Expression<TDelegate> name = lambdaExpression;

Parameters

TDelegate : The delegate type to be used for the expression
lambdaExpression : The lambda expression (ex. num => num < 5)

Intro to Expression Trees

Where we came from

Expression trees are all about consuming “source code” at runtime. Consider a method which calculates the sales tax due on a sales order decimal CalculateTotalTaxDue(SalesOrder order). Using that method in a .NET program is easy — you just call it decimal taxDue = CalculateTotalTaxDue(order);. What if you want to apply it to all the results from a remote query (SQL, XML, a remote server, etc)? Those remote query sources cannot call the method! Traditionally, you would have to invert the flow in all these cases. Make the entire query, store it in memory, then loop through the results and calculate tax for each result.

How to avoid flow inversion’s memory and latency problems

Expression trees are data structures in a format of a tree, where each node holds an expression. They are used to translate the compiled instructions (like methods used to filter data) in expressions which could be used outside of the program environment such as inside a database query.

The problem here is that a remote query cannot access our method. We could avoid this problem if instead, we sent the instructions for the method to the remote query. In our CalculateTotalTaxDue example, that means we send this information:

Create a variable to store the total tax
Loop through all the lines on the order
For each line, check if the product is taxable
If it is, multiply the line total by the applicable tax rate and add that amount to the total
Otherwise do nothing

With those instructions, the remote query can perform the work as it’s creating the data.

There are two challenges to implementing this. How do you transform a compiled .NET method into a list of instructions, and how do you format the instructions in a way that they can be consumed by the remote system?

Without expression trees, you could only solve the first problem with MSIL. (MSIL is the assembler-like code created by the .NET compiler.) Parsing MSIL is possible, but it’s not easy. Even when you do parse it properly, it can be hard to determine what the original programmer’s intent was with a particular routine.

Expression trees save the day

Expression trees address these exact issues. They represent program instructions a tree data structure where each node represents

one instruction

and has references to all the information you need to execute that instruction. For example, a

MethodCallExpression

has reference to 1) the

MethodInfo

it is going to call, 2) a list of

Expression

s it will pass to that method, 3) for instance methods, the

Expression

you’ll call the method on. You can “walk the tree” and apply the instructions on your remote query.

Creating expression trees

The easiest way to create an expression tree is with a lambda expression. These expressions look almost the same as normal C# methods. It’s important to realize this is compiler magic. When you first create a lambda expression, the compiler checks what you assign it to. If it’s a Delegate type (including Action or Func), the compiler converts the lambda expression into a delegate. If it’s a LambdaExpression (or an Expression<Action<T>> or Expression<Func<T>> which are strongly typed LambdaExpression’s), the compiler transforms it into a LambdaExpression. This is where the magic kicks in. Behind the scenes, the compiler uses the expression tree API to transform your lambda expression into a LambdaExpression.

Lambda expressions cannot create every type of expression tree. In those cases, you can use the Expressions API manually to create the tree you need to. In the Understanding the expressions API example, we create the CalculateTotalSalesTax expression using the API.

NOTE: The names get a bit confusing here. A lambda expression (two words, lower case) refers to the block of code with a => indicator. It represents an anonymous method in C# and is converted into either a Delegate or Expression. A LambdaExpression (one word, PascalCase) refers to the node type within the Expression API which represents a method you can execute.

Expression Trees and LINQ

One of the most common uses of expression trees is with LINQ and database queries. LINQ pairs an expression tree with a query provider to apply your instructions to the target remote query. For example, the LINQ to Entity Framework query provider transforms an expression tree into SQL which is executed against the database directly.

Putting all the pieces together, you can see the real power behind LINQ.

Write a query using a lambda expression: products.Where(x => x.Cost > 5)
The compiler transforms that expression into an expression tree with the instructions “check if the Cost property of the parameter is greater than five”.
The query provider parses the expression tree and produces a valid SQL query SELECT * FROM products WHERE Cost > 5
The ORM projects all the results into POCOs and you get a list of objects back

Notes

Expression trees are immutable. If you want to change an expression tree you need to create a new one, copy the existing one into the new one (to traverse an expression tree you can use the ExpressionVisitor) and make the wanted changes.

Found a mistake? Have a question or improvement idea? Let me know.

Table Of Contents