Expression Trees
suggest changeIntroduction
Expression Trees are Expressions arranged in a treelike data structure. Each node in the tree is a representation of an expression, an expression being code. An In-Memory representation of a Lambda expression would be an Expression tree, which holds the actual elements (i.e. code) of the query, but not its result. Expression trees make the structure of a lambda expression transparent and explicit.
Syntax
Expression<TDelegate> name = lambdaExpression;
Parameters
- TDelegate : The delegate type to be used for the expression
- lambdaExpression : The lambda expression (ex.
num => num < 5
)
Intro to Expression Trees
Where we came from
Expression trees are all about consuming “source code” at runtime. Consider a method which calculates the sales tax due on a sales order decimal CalculateTotalTaxDue(SalesOrder order)
. Using that method in a .NET program is easy — you just call it decimal taxDue = CalculateTotalTaxDue(order);
. What if you want to apply it to all the results from a remote query (SQL, XML, a remote server, etc)? Those remote query sources cannot call the method! Traditionally, you would have to invert the flow in all these cases. Make the entire query, store it in memory, then loop through the results and calculate tax for each result.
How to avoid flow inversion’s memory and latency problems
Expression trees are data structures in a format of a tree, where each node holds an expression. They are used to translate the compiled instructions (like methods used to filter data) in expressions which could be used outside of the program environment such as inside a database query.
The problem here is that a remote query cannot access our method. We could avoid this problem if instead, we sent the instructions for the method to the remote query. In our CalculateTotalTaxDue
example, that means we send this information:
- Create a variable to store the total tax
- Loop through all the lines on the order
- For each line, check if the product is taxable
- If it is, multiply the line total by the applicable tax rate and add that amount to the total
- Otherwise do nothing
With those instructions, the remote query can perform the work as it’s creating the data.
There are two challenges to implementing this. How do you transform a compiled .NET method into a list of instructions, and how do you format the instructions in a way that they can be consumed by the remote system?
Without expression trees, you could only solve the first problem with MSIL. (MSIL is the assembler-like code created by the .NET compiler.) Parsing MSIL is possible, but it’s not easy. Even when you do parse it properly, it can be hard to determine what the original programmer’s intent was with a particular routine.
Expression trees save the day
Expression trees address these exact issues. They represent program instructions a tree data structure where each node represents
one instruction
and has references to all the information you need to execute that instruction. For example, a
MethodCallExpression
has reference to 1) the
MethodInfo
it is going to call, 2) a list of
Expression
s it will pass to that method, 3) for instance methods, the
Expression
you’ll call the method on. You can “walk the tree” and apply the instructions on your remote query.
Creating expression trees
The easiest way to create an expression tree is with a lambda expression. These expressions look almost the same as normal C# methods. It’s important to realize this is compiler magic. When you first create a lambda expression, the compiler checks what you assign it to. If it’s a Delegate
type (including Action
or Func
), the compiler converts the lambda expression into a delegate. If it’s a LambdaExpression
(or an Expression<Action<T>>
or Expression<Func<T>>
which are strongly typed LambdaExpression
’s), the compiler transforms it into a LambdaExpression
. This is where the magic kicks in. Behind the scenes, the compiler uses the expression tree API to transform your lambda expression into a LambdaExpression
.
Lambda expressions cannot create every type of expression tree. In those cases, you can use the Expressions API manually to create the tree you need to. In the Understanding the expressions API example, we create the CalculateTotalSalesTax
expression using the API.
NOTE: The names get a bit confusing here. A lambda expression (two words, lower case) refers to the block of code with a =>
indicator. It represents an anonymous method in C# and is converted into either a Delegate
or Expression
. A LambdaExpression
(one word, PascalCase) refers to the node type within the Expression API which represents a method you can execute.
Expression Trees and LINQ
One of the most common uses of expression trees is with LINQ and database queries. LINQ pairs an expression tree with a query provider to apply your instructions to the target remote query. For example, the LINQ to Entity Framework query provider transforms an expression tree into SQL which is executed against the database directly.
Putting all the pieces together, you can see the real power behind LINQ.
- Write a query using a lambda expression:
products.Where(x => x.Cost > 5)
- The compiler transforms that expression into an expression tree with the instructions “check if the Cost property of the parameter is greater than five”.
- The query provider parses the expression tree and produces a valid SQL query
SELECT * FROM products WHERE Cost > 5
- The ORM projects all the results into POCOs and you get a list of objects back
Notes
- Expression trees are immutable. If you want to change an expression tree you need to create a new one, copy the existing one into the new one (to traverse an expression tree you can use the
ExpressionVisitor
) and make the wanted changes.