Archive for August, 2010

ToList() or not ToList(), that is the query.

A lot has been written about the nature of deferred execution with regards to LINQ queries, so I won’t go into it in detail here, suffice to say it’s nice when you can retain this behaviour as it usually makes your code more efficient. It can however come at a cost and here’s an example of why:

var query = from customer in db.Customers 
            where customer.City == "Paris" 
            select customer;
if (!query.Any()) << Query Executes Here
{
    Console.WriteLine("No results found");
    return;
}

foreach (var customer in query)  << And Query Executes Again Here
{
    Console.WriteLine(customer.CompanyName);
}

You have to make a choice here, read all results of query into a list up front with a call to ToList(), which could take a while before it starts writing out to the console (due to network IO or whatever), or you can execute the query twice. (as seen above)

Why not have the best of both worlds?

What we really want is an alternative to ToList() which still defers the execution and reads from the query as it iterates, but persists the results in subsequent iterations so that the heavy lifting only happens once (just like ToList()).

To achieve this I implemented an IEnumerator<T> called BufferedEnumerator<T> which wraps an inner enumerator and buffers the results as calls to MoveNext() are made against it, until the inner enumerator has been fully buffered and we can dispose it. To provide the BufferedEnumerator<T> I implemented an enumerable called BufferedEnumerable<T>.

One thing to note is that we don’t dispose the inner enumerator because we might not be finished with it. This raises an issue, if we don’t Dispose it when the Dispose() method is called, then when do we Dispose? The way I chose to get around this was to make the BufferedEnumerable<T> disposable and have it call the inner dispose, this means to be clean and tidy you should really wrap the buffered result in a using statement or call Dispose on it when you’re done (to ensure any underlying query is not left open etc).

Using the new Buffered() extension method

var query = from customer in db.Customers 
            where customer.City == "Paris" 
            select customer;

using (var buffer = query.Buffered())
{
    if (!buffer.Any()) << Query starts executing here
    {
        Console.WriteLine("No results found");
        return;
    }
    foreach (var customer in buffer) << Same query continues executing here
    {
        Console.WriteLine(customer.CompanyName);
    }
}

The good news is this can now be used as a 100% replacement to ToList(), you get none of the negative side effects of ToList() executing when it may not be necessary or reading all of the results when only the first result will do. Just remember that it is truly deferred, if you call query.Buffered() and never iterate over it, the query will never be executed.

Enjoy!

Download Source

Leave a comment