IEnumerable: lazy and dangerous

Time and time again I am burnt by the same bug.

There are two kinds of enumerables in .NET: enumerators and generators. Enumerators go over some collection and return its items. Enumerator will always return the same values, no matter how many times you iterate over it. This only changes when someone adds or removes items from the collection.

Generators, on the other hand, compute new values every time. They may or may not be the same, depending on how the generator is built and what happened around it. All those neat LINQ methods like “where” and “select” are in fact generators.

For example,

IEnumerable<string> names = new[] { "Paris", "Berlin", "London" }; // enumerator
IEnumerable<City> cities = names.Select(name=>new City(name)); // generator

Every time time you do foreach (var city in cities) a different set of cities is produced. E.g.

foreach (var city in cities) city.Visited = true;
...
foreach (var citiy in cities)
{
    Console.WriteLine(city.Visited); // prints false
}

One can easily convert a generator to an enumerator by adding ToList(), ToArray(), or ToDictionary(). This will create a “real” data structure that is not modified.

var cityList = names.Select(name=>new City(name)).ToList();
foreach (var city in cityList) city.Visited = true;
...
foreach (var citiy in cityList)
{
    Console.WriteLine(city.Visited); // prints true
}

Even more subtle issues can occur when using conditions:

cityList[0].Visited = true;
var visitedCities = cityList.Where(city=>city.Visited);
Console.WriteLine( visitedCities.Count() ); // 1
...
cityList[0].Visited = false;
Console.WriteLine( visitedCities.Count() ); // 0

This is an obvious case of hidden dependency. We modified something inside cityList, but it “magically” affects similarly unrelated visitedCities. Hidden dependencies are evil, because they are, well, hidden, and people tend to forget about them.

The most annoying part is that generators and enumerators look exactly the same, and it may be quite difficult, or even impossible to tell whether particular IEnumerable is a generator or an enumerator. Functional style programming assumes read-only objects, so it does not matter, but throwing in any modifiable state creates a dangerous mix.

It does not mean one should not use IEnumerable in stateful scenarios, but it is better to be careful, you have been warned!

Leave a Reply

Your email address will not be published. Required fields are marked *