TL;DR Python generator expressions can be iterated over only once.
squares = (x*x for x in range(10)) # generator expression print(list(squares)) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] print(list(squares)) # []
I mentally mapped Python generator expressions to C# IEnumerable
s, which can be iterated over multiple times. In reality, Python generator expressions behave differently: they can be iterated only once, and then just silently stop yielding values, as demonstrated in the example above.
Python documentation simply says that generator expression “yields a new generator object”, but it does not specifically mention that it can be iterated only once.
This Stackoverflow article specifically contrasts Python and C# generators.
It is possible to create multiple iterators over the same generator expression by using itertools.tee(), but it seems to buffer the output in memory, so if the generator returns different values on different invocations, it won’t work properly. Also, you need to know in advance, how many iterators you want, there is no way to create new iterators on the fly.
import itertools squares = (x*x for x in range(10)) tees = itertools.tee(squares,3) for iter in tees: print(list(iter))
This prints
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81] [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
In conclusion, one must be careful when using generator expressions in Python, since they can be iterated only once. In retrospect, I had some mysterious bugs related to this, which disappeared when I replaced generator expressions with lists.
Permalink
On the other hand, accidental multiple enumerations could lead to bad performance. If there are many thousand elements there it might be better to materialize them first.
Permalink
Agreed. Dynamic enumerables have some surprises up their sleeve too.