• programming in python • a course for the curious •

Python generators

This is a simple introduction to generators in Python.

First let me explain what is going on when you do a for loop:

>>> for x in iterable_object :
...     print(x)

First python calls iterable_object.__iter__ method to get an object over which it will iterate. Let me make a diggression how you can check this in the Python source code (yes it is not difficult). First compile a simple loop and check the opcodes:

>>> loop = "for x in iter: pass"
>>> loop_code = compile(loop, '<string>', 'exec')
>>> from dis import dis
>>> dix(loop_code)  # deassemble the python code object into single instructions (opcodes)
                    # each item in the following line is a single opcode which then is translated into
                    # c instructions in cPython.
1           0 SETUP_LOOP              14 (to 17)
            3 LOAD_NAME               0 (iter)
            6 GET_ITER
        >>  7 FOR_ITER                6 (to 16)
            10 STORE_NAME             1 (x)
            13 JUMP_ABSOLUTE          7
        >>  16 POP_BLOCK
        >>  17 LOAD_CONST             0 (None)
            20 RETURN_VALUE

Now do you see the GET_ITER opcode. You can open Python/ceval.c file in the Python source code and you will see that it calls PyObject_GetIter c function, which actually corresponds to the __iter__ method. Ok, but this was a digression, I will come back to this later ...

Python has many iterable objects (i.e. the ones which have __iter__() method. The simplest example is Python2 str and unicode or Python3 bytes and str, but there are also others like: tuple, list and dict. The bound method __iter__ does not accept any arguments and it is supposed to return an iterator object. The iterator object contains __next__ method which returns the x values, unless the iteration has exhousted. In the last case it raises StopIteration exception which is internally cought by the for loop. There is however another way of getting iterator objects using the yield keyword.

>>> def gen(n=5):
...     for x in range(n):  # in Python2 you'd use xrange
...         yield x
...
>>> for x in gen(3):
...     print(x, end=' ')  # Python3 for Python2 do "from __future__ import print_function"
...
0 1 2 >>>

The function gen returns a generator object. It has __iter__ method which returns itself, since generator objects can be iterated directly (it already contains __next__ method among other special methods which you will see in a moment). So let us check it out:

>>> g = gen(2)
>>> g is g.__iter__()
True
>>> g.__next__()
0
>>> next(g)  # this is a wrapper around ``__next__``
1
>>> next(g)
StopIteration

But more interestingly generators have a send() method. What it is used for: to send data back to the generator. It accepts a single argument which is then send to the generator. It turns out that the yield is an expression (and when used as an expression should almost always be surrounded by brackets) - and expressions can be evaluated. Let us first begin with a note that .send(None) is equivalent to __next__():

>>> g = gen(2)
>>> g.send(None)
0
>>> g.send(None)
1
>>> g.send(None)
StopIteration

First time you call send() method (unless you called __next__() already) you have to pass None as the argument (or use __next__()). So let us check out how it works:

>>> def gen(n=5):
...   for x in range(n):
...     received = (yield x)
...     print('received={}'.format(repr(recieved))
...
>>> g = gen(3)
>>> next(g)  # or g.send(None)
...          # advances the generator to the first yield statement
...          # and we got what was 'yielded':
0
>>> # now the generator is at a yield statement so we can send something to it
>>> g.send(object)
recieved=<class 'object'>

Actually two things happen when you use send() method:

  • the (yield x) evaluates to what was passed as an argument of send(),
  • the method __next__() is called: i.e. the generator code runs until next yield statement is met.

There are two useful utility function for generators:

>>> from functools import wraps
>>> def init(func):
...     """decorator which initialises the generator
...     """
...     @wraps(func)
...     def inner(*args, **kwargs):
...         g = func(*args, **kwargs)
...         g.send(None)
...         return g
...     return inner
...
>>> def consumer(gen, data):
...     """gen is an initialised generator
...     """
...     for d in data:
...         try:
...             gen.send(d)
...         except StopIteration:
...             break
...

Let us make a simple example:

>>> @init
... def gen(func):
...     x = (yield)
...     while True:
...         x = (yield func(x))
...

This is simply an implementation of function func as a generator, maybe not very useful (since you can use function by itself), but intriguing:

>>> g = gen(lambda x: x**2)
>>> g.send(2)
4
>>> g.send(9)
81

Python3.3 has also yield from which is you can read about here. We also didn't touch the throw method of generators. And there is a difference between yield from and looping a generator when an exception is caught (yield from will print a more readable traceback). And one nice new feature in Python3.3 is that you can mix return and yield in one function block.

For a more serious application of generators see this article.

Contact us.
Created using , and
we are not associated with Python Software Foundation
© Copyright 2012, 2013 Accorda Institute.