1 ********************************
2 Functional Programming HOWTO
3 ********************************
5 :Author: A. M. Kuchling
8 (This is a first draft. Please send comments/error reports/suggestions to
11 In this document, we'll take a tour of Python's features suitable for
12 implementing programs in a functional style. After an introduction to the
13 concepts of functional programming, we'll look at language features such as
14 :term:`iterator`\s and :term:`generator`\s and relevant library modules such as
15 :mod:`itertools` and :mod:`functools`.
21 This section explains the basic concept of functional programming; if you're
22 just interested in learning about Python language features, skip to the next
25 Programming languages support decomposing problems in several different ways:
27 * Most programming languages are **procedural**: programs are lists of
28 instructions that tell the computer what to do with the program's input. C,
29 Pascal, and even Unix shells are procedural languages.
31 * In **declarative** languages, you write a specification that describes the
32 problem to be solved, and the language implementation figures out how to
33 perform the computation efficiently. SQL is the declarative language you're
34 most likely to be familiar with; a SQL query describes the data set you want
35 to retrieve, and the SQL engine decides whether to scan tables or use indexes,
36 which subclauses should be performed first, etc.
38 * **Object-oriented** programs manipulate collections of objects. Objects have
39 internal state and support methods that query or modify this internal state in
40 some way. Smalltalk and Java are object-oriented languages. C++ and Python
41 are languages that support object-oriented programming, but don't force the
42 use of object-oriented features.
44 * **Functional** programming decomposes a problem into a set of functions.
45 Ideally, functions only take inputs and produce outputs, and don't have any
46 internal state that affects the output produced for a given input. Well-known
47 functional languages include the ML family (Standard ML, OCaml, and other
48 variants) and Haskell.
50 The designers of some computer languages choose to emphasize one
51 particular approach to programming. This often makes it difficult to
52 write programs that use a different approach. Other languages are
53 multi-paradigm languages that support several different approaches.
54 Lisp, C++, and Python are multi-paradigm; you can write programs or
55 libraries that are largely procedural, object-oriented, or functional
56 in all of these languages. In a large program, different sections
57 might be written using different approaches; the GUI might be
58 object-oriented while the processing logic is procedural or
59 functional, for example.
61 In a functional program, input flows through a set of functions. Each function
62 operates on its input and produces some output. Functional style discourages
63 functions with side effects that modify internal state or make other changes
64 that aren't visible in the function's return value. Functions that have no side
65 effects at all are called **purely functional**. Avoiding side effects means
66 not using data structures that get updated as a program runs; every function's
67 output must only depend on its input.
69 Some languages are very strict about purity and don't even have assignment
70 statements such as ``a=3`` or ``c = a + b``, but it's difficult to avoid all
71 side effects. Printing to the screen or writing to a disk file are side
72 effects, for example. For example, in Python a ``print`` statement or a
73 ``time.sleep(1)`` both return no useful value; they're only called for their
74 side effects of sending some text to the screen or pausing execution for a
77 Python programs written in functional style usually won't go to the extreme of
78 avoiding all I/O or all assignments; instead, they'll provide a
79 functional-appearing interface but will use non-functional features internally.
80 For example, the implementation of a function will still use assignments to
81 local variables, but won't modify global variables or have other side effects.
83 Functional programming can be considered the opposite of object-oriented
84 programming. Objects are little capsules containing some internal state along
85 with a collection of method calls that let you modify this state, and programs
86 consist of making the right set of state changes. Functional programming wants
87 to avoid state changes as much as possible and works with data flowing between
88 functions. In Python you might combine the two approaches by writing functions
89 that take and return instances representing objects in your application (e-mail
90 messages, transactions, etc.).
92 Functional design may seem like an odd constraint to work under. Why should you
93 avoid objects and side effects? There are theoretical and practical advantages
94 to the functional style:
99 * Ease of debugging and testing.
105 A theoretical benefit is that it's easier to construct a mathematical proof that
106 a functional program is correct.
108 For a long time researchers have been interested in finding ways to
109 mathematically prove programs correct. This is different from testing a program
110 on numerous inputs and concluding that its output is usually correct, or reading
111 a program's source code and concluding that the code looks right; the goal is
112 instead a rigorous proof that a program produces the right result for all
115 The technique used to prove programs correct is to write down **invariants**,
116 properties of the input data and of the program's variables that are always
117 true. For each line of code, you then show that if invariants X and Y are true
118 **before** the line is executed, the slightly different invariants X' and Y' are
119 true **after** the line is executed. This continues until you reach the end of
120 the program, at which point the invariants should match the desired conditions
121 on the program's output.
123 Functional programming's avoidance of assignments arose because assignments are
124 difficult to handle with this technique; assignments can break invariants that
125 were true before the assignment without producing any new invariants that can be
128 Unfortunately, proving programs correct is largely impractical and not relevant
129 to Python software. Even trivial programs require proofs that are several pages
130 long; the proof of correctness for a moderately complicated program would be
131 enormous, and few or none of the programs you use daily (the Python interpreter,
132 your XML parser, your web browser) could be proven correct. Even if you wrote
133 down or generated a proof, there would then be the question of verifying the
134 proof; maybe there's an error in it, and you wrongly believe you've proved the
141 A more practical benefit of functional programming is that it forces you to
142 break apart your problem into small pieces. Programs are more modular as a
143 result. It's easier to specify and write a small function that does one thing
144 than a large function that performs a complicated transformation. Small
145 functions are also easier to read and to check for errors.
148 Ease of debugging and testing
149 -----------------------------
151 Testing and debugging a functional-style program is easier.
153 Debugging is simplified because functions are generally small and clearly
154 specified. When a program doesn't work, each function is an interface point
155 where you can check that the data are correct. You can look at the intermediate
156 inputs and outputs to quickly isolate the function that's responsible for a bug.
158 Testing is easier because each function is a potential subject for a unit test.
159 Functions don't depend on system state that needs to be replicated before
160 running a test; instead you only have to synthesize the right input and then
161 check that the output matches expectations.
167 As you work on a functional-style program, you'll write a number of functions
168 with varying inputs and outputs. Some of these functions will be unavoidably
169 specialized to a particular application, but others will be useful in a wide
170 variety of programs. For example, a function that takes a directory path and
171 returns all the XML files in the directory, or a function that takes a filename
172 and returns its contents, can be applied to many different situations.
174 Over time you'll form a personal library of utilities. Often you'll assemble
175 new programs by arranging existing functions in a new configuration and writing
176 a few functions specialized for the current task.
182 I'll start by looking at a Python language feature that's an important
183 foundation for writing functional-style programs: iterators.
185 An iterator is an object representing a stream of data; this object returns the
186 data one element at a time. A Python iterator must support a method called
187 ``next()`` that takes no arguments and always returns the next element of the
188 stream. If there are no more elements in the stream, ``next()`` must raise the
189 ``StopIteration`` exception. Iterators don't have to be finite, though; it's
190 perfectly reasonable to write an iterator that produces an infinite stream of
193 The built-in :func:`iter` function takes an arbitrary object and tries to return
194 an iterator that will return the object's contents or elements, raising
195 :exc:`TypeError` if the object doesn't support iteration. Several of Python's
196 built-in data types support iteration, the most common being lists and
197 dictionaries. An object is called an **iterable** object if you can get an
200 You can experiment with the iteration interface manually:
205 <...iterator object at ...>
213 Traceback (most recent call last):
214 File "<stdin>", line 1, in ?
218 Python expects iterable objects in several different contexts, the most
219 important being the ``for`` statement. In the statement ``for X in Y``, Y must
220 be an iterator or some object for which ``iter()`` can create an iterator.
221 These two statements are equivalent::
229 Iterators can be materialized as lists or tuples by using the :func:`list` or
230 :func:`tuple` constructor functions:
233 >>> iterator = iter(L)
234 >>> t = tuple(iterator)
238 Sequence unpacking also supports iterators: if you know an iterator will return
239 N elements, you can unpack them into an N-tuple:
242 >>> iterator = iter(L)
247 Built-in functions such as :func:`max` and :func:`min` can take a single
248 iterator argument and will return the largest or smallest element. The ``"in"``
249 and ``"not in"`` operators also support iterators: ``X in iterator`` is true if
250 X is found in the stream returned by the iterator. You'll run into obvious
251 problems if the iterator is infinite; ``max()``, ``min()``, and ``"not in"``
252 will never return, and if the element X never appears in the stream, the
253 ``"in"`` operator won't return either.
255 Note that you can only go forward in an iterator; there's no way to get the
256 previous element, reset the iterator, or make a copy of it. Iterator objects
257 can optionally provide these additional capabilities, but the iterator protocol
258 only specifies the ``next()`` method. Functions may therefore consume all of
259 the iterator's output, and if you need to do something different with the same
260 stream, you'll have to create a new iterator.
264 Data Types That Support Iterators
265 ---------------------------------
267 We've already seen how lists and tuples support iterators. In fact, any Python
268 sequence type, such as strings, will automatically support creation of an
271 Calling :func:`iter` on a dictionary returns an iterator that will loop over the
274 .. not a doctest since dict ordering varies across Pythons
278 >>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
279 ... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
281 ... print key, m[key]
295 Note that the order is essentially random, because it's based on the hash
296 ordering of the objects in the dictionary.
298 Applying ``iter()`` to a dictionary always loops over the keys, but dictionaries
299 have methods that return other iterators. If you want to iterate over keys,
300 values, or key/value pairs, you can explicitly call the ``iterkeys()``,
301 ``itervalues()``, or ``iteritems()`` methods to get an appropriate iterator.
303 The :func:`dict` constructor can accept an iterator that returns a finite stream
304 of ``(key, value)`` tuples:
306 >>> L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')]
308 {'Italy': 'Rome', 'US': 'Washington DC', 'France': 'Paris'}
310 Files also support iteration by calling the ``readline()`` method until there
311 are no more lines in the file. This means you can read each line of a file like
315 # do something for each line
318 Sets can take their contents from an iterable and let you iterate over the set's
321 S = set((2, 3, 5, 7, 11, 13))
327 Generator expressions and list comprehensions
328 =============================================
330 Two common operations on an iterator's output are 1) performing some operation
331 for every element, 2) selecting a subset of elements that meet some condition.
332 For example, given a list of strings, you might want to strip off trailing
333 whitespace from each line or extract all the strings containing a given
336 List comprehensions and generator expressions (short form: "listcomps" and
337 "genexps") are a concise notation for such operations, borrowed from the
338 functional programming language Haskell (http://www.haskell.org). You can strip
339 all the whitespace from a stream of strings with the following code::
341 line_list = [' line 1\n', 'line 2 \n', ...]
343 # Generator expression -- returns iterator
344 stripped_iter = (line.strip() for line in line_list)
346 # List comprehension -- returns list
347 stripped_list = [line.strip() for line in line_list]
349 You can select only certain elements by adding an ``"if"`` condition::
351 stripped_list = [line.strip() for line in line_list
354 With a list comprehension, you get back a Python list; ``stripped_list`` is a
355 list containing the resulting lines, not an iterator. Generator expressions
356 return an iterator that computes the values as necessary, not needing to
357 materialize all the values at once. This means that list comprehensions aren't
358 useful if you're working with iterators that return an infinite stream or a very
359 large amount of data. Generator expressions are preferable in these situations.
361 Generator expressions are surrounded by parentheses ("()") and list
362 comprehensions are surrounded by square brackets ("[]"). Generator expressions
365 ( expression for expr in sequence1
367 for expr2 in sequence2
369 for expr3 in sequence3 ...
371 for exprN in sequenceN
374 Again, for a list comprehension only the outside brackets are different (square
375 brackets instead of parentheses).
377 The elements of the generated output will be the successive values of
378 ``expression``. The ``if`` clauses are all optional; if present, ``expression``
379 is only evaluated and added to the result when ``condition`` is true.
381 Generator expressions always have to be written inside parentheses, but the
382 parentheses signalling a function call also count. If you want to create an
383 iterator that will be immediately passed to a function you can write::
385 obj_total = sum(obj.count for obj in list_all_objects())
387 The ``for...in`` clauses contain the sequences to be iterated over. The
388 sequences do not have to be the same length, because they are iterated over from
389 left to right, **not** in parallel. For each element in ``sequence1``,
390 ``sequence2`` is looped over from the beginning. ``sequence3`` is then looped
391 over for each resulting pair of elements from ``sequence1`` and ``sequence2``.
393 To put it another way, a list comprehension or generator expression is
394 equivalent to the following Python code::
396 for expr1 in sequence1:
398 continue # Skip this element
399 for expr2 in sequence2:
401 continue # Skip this element
403 for exprN in sequenceN:
405 continue # Skip this element
407 # Output the value of
410 This means that when there are multiple ``for...in`` clauses but no ``if``
411 clauses, the length of the resulting output will be equal to the product of the
412 lengths of all the sequences. If you have two lists of length 3, the output
413 list is 9 elements long:
416 :options: +NORMALIZE_WHITESPACE
420 >>> [(x,y) for x in seq1 for y in seq2]
421 [('a', 1), ('a', 2), ('a', 3),
422 ('b', 1), ('b', 2), ('b', 3),
423 ('c', 1), ('c', 2), ('c', 3)]
425 To avoid introducing an ambiguity into Python's grammar, if ``expression`` is
426 creating a tuple, it must be surrounded with parentheses. The first list
427 comprehension below is a syntax error, while the second one is correct::
430 [ x,y for x in seq1 for y in seq2]
432 [ (x,y) for x in seq1 for y in seq2]
438 Generators are a special class of functions that simplify the task of writing
439 iterators. Regular functions compute a value and return it, but generators
440 return an iterator that returns a stream of values.
442 You're doubtless familiar with how regular function calls work in Python or C.
443 When you call a function, it gets a private namespace where its local variables
444 are created. When the function reaches a ``return`` statement, the local
445 variables are destroyed and the value is returned to the caller. A later call
446 to the same function creates a new private namespace and a fresh set of local
447 variables. But, what if the local variables weren't thrown away on exiting a
448 function? What if you could later resume the function where it left off? This
449 is what generators provide; they can be thought of as resumable functions.
451 Here's the simplest example of a generator function:
455 def generate_ints(N):
459 Any function containing a ``yield`` keyword is a generator function; this is
460 detected by Python's :term:`bytecode` compiler which compiles the function
461 specially as a result.
463 When you call a generator function, it doesn't return a single value; instead it
464 returns a generator object that supports the iterator protocol. On executing
465 the ``yield`` expression, the generator outputs the value of ``i``, similar to a
466 ``return`` statement. The big difference between ``yield`` and a ``return``
467 statement is that on reaching a ``yield`` the generator's state of execution is
468 suspended and local variables are preserved. On the next call to the
469 generator's ``.next()`` method, the function will resume executing.
471 Here's a sample usage of the ``generate_ints()`` generator:
473 >>> gen = generate_ints(3)
475 <generator object generate_ints at ...>
483 Traceback (most recent call last):
484 File "stdin", line 1, in ?
485 File "stdin", line 2, in generate_ints
488 You could equally write ``for i in generate_ints(5)``, or ``a,b,c =
491 Inside a generator function, the ``return`` statement can only be used without a
492 value, and signals the end of the procession of values; after executing a
493 ``return`` the generator cannot return any further values. ``return`` with a
494 value, such as ``return 5``, is a syntax error inside a generator function. The
495 end of the generator's results can also be indicated by raising
496 ``StopIteration`` manually, or by just letting the flow of execution fall off
497 the bottom of the function.
499 You could achieve the effect of generators manually by writing your own class
500 and storing all the local variables of the generator as instance variables. For
501 example, returning a list of integers could be done by setting ``self.count`` to
502 0, and having the ``next()`` method increment ``self.count`` and return it.
503 However, for a moderately complicated generator, writing a corresponding class
506 The test suite included with Python's library, ``test_generators.py``, contains
507 a number of more interesting examples. Here's one generator that implements an
508 in-order traversal of a tree using generators recursively. ::
510 # A recursive generator that generates Tree leaves in in-order.
513 for x in inorder(t.left):
518 for x in inorder(t.right):
521 Two other examples in ``test_generators.py`` produce solutions for the N-Queens
522 problem (placing N queens on an NxN chess board so that no queen threatens
523 another) and the Knight's Tour (finding a route that takes a knight to every
524 square of an NxN chessboard without visiting any square twice).
528 Passing values into a generator
529 -------------------------------
531 In Python 2.4 and earlier, generators only produced output. Once a generator's
532 code was invoked to create an iterator, there was no way to pass any new
533 information into the function when its execution is resumed. You could hack
534 together this ability by making the generator look at a global variable or by
535 passing in some mutable object that callers then modify, but these approaches
538 In Python 2.5 there's a simple way to pass values into a generator.
539 :keyword:`yield` became an expression, returning a value that can be assigned to
540 a variable or otherwise operated on::
544 I recommend that you **always** put parentheses around a ``yield`` expression
545 when you're doing something with the returned value, as in the above example.
546 The parentheses aren't always necessary, but it's easier to always add them
547 instead of having to remember when they're needed.
549 (PEP 342 explains the exact rules, which are that a ``yield``-expression must
550 always be parenthesized except when it occurs at the top-level expression on the
551 right-hand side of an assignment. This means you can write ``val = yield i``
552 but have to use parentheses when there's an operation, as in ``val = (yield i)
555 Values are sent into a generator by calling its ``send(value)`` method. This
556 method resumes the generator's code and the ``yield`` expression returns the
557 specified value. If the regular ``next()`` method is called, the ``yield``
560 Here's a simple counter that increments by 1 and allows changing the value of
561 the internal counter.
565 def counter (maximum):
569 # If value provided, change counter
575 And here's an example of changing the counter:
587 Traceback (most recent call last):
588 File "t.py", line 15, in ?
592 Because ``yield`` will often be returning ``None``, you should always check for
593 this case. Don't just use its value in expressions unless you're sure that the
594 ``send()`` method will be the only method used resume your generator function.
596 In addition to ``send()``, there are two other new methods on generators:
598 * ``throw(type, value=None, traceback=None)`` is used to raise an exception
599 inside the generator; the exception is raised by the ``yield`` expression
600 where the generator's execution is paused.
602 * ``close()`` raises a :exc:`GeneratorExit` exception inside the generator to
603 terminate the iteration. On receiving this exception, the generator's code
604 must either raise :exc:`GeneratorExit` or :exc:`StopIteration`; catching the
605 exception and doing anything else is illegal and will trigger a
606 :exc:`RuntimeError`. ``close()`` will also be called by Python's garbage
607 collector when the generator is garbage-collected.
609 If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest
610 using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`.
612 The cumulative effect of these changes is to turn generators from one-way
613 producers of information into both producers and consumers.
615 Generators also become **coroutines**, a more generalized form of subroutines.
616 Subroutines are entered at one point and exited at another point (the top of the
617 function, and a ``return`` statement), but coroutines can be entered, exited,
618 and resumed at many different points (the ``yield`` statements).
624 Let's look in more detail at built-in functions often used with iterators.
626 Two of Python's built-in functions, :func:`map` and :func:`filter`, are somewhat
627 obsolete; they duplicate the features of list comprehensions but return actual
628 lists instead of iterators.
630 ``map(f, iterA, iterB, ...)`` returns a list containing ``f(iterA[0], iterB[0]),
631 f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``.
636 >>> map(upper, ['sentence', 'fragment'])
637 ['SENTENCE', 'FRAGMENT']
639 >>> [upper(s) for s in ['sentence', 'fragment']]
640 ['SENTENCE', 'FRAGMENT']
642 As shown above, you can achieve the same effect with a list comprehension. The
643 :func:`itertools.imap` function does the same thing but can handle infinite
644 iterators; it'll be discussed later, in the section on the :mod:`itertools` module.
646 ``filter(predicate, iter)`` returns a list that contains all the sequence
647 elements that meet a certain condition, and is similarly duplicated by list
648 comprehensions. A **predicate** is a function that returns the truth value of
649 some condition; for use with :func:`filter`, the predicate must take a single
653 ... return (x % 2) == 0
655 >>> filter(is_even, range(10))
658 This can also be written as a list comprehension:
660 >>> [x for x in range(10) if is_even(x)]
663 :func:`filter` also has a counterpart in the :mod:`itertools` module,
664 :func:`itertools.ifilter`, that returns an iterator and can therefore handle
665 infinite sequences just as :func:`itertools.imap` can.
667 ``reduce(func, iter, [initial_value])`` doesn't have a counterpart in the
668 :mod:`itertools` module because it cumulatively performs an operation on all the
669 iterable's elements and therefore can't be applied to infinite iterables.
670 ``func`` must be a function that takes two elements and returns a single value.
671 :func:`reduce` takes the first two elements A and B returned by the iterator and
672 calculates ``func(A, B)``. It then requests the third element, C, calculates
673 ``func(func(A, B), C)``, combines this result with the fourth element returned,
674 and continues until the iterable is exhausted. If the iterable returns no
675 values at all, a :exc:`TypeError` exception is raised. If the initial value is
676 supplied, it's used as a starting point and ``func(initial_value, A)`` is the
680 >>> reduce(operator.concat, ['A', 'BB', 'C'])
682 >>> reduce(operator.concat, [])
683 Traceback (most recent call last):
685 TypeError: reduce() of empty sequence with no initial value
686 >>> reduce(operator.mul, [1,2,3], 1)
688 >>> reduce(operator.mul, [], 1)
691 If you use :func:`operator.add` with :func:`reduce`, you'll add up all the
692 elements of the iterable. This case is so common that there's a special
693 built-in called :func:`sum` to compute it:
695 >>> reduce(operator.add, [1,2,3,4], 0)
702 For many uses of :func:`reduce`, though, it can be clearer to just write the
703 obvious :keyword:`for` loop::
706 product = reduce(operator.mul, [1,2,3], 1)
714 ``enumerate(iter)`` counts off the elements in the iterable, returning 2-tuples
715 containing the count and each element.
717 >>> for item in enumerate(['subject', 'verb', 'object']):
723 :func:`enumerate` is often used when looping through a list and recording the
724 indexes at which certain conditions are met::
726 f = open('data.txt', 'r')
727 for i, line in enumerate(f):
728 if line.strip() == '':
729 print 'Blank line at line #%i' % i
731 ``sorted(iterable, [cmp=None], [key=None], [reverse=False])`` collects all the
732 elements of the iterable into a list, sorts the list, and returns the sorted
733 result. The ``cmp``, ``key``, and ``reverse`` arguments are passed through to
734 the constructed list's ``.sort()`` method. ::
737 >>> # Generate 8 random numbers between [0, 10000)
738 >>> rand_list = random.sample(range(10000), 8)
740 [769, 7953, 9828, 6431, 8442, 9878, 6213, 2207]
741 >>> sorted(rand_list)
742 [769, 2207, 6213, 6431, 7953, 8442, 9828, 9878]
743 >>> sorted(rand_list, reverse=True)
744 [9878, 9828, 8442, 7953, 6431, 6213, 2207, 769]
746 (For a more detailed discussion of sorting, see the Sorting mini-HOWTO in the
747 Python wiki at http://wiki.python.org/moin/HowTo/Sorting.)
749 The ``any(iter)`` and ``all(iter)`` built-ins look at the truth values of an
750 iterable's contents. :func:`any` returns True if any element in the iterable is
751 a true value, and :func:`all` returns True if all of the elements are true
768 Small functions and the lambda expression
769 =========================================
771 When writing functional-style programs, you'll often need little functions that
772 act as predicates or that combine elements in some way.
774 If there's a Python built-in or a module function that's suitable, you don't
775 need to define a new function at all::
777 stripped_lines = [line.strip() for line in lines]
778 existing_files = filter(os.path.exists, file_list)
780 If the function you need doesn't exist, you need to write it. One way to write
781 small functions is to use the ``lambda`` statement. ``lambda`` takes a number
782 of parameters and an expression combining these parameters, and creates a small
783 function that returns the value of the expression::
785 lowercase = lambda x: x.lower()
787 print_assign = lambda name, value: name + '=' + str(value)
789 adder = lambda x, y: x+y
791 An alternative is to just use the ``def`` statement and define a function in the
797 def print_assign(name, value):
798 return name + '=' + str(value)
803 Which alternative is preferable? That's a style question; my usual course is to
804 avoid using ``lambda``.
806 One reason for my preference is that ``lambda`` is quite limited in the
807 functions it can define. The result has to be computable as a single
808 expression, which means you can't have multiway ``if... elif... else``
809 comparisons or ``try... except`` statements. If you try to do too much in a
810 ``lambda`` statement, you'll end up with an overly complicated expression that's
811 hard to read. Quick, what's the following code doing?
815 total = reduce(lambda a, b: (0, a[1] + b[1]), items)[1]
817 You can figure it out, but it takes time to disentangle the expression to figure
818 out what's going on. Using a short nested ``def`` statements makes things a
822 return 0, a[1] + b[1]
824 total = reduce(combine, items)[1]
826 But it would be best of all if I had simply used a ``for`` loop::
832 Or the :func:`sum` built-in and a generator expression::
834 total = sum(b for a,b in items)
836 Many uses of :func:`reduce` are clearer when written as ``for`` loops.
838 Fredrik Lundh once suggested the following set of rules for refactoring uses of
841 1) Write a lambda function.
842 2) Write a comment explaining what the heck that lambda does.
843 3) Study the comment for a while, and think of a name that captures the essence
845 4) Convert the lambda to a def statement, using that name.
846 5) Remove the comment.
848 I really like these rules, but you're free to disagree
849 about whether this lambda-free style is better.
855 The :mod:`itertools` module contains a number of commonly-used iterators as well
856 as functions for combining several iterators. This section will introduce the
857 module's contents by showing small examples.
859 The module's functions fall into a few broad classes:
861 * Functions that create a new iterator based on an existing iterator.
862 * Functions for treating an iterator's elements as function arguments.
863 * Functions for selecting portions of an iterator's output.
864 * A function for grouping an iterator's output.
866 Creating new iterators
867 ----------------------
869 ``itertools.count(n)`` returns an infinite stream of integers, increasing by 1
870 each time. You can optionally supply the starting number, which defaults to 0::
873 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
874 itertools.count(10) =>
875 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
877 ``itertools.cycle(iter)`` saves a copy of the contents of a provided iterable
878 and returns a new iterator that returns its elements from first to last. The
879 new iterator will repeat these elements infinitely. ::
881 itertools.cycle([1,2,3,4,5]) =>
882 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, ...
884 ``itertools.repeat(elem, [n])`` returns the provided element ``n`` times, or
885 returns the element endlessly if ``n`` is not provided. ::
887 itertools.repeat('abc') =>
888 abc, abc, abc, abc, abc, abc, abc, abc, abc, abc, ...
889 itertools.repeat('abc', 5) =>
890 abc, abc, abc, abc, abc
892 ``itertools.chain(iterA, iterB, ...)`` takes an arbitrary number of iterables as
893 input, and returns all the elements of the first iterator, then all the elements
894 of the second, and so on, until all of the iterables have been exhausted. ::
896 itertools.chain(['a', 'b', 'c'], (1, 2, 3)) =>
899 ``itertools.izip(iterA, iterB, ...)`` takes one element from each iterable and
900 returns them in a tuple::
902 itertools.izip(['a', 'b', 'c'], (1, 2, 3)) =>
903 ('a', 1), ('b', 2), ('c', 3)
905 It's similar to the built-in :func:`zip` function, but doesn't construct an
906 in-memory list and exhaust all the input iterators before returning; instead
907 tuples are constructed and returned only if they're requested. (The technical
908 term for this behaviour is `lazy evaluation
909 <http://en.wikipedia.org/wiki/Lazy_evaluation>`__.)
911 This iterator is intended to be used with iterables that are all of the same
912 length. If the iterables are of different lengths, the resulting stream will be
913 the same length as the shortest iterable. ::
915 itertools.izip(['a', 'b'], (1, 2, 3)) =>
918 You should avoid doing this, though, because an element may be taken from the
919 longer iterators and discarded. This means you can't go on to use the iterators
920 further because you risk skipping a discarded element.
922 ``itertools.islice(iter, [start], stop, [step])`` returns a stream that's a
923 slice of the iterator. With a single ``stop`` argument, it will return the
924 first ``stop`` elements. If you supply a starting index, you'll get
925 ``stop-start`` elements, and if you supply a value for ``step``, elements will
926 be skipped accordingly. Unlike Python's string and list slicing, you can't use
927 negative values for ``start``, ``stop``, or ``step``. ::
929 itertools.islice(range(10), 8) =>
930 0, 1, 2, 3, 4, 5, 6, 7
931 itertools.islice(range(10), 2, 8) =>
933 itertools.islice(range(10), 2, 8, 2) =>
936 ``itertools.tee(iter, [n])`` replicates an iterator; it returns ``n``
937 independent iterators that will all return the contents of the source iterator.
938 If you don't supply a value for ``n``, the default is 2. Replicating iterators
939 requires saving some of the contents of the source iterator, so this can consume
940 significant memory if the iterator is large and one of the new iterators is
941 consumed more than the others. ::
943 itertools.tee( itertools.count() ) =>
947 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
950 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
953 Calling functions on elements
954 -----------------------------
956 Two functions are used for calling other functions on the contents of an
959 ``itertools.imap(f, iterA, iterB, ...)`` returns a stream containing
960 ``f(iterA[0], iterB[0]), f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``::
962 itertools.imap(operator.add, [5, 6, 5], [1, 2, 3]) =>
965 The ``operator`` module contains a set of functions corresponding to Python's
966 operators. Some examples are ``operator.add(a, b)`` (adds two values),
967 ``operator.ne(a, b)`` (same as ``a!=b``), and ``operator.attrgetter('id')``
968 (returns a callable that fetches the ``"id"`` attribute).
970 ``itertools.starmap(func, iter)`` assumes that the iterable will return a stream
971 of tuples, and calls ``f()`` using these tuples as the arguments::
973 itertools.starmap(os.path.join,
974 [('/usr', 'bin', 'java'), ('/bin', 'python'),
975 ('/usr', 'bin', 'perl'),('/usr', 'bin', 'ruby')])
977 /usr/bin/java, /bin/python, /usr/bin/perl, /usr/bin/ruby
983 Another group of functions chooses a subset of an iterator's elements based on a
986 ``itertools.ifilter(predicate, iter)`` returns all the elements for which the
987 predicate returns true::
992 itertools.ifilter(is_even, itertools.count()) =>
993 0, 2, 4, 6, 8, 10, 12, 14, ...
995 ``itertools.ifilterfalse(predicate, iter)`` is the opposite, returning all
996 elements for which the predicate returns false::
998 itertools.ifilterfalse(is_even, itertools.count()) =>
999 1, 3, 5, 7, 9, 11, 13, 15, ...
1001 ``itertools.takewhile(predicate, iter)`` returns elements for as long as the
1002 predicate returns true. Once the predicate returns false, the iterator will
1003 signal the end of its results.
1007 def less_than_10(x):
1010 itertools.takewhile(less_than_10, itertools.count()) =>
1011 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
1013 itertools.takewhile(is_even, itertools.count()) =>
1016 ``itertools.dropwhile(predicate, iter)`` discards elements while the predicate
1017 returns true, and then returns the rest of the iterable's results.
1021 itertools.dropwhile(less_than_10, itertools.count()) =>
1022 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
1024 itertools.dropwhile(is_even, itertools.count()) =>
1025 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
1031 The last function I'll discuss, ``itertools.groupby(iter, key_func=None)``, is
1032 the most complicated. ``key_func(elem)`` is a function that can compute a key
1033 value for each element returned by the iterable. If you don't supply a key
1034 function, the key is simply each element itself.
1036 ``groupby()`` collects all the consecutive elements from the underlying iterable
1037 that have the same key value, and returns a stream of 2-tuples containing a key
1038 value and an iterator for the elements with that key.
1042 city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'),
1043 ('Anchorage', 'AK'), ('Nome', 'AK'),
1044 ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'),
1048 def get_state ((city, state)):
1051 itertools.groupby(city_list, get_state) =>
1054 ('AZ', iterator-3), ...
1058 ('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL')
1060 ('Anchorage', 'AK'), ('Nome', 'AK')
1062 ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')
1064 ``groupby()`` assumes that the underlying iterable's contents will already be
1065 sorted based on the key. Note that the returned iterators also use the
1066 underlying iterable, so you have to consume the results of iterator-1 before
1067 requesting iterator-2 and its corresponding key.
1070 The functools module
1071 ====================
1073 The :mod:`functools` module in Python 2.5 contains some higher-order functions.
1074 A **higher-order function** takes one or more functions as input and returns a
1075 new function. The most useful tool in this module is the
1076 :func:`functools.partial` function.
1078 For programs written in a functional style, you'll sometimes want to construct
1079 variants of existing functions that have some of the parameters filled in.
1080 Consider a Python function ``f(a, b, c)``; you may wish to create a new function
1081 ``g(b, c)`` that's equivalent to ``f(1, b, c)``; you're filling in a value for
1082 one of ``f()``'s parameters. This is called "partial function application".
1084 The constructor for ``partial`` takes the arguments ``(function, arg1, arg2,
1085 ... kwarg1=value1, kwarg2=value2)``. The resulting object is callable, so you
1086 can just call it to invoke ``function`` with the filled-in arguments.
1088 Here's a small but realistic example::
1092 def log (message, subsystem):
1093 "Write the contents of 'message' to the specified subsystem."
1094 print '%s: %s' % (subsystem, message)
1097 server_log = functools.partial(log, subsystem='server')
1098 server_log('Unable to open socket')
1104 The :mod:`operator` module was mentioned earlier. It contains a set of
1105 functions corresponding to Python's operators. These functions are often useful
1106 in functional-style code because they save you from writing trivial functions
1107 that perform a single operation.
1109 Some of the functions in this module are:
1111 * Math operations: ``add()``, ``sub()``, ``mul()``, ``div()``, ``floordiv()``,
1113 * Logical operations: ``not_()``, ``truth()``.
1114 * Bitwise operations: ``and_()``, ``or_()``, ``invert()``.
1115 * Comparisons: ``eq()``, ``ne()``, ``lt()``, ``le()``, ``gt()``, and ``ge()``.
1116 * Object identity: ``is_()``, ``is_not()``.
1118 Consult the operator module's documentation for a complete list.
1122 The functional module
1123 ---------------------
1125 Collin Winter's `functional module <http://oakwinter.com/code/functional/>`__
1126 provides a number of more advanced tools for functional programming. It also
1127 reimplements several Python built-ins, trying to make them more intuitive to
1128 those used to functional programming in other languages.
1130 This section contains an introduction to some of the most important functions in
1131 ``functional``; full documentation can be found at `the project's website
1132 <http://oakwinter.com/code/functional/documentation/>`__.
1134 ``compose(outer, inner, unpack=False)``
1136 The ``compose()`` function implements function composition. In other words, it
1137 returns a wrapper around the ``outer`` and ``inner`` callables, such that the
1138 return value from ``inner`` is fed directly to ``outer``. That is, ::
1146 >>> compose(double, add)(5, 6)
1151 >>> double(add(5, 6))
1154 The ``unpack`` keyword is provided to work around the fact that Python functions
1155 are not always `fully curried <http://en.wikipedia.org/wiki/Currying>`__. By
1156 default, it is expected that the ``inner`` function will return a single object
1157 and that the ``outer`` function will take a single argument. Setting the
1158 ``unpack`` argument causes ``compose`` to expect a tuple from ``inner`` which
1159 will be expanded before being passed to ``outer``. Put simply, ::
1169 compose(f, g, unpack=True)(5, 6)
1175 Even though ``compose()`` only accepts two functions, it's trivial to build up a
1176 version that will compose any number of functions. We'll use ``reduce()``,
1177 ``compose()`` and ``partial()`` (the last of which is provided by both
1178 ``functional`` and ``functools``). ::
1180 from functional import compose, partial
1182 multi_compose = partial(reduce, compose)
1185 We can also use ``map()``, ``compose()`` and ``partial()`` to craft a version of
1186 ``"".join(...)`` that converts its arguments to string::
1188 from functional import compose, partial
1190 join = compose("".join, partial(map, str))
1195 ``flip()`` wraps the callable in ``func`` and causes it to receive its
1196 non-keyword arguments in reverse order. ::
1198 >>> def triple(a, b, c):
1199 ... return (a, b, c)
1204 >>> flipped_triple = flip(triple)
1205 >>> flipped_triple(5, 6, 7)
1208 ``foldl(func, start, iterable)``
1210 ``foldl()`` takes a binary function, a starting value (usually some kind of
1211 'zero'), and an iterable. The function is applied to the starting value and the
1212 first element of the list, then the result of that and the second element of the
1213 list, then the result of that and the third element of the list, and so on.
1215 This means that a call such as::
1217 foldl(f, 0, [1, 2, 3])
1224 ``foldl()`` is roughly equivalent to the following recursive function::
1226 def foldl(func, start, seq):
1230 return foldl(func, func(start, seq[0]), seq[1:])
1232 Speaking of equivalence, the above ``foldl`` call can be expressed in terms of
1233 the built-in ``reduce`` like so::
1235 reduce(f, [1, 2, 3], 0)
1238 We can use ``foldl()``, ``operator.concat()`` and ``partial()`` to write a
1239 cleaner, more aesthetically-pleasing version of Python's ``"".join(...)``
1242 from functional import foldl, partial from operator import concat
1244 join = partial(foldl, concat, "")
1247 Revision History and Acknowledgements
1248 =====================================
1250 The author would like to thank the following people for offering suggestions,
1251 corrections and assistance with various drafts of this article: Ian Bicking,
1252 Nick Coghlan, Nick Efford, Raymond Hettinger, Jim Jewett, Mike Krell, Leandro
1253 Lameiro, Jussi Salmela, Collin Winter, Blake Winton.
1255 Version 0.1: posted June 30 2006.
1257 Version 0.11: posted July 1 2006. Typo fixes.
1259 Version 0.2: posted July 10 2006. Merged genexp and listcomp sections into one.
1262 Version 0.21: Added more references suggested on the tutor mailing list.
1264 Version 0.30: Adds a section on the ``functional`` module written by Collin
1265 Winter; adds short section on the operator module; a few other edits.
1274 **Structure and Interpretation of Computer Programs**, by Harold Abelson and
1275 Gerald Jay Sussman with Julie Sussman. Full text at
1276 http://mitpress.mit.edu/sicp/. In this classic textbook of computer science,
1277 chapters 2 and 3 discuss the use of sequences and streams to organize the data
1278 flow inside a program. The book uses Scheme for its examples, but many of the
1279 design approaches described in these chapters are applicable to functional-style
1282 http://www.defmacro.org/ramblings/fp.html: A general introduction to functional
1283 programming that uses Java examples and has a lengthy historical introduction.
1285 http://en.wikipedia.org/wiki/Functional_programming: General Wikipedia entry
1286 describing functional programming.
1288 http://en.wikipedia.org/wiki/Coroutine: Entry for coroutines.
1290 http://en.wikipedia.org/wiki/Currying: Entry for the concept of currying.
1295 http://gnosis.cx/TPiP/: The first chapter of David Mertz's book
1296 :title-reference:`Text Processing in Python` discusses functional programming
1297 for text processing, in the section titled "Utilizing Higher-Order Functions in
1300 Mertz also wrote a 3-part series of articles on functional programming
1301 for IBM's DeveloperWorks site; see
1302 `part 1 <http://www-128.ibm.com/developerworks/library/l-prog.html>`__,
1303 `part 2 <http://www-128.ibm.com/developerworks/library/l-prog2.html>`__, and
1304 `part 3 <http://www-128.ibm.com/developerworks/linux/library/l-prog3.html>`__,
1307 Python documentation
1308 --------------------
1310 Documentation for the :mod:`itertools` module.
1312 Documentation for the :mod:`operator` module.
1314 :pep:`289`: "Generator Expressions"
1316 :pep:`342`: "Coroutines via Enhanced Generators" describes the new generator
1317 features in Python 2.5.
1322 -----------------------------
1326 XXX Need a large example.
1328 But will an example add much? I'll post a first draft and see
1329 what the comments say.
1336 Programs built out of functions
1337 Functions are strictly input-output, no internal state
1338 Opposed to OO programming, where objects have state
1342 Assignment is difficult to reason about
1343 Not very relevant to Python
1345 Small functions that do one thing
1347 Easy to test due to lack of state
1348 Easy to verify output from intermediate steps
1350 You assemble a toolbox of functions that can be mixed
1353 Need a significant example
1357 The itertools module
1359 Small functions and the lambda statement
1367 Handy little function for printing part of an iterator -- used
1368 while writing this document.
1372 slice = itertools.islice(it, 10)
1373 for elem in slice[:-1]:
1374 sys.stdout.write(str(elem))
1375 sys.stdout.write(', ')