1 .. #!/usr/bin/env python
2 # -*- coding: iso-8859-1 -*-
4 Test the simplestates.py generic state machine
5 ==============================================
9 :Copyright: 2006 Guenter Milde.
10 Released under the terms of the GNU General Public License
13 .. default-role:: literal
16 .. contents:: :depth: 2
19 Abstract State Machine Class
20 ============================
22 First version of an abstract state machine
26 """generic state machine acting on iterable data
29 init_state -- name of the first state_handler method
35 * sets the data object to the `data` argument.
36 * remaining keyword arguments are stored as class attributes (or methods, if
37 they are function objects) overwriting class defaults (a neat little trick
38 I found somewhere on the net)
39 * the `state_handler` attribute is set to the method named in `init_state`
43 def __init__(self, data, **keyw):
44 """data -- iterable data object
45 (list, file, generator, string, ...)
46 **keyw -- all remaining keyword arguments are
47 stored as class attributes
50 self.__dict__.update(keyw)
52 The special `__iter__` method returns an iterator_. This allows to use
53 a class instance directly in an iteration loop. We define it as is a
54 generator_ method that sets the initial state and then iterates over the
55 data calling the state methods::
58 self.state_handler = getattr(self, self.init_state)
59 for token in self.data:
60 yield self.state_handler(token)
62 To use class instances as callable objects, we add a `__call__` method::
65 """Run state-machine and return tokens as a list"""
66 return [token for token in self]
68 Example 1: A two-state machine sorting numbers
69 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71 Our small example state machine subclasses the `SimpleStates1` class::
73 class Example1(SimpleStates1):
74 """A silly example two-state machine sorting numbers
75 in the categories "low" (< 3) and "high" (>= 3).
78 It will be completed by two state methods and a `__str__` method.
83 State methods are functions that are called to iterate over the data. They
86 * test the data token for a change of state indicator
87 * return the data token after some processing
89 In our example, the `low` method switches to `high` (and calls it with the
90 data token), if token is bigger than 3. If not, it returns "l(token)"::
93 # print "low(", token, ")",
95 self.state_handler = self.high
97 return self.state_handler(token)
100 The `high` method switches to `low`, if token is bigger than 3. If not, it
103 def high(self, token):
104 # print "high(", token, ")",
106 self.state_handler = self.low
108 return self.state_handler(token)
111 Conversion of the class instance to a string is done by joining the list
112 that is returned by a call to the instance with spaces::
115 return " ".join(self())
120 Testing is done with the nose_ test framework. This will collect and
121 execute all test functions and methods (basically everything that starts or
122 ends with "[Tt]est"). This is similar to the more known "py.test".
124 .. _nose: http://somethingaboutorange.com/mrl/projects/nose/
126 We set up some test data::
128 testdata = [1, 2, 3, 4, 5, 4, 3, 2, 1]
130 and define a test function::
133 statemachine = Example1(testdata, init_state='low')
134 for result in statemachine:
138 # Calling an instance should return a list of results
140 assert statemachine() == ['l(1)','l(2)','l(3)', # low numbers
141 'h(4)','h(5)','h(4)', # high numbers
142 'l(3)','l(2)','l(1)'] # low again
144 # Converting to a string should call the __str__ method::
145 print str(statemachine)
146 assert str(statemachine) == "l(1) l(2) l(3) h(4) h(5) h(4) l(3) l(2) l(1)"
151 The sorting works as expected. However, as the state handlers get the data
152 token by token, acting on subsequent tokens or tests that combine the
153 knowledge of several tokens are hard to achieve.
155 An example would be a state handler that sums up the data tokens and
156 returns the result if it exceeds a threshold.
160 Varied State Machine Class Template
161 ===================================
163 The second version of an abstract state machine converts the test data to an
164 iterator which is shared by the state methods.
166 There is no need to pass this on via arguments, as class methods share the
167 class instances attributes (class variables).
169 We subclass our first version and modify to our needs::
171 class SimpleStates2(SimpleStates1):
172 """second version of the abstract state machine class
175 We add the initialization of the data to the `__iter__` method. The data is
176 transformed inta an iterator_ first. ::
179 self.data_iterator = iter(self.data)
180 self.state_handler = getattr(self, self.init_state)
181 # do not pass data tokens as argument
182 # (state methods should call self.data_iterator.next() instead)
184 yield self.state_handler()
186 Iterators "use up" the data, so the state methods will always get a "new"
187 token until the data is fully "used up" and `StopIteration` is raised
188 aborting the iteration.
190 Doing the conversion from iterable to iterator in `__iter__` and not in
191 `__init__` allows repeated iteration over the class instance (if the data is
192 a list or a file and not already a generator) as the "used up" generator is
193 replaced by a new one.
195 Example 2: Another two-state machine sorting numbers
196 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
198 Our small example state machine subclasses the `SimpleStates2` class
199 and adds 2 methods as state handlers. ::
201 class Example2(SimpleStates2):
202 """An example two-state machine sorting numbers
203 in the categories "low" (< 3) and "high" (>= 3).
209 This time, the state methods will get the data tokens not as argument but
210 take them from the `data_iterator`. Note that *backtracking* is impossible
211 with a standard iterator. See below for the problem this causes for our
212 sorting algorithm. ::
215 # print "low(", token, ")",
216 token = self.data_iterator.next()
218 self.state_handler = self.high
221 # print "high(", token, ")",
222 token = self.data_iterator.next()
224 self.state_handler = self.low
230 Define a second test function::
233 statemachine = Example2(testdata, init_state='low')
235 Calling an instance should return a list of results. However, as
236 we cannot backtrack on a normal iterator, the result is not as we expected:
237 There is a "hysteresis" the "switch triggering" token is always processed by
241 assert statemachine() == ['l(1)', 'l(2)', 'l(3)', # low numbers
242 'l(4)', 'h(5)', 'h(4)', # high numbers
243 'h(3)', 'l(2)', 'l(1)'] # low numbers
248 Missing backtracks break our number sorting machine. The remedy
249 is the use of an iterator with an appendleft() method (known from the
250 dqueue() standard class). We will come to this in `Example 4`__
252 __ `Example 4: A two-state machine with generators and backtracking`_
254 OTOH, as the state methods do the fetching of data tokens themself, a state
255 handler that sums up the data tokens and returns the result if it exceeds a
256 threshold would be easy to implement. We will do this in our next example
257 using state handler generators.
260 State Machine class using state_handler generators
261 ==================================================
263 The variations in `StateMachine2` complicate the StateMachine design. They
264 makes sense, however, if we use generated iterators to handle the states.
265 No changes are needed to the abstract base class, so that Example 3 can
266 build on `StateMachine2`::
268 class Example3(SimpleStates2):
270 Example 3: A two-state machine with state handler generators
271 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
276 State Generators generate and return an iterator that will handle the next
277 data token(s) if its .next() method is called. This is easily achieved with a
278 for loop over self.data and the `yield` keyword.
281 def high_handler_generator(self):
282 """Return an iterator, whose next() method
283 returns "h(token)" and switches to `low`, if token > 3
285 for token in self.data_iterator:
287 self.state_handler = self.low
290 def low_handler_generator(self):
291 """Return an iterator, whose next() method sums up data tokens.
292 If the sum exceeds 8, it is returned and the state
296 for token in self.data_iterator:
299 self.state_handler = self.high
301 sum = 0 # next iteration continues here
302 # no more tokens but sum not reached
303 yield "p=%d"%sum # partial sum
305 The iterator must instanciate the state-iterators before starting the
309 """Generate and return an iterator
311 * convert `data` to an iterator
312 * convert the state generators into iterators
313 * (re) set the state_handler attribute to the init-state
314 * pass control to the active states state_handler
315 which should call and process self.data_iterator.next()
317 self.data_iterator = iter(self.data)
318 self.high = self.high_handler_generator().next
319 self.low = self.low_handler_generator().next
321 self.state_handler = getattr(self, self.init_state)
322 # now start the iteration, aborts if data is empty
324 yield self.state_handler()
329 Again define a test function that gets an instance of the Example3 class ::
332 statemachine = Example3(testdata, init_state='low')
334 Calling statemachine() should iterate over the test data and return the
335 processed values as list::
338 assert statemachine() == ['s=10','h(5)','h(4)','h(3)', 'p=3']
343 the iterqueue module provides an "extendable" iterator with, e.g.,
344 an `appendleft` method to push back values::
346 from iterqueue import XIter
348 Thus we can prepend a non-processed data item
349 to the data iterator for use by the next state handler
351 Example 4: A two-state machine with generators and backtracking
352 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
354 Again we start from the `SimpleStates2` base class::
356 class Example4(SimpleStates2):
357 """two-state machine with generators and backtracking
360 Let the iterator wrap the data in an XIter instance with `appendleft`
364 """Generate and return an iterator
366 * convert `data` to an iterator queue
367 * convert the state generators into iterators
368 * (re) set the state_handler attribute to the init-state
369 * pass control to the active states state_handler
370 which should call and process self.data_iterator.next()
372 self.data_iterator = XIter(self.data) # queue with `appendleft` method
373 self.high = self.high_handler_generator().next
374 self.low = self.low_handler_generator().next
375 self.state_handler = getattr(self, self.init_state)
376 # now start the iteration
378 yield self.state_handler()
380 Add state method generators that use the "backtracking" feature::
382 def high_handler_generator(self):
383 """Return an iterator, whose next() method
384 returns "h(token)" and switches to `low`, if token > 3
386 for token in self.data_iterator:
387 # print "high got", token
389 # push back data token
390 self.data_iterator.appendleft(token)
392 self.state_handler = self.low
393 # return non-value indicating the state switch
398 def low_handler_generator(self):
399 """Return an iterator, whose next() method
400 returns "l(token)" and switches to `high`, if token <=3
402 for token in self.data_iterator:
403 # print "low got", token
405 self.data_iterator.appendleft(token) # push back
406 # set and run the new state
407 self.state_handler = self.high
408 # alternatively, return the token processed by the new
410 yield self.state_handler()
414 The `__str__` converter should ignore the switch-indicator::
417 tokens = [token for token in self() if token != None]
418 return " ".join(tokens)
423 Again define a test function. This time with an instance of the Example4
427 statemachine = Example4(testdata, init_state='low')
429 Calling statemachine() should iterate over the test data and return the
430 processed values as list. If the state of the machine changes, the special
431 "non-value" `None` is returned. ::
433 print statemachine() # only printed if something goes wrong
434 assert statemachine() == ['l(1)', 'l(2)', 'l(3)',
435 'h(4)', 'h(5)', 'h(4)', None, # switch indicator
436 'l(3)', 'l(2)', 'l(1)']
438 Converting to a string should skip the `None` values::
441 assert str(statemachine) == "l(1) l(2) l(3) h(4) h(5) h(4) l(3) l(2) l(1)"
446 The `XIter` class allows backtracking also in a state machine with state
447 handlers acting on a common iterator object. The "high" and "low" handlers
448 demonstrate two possible actions for the state-transition with backtrack:
449 Either call the new state handler from the current one
450 (like the `low_handler_generator`) or return a "non-value" that signifies
451 that processing the data token did not produce any output data.
453 Using generators made the state handlers shorter and (once the concept of a
454 generator is clear) easier. Further advantages of the generator concept are
456 * internal variables are easily saved over subsequent invocations
457 * no function-call overhead (not relevant in this example but maybe for a
458 state machine that has to process long data lists.
461 Converting all state method generators with a generic function
462 ==============================================================
464 In `Example4`, we had to redefine the `__iter__` method to convert the
465 methode state generators into iterators. It would be nice if this could be
466 done in the base class.
468 `SimpleStates3` adds a generic function for this task that relies on a
469 simple naming convention: functions whose name matches
470 `<state>_handler_generator` should be converted to iterators and their
471 `.next()` method stored as `<state>`.
474 class SimpleStates5(SimpleStates2):
475 """generic state machine acting on iterable data
477 def _initialize_state_generators(self):
478 """Generic function to initialize state handlers from generators
480 functions whose name matches `[^_]<state>_handler_generator` should
481 be converted to iterators and their `.next()` method stored as
484 suffix = "_handler_generator"
485 shg_names = [name for name in dir(self)
486 if name.endswith(suffix)
487 and not name.startswith("_")]
488 for name in shg_names:
489 shg = getattr(self, name)
491 setattr(self, name[:-len(suffix)], shg().next)
495 """Generate and return an iterator
497 * convert `data` to an iterator queue
498 * convert the state generators into iterators
499 * (re) set the state_handler attribute to the init-state
500 * pass control to the active states state_handler
501 which should call and process self.data_iterator.next()
503 self.data_iterator = XIter(self.data) # queue with `appendleft` method
504 self._initialize_state_generators()
505 self.state_handler = getattr(self, self.init_state)
506 # now start the iteration
508 yield self.state_handler()
513 The next example combines the state handlers from Example 4 and the new
516 class Example5(Example4, SimpleStates5):
517 """one more example"""
523 A function that has the generator-suffix but is prefixed with an underscore,
524 should be skipped by the `_initialize_state_generators` method::
526 class Test_SimpleStates5:
527 E5 = Example5(testdata)
528 E5._bogus_handler_generator = "low"
529 def test_initialize_state_generators(self):
530 self.E5._initialize_state_generators()
532 A test function. This time with an instance of the Example5 class ::
535 statemachine = Example5(testdata, init_state='low')
536 print statemachine.__dict__
538 Calling statemachine() should iterate over the test data and return the
539 processed values as list. If the state of the machine changes, the special
540 "non-value" `None` is returned. ::
542 print statemachine() # only printed if something goes wrong
543 assert statemachine() == ['l(1)', 'l(2)', 'l(3)',
544 'h(4)', 'h(5)', 'h(4)', None, # switch indicator
545 'l(3)', 'l(2)', 'l(1)']
547 Converting to a string should skip the `None` values::
550 assert str(statemachine) == "l(1) l(2) l(3) h(4) h(5) h(4) l(3) l(2) l(1)"
555 The file `simplestates.py` contains the full definition of the `SimpleStates5`
556 class in a self-contained version.
561 The final Example is used to test whether we have put it together well. It
562 subclasses SimpleStates and adds state method generators for "high" and
566 class Example6(simplestates.SimpleStates):
567 """two-state machine with generators and backtracking
569 def high_handler_generator(self):
570 """Return an iterator, whose next() method
571 returns "h(token)" and switches to `low`, if token > 3
573 for token in self.data_iterator:
574 # print "high got", token
576 # push back data token
577 self.data_iterator.appendleft(token)
579 self.state_handler = self.low
580 # return the token processed by the new state handler
581 yield self.state_handler()
585 def low_handler_generator(self):
586 """Return an iterator, whose next() method
587 returns "l(token)" and switches to `high`, if token <=3
589 for token in self.data_iterator:
590 # print "low got", token
592 self.data_iterator.appendleft(token) # push back
593 # set and run the new state
594 self.state_handler = self.high
595 # return the token processed by the new state handler
596 yield self.state_handler()
603 In order not to make it dependent on the iterqueue module, the final
604 `SimpleStates` doesnot wrap the data in an XIter instance. This step should
605 be done at the instanciation of a state machine. ::
608 statemachine = Example5(XIter(testdata), init_state='low')
609 print statemachine.__dict__
611 Calling statemachine() should iterate over the test data and return the
612 processed values as list::
614 print statemachine() # only printed if something goes wrong
615 # reset the data iterator as it is "used up" now
616 statemachine.data = XIter(testdata)
617 assert statemachine() == ['l(1)', 'l(2)', 'l(3)',
618 'h(4)', 'h(5)', 'h(4)', None,
619 'l(3)', 'l(2)', 'l(1)']
625 :_`generator`: A function with a `yield` keyword. Calling this function will
628 :_`iterator`: An object with a `next()` method. Calling `<iterator>.next()`
629 will (typically) return one data token (list element, line in
630 a file, ...). If there is no more data the `StopIteration`
636 running this script should explore it in the "nose" test framework::
638 if __name__ == "__main__":
640 # first run any doctests
642 # then run nose on this module
643 nose.runmodule() # requires nose 0.9.1