Fix {} mismatch spotted by George Yoshida.
[python.git] / Doc / whatsnew / whatsnew25.tex
blobdeb66f7792304b7d7a3986297726344a8a466839
1 \documentclass{howto}
2 \usepackage{distutils}
3 % $Id$
5 % Fix XXX comments
6 % The easy_install stuff
7 % Stateful codec changes
8 % Count up the patches and bugs
10 \title{What's New in Python 2.5}
11 \release{0.1}
12 \author{A.M. Kuchling}
13 \authoraddress{\email{amk@amk.ca}}
15 \begin{document}
16 \maketitle
17 \tableofcontents
19 This article explains the new features in Python 2.5. No release date
20 for Python 2.5 has been set; it will probably be released in the
21 autumn of 2006. \pep{356} describes the planned release schedule.
23 (This is still an early draft, and some sections are still skeletal or
24 completely missing. Comments on the present material will still be
25 welcomed.)
27 % XXX Compare with previous release in 2 - 3 sentences here.
29 This article doesn't attempt to provide a complete specification of
30 the new features, but instead provides a convenient overview. For
31 full details, you should refer to the documentation for Python 2.5.
32 % XXX add hyperlink when the documentation becomes available online.
33 If you want to understand the complete implementation and design
34 rationale, refer to the PEP for a particular new feature.
37 %======================================================================
38 \section{PEP 243: Uploading Modules to PyPI}
40 PEP 243 describes an HTTP-based protocol for submitting software
41 packages to a central archive. The Python package index at
42 \url{http://cheeseshop.python.org} now supports package uploads, and
43 the new \command{upload} Distutils command will upload a package to the
44 repository.
46 Before a package can be uploaded, you must be able to build a
47 distribution using the \command{sdist} Distutils command. Once that
48 works, you can run \code{python setup.py upload} to add your package
49 to the PyPI archive. Optionally you can GPG-sign the package by
50 supplying the \programopt{--sign} and
51 \programopt{--identity} options.
53 \begin{seealso}
55 \seepep{243}{Module Repository Upload Mechanism}{PEP written by
56 Sean Reifschneider; implemented by Martin von~L\"owis
57 and Richard Jones. Note that the PEP doesn't exactly
58 describe what's implemented in PyPI.}
60 \end{seealso}
63 %======================================================================
64 \section{PEP 308: Conditional Expressions}
66 For a long time, people have been requesting a way to write
67 conditional expressions, expressions that return value A or value B
68 depending on whether a Boolean value is true or false. A conditional
69 expression lets you write a single assignment statement that has the
70 same effect as the following:
72 \begin{verbatim}
73 if condition:
74 x = true_value
75 else:
76 x = false_value
77 \end{verbatim}
79 There have been endless tedious discussions of syntax on both
80 python-dev and comp.lang.python. A vote was even held that found the
81 majority of voters wanted conditional expressions in some form,
82 but there was no syntax that was preferred by a clear majority.
83 Candidates included C's \code{cond ? true_v : false_v},
84 \code{if cond then true_v else false_v}, and 16 other variations.
86 GvR eventually chose a surprising syntax:
88 \begin{verbatim}
89 x = true_value if condition else false_value
90 \end{verbatim}
92 Evaluation is still lazy as in existing Boolean expressions, so the
93 order of evaluation jumps around a bit. The \var{condition}
94 expression in the middle is evaluated first, and the \var{true_value}
95 expression is evaluated only if the condition was true. Similarly,
96 the \var{false_value} expression is only evaluated when the condition
97 is false.
99 This syntax may seem strange and backwards; why does the condition go
100 in the \emph{middle} of the expression, and not in the front as in C's
101 \code{c ? x : y}? The decision was checked by applying the new syntax
102 to the modules in the standard library and seeing how the resulting
103 code read. In many cases where a conditional expression is used, one
104 value seems to be the 'common case' and one value is an 'exceptional
105 case', used only on rarer occasions when the condition isn't met. The
106 conditional syntax makes this pattern a bit more obvious:
108 \begin{verbatim}
109 contents = ((doc + '\n') if doc else '')
110 \end{verbatim}
112 I read the above statement as meaning ``here \var{contents} is
113 usually assigned a value of \code{doc+'\e n'}; sometimes
114 \var{doc} is empty, in which special case an empty string is returned.''
115 I doubt I will use conditional expressions very often where there
116 isn't a clear common and uncommon case.
118 There was some discussion of whether the language should require
119 surrounding conditional expressions with parentheses. The decision
120 was made to \emph{not} require parentheses in the Python language's
121 grammar, but as a matter of style I think you should always use them.
122 Consider these two statements:
124 \begin{verbatim}
125 # First version -- no parens
126 level = 1 if logging else 0
128 # Second version -- with parens
129 level = (1 if logging else 0)
130 \end{verbatim}
132 In the first version, I think a reader's eye might group the statement
133 into 'level = 1', 'if logging', 'else 0', and think that the condition
134 decides whether the assignment to \var{level} is performed. The
135 second version reads better, in my opinion, because it makes it clear
136 that the assignment is always performed and the choice is being made
137 between two values.
139 Another reason for including the brackets: a few odd combinations of
140 list comprehensions and lambdas could look like incorrect conditional
141 expressions. See \pep{308} for some examples. If you put parentheses
142 around your conditional expressions, you won't run into this case.
145 \begin{seealso}
147 \seepep{308}{Conditional Expressions}{PEP written by
148 Guido van Rossum and Raymond D. Hettinger; implemented by Thomas
149 Wouters.}
151 \end{seealso}
154 %======================================================================
155 \section{PEP 309: Partial Function Application}
157 The \module{functional} module is intended to contain tools for
158 functional-style programming. Currently it only contains a
159 \class{partial()} function, but new functions will probably be added
160 in future versions of Python.
162 For programs written in a functional style, it can be useful to
163 construct variants of existing functions that have some of the
164 parameters filled in. Consider a Python function \code{f(a, b, c)};
165 you could create a new function \code{g(b, c)} that was equivalent to
166 \code{f(1, b, c)}. This is called ``partial function application'',
167 and is provided by the \class{partial} class in the new
168 \module{functional} module.
170 The constructor for \class{partial} takes the arguments
171 \code{(\var{function}, \var{arg1}, \var{arg2}, ...
172 \var{kwarg1}=\var{value1}, \var{kwarg2}=\var{value2})}. The resulting
173 object is callable, so you can just call it to invoke \var{function}
174 with the filled-in arguments.
176 Here's a small but realistic example:
178 \begin{verbatim}
179 import functional
181 def log (message, subsystem):
182 "Write the contents of 'message' to the specified subsystem."
183 print '%s: %s' % (subsystem, message)
186 server_log = functional.partial(log, subsystem='server')
187 server_log('Unable to open socket')
188 \end{verbatim}
190 Here's another example, from a program that uses PyGTk. Here a
191 context-sensitive pop-up menu is being constructed dynamically. The
192 callback provided for the menu option is a partially applied version
193 of the \method{open_item()} method, where the first argument has been
194 provided.
196 \begin{verbatim}
198 class Application:
199 def open_item(self, path):
201 def init (self):
202 open_func = functional.partial(self.open_item, item_path)
203 popup_menu.append( ("Open", open_func, 1) )
204 \end{verbatim}
207 \begin{seealso}
209 \seepep{309}{Partial Function Application}{PEP proposed and written by
210 Peter Harris; implemented by Hye-Shik Chang, with adaptations by
211 Raymond Hettinger.}
213 \end{seealso}
216 %======================================================================
217 \section{PEP 314: Metadata for Python Software Packages v1.1}
219 Some simple dependency support was added to Distutils. The
220 \function{setup()} function now has \code{requires}, \code{provides},
221 and \code{obsoletes} keyword parameters. When you build a source
222 distribution using the \code{sdist} command, the dependency
223 information will be recorded in the \file{PKG-INFO} file.
225 Another new keyword parameter is \code{download_url}, which should be
226 set to a URL for the package's source code. This means it's now
227 possible to look up an entry in the package index, determine the
228 dependencies for a package, and download the required packages.
230 % XXX put example here
232 \begin{seealso}
234 \seepep{314}{Metadata for Python Software Packages v1.1}{PEP proposed
235 and written by A.M. Kuchling, Richard Jones, and Fred Drake;
236 implemented by Richard Jones and Fred Drake.}
238 \end{seealso}
241 %======================================================================
242 \section{PEP 328: Absolute and Relative Imports}
244 The simpler part of PEP 328 was implemented in Python 2.4: parentheses
245 could now be used to enclose the names imported from a module using
246 the \code{from ... import ...} statement, making it easier to import
247 many different names.
249 The more complicated part has been implemented in Python 2.5:
250 importing a module can be specified to use absolute or
251 package-relative imports. The plan is to move toward making absolute
252 imports the default in future versions of Python.
254 Let's say you have a package directory like this:
255 \begin{verbatim}
256 pkg/
257 pkg/__init__.py
258 pkg/main.py
259 pkg/string.py
260 \end{verbatim}
262 This defines a package named \module{pkg} containing the
263 \module{pkg.main} and \module{pkg.string} submodules.
265 Consider the code in the \file{main.py} module. What happens if it
266 executes the statement \code{import string}? In Python 2.4 and
267 earlier, it will first look in the package's directory to perform a
268 relative import, finds \file{pkg/string.py}, imports the contents of
269 that file as the \module{pkg.string} module, and that module is bound
270 to the name \samp{string} in the \module{pkg.main} module's namespace.
272 That's fine if \module{pkg.string} was what you wanted. But what if
273 you wanted Python's standard \module{string} module? There's no clean
274 way to ignore \module{pkg.string} and look for the standard module;
275 generally you had to look at the contents of \code{sys.modules}, which
276 is slightly unclean.
277 Holger Krekel's \module{py.std} package provides a tidier way to perform
278 imports from the standard library, \code{import py ; py.std.string.join()},
279 but that package isn't available on all Python installations.
281 Reading code which relies on relative imports is also less clear,
282 because a reader may be confused about which module, \module{string}
283 or \module{pkg.string}, is intended to be used. Python users soon
284 learned not to duplicate the names of standard library modules in the
285 names of their packages' submodules, but you can't protect against
286 having your submodule's name being used for a new module added in a
287 future version of Python.
289 In Python 2.5, you can switch \keyword{import}'s behaviour to
290 absolute imports using a \code{from __future__ import absolute_import}
291 directive. This absolute-import behaviour will become the default in
292 a future version (probably Python 2.7). Once absolute imports
293 are the default, \code{import string} will
294 always find the standard library's version.
295 It's suggested that users should begin using absolute imports as much
296 as possible, so it's preferable to begin writing \code{from pkg import
297 string} in your code.
299 Relative imports are still possible by adding a leading period
300 to the module name when using the \code{from ... import} form:
302 \begin{verbatim}
303 # Import names from pkg.string
304 from .string import name1, name2
305 # Import pkg.string
306 from . import string
307 \end{verbatim}
309 This imports the \module{string} module relative to the current
310 package, so in \module{pkg.main} this will import \var{name1} and
311 \var{name2} from \module{pkg.string}. Additional leading periods
312 perform the relative import starting from the parent of the current
313 package. For example, code in the \module{A.B.C} module can do:
315 \begin{verbatim}
316 from . import D # Imports A.B.D
317 from .. import E # Imports A.E
318 from ..F import G # Imports A.F.G
319 \end{verbatim}
321 Leading periods cannot be used with the \code{import \var{modname}}
322 form of the import statement, only the \code{from ... import} form.
324 \begin{seealso}
326 \seepep{328}{Imports: Multi-Line and Absolute/Relative}
327 {PEP written by Aahz; implemented by Thomas Wouters.}
329 \seeurl{http://codespeak.net/py/current/doc/index.html}
330 {The py library by Holger Krekel, which contains the \module{py.std} package.}
332 \end{seealso}
335 %======================================================================
336 \section{PEP 338: Executing Modules as Scripts}
338 The \programopt{-m} switch added in Python 2.4 to execute a module as
339 a script gained a few more abilities. Instead of being implemented in
340 C code inside the Python interpreter, the switch now uses an
341 implementation in a new module, \module{runpy}.
343 The \module{runpy} module implements a more sophisticated import
344 mechanism so that it's now possible to run modules in a package such
345 as \module{pychecker.checker}. The module also supports alternative
346 import mechanisms such as the \module{zipimport} module. (This means
347 you can add a .zip archive's path to \code{sys.path} and then use the
348 \programopt{-m} switch to execute code from the archive.
351 \begin{seealso}
353 \seepep{338}{Executing modules as scripts}{PEP written and
354 implemented by Nick Coghlan.}
356 \end{seealso}
359 %======================================================================
360 \section{PEP 341: Unified try/except/finally}
362 Until Python 2.5, the \keyword{try} statement came in two
363 flavours. You could use a \keyword{finally} block to ensure that code
364 is always executed, or a number of \keyword{except} blocks to catch an
365 exception. You couldn't combine both \keyword{except} blocks and a
366 \keyword{finally} block, because generating the right bytecode for the
367 combined version was complicated and it wasn't clear what the
368 semantics of the combined should be.
370 GvR spent some time working with Java, which does support the
371 equivalent of combining \keyword{except} blocks and a
372 \keyword{finally} block, and this clarified what the statement should
373 mean. In Python 2.5, you can now write:
375 \begin{verbatim}
376 try:
377 block-1 ...
378 except Exception1:
379 handler-1 ...
380 except Exception2:
381 handler-2 ...
382 else:
383 else-block
384 finally:
385 final-block
386 \end{verbatim}
388 The code in \var{block-1} is executed. If the code raises an
389 exception, the handlers are tried in order: \var{handler-1},
390 \var{handler-2}, ... If no exception is raised, the \var{else-block}
391 is executed. No matter what happened previously, the
392 \var{final-block} is executed once the code block is complete and any
393 raised exceptions handled. Even if there's an error in an exception
394 handler or the \var{else-block} and a new exception is raised, the
395 \var{final-block} is still executed.
397 \begin{seealso}
399 \seepep{341}{Unifying try-except and try-finally}{PEP written by Georg Brandl;
400 implementation by Thomas Lee.}
402 \end{seealso}
405 %======================================================================
406 \section{PEP 342: New Generator Features}
408 Python 2.5 adds a simple way to pass values \emph{into} a generator.
409 As introduced in Python 2.3, generators only produce output; once a
410 generator's code is invoked to create an iterator, there's no way to
411 pass any new information into the function when its execution is
412 resumed. Sometimes the ability to pass in some information would be
413 useful. Hackish solutions to this include making the generator's code
414 look at a global variable and then changing the global variable's
415 value, or passing in some mutable object that callers then modify.
417 To refresh your memory of basic generators, here's a simple example:
419 \begin{verbatim}
420 def counter (maximum):
421 i = 0
422 while i < maximum:
423 yield i
424 i += 1
425 \end{verbatim}
427 When you call \code{counter(10)}, the result is an iterator that
428 returns the values from 0 up to 9. On encountering the
429 \keyword{yield} statement, the iterator returns the provided value and
430 suspends the function's execution, preserving the local variables.
431 Execution resumes on the following call to the iterator's
432 \method{next()} method, picking up after the \keyword{yield} statement.
434 In Python 2.3, \keyword{yield} was a statement; it didn't return any
435 value. In 2.5, \keyword{yield} is now an expression, returning a
436 value that can be assigned to a variable or otherwise operated on:
438 \begin{verbatim}
439 val = (yield i)
440 \end{verbatim}
442 I recommend that you always put parentheses around a \keyword{yield}
443 expression when you're doing something with the returned value, as in
444 the above example. The parentheses aren't always necessary, but it's
445 easier to always add them instead of having to remember when they're
446 needed.\footnote{The exact rules are that a \keyword{yield}-expression must
447 always be parenthesized except when it occurs at the top-level
448 expression on the right-hand side of an assignment, meaning you can
449 write \code{val = yield i} but have to use parentheses when there's an
450 operation, as in \code{val = (yield i) + 12}.}
452 Values are sent into a generator by calling its
453 \method{send(\var{value})} method. The generator's code is then
454 resumed and the \keyword{yield} expression returns the specified
455 \var{value}. If the regular \method{next()} method is called, the
456 \keyword{yield} returns \constant{None}.
458 Here's the previous example, modified to allow changing the value of
459 the internal counter.
461 \begin{verbatim}
462 def counter (maximum):
463 i = 0
464 while i < maximum:
465 val = (yield i)
466 # If value provided, change counter
467 if val is not None:
468 i = val
469 else:
470 i += 1
471 \end{verbatim}
473 And here's an example of changing the counter:
475 \begin{verbatim}
476 >>> it = counter(10)
477 >>> print it.next()
479 >>> print it.next()
481 >>> print it.send(8)
483 >>> print it.next()
485 >>> print it.next()
486 Traceback (most recent call last):
487 File ``t.py'', line 15, in ?
488 print it.next()
489 StopIteration
490 \end{verbatim}
492 Because \keyword{yield} will often be returning \constant{None}, you
493 should always check for this case. Don't just use its value in
494 expressions unless you're sure that the \method{send()} method
495 will be the only method used resume your generator function.
497 In addition to \method{send()}, there are two other new methods on
498 generators:
500 \begin{itemize}
502 \item \method{throw(\var{type}, \var{value}=None,
503 \var{traceback}=None)} is used to raise an exception inside the
504 generator; the exception is raised by the \keyword{yield} expression
505 where the generator's execution is paused.
507 \item \method{close()} raises a new \exception{GeneratorExit}
508 exception inside the generator to terminate the iteration.
509 On receiving this
510 exception, the generator's code must either raise
511 \exception{GeneratorExit} or \exception{StopIteration}; catching the
512 exception and doing anything else is illegal and will trigger
513 a \exception{RuntimeError}. \method{close()} will also be called by
514 Python's garbage collection when the generator is garbage-collected.
516 If you need to run cleanup code in case of a \exception{GeneratorExit},
517 I suggest using a \code{try: ... finally:} suite instead of
518 catching \exception{GeneratorExit}.
520 \end{itemize}
522 The cumulative effect of these changes is to turn generators from
523 one-way producers of information into both producers and consumers.
525 Generators also become \emph{coroutines}, a more generalized form of
526 subroutines. Subroutines are entered at one point and exited at
527 another point (the top of the function, and a \keyword{return
528 statement}), but coroutines can be entered, exited, and resumed at
529 many different points (the \keyword{yield} statements). We'll have to
530 figure out patterns for using coroutines effectively in Python.
532 The addition of the \method{close()} method has one side effect that
533 isn't obvious. \method{close()} is called when a generator is
534 garbage-collected, so this means the generator's code gets one last
535 chance to run before the generator is destroyed, and this last chance
536 means that \code{try...finally} statements in generators can now be
537 guaranteed to work; the \keyword{finally} clause will now always get a
538 chance to run. The syntactic restriction that you couldn't mix
539 \keyword{yield} statements with a \code{try...finally} suite has
540 therefore been removed. This seems like a minor bit of language
541 trivia, but using generators and \code{try...finally} is actually
542 necessary in order to implement the \keyword{with} statement
543 described by PEP 343. We'll look at this new statement in the following
544 section.
546 \begin{seealso}
548 \seepep{342}{Coroutines via Enhanced Generators}{PEP written by
549 Guido van Rossum and Phillip J. Eby;
550 implemented by Phillip J. Eby. Includes examples of
551 some fancier uses of generators as coroutines.}
553 \seeurl{http://en.wikipedia.org/wiki/Coroutine}{The Wikipedia entry for
554 coroutines.}
556 \seeurl{http://www.sidhe.org/\~{}dan/blog/archives/000178.html}{An
557 explanation of coroutines from a Perl point of view, written by Dan
558 Sugalski.}
560 \end{seealso}
563 %======================================================================
564 \section{PEP 343: The 'with' statement}
566 The \keyword{with} statement allows a clearer
567 version of code that uses \code{try...finally} blocks
569 First, I'll discuss the statement as it will commonly be used, and
570 then I'll discuss the detailed implementation and how to write objects
571 (called ``context managers'') that can be used with this statement.
572 Most people, who will only use \keyword{with} in company with an
573 existing object, don't need to know these details and can
574 just use objects that are documented to work as context managers.
575 Authors of new context managers will need to understand the details of
576 the underlying implementation.
578 The \keyword{with} statement is a new control-flow structure whose
579 basic structure is:
581 \begin{verbatim}
582 with expression as variable:
583 with-block
584 \end{verbatim}
586 The expression is evaluated, and it should result in a type of object
587 that's called a context manager. The context manager can return a
588 value that will be bound to the name \var{variable}. (Note carefully:
589 \var{variable} is \emph{not} assigned the result of \var{expression}.
590 One method of the context manager is run before \var{with-block} is
591 executed, and another method is run after the block is done, even if
592 the block raised an exception.
594 To enable the statement in Python 2.5, you need
595 to add the following directive to your module:
597 \begin{verbatim}
598 from __future__ import with_statement
599 \end{verbatim}
601 Some standard Python objects can now behave as context managers. For
602 example, file objects:
604 \begin{verbatim}
605 with open('/etc/passwd', 'r') as f:
606 for line in f:
607 print line
609 # f has been automatically closed at this point.
610 \end{verbatim}
612 The \module{threading} module's locks and condition variables
613 also support the \keyword{with} statement:
615 \begin{verbatim}
616 lock = threading.Lock()
617 with lock:
618 # Critical section of code
620 \end{verbatim}
622 The lock is acquired before the block is executed, and released once
623 the block is complete.
625 The \module{decimal} module's contexts, which encapsulate the desired
626 precision and rounding characteristics for computations, can also be
627 used as context managers.
629 \begin{verbatim}
630 import decimal
632 v1 = decimal.Decimal('578')
634 # Displays with default precision of 28 digits
635 print v1.sqrt()
637 with decimal.Context(prec=16):
638 # All code in this block uses a precision of 16 digits.
639 # The original context is restored on exiting the block.
640 print v1.sqrt()
641 \end{verbatim}
643 \subsection{Writing Context Managers}
645 % XXX write this
647 This section still needs to be written.
649 The new \module{contextlib} module provides some functions and a
650 decorator that are useful for writing context managers.
651 Future versions will go into more detail.
653 % XXX describe further
655 \begin{seealso}
657 \seepep{343}{The ``with'' statement}{PEP written by
658 Guido van Rossum and Nick Coghlan. }
660 \end{seealso}
663 %======================================================================
664 \section{PEP 352: Exceptions as New-Style Classes}
666 Exception classes can now be new-style classes, not just classic
667 classes, and the built-in \exception{Exception} class and all the
668 standard built-in exceptions (\exception{NameError},
669 \exception{ValueError}, etc.) are now new-style classes.
671 The inheritance hierarchy for exceptions has been rearranged a bit.
672 In 2.5, the inheritance relationships are:
674 \begin{verbatim}
675 BaseException # New in Python 2.5
676 |- KeyboardInterrupt
677 |- SystemExit
678 |- Exception
679 |- (all other current built-in exceptions)
680 \end{verbatim}
682 This rearrangement was done because people often want to catch all
683 exceptions that indicate program errors. \exception{KeyboardInterrupt} and
684 \exception{SystemExit} aren't errors, though, and usually represent an explicit
685 action such as the user hitting Control-C or code calling
686 \function{sys.exit()}. A bare \code{except:} will catch all exceptions,
687 so you commonly need to list \exception{KeyboardInterrupt} and
688 \exception{SystemExit} in order to re-raise them. The usual pattern is:
690 \begin{verbatim}
691 try:
693 except (KeyboardInterrupt, SystemExit):
694 raise
695 except:
696 # Log error...
697 # Continue running program...
698 \end{verbatim}
700 In Python 2.5, you can now write \code{except Exception} to achieve
701 the same result, catching all the exceptions that usually indicate errors
702 but leaving \exception{KeyboardInterrupt} and
703 \exception{SystemExit} alone. As in previous versions,
704 a bare \code{except:} still catches all exceptions.
706 The goal for Python 3.0 is to require any class raised as an exception
707 to derive from \exception{BaseException} or some descendant of
708 \exception{BaseException}, and future releases in the
709 Python 2.x series may begin to enforce this constraint. Therefore, I
710 suggest you begin making all your exception classes derive from
711 \exception{Exception} now. It's been suggested that the bare
712 \code{except:} form should be removed in Python 3.0, but Guido van~Rossum
713 hasn't decided whether to do this or not.
715 Raising of strings as exceptions, as in the statement \code{raise
716 "Error occurred"}, is deprecated in Python 2.5 and will trigger a
717 warning. The aim is to be able to remove the string-exception feature
718 in a few releases.
721 \begin{seealso}
723 \seepep{352}{Required Superclass for Exceptions}{PEP written by
724 Brett Cannon and Guido van Rossum; implemented by Brett Cannon.}
726 \end{seealso}
729 %======================================================================
730 \section{PEP 353: Using ssize_t as the index type\label{section-353}}
732 A wide-ranging change to Python's C API, using a new
733 \ctype{Py_ssize_t} type definition instead of \ctype{int},
734 will permit the interpreter to handle more data on 64-bit platforms.
735 This change doesn't affect Python's capacity on 32-bit platforms.
737 Various pieces of the Python interpreter used C's \ctype{int} type to
738 store sizes or counts; for example, the number of items in a list or
739 tuple were stored in an \ctype{int}. The C compilers for most 64-bit
740 platforms still define \ctype{int} as a 32-bit type, so that meant
741 that lists could only hold up to \code{2**31 - 1} = 2147483647 items.
742 (There are actually a few different programming models that 64-bit C
743 compilers can use -- see
744 \url{http://www.unix.org/version2/whatsnew/lp64_wp.html} for a
745 discussion -- but the most commonly available model leaves \ctype{int}
746 as 32 bits.)
748 A limit of 2147483647 items doesn't really matter on a 32-bit platform
749 because you'll run out of memory before hitting the length limit.
750 Each list item requires space for a pointer, which is 4 bytes, plus
751 space for a \ctype{PyObject} representing the item. 2147483647*4 is
752 already more bytes than a 32-bit address space can contain.
754 It's possible to address that much memory on a 64-bit platform,
755 however. The pointers for a list that size would only require 16GiB
756 of space, so it's not unreasonable that Python programmers might
757 construct lists that large. Therefore, the Python interpreter had to
758 be changed to use some type other than \ctype{int}, and this will be a
759 64-bit type on 64-bit platforms. The change will cause
760 incompatibilities on 64-bit machines, so it was deemed worth making
761 the transition now, while the number of 64-bit users is still
762 relatively small. (In 5 or 10 years, we may \emph{all} be on 64-bit
763 machines, and the transition would be more painful then.)
765 This change most strongly affects authors of C extension modules.
766 Python strings and container types such as lists and tuples
767 now use \ctype{Py_ssize_t} to store their size.
768 Functions such as \cfunction{PyList_Size()}
769 now return \ctype{Py_ssize_t}. Code in extension modules
770 may therefore need to have some variables changed to
771 \ctype{Py_ssize_t}.
773 The \cfunction{PyArg_ParseTuple()} and \cfunction{Py_BuildValue()} functions
774 have a new conversion code, \samp{n}, for \ctype{Py_ssize_t}.
775 \cfunction{PyArg_ParseTuple()}'s \samp{s\#} and \samp{t\#} still output
776 \ctype{int} by default, but you can define the macro
777 \csimplemacro{PY_SSIZE_T_CLEAN} before including \file{Python.h}
778 to make them return \ctype{Py_ssize_t}.
780 \pep{353} has a section on conversion guidelines that
781 extension authors should read to learn about supporting 64-bit
782 platforms.
784 \begin{seealso}
786 \seepep{353}{Using ssize_t as the index type}{PEP written and implemented by Martin von~L\"owis.}
788 \end{seealso}
791 %======================================================================
792 \section{PEP 357: The '__index__' method}
794 The NumPy developers had a problem that could only be solved by adding
795 a new special method, \method{__index__}. When using slice notation,
796 as in \code{[\var{start}:\var{stop}:\var{step}]}, the values of the
797 \var{start}, \var{stop}, and \var{step} indexes must all be either
798 integers or long integers. NumPy defines a variety of specialized
799 integer types corresponding to unsigned and signed integers of 8, 16,
800 32, and 64 bits, but there was no way to signal that these types could
801 be used as slice indexes.
803 Slicing can't just use the existing \method{__int__} method because
804 that method is also used to implement coercion to integers. If
805 slicing used \method{__int__}, floating-point numbers would also
806 become legal slice indexes and that's clearly an undesirable
807 behaviour.
809 Instead, a new special method called \method{__index__} was added. It
810 takes no arguments and returns an integer giving the slice index to
811 use. For example:
813 \begin{verbatim}
814 class C:
815 def __index__ (self):
816 return self.value
817 \end{verbatim}
819 The return value must be either a Python integer or long integer.
820 The interpreter will check that the type returned is correct, and
821 raises a \exception{TypeError} if this requirement isn't met.
823 A corresponding \member{nb_index} slot was added to the C-level
824 \ctype{PyNumberMethods} structure to let C extensions implement this
825 protocol. \cfunction{PyNumber_Index(\var{obj})} can be used in
826 extension code to call the \method{__index__} function and retrieve
827 its result.
829 \begin{seealso}
831 \seepep{357}{Allowing Any Object to be Used for Slicing}{PEP written
832 and implemented by Travis Oliphant.}
834 \end{seealso}
837 %======================================================================
838 \section{Other Language Changes}
840 Here are all of the changes that Python 2.5 makes to the core Python
841 language.
843 \begin{itemize}
845 \item The \function{min()} and \function{max()} built-in functions
846 gained a \code{key} keyword argument analogous to the \code{key}
847 argument for \method{sort()}. This argument supplies a function
848 that takes a single argument and is called for every value in the list;
849 \function{min()}/\function{max()} will return the element with the
850 smallest/largest return value from this function.
851 For example, to find the longest string in a list, you can do:
853 \begin{verbatim}
854 L = ['medium', 'longest', 'short']
855 # Prints 'longest'
856 print max(L, key=len)
857 # Prints 'short', because lexicographically 'short' has the largest value
858 print max(L)
859 \end{verbatim}
861 (Contributed by Steven Bethard and Raymond Hettinger.)
863 \item Two new built-in functions, \function{any()} and
864 \function{all()}, evaluate whether an iterator contains any true or
865 false values. \function{any()} returns \constant{True} if any value
866 returned by the iterator is true; otherwise it will return
867 \constant{False}. \function{all()} returns \constant{True} only if
868 all of the values returned by the iterator evaluate as being true.
869 (Suggested by GvR, and implemented by Raymond Hettinger.)
871 \item ASCII is now the default encoding for modules. It's now
872 a syntax error if a module contains string literals with 8-bit
873 characters but doesn't have an encoding declaration. In Python 2.4
874 this triggered a warning, not a syntax error. See \pep{263}
875 for how to declare a module's encoding; for example, you might add
876 a line like this near the top of the source file:
878 \begin{verbatim}
879 # -*- coding: latin1 -*-
880 \end{verbatim}
882 \item The list of base classes in a class definition can now be empty.
883 As an example, this is now legal:
885 \begin{verbatim}
886 class C():
887 pass
888 \end{verbatim}
889 (Implemented by Brett Cannon.)
891 % XXX __missing__ hook in dictionaries
893 \end{itemize}
896 %======================================================================
897 \subsection{Interactive Interpreter Changes}
899 In the interactive interpreter, \code{quit} and \code{exit}
900 have long been strings so that new users get a somewhat helpful message
901 when they try to quit:
903 \begin{verbatim}
904 >>> quit
905 'Use Ctrl-D (i.e. EOF) to exit.'
906 \end{verbatim}
908 In Python 2.5, \code{quit} and \code{exit} are now objects that still
909 produce string representations of themselves, but are also callable.
910 Newbies who try \code{quit()} or \code{exit()} will now exit the
911 interpreter as they expect. (Implemented by Georg Brandl.)
914 %======================================================================
915 \subsection{Optimizations}
917 \begin{itemize}
919 \item When they were introduced
920 in Python 2.4, the built-in \class{set} and \class{frozenset} types
921 were built on top of Python's dictionary type.
922 In 2.5 the internal data structure has been customized for implementing sets,
923 and as a result sets will use a third less memory and are somewhat faster.
924 (Implemented by Raymond Hettinger.)
926 \item The performance of some Unicode operations has been improved.
927 % XXX provide details?
929 \item The code generator's peephole optimizer now performs
930 simple constant folding in expressions. If you write something like
931 \code{a = 2+3}, the code generator will do the arithmetic and produce
932 code corresponding to \code{a = 5}.
934 \end{itemize}
936 The net result of the 2.5 optimizations is that Python 2.5 runs the
937 pystone benchmark around XXX\% faster than Python 2.4.
940 %======================================================================
941 \section{New, Improved, and Deprecated Modules}
943 As usual, Python's standard library received a number of enhancements and
944 bug fixes. Here's a partial list of the most notable changes, sorted
945 alphabetically by module name. Consult the
946 \file{Misc/NEWS} file in the source tree for a more
947 complete list of changes, or look through the SVN logs for all the
948 details.
950 \begin{itemize}
952 % collections.deque now has .remove()
953 % collections.defaultdict
955 % the cPickle module no longer accepts the deprecated None option in the
956 % args tuple returned by __reduce__().
958 % csv module improvements
960 % datetime.datetime() now has a strptime class method which can be used to
961 % create datetime object using a string and format.
963 % fileinput: opening hook used to control how files are opened.
964 % .input() now has a mode parameter
965 % now has a fileno() function
966 % accepts Unicode filenames
968 \item In the \module{gc} module, the new \function{get_count()} function
969 returns a 3-tuple containing the current collection counts for the
970 three GC generations. This is accounting information for the garbage
971 collector; when these counts reach a specified threshold, a garbage
972 collection sweep will be made. The existing \function{gc.collect()}
973 function now takes an optional \var{generation} argument of 0, 1, or 2
974 to specify which generation to collect.
976 \item The \function{nsmallest()} and
977 \function{nlargest()} functions in the \module{heapq} module
978 now support a \code{key} keyword argument similar to the one
979 provided by the \function{min()}/\function{max()} functions
980 and the \method{sort()} methods. For example:
981 Example:
983 \begin{verbatim}
984 >>> import heapq
985 >>> L = ["short", 'medium', 'longest', 'longer still']
986 >>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically
987 ['longer still', 'longest']
988 >>> heapq.nsmallest(2, L, key=len) # Return two shortest elements
989 ['short', 'medium']
990 \end{verbatim}
992 (Contributed by Raymond Hettinger.)
994 \item The \function{itertools.islice()} function now accepts
995 \code{None} for the start and step arguments. This makes it more
996 compatible with the attributes of slice objects, so that you can now write
997 the following:
999 \begin{verbatim}
1000 s = slice(5) # Create slice object
1001 itertools.islice(iterable, s.start, s.stop, s.step)
1002 \end{verbatim}
1004 (Contributed by Raymond Hettinger.)
1006 \item The \module{operator} module's \function{itemgetter()}
1007 and \function{attrgetter()} functions now support multiple fields.
1008 A call such as \code{operator.attrgetter('a', 'b')}
1009 will return a function
1010 that retrieves the \member{a} and \member{b} attributes. Combining
1011 this new feature with the \method{sort()} method's \code{key} parameter
1012 lets you easily sort lists using multiple fields.
1013 (Contributed by Raymond Hettinger.)
1016 \item The \module{os} module underwent a number of changes. The
1017 \member{stat_float_times} variable now defaults to true, meaning that
1018 \function{os.stat()} will now return time values as floats. (This
1019 doesn't necessarily mean that \function{os.stat()} will return times
1020 that are precise to fractions of a second; not all systems support
1021 such precision.)
1023 Constants named \member{os.SEEK_SET}, \member{os.SEEK_CUR}, and
1024 \member{os.SEEK_END} have been added; these are the parameters to the
1025 \function{os.lseek()} function. Two new constants for locking are
1026 \member{os.O_SHLOCK} and \member{os.O_EXLOCK}.
1028 Two new functions, \function{wait3()} and \function{wait4()}, were
1029 added. They're similar the \function{waitpid()} function which waits
1030 for a child process to exit and returns a tuple of the process ID and
1031 its exit status, but \function{wait3()} and \function{wait4()} return
1032 additional information. \function{wait3()} doesn't take a process ID
1033 as input, so it waits for any child process to exit and returns a
1034 3-tuple of \var{process-id}, \var{exit-status}, \var{resource-usage}
1035 as returned from the \function{resource.getrusage()} function.
1036 \function{wait4(\var{pid})} does take a process ID.
1037 (Contributed by Chad J. Schroeder.)
1039 On FreeBSD, the \function{os.stat()} function now returns
1040 times with nanosecond resolution, and the returned object
1041 now has \member{st_gen} and \member{st_birthtime}.
1042 The \member{st_flags} member is also available, if the platform supports it.
1043 (Contributed by Antti Louko and Diego Petten\`o.)
1044 % (Patch 1180695, 1212117)
1046 \item The old \module{regex} and \module{regsub} modules, which have been
1047 deprecated ever since Python 2.0, have finally been deleted.
1048 Other deleted modules: \module{statcache}, \module{tzparse},
1049 \module{whrandom}.
1051 \item The \file{lib-old} directory,
1052 which includes ancient modules such as \module{dircmp} and
1053 \module{ni}, was also deleted. \file{lib-old} wasn't on the default
1054 \code{sys.path}, so unless your programs explicitly added the directory to
1055 \code{sys.path}, this removal shouldn't affect your code.
1057 \item The \module{socket} module now supports \constant{AF_NETLINK}
1058 sockets on Linux, thanks to a patch from Philippe Biondi.
1059 Netlink sockets are a Linux-specific mechanism for communications
1060 between a user-space process and kernel code; an introductory
1061 article about them is at \url{http://www.linuxjournal.com/article/7356}.
1062 In Python code, netlink addresses are represented as a tuple of 2 integers,
1063 \code{(\var{pid}, \var{group_mask})}.
1065 Socket objects also gained accessor methods \method{getfamily()},
1066 \method{gettype()}, and \method{getproto()} methods to retrieve the
1067 family, type, and protocol values for the socket.
1069 \item New module: \module{spwd} provides functions for accessing the
1070 shadow password database on systems that support it.
1071 % XXX give example
1073 % XXX patch #1382163: sys.subversion, Py_GetBuildNumber()
1075 \item The \class{TarFile} class in the \module{tarfile} module now has
1076 an \method{extractall()} method that extracts all members from the
1077 archive into the current working directory. It's also possible to set
1078 a different directory as the extraction target, and to unpack only a
1079 subset of the archive's members.
1081 A tarfile's compression can be autodetected by
1082 using the mode \code{'r|*'}.
1083 % patch 918101
1084 (Contributed by Lars Gust\"abel.)
1086 \item The \module{unicodedata} module has been updated to use version 4.1.0
1087 of the Unicode character database. Version 3.2.0 is required
1088 by some specifications, so it's still available as
1089 \member{unicodedata.db_3_2_0}.
1091 % patch #754022: Greatly enhanced webbrowser.py (by Oleg Broytmann).
1094 \item The \module{xmlrpclib} module now supports returning
1095 \class{datetime} objects for the XML-RPC date type. Supply
1096 \code{use_datetime=True} to the \function{loads()} function
1097 or the \class{Unmarshaller} class to enable this feature.
1098 (Contributed by Skip Montanaro.)
1099 % Patch 1120353
1102 \end{itemize}
1106 %======================================================================
1107 % whole new modules get described in subsections here
1109 \subsection{The ctypes package}
1111 The \module{ctypes} package, written by Thomas Heller, has been added
1112 to the standard library. \module{ctypes} lets you call arbitrary functions
1113 in shared libraries or DLLs. Long-time users may remember the \module{dl} module, which
1114 provides functions for loading shared libraries and calling functions in them. The \module{ctypes} package is much fancier.
1116 To load a shared library or DLL, you must create an instance of the
1117 \class{CDLL} class and provide the name or path of the shared library
1118 or DLL. Once that's done, you can call arbitrary functions
1119 by accessing them as attributes of the \class{CDLL} object.
1121 \begin{verbatim}
1122 import ctypes
1124 libc = ctypes.CDLL('libc.so.6')
1125 result = libc.printf("Line of output\n")
1126 \end{verbatim}
1128 Type constructors for the various C types are provided: \function{c_int},
1129 \function{c_float}, \function{c_double}, \function{c_char_p} (equivalent to \ctype{char *}), and so forth. Unlike Python's types, the C versions are all mutable; you can assign to their \member{value} attribute
1130 to change the wrapped value. Python integers and strings will be automatically
1131 converted to the corresponding C types, but for other types you
1132 must call the correct type constructor. (And I mean \emph{must};
1133 getting it wrong will often result in the interpreter crashing
1134 with a segmentation fault.)
1136 You shouldn't use \function{c_char_p} with a Python string when the C function will be modifying the memory area, because Python strings are
1137 supposed to be immutable; breaking this rule will cause puzzling bugs. When you need a modifiable memory area,
1138 use \function{create_string_buffer()}:
1140 \begin{verbatim}
1141 s = "this is a string"
1142 buf = ctypes.create_string_buffer(s)
1143 libc.strfry(buf)
1144 \end{verbatim}
1146 C functions are assumed to return integers, but you can set
1147 the \member{restype} attribute of the function object to
1148 change this:
1150 \begin{verbatim}
1151 >>> libc.atof('2.71828')
1152 -1783957616
1153 >>> libc.atof.restype = ctypes.c_double
1154 >>> libc.atof('2.71828')
1155 2.71828
1156 \end{verbatim}
1158 \module{ctypes} also provides a wrapper for Python's C API
1159 as the \code{ctypes.pythonapi} object. This object does \emph{not}
1160 release the global interpreter lock before calling a function, because the lock must be held when calling into the interpreter's code.
1161 There's a \class{py_object()} type constructor that will create a
1162 \ctype{PyObject *} pointer. A simple usage:
1164 \begin{verbatim}
1165 import ctypes
1167 d = {}
1168 ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d),
1169 ctypes.py_object("abc"), ctypes.py_object(1))
1170 # d is now {'abc', 1}.
1171 \end{verbatim}
1173 Don't forget to use \class{py_object()}; if it's omitted you end
1174 up with a segmentation fault.
1176 \module{ctypes} has been around for a while, but people still write
1177 and distribution hand-coded extension modules because you can't rely on \module{ctypes} being present.
1178 Perhaps developers will begin to write
1179 Python wrappers atop a library accessed through \module{ctypes} instead
1180 of extension modules, now that \module{ctypes} is included with core Python.
1182 % XXX write introduction
1184 \begin{seealso}
1186 \seeurl{http://starship.python.net/crew/theller/ctypes/}
1187 {The ctypes web page, with a tutorial, reference, and FAQ.}
1189 \end{seealso}
1191 \subsection{The ElementTree package}
1193 A subset of Fredrik Lundh's ElementTree library for processing XML has
1194 been added to the standard library as \module{xmlcore.etree}. The
1195 available modules are
1196 \module{ElementTree}, \module{ElementPath}, and
1197 \module{ElementInclude} from ElementTree 1.2.6.
1198 The \module{cElementTree} accelerator module is also included.
1200 The rest of this section will provide a brief overview of using
1201 ElementTree. Full documentation for ElementTree is available at
1202 \url{http://effbot.org/zone/element-index.htm}.
1204 ElementTree represents an XML document as a tree of element nodes.
1205 The text content of the document is stored as the \member{.text}
1206 and \member{.tail} attributes of
1207 (This is one of the major differences between ElementTree and
1208 the Document Object Model; in the DOM there are many different
1209 types of node, including \class{TextNode}.)
1211 The most commonly used parsing function is \function{parse()}, that
1212 takes either a string (assumed to contain a filename) or a file-like
1213 object and returns an \class{ElementTree} instance:
1215 \begin{verbatim}
1216 from xmlcore.etree import ElementTree as ET
1218 tree = ET.parse('ex-1.xml')
1220 feed = urllib.urlopen(
1221 'http://planet.python.org/rss10.xml')
1222 tree = ET.parse(feed)
1223 \end{verbatim}
1225 Once you have an \class{ElementTree} instance, you
1226 can call its \method{getroot()} method to get the root \class{Element} node.
1228 There's also an \function{XML()} function that takes a string literal
1229 and returns an \class{Element} node (not an \class{ElementTree}).
1230 This function provides a tidy way to incorporate XML fragments,
1231 approaching the convenience of an XML literal:
1233 \begin{verbatim}
1234 svg = et.XML("""<svg width="10px" version="1.0">
1235 </svg>""")
1236 svg.set('height', '320px')
1237 svg.append(elem1)
1238 \end{verbatim}
1240 Each XML element supports some dictionary-like and some list-like
1241 access methods. Dictionary-like operations are used to access attribute
1242 values, and list-like operations are used to access child nodes.
1244 \begin{tableii}{c|l}{code}{Operation}{Result}
1245 \lineii{elem[n]}{Returns n'th child element.}
1246 \lineii{elem[m:n]}{Returns list of m'th through n'th child elements.}
1247 \lineii{len(elem)}{Returns number of child elements.}
1248 \lineii{elem.getchildren()}{Returns list of child elements.}
1249 \lineii{elem.append(elem2)}{Adds \var{elem2} as a child.}
1250 \lineii{elem.insert(index, elem2)}{Inserts \var{elem2} at the specified location.}
1251 \lineii{del elem[n]}{Deletes n'th child element.}
1252 \lineii{elem.keys()}{Returns list of attribute names.}
1253 \lineii{elem.get(name)}{Returns value of attribute \var{name}.}
1254 \lineii{elem.set(name, value)}{Sets new value for attribute \var{name}.}
1255 \lineii{elem.attrib}{Retrieves the dictionary containing attributes.}
1256 \lineii{del elem.attrib[name]}{Deletes attribute \var{name}.}
1257 \end{tableii}
1259 Comments and processing instructions are also represented as
1260 \class{Element} nodes. To check if a node is a comment or processing
1261 instructions:
1263 \begin{verbatim}
1264 if elem.tag is ET.Comment:
1266 elif elem.tag is ET.ProcessingInstruction:
1268 \end{verbatim}
1270 To generate XML output, you should call the
1271 \method{ElementTree.write()} method. Like \function{parse()},
1272 it can take either a string or a file-like object:
1274 \begin{verbatim}
1275 # Encoding is US-ASCII
1276 tree.write('output.xml')
1278 # Encoding is UTF-8
1279 f = open('output.xml', 'w')
1280 tree.write(f, 'utf-8')
1281 \end{verbatim}
1283 (Caution: the default encoding used for output is ASCII, which isn't
1284 very useful for general XML work, raising an exception if there are
1285 any characters with values greater than 127. You should always
1286 specify a different encoding such as UTF-8 that can handle any Unicode
1287 character.)
1289 This section is only a partial description of the ElementTree interfaces.
1290 Please read the package's official documentation for more details.
1292 \begin{seealso}
1294 \seeurl{http://effbot.org/zone/element-index.htm}
1295 {Official documentation for ElementTree.}
1298 \end{seealso}
1301 \subsection{The hashlib package}
1303 A new \module{hashlib} module has been added to replace the
1304 \module{md5} and \module{sha} modules. \module{hashlib} adds support
1305 for additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512).
1306 When available, the module uses OpenSSL for fast platform optimized
1307 implementations of algorithms.
1309 The old \module{md5} and \module{sha} modules still exist as wrappers
1310 around hashlib to preserve backwards compatibility. The new module's
1311 interface is very close to that of the old modules, but not identical.
1312 The most significant difference is that the constructor functions
1313 for creating new hashing objects are named differently.
1315 \begin{verbatim}
1316 # Old versions
1317 h = md5.md5()
1318 h = md5.new()
1320 # New version
1321 h = hashlib.md5()
1323 # Old versions
1324 h = sha.sha()
1325 h = sha.new()
1327 # New version
1328 h = hashlib.sha1()
1330 # Hash that weren't previously available
1331 h = hashlib.sha224()
1332 h = hashlib.sha256()
1333 h = hashlib.sha384()
1334 h = hashlib.sha512()
1336 # Alternative form
1337 h = hashlib.new('md5') # Provide algorithm as a string
1338 \end{verbatim}
1340 Once a hash object has been created, its methods are the same as before:
1341 \method{update(\var{string})} hashes the specified string into the
1342 current digest state, \method{digest()} and \method{hexdigest()}
1343 return the digest value as a binary string or a string of hex digits,
1344 and \method{copy()} returns a new hashing object with the same digest state.
1346 This module was contributed by Gregory P. Smith.
1349 \subsection{The sqlite3 package}
1351 The pysqlite module (\url{http://www.pysqlite.org}), a wrapper for the
1352 SQLite embedded database, has been added to the standard library under
1353 the package name \module{sqlite3}. SQLite is a C library that
1354 provides a SQL-language database that stores data in disk files
1355 without requiring a separate server process. pysqlite was written by
1356 Gerhard H\"aring, and provides a SQL interface that complies with the
1357 DB-API 2.0 specification described by \pep{249}. This means that it
1358 should be possible to write the first version of your applications
1359 using SQLite for data storage and, if switching to a larger database
1360 such as PostgreSQL or Oracle is necessary, the switch should be
1361 relatively easy.
1363 If you're compiling the Python source yourself, note that the source
1364 tree doesn't include the SQLite code itself, only the wrapper module.
1365 You'll need to have the SQLite libraries and headers installed before
1366 compiling Python, and the build process will compile the module when
1367 the necessary headers are available.
1369 To use the module, you must first create a \class{Connection} object
1370 that represents the database. Here the data will be stored in the
1371 \file{/tmp/example} file:
1373 \begin{verbatim}
1374 conn = sqlite3.connect('/tmp/example')
1375 \end{verbatim}
1377 You can also supply the special name \samp{:memory:} to create
1378 a database in RAM.
1380 Once you have a \class{Connection}, you can create a \class{Cursor}
1381 object and call its \method{execute()} method to perform SQL commands:
1383 \begin{verbatim}
1384 c = conn.cursor()
1386 # Create table
1387 c.execute('''create table stocks
1388 (date timestamp, trans varchar, symbol varchar,
1389 qty decimal, price decimal)''')
1391 # Insert a row of data
1392 c.execute("""insert into stocks
1393 values ('2006-01-05','BUY','RHAT',100, 35.14)""")
1394 \end{verbatim}
1396 Usually your SQL queries will need to reflect the value of Python
1397 variables. You shouldn't assemble your query using Python's string
1398 operations because doing so is insecure; it makes your program
1399 vulnerable to what's called an SQL injection attack. Instead, use
1400 SQLite's parameter substitution, putting \samp{?} as a placeholder
1401 wherever you want to use a value, and then provide a tuple of values
1402 as the second argument to the cursor's \method{execute()} method. For
1403 example:
1405 \begin{verbatim}
1406 # Never do this -- insecure!
1407 symbol = 'IBM'
1408 c.execute("... where symbol = '%s'" % symbol)
1410 # Do this instead
1411 t = (symbol,)
1412 c.execute("... where symbol = '?'", t)
1414 # Larger example
1415 for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
1416 ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
1417 ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
1419 c.execute('insert into stocks values (?,?,?,?,?)', t)
1420 \end{verbatim}
1422 To retrieve data after executing a SELECT statement, you can either
1423 treat the cursor as an iterator, call the cursor's \method{fetchone()}
1424 method to retrieve a single matching row,
1425 or call \method{fetchall()} to get a list of the matching rows.
1427 This example uses the iterator form:
1429 \begin{verbatim}
1430 >>> c = conn.cursor()
1431 >>> c.execute('select * from stocks order by price')
1432 >>> for row in c:
1433 ... print row
1435 (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
1436 (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
1437 (u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
1438 (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
1440 \end{verbatim}
1442 You should also use parameter substitution with SELECT statements:
1444 \begin{verbatim}
1445 >>> c.execute('select * from stocks where symbol=?', ('IBM',))
1446 >>> print c.fetchall()
1447 [(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0),
1448 (u'2006-04-06', u'SELL', u'IBM', 500, 53.0)]
1449 \end{verbatim}
1451 For more information about the SQL dialect supported by SQLite, see
1452 \url{http://www.sqlite.org}.
1454 \begin{seealso}
1456 \seeurl{http://www.pysqlite.org}
1457 {The pysqlite web page.}
1459 \seeurl{http://www.sqlite.org}
1460 {The SQLite web page; the documentation describes the syntax and the
1461 available data types for the supported SQL dialect.}
1463 \seepep{249}{Database API Specification 2.0}{PEP written by
1464 Marc-Andr\'e Lemburg.}
1466 \end{seealso}
1469 % ======================================================================
1470 \section{Build and C API Changes}
1472 Changes to Python's build process and to the C API include:
1474 \begin{itemize}
1476 \item The largest change to the C API came from \pep{353},
1477 which modifies the interpreter to use a \ctype{Py_ssize_t} type
1478 definition instead of \ctype{int}. See the earlier
1479 section~ref{section-353} for a discussion of this change.
1481 \item The design of the bytecode compiler has changed a great deal, to
1482 no longer generate bytecode by traversing the parse tree. Instead
1483 the parse tree is converted to an abstract syntax tree (or AST), and it is
1484 the abstract syntax tree that's traversed to produce the bytecode.
1486 It's possible for Python code to obtain AST objects by using the
1487 \function{compile()} built-in and specifying \code{_ast.PyCF_ONLY_AST}
1488 as the value of the
1489 \var{flags} parameter:
1491 \begin{verbatim}
1492 from _ast import PyCF_ONLY_AST
1493 ast = compile("""a=0
1494 for i in range(10):
1495 a += i
1496 """, "<string>", 'exec', PyCF_ONLY_AST)
1498 assignment = ast.body[0]
1499 for_loop = ast.body[1]
1500 \end{verbatim}
1502 No documentation has been written for the AST code yet. To start
1503 learning about it, read the definition of the various AST nodes in
1504 \file{Parser/Python.asdl}. A Python script reads this file and
1505 generates a set of C structure definitions in
1506 \file{Include/Python-ast.h}. The \cfunction{PyParser_ASTFromString()}
1507 and \cfunction{PyParser_ASTFromFile()}, defined in
1508 \file{Include/pythonrun.h}, take Python source as input and return the
1509 root of an AST representing the contents. This AST can then be turned
1510 into a code object by \cfunction{PyAST_Compile()}. For more
1511 information, read the source code, and then ask questions on
1512 python-dev.
1514 % List of names taken from Jeremy's python-dev post at
1515 % http://mail.python.org/pipermail/python-dev/2005-October/057500.html
1516 The AST code was developed under Jeremy Hylton's management, and
1517 implemented by (in alphabetical order) Brett Cannon, Nick Coghlan,
1518 Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters,
1519 Armin Rigo, and Neil Schemenauer, plus the participants in a number of
1520 AST sprints at conferences such as PyCon.
1522 \item The built-in set types now have an official C API. Call
1523 \cfunction{PySet_New()} and \cfunction{PyFrozenSet_New()} to create a
1524 new set, \cfunction{PySet_Add()} and \cfunction{PySet_Discard()} to
1525 add and remove elements, and \cfunction{PySet_Contains} and
1526 \cfunction{PySet_Size} to examine the set's state.
1528 \item The \cfunction{PyRange_New()} function was removed. It was
1529 never documented, never used in the core code, and had dangerously lax
1530 error checking.
1532 \end{itemize}
1535 %======================================================================
1536 %\subsection{Port-Specific Changes}
1538 %Platform-specific changes go here.
1541 %======================================================================
1542 \section{Other Changes and Fixes \label{section-other}}
1544 As usual, there were a bunch of other improvements and bugfixes
1545 scattered throughout the source tree. A search through the SVN change
1546 logs finds there were XXX patches applied and YYY bugs fixed between
1547 Python 2.4 and 2.5. Both figures are likely to be underestimates.
1549 Some of the more notable changes are:
1551 \begin{itemize}
1553 \item Evan Jones's patch to obmalloc, first described in a talk
1554 at PyCon DC 2005, was applied. Python 2.4 allocated small objects in
1555 256K-sized arenas, but never freed arenas. With this patch, Python
1556 will free arenas when they're empty. The net effect is that on some
1557 platforms, when you allocate many objects, Python's memory usage may
1558 actually drop when you delete them, and the memory may be returned to
1559 the operating system. (Implemented by Evan Jones, and reworked by Tim
1560 Peters.)
1562 Note that this change means extension modules need to be more careful
1563 with how they allocate memory. Python's API has a number of different
1564 functions for allocating memory that are grouped into families. For
1565 example, \cfunction{PyMem_Malloc()}, \cfunction{PyMem_Realloc()}, and
1566 \cfunction{PyMem_Free()} are one family that allocates raw memory,
1567 while \cfunction{PyObject_Malloc()}, \cfunction{PyObject_Realloc()},
1568 and \cfunction{PyObject_Free()} are another family that's supposed to
1569 be used for creating Python objects.
1571 Previously these different families all reduced to the platform's
1572 \cfunction{malloc()} and \cfunction{free()} functions. This meant
1573 it didn't matter if you got things wrong and allocated memory with the
1574 \cfunction{PyMem} function but freed it with the \cfunction{PyObject}
1575 function. With the obmalloc change, these families now do different
1576 things, and mismatches will probably result in a segfault. You should
1577 carefully test your C extension modules with Python 2.5.
1579 \item Coverity, a company that markets a source code analysis tool
1580 called Prevent, provided the results of their examination of the Python
1581 source code. The analysis found a number of refcounting bugs, often
1582 in error-handling code. These bugs have been fixed.
1583 % XXX provide reference?
1585 \end{itemize}
1588 %======================================================================
1589 \section{Porting to Python 2.5}
1591 This section lists previously described changes that may require
1592 changes to your code:
1594 \begin{itemize}
1596 \item ASCII is now the default encoding for modules. It's now
1597 a syntax error if a module contains string literals with 8-bit
1598 characters but doesn't have an encoding declaration. In Python 2.4
1599 this triggered a warning, not a syntax error.
1601 \item The \module{pickle} module no longer uses the deprecated \var{bin} parameter.
1603 \item C API: Many functions now use \ctype{Py_ssize_t}
1604 instead of \ctype{int} to allow processing more data
1605 on 64-bit machines. Extension code may need to make
1606 the same change to avoid warnings and to support 64-bit machines.
1607 See the earlier
1608 section~ref{section-353} for a discussion of this change.
1610 \item C API:
1611 The obmalloc changes mean that
1612 you must be careful to not mix usage
1613 of the \cfunction{PyMem_*()} and \cfunction{PyObject_*()}
1614 families of functions. Memory allocated with
1615 one family's \cfunction{*_Malloc()} must be
1616 freed with the corresponding family's \cfunction{*_Free()} function.
1618 \end{itemize}
1621 %======================================================================
1622 \section{Acknowledgements \label{acks}}
1624 The author would like to thank the following people for offering
1625 suggestions, corrections and assistance with various drafts of this
1626 article: Martin von~L\"owis, Mike Rovner, Thomas Wouters.
1628 \end{document}