Doc/faq/library.rst

   1 :tocdepth: 2
   2
   3 =========================
   4 Library and Extension FAQ
   5 =========================
   6
   7 .. contents::
   8
   9 General Library Questions
  10 =========================
  11
  12 How do I find a module or application to perform task X?
  13 --------------------------------------------------------
  14
  15 Check :ref:`the Library Reference <library-index>` to see if there's a relevant
  16 standard library module.  (Eventually you'll learn what's in the standard
  17 library and will able to skip this step.)
  18
  19 For third-party packages, search the `Python Package Index
  20 <http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
  21 another Web search engine.  Searching for "Python" plus a keyword or two for
  22 your topic of interest will usually find something helpful.
  23
  24
  25 Where is the math.py (socket.py, regex.py, etc.) source file?
  26 -------------------------------------------------------------
  27
  28 If you can't find a source file for a module it may be a builtin or dynamically
  29 loaded module implemented in C, C++ or other compiled language.  In this case
  30 you may not have the source file or it may be something like mathmodule.c,
  31 somewhere in a C source directory (not on the Python Path).
  32
  33 There are (at least) three kinds of modules in Python:
  34
  35 1) modules written in Python (.py);
  36 2) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
  37 3) modules written in C and linked with the interpreter; to get a list of these,
  38    type::
  39
  40       import sys
  41       print sys.builtin_module_names
  42
  43
  44 How do I make a Python script executable on Unix?
  45 -------------------------------------------------
  46
  47 You need to do two things: the script file's mode must be executable and the
  48 first line must begin with ``#!`` followed by the path of the Python
  49 interpreter.
  50
  51 The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
  52 scriptfile``.
  53
  54 The second can be done in a number of ways.  The most straightforward way is to
  55 write ::
  56
  57   #!/usr/local/bin/python
  58
  59 as the very first line of your file, using the pathname for where the Python
  60 interpreter is installed on your platform.
  61
  62 If you would like the script to be independent of where the Python interpreter
  63 lives, you can use the "env" program.  Almost all Unix variants support the
  64 following, assuming the python interpreter is in a directory on the user's
  65 $PATH::
  66
  67   #!/usr/bin/env python
  68
  69 *Don't* do this for CGI scripts.  The $PATH variable for CGI scripts is often
  70 very minimal, so you need to use the actual absolute pathname of the
  71 interpreter.
  72
  73 Occasionally, a user's environment is so full that the /usr/bin/env program
  74 fails; or there's no env program at all.  In that case, you can try the
  75 following hack (due to Alex Rezinsky)::
  76
  77    #! /bin/sh
  78    """:"
  79    exec python $0 ${1+"$@"}
  80    """
  81
  82 The minor disadvantage is that this defines the script's __doc__ string.
  83 However, you can fix that by adding ::
  84
  85    __doc__ = """...Whatever..."""
  86
  87
  88
  89 Is there a curses/termcap package for Python?
  90 ---------------------------------------------
  91
  92 .. XXX curses *is* built by default, isn't it?
  93
  94 For Unix variants: The standard Python source distribution comes with a curses
  95 module in the ``Modules/`` subdirectory, though it's not compiled by default
  96 (note that this is not available in the Windows distribution -- there is no
  97 curses module for Windows).
  98
  99 The curses module supports basic curses features as well as many additional
 100 functions from ncurses and SYSV curses such as colour, alternative character set
 101 support, pads, and mouse support. This means the module isn't compatible with
 102 operating systems that only have BSD curses, but there don't seem to be any
 103 currently maintained OSes that fall into this category.
 104
 105 For Windows: use `the consolelib module
 106 <http://effbot.org/zone/console-index.htm>`_.
 107
 108
 109 Is there an equivalent to C's onexit() in Python?
 110 -------------------------------------------------
 111
 112 The :mod:`atexit` module provides a register function that is similar to C's
 113 onexit.
 114
 115
 116 Why don't my signal handlers work?
 117 ----------------------------------
 118
 119 The most common problem is that the signal handler is declared with the wrong
 120 argument list.  It is called as ::
 121
 122    handler(signum, frame)
 123
 124 so it should be declared with two arguments::
 125
 126    def handler(signum, frame):
 127        ...
 128
 129
 130 Common tasks
 131 ============
 132
 133 How do I test a Python program or component?
 134 --------------------------------------------
 135
 136 Python comes with two testing frameworks.  The :mod:`doctest` module finds
 137 examples in the docstrings for a module and runs them, comparing the output with
 138 the expected output given in the docstring.
 139
 140 The :mod:`unittest` module is a fancier testing framework modelled on Java and
 141 Smalltalk testing frameworks.
 142
 143 For testing, it helps to write the program so that it may be easily tested by
 144 using good modular design.  Your program should have almost all functionality
 145 encapsulated in either functions or class methods -- and this sometimes has the
 146 surprising and delightful effect of making the program run faster (because local
 147 variable accesses are faster than global accesses).  Furthermore the program
 148 should avoid depending on mutating global variables, since this makes testing
 149 much more difficult to do.
 150
 151 The "global main logic" of your program may be as simple as ::
 152
 153    if __name__ == "__main__":
 154        main_logic()
 155
 156 at the bottom of the main module of your program.
 157
 158 Once your program is organized as a tractable collection of functions and class
 159 behaviours you should write test functions that exercise the behaviours.  A test
 160 suite can be associated with each module which automates a sequence of tests.
 161 This sounds like a lot of work, but since Python is so terse and flexible it's
 162 surprisingly easy.  You can make coding much more pleasant and fun by writing
 163 your test functions in parallel with the "production code", since this makes it
 164 easy to find bugs and even design flaws earlier.
 165
 166 "Support modules" that are not intended to be the main module of a program may
 167 include a self-test of the module. ::
 168
 169    if __name__ == "__main__":
 170        self_test()
 171
 172 Even programs that interact with complex external interfaces may be tested when
 173 the external interfaces are unavailable by using "fake" interfaces implemented
 174 in Python.
 175
 176
 177 How do I create documentation from doc strings?
 178 -----------------------------------------------
 179
 180 The :mod:`pydoc` module can create HTML from the doc strings in your Python
 181 source code.  An alternative for creating API documentation purely from
 182 docstrings is `epydoc <http://epydoc.sf.net/>`_.  `Sphinx
 183 <http://sphinx.pocoo.org>`_ can also include docstring content.
 184
 185
 186 How do I get a single keypress at a time?
 187 -----------------------------------------
 188
 189 For Unix variants: There are several solutions.  It's straightforward to do this
 190 using curses, but curses is a fairly large module to learn.  Here's a solution
 191 without curses::
 192
 193    import termios, fcntl, sys, os
 194    fd = sys.stdin.fileno()
 195
 196    oldterm = termios.tcgetattr(fd)
 197    newattr = termios.tcgetattr(fd)
 198    newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
 199    termios.tcsetattr(fd, termios.TCSANOW, newattr)
 200
 201    oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
 202    fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
 203
 204    try:
 205        while 1:
 206            try:
 207                c = sys.stdin.read(1)
 208                print "Got character", `c`
 209            except IOError: pass
 210    finally:
 211        termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
 212        fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
 213
 214 You need the :mod:`termios` and the :mod:`fcntl` module for any of this to work,
 215 and I've only tried it on Linux, though it should work elsewhere.  In this code,
 216 characters are read and printed one at a time.
 217
 218 :func:`termios.tcsetattr` turns off stdin's echoing and disables canonical mode.
 219 :func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags and modify
 220 them for non-blocking mode.  Since reading stdin when it is empty results in an
 221 :exc:`IOError`, this error is caught and ignored.
 222
 223
 224 Threads
 225 =======
 226
 227 How do I program using threads?
 228 -------------------------------
 229
 230 .. XXX it's _thread in py3k
 231
 232 Be sure to use the :mod:`threading` module and not the :mod:`thread` module.
 233 The :mod:`threading` module builds convenient abstractions on top of the
 234 low-level primitives provided by the :mod:`thread` module.
 235
 236 Aahz has a set of slides from his threading tutorial that are helpful; see
 237 http://www.pythoncraft.com/OSCON2001/.
 238
 239
 240 None of my threads seem to run: why?
 241 ------------------------------------
 242
 243 As soon as the main thread exits, all threads are killed.  Your main thread is
 244 running too quickly, giving the threads no time to do any work.
 245
 246 A simple fix is to add a sleep to the end of the program that's long enough for
 247 all the threads to finish::
 248
 249    import threading, time
 250
 251    def thread_task(name, n):
 252        for i in range(n): print name, i
 253
 254    for i in range(10):
 255        T = threading.Thread(target=thread_task, args=(str(i), i))
 256        T.start()
 257
 258    time.sleep(10) # <----------------------------!
 259
 260 But now (on many platforms) the threads don't run in parallel, but appear to run
 261 sequentially, one at a time!  The reason is that the OS thread scheduler doesn't
 262 start a new thread until the previous thread is blocked.
 263
 264 A simple fix is to add a tiny sleep to the start of the run function::
 265
 266    def thread_task(name, n):
 267        time.sleep(0.001) # <---------------------!
 268        for i in range(n): print name, i
 269
 270    for i in range(10):
 271        T = threading.Thread(target=thread_task, args=(str(i), i))
 272        T.start()
 273
 274    time.sleep(10)
 275
 276 Instead of trying to guess how long a :func:`time.sleep` delay will be enough,
 277 it's better to use some kind of semaphore mechanism.  One idea is to use the
 278 :mod:`Queue` module to create a queue object, let each thread append a token to
 279 the queue when it finishes, and let the main thread read as many tokens from the
 280 queue as there are threads.
 281
 282
 283 How do I parcel out work among a bunch of worker threads?
 284 ---------------------------------------------------------
 285
 286 Use the :mod:`Queue` module to create a queue containing a list of jobs.  The
 287 :class:`~Queue.Queue` class maintains a list of objects with ``.put(obj)`` to
 288 add an item to the queue and ``.get()`` to return an item.  The class will take
 289 care of the locking necessary to ensure that each job is handed out exactly
 290 once.
 291
 292 Here's a trivial example::
 293
 294    import threading, Queue, time
 295
 296    # The worker thread gets jobs off the queue.  When the queue is empty, it
 297    # assumes there will be no more work and exits.
 298    # (Realistically workers will run until terminated.)
 299    def worker ():
 300        print 'Running worker'
 301        time.sleep(0.1)
 302        while True:
 303            try:
 304                arg = q.get(block=False)
 305            except Queue.Empty:
 306                print 'Worker', threading.currentThread(),
 307                print 'queue empty'
 308                break
 309            else:
 310                print 'Worker', threading.currentThread(),
 311                print 'running with argument', arg
 312                time.sleep(0.5)
 313
 314    # Create queue
 315    q = Queue.Queue()
 316
 317    # Start a pool of 5 workers
 318    for i in range(5):
 319        t = threading.Thread(target=worker, name='worker %i' % (i+1))
 320        t.start()
 321
 322    # Begin adding work to the queue
 323    for i in range(50):
 324        q.put(i)
 325
 326    # Give threads time to run
 327    print 'Main thread sleeping'
 328    time.sleep(5)
 329
 330 When run, this will produce the following output:
 331
 332    Running worker
 333    Running worker
 334    Running worker
 335    Running worker
 336    Running worker
 337    Main thread sleeping
 338    Worker <Thread(worker 1, started)> running with argument 0
 339    Worker <Thread(worker 2, started)> running with argument 1
 340    Worker <Thread(worker 3, started)> running with argument 2
 341    Worker <Thread(worker 4, started)> running with argument 3
 342    Worker <Thread(worker 5, started)> running with argument 4
 343    Worker <Thread(worker 1, started)> running with argument 5
 344    ...
 345
 346 Consult the module's documentation for more details; the ``Queue`` class
 347 provides a featureful interface.
 348
 349
 350 What kinds of global value mutation are thread-safe?
 351 ----------------------------------------------------
 352
 353 A global interpreter lock (GIL) is used internally to ensure that only one
 354 thread runs in the Python VM at a time.  In general, Python offers to switch
 355 among threads only between bytecode instructions; how frequently it switches can
 356 be set via :func:`sys.setcheckinterval`.  Each bytecode instruction and
 357 therefore all the C implementation code reached from each instruction is
 358 therefore atomic from the point of view of a Python program.
 359
 360 In theory, this means an exact accounting requires an exact understanding of the
 361 PVM bytecode implementation.  In practice, it means that operations on shared
 362 variables of builtin data types (ints, lists, dicts, etc) that "look atomic"
 363 really are.
 364
 365 For example, the following operations are all atomic (L, L1, L2 are lists, D,
 366 D1, D2 are dicts, x, y are objects, i, j are ints)::
 367
 368    L.append(x)
 369    L1.extend(L2)
 370    x = L[i]
 371    x = L.pop()
 372    L1[i:j] = L2
 373    L.sort()
 374    x = y
 375    x.field = y
 376    D[x] = y
 377    D1.update(D2)
 378    D.keys()
 379
 380 These aren't::
 381
 382    i = i+1
 383    L.append(L[-1])
 384    L[i] = L[j]
 385    D[x] = D[x] + 1
 386
 387 Operations that replace other objects may invoke those other objects'
 388 :meth:`__del__` method when their reference count reaches zero, and that can
 389 affect things.  This is especially true for the mass updates to dictionaries and
 390 lists.  When in doubt, use a mutex!
 391
 392
 393 Can't we get rid of the Global Interpreter Lock?
 394 ------------------------------------------------
 395
 396 .. XXX mention multiprocessing
 397 .. XXX link to dbeazley's talk about GIL?
 398
 399 The Global Interpreter Lock (GIL) is often seen as a hindrance to Python's
 400 deployment on high-end multiprocessor server machines, because a multi-threaded
 401 Python program effectively only uses one CPU, due to the insistence that
 402 (almost) all Python code can only run while the GIL is held.
 403
 404 Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
 405 patch set (the "free threading" patches) that removed the GIL and replaced it
 406 with fine-grained locking.  Unfortunately, even on Windows (where locks are very
 407 efficient) this ran ordinary Python code about twice as slow as the interpreter
 408 using the GIL.  On Linux the performance loss was even worse because pthread
 409 locks aren't as efficient.
 410
 411 Since then, the idea of getting rid of the GIL has occasionally come up but
 412 nobody has found a way to deal with the expected slowdown, and users who don't
 413 use threads would not be happy if their code ran at half at the speed.  Greg's
 414 free threading patch set has not been kept up-to-date for later Python versions.
 415
 416 This doesn't mean that you can't make good use of Python on multi-CPU machines!
 417 You just have to be creative with dividing the work up between multiple
 418 *processes* rather than multiple *threads*.  Judicious use of C extensions will
 419 also help; if you use a C extension to perform a time-consuming task, the
 420 extension can release the GIL while the thread of execution is in the C code and
 421 allow other threads to get some work done.
 422
 423 It has been suggested that the GIL should be a per-interpreter-state lock rather
 424 than truly global; interpreters then wouldn't be able to share objects.
 425 Unfortunately, this isn't likely to happen either.  It would be a tremendous
 426 amount of work, because many object implementations currently have global state.
 427 For example, small integers and short strings are cached; these caches would
 428 have to be moved to the interpreter state.  Other object types have their own
 429 free list; these free lists would have to be moved to the interpreter state.
 430 And so on.
 431
 432 And I doubt that it can even be done in finite time, because the same problem
 433 exists for 3rd party extensions.  It is likely that 3rd party extensions are
 434 being written at a faster rate than you can convert them to store all their
 435 global state in the interpreter state.
 436
 437 And finally, once you have multiple interpreters not sharing any state, what
 438 have you gained over running each interpreter in a separate process?
 439
 440
 441 Input and Output
 442 ================
 443
 444 How do I delete a file? (And other file questions...)
 445 -----------------------------------------------------
 446
 447 Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
 448 the :mod:`os` module.  The two functions are identical; :func:`unlink` is simply
 449 the name of the Unix system call for this function.
 450
 451 To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
 452 ``os.makedirs(path)`` will create any intermediate directories in ``path`` that
 453 don't exist. ``os.removedirs(path)`` will remove intermediate directories as
 454 long as they're empty; if you want to delete an entire directory tree and its
 455 contents, use :func:`shutil.rmtree`.
 456
 457 To rename a file, use ``os.rename(old_path, new_path)``.
 458
 459 To truncate a file, open it using ``f = open(filename, "r+")``, and use
 460 ``f.truncate(offset)``; offset defaults to the current seek position.  There's
 461 also ```os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
 462 ``fd`` is the file descriptor (a small integer).
 463
 464 The :mod:`shutil` module also contains a number of functions to work on files
 465 including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
 466 :func:`~shutil.rmtree`.
 467
 468
 469 How do I copy a file?
 470 ---------------------
 471
 472 The :mod:`shutil` module contains a :func:`~shutil.copyfile` function.  Note
 473 that on MacOS 9 it doesn't copy the resource fork and Finder info.
 474
 475
 476 How do I read (or write) binary data?
 477 -------------------------------------
 478
 479 To read or write complex binary data formats, it's best to use the :mod:`struct`
 480 module.  It allows you to take a string containing binary data (usually numbers)
 481 and convert it to Python objects; and vice versa.
 482
 483 For example, the following code reads two 2-byte integers and one 4-byte integer
 484 in big-endian format from a file::
 485
 486    import struct
 487
 488    f = open(filename, "rb")  # Open in binary mode for portability
 489    s = f.read(8)
 490    x, y, z = struct.unpack(">hhl", s)
 491
 492 The '>' in the format string forces big-endian data; the letter 'h' reads one
 493 "short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
 494 string.
 495
 496 For data that is more regular (e.g. a homogeneous list of ints or thefloats),
 497 you can also use the :mod:`array` module.
 498
 499
 500 I can't seem to use os.read() on a pipe created with os.popen(); why?
 501 ---------------------------------------------------------------------
 502
 503 :func:`os.read` is a low-level function which takes a file descriptor, a small
 504 integer representing the opened file.  :func:`os.popen` creates a high-level
 505 file object, the same type returned by the builtin :func:`open` function.  Thus,
 506 to read n bytes from a pipe p created with :func:`os.popen`, you need to use
 507 ``p.read(n)``.
 508
 509
 510 How do I run a subprocess with pipes connected to both input and output?
 511 ------------------------------------------------------------------------
 512
 513 .. XXX update to use subprocess
 514
 515 Use the :mod:`popen2` module.  For example::
 516
 517    import popen2
 518    fromchild, tochild = popen2.popen2("command")
 519    tochild.write("input\n")
 520    tochild.flush()
 521    output = fromchild.readline()
 522
 523 Warning: in general it is unwise to do this because you can easily cause a
 524 deadlock where your process is blocked waiting for output from the child while
 525 the child is blocked waiting for input from you.  This can be caused because the
 526 parent expects the child to output more text than it does, or it can be caused
 527 by data being stuck in stdio buffers due to lack of flushing.  The Python parent
 528 can of course explicitly flush the data it sends to the child before it reads
 529 any output, but if the child is a naive C program it may have been written to
 530 never explicitly flush its output, even if it is interactive, since flushing is
 531 normally automatic.
 532
 533 Note that a deadlock is also possible if you use :func:`popen3` to read stdout
 534 and stderr. If one of the two is too large for the internal buffer (increasing
 535 the buffer size does not help) and you ``read()`` the other one first, there is
 536 a deadlock, too.
 537
 538 Note on a bug in popen2: unless your program calls ``wait()`` or ``waitpid()``,
 539 finished child processes are never removed, and eventually calls to popen2 will
 540 fail because of a limit on the number of child processes.  Calling
 541 :func:`os.waitpid` with the :data:`os.WNOHANG` option can prevent this; a good
 542 place to insert such a call would be before calling ``popen2`` again.
 543
 544 In many cases, all you really need is to run some data through a command and get
 545 the result back.  Unless the amount of data is very large, the easiest way to do
 546 this is to write it to a temporary file and run the command with that temporary
 547 file as input.  The standard module :mod:`tempfile` exports a ``mktemp()``
 548 function to generate unique temporary file names. ::
 549
 550    import tempfile
 551    import os
 552
 553    class Popen3:
 554        """
 555        This is a deadlock-safe version of popen that returns
 556        an object with errorlevel, out (a string) and err (a string).
 557        (capturestderr may not work under windows.)
 558        Example: print Popen3('grep spam','\n\nhere spam\n\n').out
 559        """
 560        def __init__(self,command,input=None,capturestderr=None):
 561            outfile=tempfile.mktemp()
 562            command="( %s ) > %s" % (command,outfile)
 563            if input:
 564                infile=tempfile.mktemp()
 565                open(infile,"w").write(input)
 566                command=command+" <"+infile
 567            if capturestderr:
 568                errfile=tempfile.mktemp()
 569                command=command+" 2>"+errfile
 570            self.errorlevel=os.system(command) >> 8
 571            self.out=open(outfile,"r").read()
 572            os.remove(outfile)
 573            if input:
 574                os.remove(infile)
 575            if capturestderr:
 576                self.err=open(errfile,"r").read()
 577                os.remove(errfile)
 578
 579 Note that many interactive programs (e.g. vi) don't work well with pipes
 580 substituted for standard input and output.  You will have to use pseudo ttys
 581 ("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
 582 "expect" library.  A Python extension that interfaces to expect is called "expy"
 583 and available from http://expectpy.sourceforge.net.  A pure Python solution that
 584 works like expect is `pexpect <http://pypi.python.org/pypi/pexpect/>`_.
 585
 586
 587 How do I access the serial (RS232) port?
 588 ----------------------------------------
 589
 590 For Win32, POSIX (Linux, BSD, etc.), Jython:
 591
 592    http://pyserial.sourceforge.net
 593
 594 For Unix, see a Usenet post by Mitch Chapman:
 595
 596    http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
 597
 598
 599 Why doesn't closing sys.stdout (stdin, stderr) really close it?
 600 ---------------------------------------------------------------
 601
 602 Python file objects are a high-level layer of abstraction on top of C streams,
 603 which in turn are a medium-level layer of abstraction on top of (among other
 604 things) low-level C file descriptors.
 605
 606 For most file objects you create in Python via the builtin ``file`` constructor,
 607 ``f.close()`` marks the Python file object as being closed from Python's point
 608 of view, and also arranges to close the underlying C stream.  This also happens
 609 automatically in f's destructor, when f becomes garbage.
 610
 611 But stdin, stdout and stderr are treated specially by Python, because of the
 612 special status also given to them by C.  Running ``sys.stdout.close()`` marks
 613 the Python-level file object as being closed, but does *not* close the
 614 associated C stream.
 615
 616 To close the underlying C stream for one of these three, you should first be
 617 sure that's what you really want to do (e.g., you may confuse extension modules
 618 trying to do I/O).  If it is, use os.close::
 619
 620     os.close(0)   # close C's stdin stream
 621     os.close(1)   # close C's stdout stream
 622     os.close(2)   # close C's stderr stream
 623
 624
 625 Network/Internet Programming
 626 ============================
 627
 628 What WWW tools are there for Python?
 629 ------------------------------------
 630
 631 See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
 632 Reference Manual.  Python has many modules that will help you build server-side
 633 and client-side web systems.
 634
 635 .. XXX check if wiki page is still up to date
 636
 637 A summary of available frameworks is maintained by Paul Boddie at
 638 http://wiki.python.org/moin/WebProgramming .
 639
 640 Cameron Laird maintains a useful set of pages about Python web technologies at
 641 http://phaseit.net/claird/comp.lang.python/web_python.
 642
 643
 644 How can I mimic CGI form submission (METHOD=POST)?
 645 --------------------------------------------------
 646
 647 I would like to retrieve web pages that are the result of POSTing a form. Is
 648 there existing code that would let me do this easily?
 649
 650 Yes. Here's a simple example that uses httplib::
 651
 652    #!/usr/local/bin/python
 653
 654    import httplib, sys, time
 655
 656    ### build the query string
 657    qs = "First=Josephine&MI=Q&Last=Public"
 658
 659    ### connect and send the server a path
 660    httpobj = httplib.HTTP('www.some-server.out-there', 80)
 661    httpobj.putrequest('POST', '/cgi-bin/some-cgi-script')
 662    ### now generate the rest of the HTTP headers...
 663    httpobj.putheader('Accept', '*/*')
 664    httpobj.putheader('Connection', 'Keep-Alive')
 665    httpobj.putheader('Content-type', 'application/x-www-form-urlencoded')
 666    httpobj.putheader('Content-length', '%d' % len(qs))
 667    httpobj.endheaders()
 668    httpobj.send(qs)
 669    ### find out what the server said in response...
 670    reply, msg, hdrs = httpobj.getreply()
 671    if reply != 200:
 672        sys.stdout.write(httpobj.getfile().read())
 673
 674 Note that in general for URL-encoded POST operations, query strings must be
 675 quoted by using :func:`urllib.quote`.  For example to send name="Guy Steele,
 676 Jr."::
 677
 678    >>> from urllib import quote
 679    >>> x = quote("Guy Steele, Jr.")
 680    >>> x
 681    'Guy%20Steele,%20Jr.'
 682    >>> query_string = "name="+x
 683    >>> query_string
 684    'name=Guy%20Steele,%20Jr.'
 685
 686
 687 What module should I use to help with generating HTML?
 688 ------------------------------------------------------
 689
 690 .. XXX add modern template languages
 691
 692 There are many different modules available:
 693
 694 * HTMLgen is a class library of objects corresponding to all the HTML 3.2 markup
 695   tags. It's used when you are writing in Python and wish to synthesize HTML
 696   pages for generating a web or for CGI forms, etc.
 697
 698 * DocumentTemplate and Zope Page Templates are two different systems that are
 699   part of Zope.
 700
 701 * Quixote's PTL uses Python syntax to assemble strings of text.
 702
 703 Consult the `Web Programming wiki pages
 704 <http://wiki.python.org/moin/WebProgramming>`_ for more links.
 705
 706
 707 How do I send mail from a Python script?
 708 ----------------------------------------
 709
 710 Use the standard library module :mod:`smtplib`.
 711
 712 Here's a very simple interactive mail sender that uses it.  This method will
 713 work on any host that supports an SMTP listener. ::
 714
 715    import sys, smtplib
 716
 717    fromaddr = raw_input("From: ")
 718    toaddrs  = raw_input("To: ").split(',')
 719    print "Enter message, end with ^D:"
 720    msg = ''
 721    while True:
 722        line = sys.stdin.readline()
 723        if not line:
 724            break
 725        msg += line
 726
 727    # The actual mail send
 728    server = smtplib.SMTP('localhost')
 729    server.sendmail(fromaddr, toaddrs, msg)
 730    server.quit()
 731
 732 A Unix-only alternative uses sendmail.  The location of the sendmail program
 733 varies between systems; sometimes it is ``/usr/lib/sendmail``, sometime
 734 ``/usr/sbin/sendmail``.  The sendmail manual page will help you out.  Here's
 735 some sample code::
 736
 737    SENDMAIL = "/usr/sbin/sendmail" # sendmail location
 738    import os
 739    p = os.popen("%s -t -i" % SENDMAIL, "w")
 740    p.write("To: receiver@example.com\n")
 741    p.write("Subject: test\n")
 742    p.write("\n") # blank line separating headers from body
 743    p.write("Some text\n")
 744    p.write("some more text\n")
 745    sts = p.close()
 746    if sts != 0:
 747        print "Sendmail exit status", sts
 748
 749
 750 How do I avoid blocking in the connect() method of a socket?
 751 ------------------------------------------------------------
 752
 753 The select module is commonly used to help with asynchronous I/O on sockets.
 754
 755 To prevent the TCP connect from blocking, you can set the socket to non-blocking
 756 mode.  Then when you do the ``connect()``, you will either connect immediately
 757 (unlikely) or get an exception that contains the error number as ``.errno``.
 758 ``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
 759 finished yet.  Different OSes will return different values, so you're going to
 760 have to check what's returned on your system.
 761
 762 You can use the ``connect_ex()`` method to avoid creating an exception.  It will
 763 just return the errno value.  To poll, you can call ``connect_ex()`` again later
 764 -- 0 or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
 765 socket to select to check if it's writable.
 766
 767
 768 Databases
 769 =========
 770
 771 Are there any interfaces to database packages in Python?
 772 --------------------------------------------------------
 773
 774 Yes.
 775
 776 .. XXX remove bsddb in py3k, fix other module names
 777
 778 Python 2.3 includes the :mod:`bsddb` package which provides an interface to the
 779 BerkeleyDB library.  Interfaces to disk-based hashes such as :mod:`DBM <dbm>`
 780 and :mod:`GDBM <gdbm>` are also included with standard Python.
 781
 782 Support for most relational databases is available.  See the
 783 `DatabaseProgramming wiki page
 784 <http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
 785
 786
 787 How do you implement persistent objects in Python?
 788 --------------------------------------------------
 789
 790 The :mod:`pickle` library module solves this in a very general way (though you
 791 still can't store things like open files, sockets or windows), and the
 792 :mod:`shelve` library module uses pickle and (g)dbm to create persistent
 793 mappings containing arbitrary Python objects.  For better performance, you can
 794 use the :mod:`cPickle` module.
 795
 796 A more awkward way of doing things is to use pickle's little sister, marshal.
 797 The :mod:`marshal` module provides very fast ways to store noncircular basic
 798 Python types to files and strings, and back again.  Although marshal does not do
 799 fancy things like store instances or handle shared references properly, it does
 800 run extremely fast.  For example loading a half megabyte of data may take less
 801 than a third of a second.  This often beats doing something more complex and
 802 general such as using gdbm with pickle/shelve.
 803
 804
 805 Why is cPickle so slow?
 806 -----------------------
 807
 808 .. XXX update this, default protocol is 2/3
 809
 810 The default format used by the pickle module is a slow one that results in
 811 readable pickles.  Making it the default, but it would break backward
 812 compatibility::
 813
 814     largeString = 'z' * (100 * 1024)
 815     myPickle = cPickle.dumps(largeString, protocol=1)
 816
 817
 818 If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
 819 ------------------------------------------------------------------------------------------
 820
 821 Databases opened for write access with the bsddb module (and often by the anydbm
 822 module, since it will preferentially use bsddb) must explicitly be closed using
 823 the ``.close()`` method of the database.  The underlying library caches database
 824 contents which need to be converted to on-disk form and written.
 825
 826 If you have initialized a new bsddb database but not written anything to it
 827 before the program crashes, you will often wind up with a zero-length file and
 828 encounter an exception the next time the file is opened.
 829
 830
 831 I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
 832 ----------------------------------------------------------------------------------------------------------------------------
 833
 834 Don't panic! Your data is probably intact. The most frequent cause for the error
 835 is that you tried to open an earlier Berkeley DB file with a later version of
 836 the Berkeley DB library.
 837
 838 Many Linux systems now have all three versions of Berkeley DB available.  If you
 839 are migrating from version 1 to a newer version use db_dump185 to dump a plain
 840 text version of the database.  If you are migrating from version 2 to version 3
 841 use db2_dump to create a plain text version of the database.  In either case,
 842 use db_load to create a new native database for the latest version installed on
 843 your computer.  If you have version 3 of Berkeley DB installed, you should be
 844 able to use db2_load to create a native version 2 database.
 845
 846 You should move away from Berkeley DB version 1 files because the hash file code
 847 contains known bugs that can corrupt your data.
 848
 849
 850 Mathematics and Numerics
 851 ========================
 852
 853 How do I generate random numbers in Python?
 854 -------------------------------------------
 855
 856 The standard module :mod:`random` implements a random number generator.  Usage
 857 is simple::
 858
 859    import random
 860    random.random()
 861
 862 This returns a random floating point number in the range [0, 1).
 863
 864 There are also many other specialized generators in this module, such as:
 865
 866 * ``randrange(a, b)`` chooses an integer in the range [a, b).
 867 * ``uniform(a, b)`` chooses a floating point number in the range [a, b).
 868 * ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
 869
 870 Some higher-level functions operate on sequences directly, such as:
 871
 872 * ``choice(S)`` chooses random element from a given sequence
 873 * ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
 874
 875 There's also a ``Random`` class you can instantiate to create independent
 876 multiple random number generators.