Doc/lib/libstdtypes.tex

   1 \section{Built-in Types \label{types}}
   2
   3 The following sections describe the standard types that are built into
   4 the interpreter.  Historically, Python's built-in types have differed
   5 from user-defined types because it was not possible to use the built-in
   6 types as the basis for object-oriented inheritance. With the 2.2
   7 release this situation has started to change, although the intended
   8 unification of user-defined and built-in types is as yet far from
   9 complete.
  10
  11 The principal built-in types are numerics, sequences, mappings, files
  12 classes, instances and exceptions.
  13 \indexii{built-in}{types}
  14
  15 Some operations are supported by several object types; in particular,
  16 practically all objects can be compared, tested for truth value,
  17 and converted to a string (with the \code{`\textrm{\ldots}`} notation,
  18 the equivalent \function{repr()} function, or the slightly different
  19 \function{str()} function).  The latter
  20 function is implicitly used when an object is written by the
  21 \keyword{print}\stindex{print} statement.
  22 (Information on \ulink{\keyword{print} statement}{../ref/print.html}
  23 and other language statements can be found in the
  24 \citetitle[../ref/ref.html]{Python Reference Manual} and the
  25 \citetitle[../tut/tut.html]{Python Tutorial}.)
  26
  27
  28 \subsection{Truth Value Testing\label{truth}}
  29
  30 Any object can be tested for truth value, for use in an \keyword{if} or
  31 \keyword{while} condition or as operand of the Boolean operations below.
  32 The following values are considered false:
  33 \stindex{if}
  34 \stindex{while}
  35 \indexii{truth}{value}
  36 \indexii{Boolean}{operations}
  37 \index{false}
  38
  39 \begin{itemize}
  40
  41 \item   \code{None}
  42         \withsubitem{(Built-in object)}{\ttindex{None}}
  43
  44 \item   \code{False}
  45         \withsubitem{(Built-in object)}{\ttindex{False}}
  46
  47 \item   zero of any numeric type, for example, \code{0}, \code{0L},
  48         \code{0.0}, \code{0j}.
  49
  50 \item   any empty sequence, for example, \code{''}, \code{()}, \code{[]}.
  51
  52 \item   any empty mapping, for example, \code{\{\}}.
  53
  54 \item   instances of user-defined classes, if the class defines a
  55         \method{__nonzero__()} or \method{__len__()} method, when that
  56         method returns the integer zero or \class{bool} value
  57         \code{False}.\footnote{Additional
  58 information on these special methods may be found in the
  59 \citetitle[../ref/ref.html]{Python Reference Manual}.}
  60
  61 \end{itemize}
  62
  63 All other values are considered true --- so objects of many types are
  64 always true.
  65 \index{true}
  66
  67 Operations and built-in functions that have a Boolean result always
  68 return \code{0} or \code{False} for false and \code{1} or \code{True}
  69 for true, unless otherwise stated.  (Important exception: the Boolean
  70 operations \samp{or}\opindex{or} and \samp{and}\opindex{and} always
  71 return one of their operands.)
  72 \index{False}
  73 \index{True}
  74
  75 \subsection{Boolean Operations ---
  76             \keyword{and}, \keyword{or}, \keyword{not}
  77             \label{boolean}}
  78
  79 These are the Boolean operations, ordered by ascending priority:
  80 \indexii{Boolean}{operations}
  81
  82 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
  83   \lineiii{\var{x} or \var{y}}
  84           {if \var{x} is false, then \var{y}, else \var{x}}{(1)}
  85   \lineiii{\var{x} and \var{y}}
  86           {if \var{x} is false, then \var{x}, else \var{y}}{(1)}
  87   \hline
  88   \lineiii{not \var{x}}
  89           {if \var{x} is false, then \code{True}, else \code{False}}{(2)}
  90 \end{tableiii}
  91 \opindex{and}
  92 \opindex{or}
  93 \opindex{not}
  94
  95 \noindent
  96 Notes:
  97
  98 \begin{description}
  99
 100 \item[(1)]
 101 These only evaluate their second argument if needed for their outcome.
 102
 103 \item[(2)]
 104 \samp{not} has a lower priority than non-Boolean operators, so
 105 \code{not \var{a} == \var{b}} is interpreted as \code{not (\var{a} ==
 106 \var{b})}, and \code{\var{a} == not \var{b}} is a syntax error.
 107
 108 \end{description}
 109
 110
 111 \subsection{Comparisons \label{comparisons}}
 112
 113 Comparison operations are supported by all objects.  They all have the
 114 same priority (which is higher than that of the Boolean operations).
 115 Comparisons can be chained arbitrarily; for example, \code{\var{x} <
 116 \var{y} <= \var{z}} is equivalent to \code{\var{x} < \var{y} and
 117 \var{y} <= \var{z}}, except that \var{y} is evaluated only once (but
 118 in both cases \var{z} is not evaluated at all when \code{\var{x} <
 119 \var{y}} is found to be false).
 120 \indexii{chaining}{comparisons}
 121
 122 This table summarizes the comparison operations:
 123
 124 \begin{tableiii}{c|l|c}{code}{Operation}{Meaning}{Notes}
 125   \lineiii{<}{strictly less than}{}
 126   \lineiii{<=}{less than or equal}{}
 127   \lineiii{>}{strictly greater than}{}
 128   \lineiii{>=}{greater than or equal}{}
 129   \lineiii{==}{equal}{}
 130   \lineiii{!=}{not equal}{(1)}
 131   \lineiii{<>}{not equal}{(1)}
 132   \lineiii{is}{object identity}{}
 133   \lineiii{is not}{negated object identity}{}
 134 \end{tableiii}
 135 \indexii{operator}{comparison}
 136 \opindex{==} % XXX *All* others have funny characters < ! >
 137 \opindex{is}
 138 \opindex{is not}
 139
 140 \noindent
 141 Notes:
 142
 143 \begin{description}
 144
 145 \item[(1)]
 146 \code{<>} and \code{!=} are alternate spellings for the same operator.
 147 \code{!=} is the preferred spelling; \code{<>} is obsolescent.
 148
 149 \end{description}
 150
 151 Objects of different types, except different numeric types and different string types, never
 152 compare equal; such objects are ordered consistently but arbitrarily
 153 (so that sorting a heterogeneous array yields a consistent result).
 154 Furthermore, some types (for example, file objects) support only a
 155 degenerate notion of comparison where any two objects of that type are
 156 unequal.  Again, such objects are ordered arbitrarily but
 157 consistently. The \code{<}, \code{<=}, \code{>} and \code{>=}
 158 operators will raise a \exception{TypeError} exception when any operand
 159 is a complex number.
 160 \indexii{object}{numeric}
 161 \indexii{objects}{comparing}
 162
 163 Instances of a class normally compare as non-equal unless the class
 164 \withsubitem{(instance method)}{\ttindex{__cmp__()}}
 165 defines the \method{__cmp__()} method.  Refer to the
 166 \citetitle[../ref/customization.html]{Python Reference Manual} for
 167 information on the use of this method to effect object comparisons.
 168
 169 \strong{Implementation note:} Objects of different types except
 170 numbers are ordered by their type names; objects of the same types
 171 that don't support proper comparison are ordered by their address.
 172
 173 Two more operations with the same syntactic priority,
 174 \samp{in}\opindex{in} and \samp{not in}\opindex{not in}, are supported
 175 only by sequence types (below).
 176
 177
 178 \subsection{Numeric Types ---
 179             \class{int}, \class{float}, \class{long}, \class{complex}
 180             \label{typesnumeric}}
 181
 182 There are four distinct numeric types: \dfn{plain integers},
 183 \dfn{long integers},
 184 \dfn{floating point numbers}, and \dfn{complex numbers}.
 185 In addition, Booleans are a subtype of plain integers.
 186 Plain integers (also just called \dfn{integers})
 187 are implemented using \ctype{long} in C, which gives them at least 32
 188 bits of precision.  Long integers have unlimited precision.  Floating
 189 point numbers are implemented using \ctype{double} in C.  All bets on
 190 their precision are off unless you happen to know the machine you are
 191 working with.
 192 \obindex{numeric}
 193 \obindex{Boolean}
 194 \obindex{integer}
 195 \obindex{long integer}
 196 \obindex{floating point}
 197 \obindex{complex number}
 198 \indexii{C}{language}
 199
 200 Complex numbers have a real and imaginary part, which are each
 201 implemented using \ctype{double} in C.  To extract these parts from
 202 a complex number \var{z}, use \code{\var{z}.real} and \code{\var{z}.imag}.
 203
 204 Numbers are created by numeric literals or as the result of built-in
 205 functions and operators.  Unadorned integer literals (including hex
 206 and octal numbers) yield plain integers unless the value they denote
 207 is too large to be represented as a plain integer, in which case
 208 they yield a long integer.  Integer literals with an
 209 \character{L} or \character{l} suffix yield long integers
 210 (\character{L} is preferred because \samp{1l} looks too much like
 211 eleven!).  Numeric literals containing a decimal point or an exponent
 212 sign yield floating point numbers.  Appending \character{j} or
 213 \character{J} to a numeric literal yields a complex number with a
 214 zero real part. A complex numeric literal is the sum of a real and
 215 an imaginary part.
 216 \indexii{numeric}{literals}
 217 \indexii{integer}{literals}
 218 \indexiii{long}{integer}{literals}
 219 \indexii{floating point}{literals}
 220 \indexii{complex number}{literals}
 221 \indexii{hexadecimal}{literals}
 222 \indexii{octal}{literals}
 223
 224 Python fully supports mixed arithmetic: when a binary arithmetic
 225 operator has operands of different numeric types, the operand with the
 226 ``narrower'' type is widened to that of the other, where plain
 227 integer is narrower than long integer is narrower than floating point is
 228 narrower than complex.
 229 Comparisons between numbers of mixed type use the same rule.\footnote{
 230         As a consequence, the list \code{[1, 2]} is considered equal
 231         to \code{[1.0, 2.0]}, and similarly for tuples.
 232 } The constructors \function{int()}, \function{long()}, \function{float()},
 233 and \function{complex()} can be used
 234 to produce numbers of a specific type.
 235 \index{arithmetic}
 236 \bifuncindex{int}
 237 \bifuncindex{long}
 238 \bifuncindex{float}
 239 \bifuncindex{complex}
 240
 241 All numeric types (except complex) support the following operations,
 242 sorted by ascending priority (operations in the same box have the same
 243 priority; all numeric operations have a higher priority than
 244 comparison operations):
 245
 246 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
 247   \lineiii{\var{x} + \var{y}}{sum of \var{x} and \var{y}}{}
 248   \lineiii{\var{x} - \var{y}}{difference of \var{x} and \var{y}}{}
 249   \hline
 250   \lineiii{\var{x} * \var{y}}{product of \var{x} and \var{y}}{}
 251   \lineiii{\var{x} / \var{y}}{quotient of \var{x} and \var{y}}{(1)}
 252   \lineiii{\var{x} \%{} \var{y}}{remainder of \code{\var{x} / \var{y}}}{(4)}
 253   \hline
 254   \lineiii{-\var{x}}{\var{x} negated}{}
 255   \lineiii{+\var{x}}{\var{x} unchanged}{}
 256   \hline
 257   \lineiii{abs(\var{x})}{absolute value or magnitude of \var{x}}{}
 258   \lineiii{int(\var{x})}{\var{x} converted to integer}{(2)}
 259   \lineiii{long(\var{x})}{\var{x} converted to long integer}{(2)}
 260   \lineiii{float(\var{x})}{\var{x} converted to floating point}{}
 261   \lineiii{complex(\var{re},\var{im})}{a complex number with real part \var{re}, imaginary part \var{im}.  \var{im} defaults to zero.}{}
 262   \lineiii{\var{c}.conjugate()}{conjugate of the complex number \var{c}}{}
 263   \lineiii{divmod(\var{x}, \var{y})}{the pair \code{(\var{x} // \var{y}, \var{x} \%{} \var{y})}}{(3)(4)}
 264   \lineiii{pow(\var{x}, \var{y})}{\var{x} to the power \var{y}}{}
 265   \lineiii{\var{x} ** \var{y}}{\var{x} to the power \var{y}}{}
 266 \end{tableiii}
 267 \indexiii{operations on}{numeric}{types}
 268 \withsubitem{(complex number method)}{\ttindex{conjugate()}}
 269
 270 \noindent
 271 Notes:
 272 \begin{description}
 273
 274 \item[(1)]
 275 For (plain or long) integer division, the result is an integer.
 276 The result is always rounded towards minus infinity: 1/2 is 0,
 277 (-1)/2 is -1, 1/(-2) is -1, and (-1)/(-2) is 0.  Note that the result
 278 is a long integer if either operand is a long integer, regardless of
 279 the numeric value.
 280 \indexii{integer}{division}
 281 \indexiii{long}{integer}{division}
 282
 283 \item[(2)]
 284 Conversion from floating point to (long or plain) integer may round or
 285 truncate as in C; see functions \function{floor()} and
 286 \function{ceil()} in the \refmodule{math}\refbimodindex{math} module
 287 for well-defined conversions.
 288 \withsubitem{(in module math)}{\ttindex{floor()}\ttindex{ceil()}}
 289 \indexii{numeric}{conversions}
 290 \indexii{C}{language}
 291
 292 \item[(3)]
 293 See section \ref{built-in-funcs}, ``Built-in Functions,'' for a full
 294 description.
 295
 296 \item[(4)]
 297 Complex floor division operator, modulo operator, and \function{divmod()}.
 298
 299 \deprecated{2.3}{Instead convert to float using \function{abs()}
 300 if appropriate.}
 301
 302 \end{description}
 303 % XXXJH exceptions: overflow (when? what operations?) zerodivision
 304
 305 \subsubsection{Bit-string Operations on Integer Types \label{bitstring-ops}}
 306 \nodename{Bit-string Operations}
 307
 308 Plain and long integer types support additional operations that make
 309 sense only for bit-strings.  Negative numbers are treated as their 2's
 310 complement value (for long integers, this assumes a sufficiently large
 311 number of bits that no overflow occurs during the operation).
 312
 313 The priorities of the binary bit-wise operations are all lower than
 314 the numeric operations and higher than the comparisons; the unary
 315 operation \samp{\~} has the same priority as the other unary numeric
 316 operations (\samp{+} and \samp{-}).
 317
 318 This table lists the bit-string operations sorted in ascending
 319 priority (operations in the same box have the same priority):
 320
 321 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
 322   \lineiii{\var{x} | \var{y}}{bitwise \dfn{or} of \var{x} and \var{y}}{}
 323   \lineiii{\var{x} \^{} \var{y}}{bitwise \dfn{exclusive or} of \var{x} and \var{y}}{}
 324   \lineiii{\var{x} \&{} \var{y}}{bitwise \dfn{and} of \var{x} and \var{y}}{}
 325   % The empty groups below prevent conversion to guillemets.
 326   \lineiii{\var{x} <{}< \var{n}}{\var{x} shifted left by \var{n} bits}{(1), (2)}
 327   \lineiii{\var{x} >{}> \var{n}}{\var{x} shifted right by \var{n} bits}{(1), (3)}
 328   \hline
 329   \lineiii{\~\var{x}}{the bits of \var{x} inverted}{}
 330 \end{tableiii}
 331 \indexiii{operations on}{integer}{types}
 332 \indexii{bit-string}{operations}
 333 \indexii{shifting}{operations}
 334 \indexii{masking}{operations}
 335
 336 \noindent
 337 Notes:
 338 \begin{description}
 339 \item[(1)] Negative shift counts are illegal and cause a
 340 \exception{ValueError} to be raised.
 341 \item[(2)] A left shift by \var{n} bits is equivalent to
 342 multiplication by \code{pow(2, \var{n})} without overflow check.
 343 \item[(3)] A right shift by \var{n} bits is equivalent to
 344 division by \code{pow(2, \var{n})} without overflow check.
 345 \end{description}
 346
 347
 348 \subsection{Iterator Types \label{typeiter}}
 349
 350 \versionadded{2.2}
 351 \index{iterator protocol}
 352 \index{protocol!iterator}
 353 \index{sequence!iteration}
 354 \index{container!iteration over}
 355
 356 Python supports a concept of iteration over containers.  This is
 357 implemented using two distinct methods; these are used to allow
 358 user-defined classes to support iteration.  Sequences, described below
 359 in more detail, always support the iteration methods.
 360
 361 One method needs to be defined for container objects to provide
 362 iteration support:
 363
 364 \begin{methoddesc}[container]{__iter__}{}
 365   Return an iterator object.  The object is required to support the
 366   iterator protocol described below.  If a container supports
 367   different types of iteration, additional methods can be provided to
 368   specifically request iterators for those iteration types.  (An
 369   example of an object supporting multiple forms of iteration would be
 370   a tree structure which supports both breadth-first and depth-first
 371   traversal.)  This method corresponds to the \member{tp_iter} slot of
 372   the type structure for Python objects in the Python/C API.
 373 \end{methoddesc}
 374
 375 The iterator objects themselves are required to support the following
 376 two methods, which together form the \dfn{iterator protocol}:
 377
 378 \begin{methoddesc}[iterator]{__iter__}{}
 379   Return the iterator object itself.  This is required to allow both
 380   containers and iterators to be used with the \keyword{for} and
 381   \keyword{in} statements.  This method corresponds to the
 382   \member{tp_iter} slot of the type structure for Python objects in
 383   the Python/C API.
 384 \end{methoddesc}
 385
 386 \begin{methoddesc}[iterator]{next}{}
 387   Return the next item from the container.  If there are no further
 388   items, raise the \exception{StopIteration} exception.  This method
 389   corresponds to the \member{tp_iternext} slot of the type structure
 390   for Python objects in the Python/C API.
 391 \end{methoddesc}
 392
 393 Python defines several iterator objects to support iteration over
 394 general and specific sequence types, dictionaries, and other more
 395 specialized forms.  The specific types are not important beyond their
 396 implementation of the iterator protocol.
 397
 398 The intention of the protocol is that once an iterator's
 399 \method{next()} method raises \exception{StopIteration}, it will
 400 continue to do so on subsequent calls.  Implementations that
 401 do not obey this property are deemed broken.  (This constraint
 402 was added in Python 2.3; in Python 2.2, various iterators are
 403 broken according to this rule.)
 404
 405 Python's generators provide a convenient way to implement the
 406 iterator protocol.  If a container object's \method{__iter__()}
 407 method is implemented as a generator, it will automatically
 408 return an iterator object (technically, a generator object)
 409 supplying the \method{__iter__()} and \method{next()} methods.
 410
 411
 412 \subsection{Sequence Types ---
 413             \class{str}, \class{unicode}, \class{list},
 414             \class{tuple}, \class{buffer}, \class{xrange}
 415             \label{typesseq}}
 416
 417 There are six sequence types: strings, Unicode strings, lists,
 418 tuples, buffers, and xrange objects.
 419
 420 String literals are written in single or double quotes:
 421 \code{'xyzzy'}, \code{"frobozz"}.  See chapter 2 of the
 422 \citetitle[../ref/strings.html]{Python Reference Manual} for more about
 423 string literals.  Unicode strings are much like strings, but are
 424 specified in the syntax using a preceding \character{u} character:
 425 \code{u'abc'}, \code{u"def"}.  Lists are constructed with square brackets,
 426 separating items with commas: \code{[a, b, c]}.  Tuples are
 427 constructed by the comma operator (not within square brackets), with
 428 or without enclosing parentheses, but an empty tuple must have the
 429 enclosing parentheses, such as \code{a, b, c} or \code{()}.  A single
 430 item tuple must have a trailing comma, such as \code{(d,)}.
 431 \obindex{sequence}
 432 \obindex{string}
 433 \obindex{Unicode}
 434 \obindex{tuple}
 435 \obindex{list}
 436
 437 Buffer objects are not directly supported by Python syntax, but can be
 438 created by calling the builtin function
 439 \function{buffer()}.\bifuncindex{buffer}  They don't support
 440 concatenation or repetition.
 441 \obindex{buffer}
 442
 443 Xrange objects are similar to buffers in that there is no specific
 444 syntax to create them, but they are created using the \function{xrange()}
 445 function.\bifuncindex{xrange}  They don't support slicing,
 446 concatenation or repetition, and using \code{in}, \code{not in},
 447 \function{min()} or \function{max()} on them is inefficient.
 448 \obindex{xrange}
 449
 450 Most sequence types support the following operations.  The \samp{in} and
 451 \samp{not in} operations have the same priorities as the comparison
 452 operations.  The \samp{+} and \samp{*} operations have the same
 453 priority as the corresponding numeric operations.\footnote{They must
 454 have since the parser can't tell the type of the operands.}
 455
 456 This table lists the sequence operations sorted in ascending priority
 457 (operations in the same box have the same priority).  In the table,
 458 \var{s} and \var{t} are sequences of the same type; \var{n}, \var{i}
 459 and \var{j} are integers:
 460
 461 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
 462   \lineiii{\var{x} in \var{s}}{\code{True} if an item of \var{s} is equal to \var{x}, else \code{False}}{(1)}
 463   \lineiii{\var{x} not in \var{s}}{\code{False} if an item of \var{s} is
 464 equal to \var{x}, else \code{True}}{(1)}
 465   \hline
 466   \lineiii{\var{s} + \var{t}}{the concatenation of \var{s} and \var{t}}{(6)}
 467   \lineiii{\var{s} * \var{n}\textrm{,} \var{n} * \var{s}}{\var{n} shallow copies of \var{s} concatenated}{(2)}
 468   \hline
 469   \lineiii{\var{s}[\var{i}]}{\var{i}'th item of \var{s}, origin 0}{(3)}
 470   \lineiii{\var{s}[\var{i}:\var{j}]}{slice of \var{s} from \var{i} to \var{j}}{(3), (4)}
 471   \lineiii{\var{s}[\var{i}:\var{j}:\var{k}]}{slice of \var{s} from \var{i} to \var{j} with step \var{k}}{(3), (5)}
 472   \hline
 473   \lineiii{len(\var{s})}{length of \var{s}}{}
 474   \lineiii{min(\var{s})}{smallest item of \var{s}}{}
 475   \lineiii{max(\var{s})}{largest item of \var{s}}{}
 476 \end{tableiii}
 477 \indexiii{operations on}{sequence}{types}
 478 \bifuncindex{len}
 479 \bifuncindex{min}
 480 \bifuncindex{max}
 481 \indexii{concatenation}{operation}
 482 \indexii{repetition}{operation}
 483 \indexii{subscript}{operation}
 484 \indexii{slice}{operation}
 485 \indexii{extended slice}{operation}
 486 \opindex{in}
 487 \opindex{not in}
 488
 489 \noindent
 490 Notes:
 491
 492 \begin{description}
 493 \item[(1)] When \var{s} is a string or Unicode string object the
 494 \code{in} and \code{not in} operations act like a substring test.  In
 495 Python versions before 2.3, \var{x} had to be a string of length 1.
 496 In Python 2.3 and beyond, \var{x} may be a string of any length.
 497
 498 \item[(2)] Values of \var{n} less than \code{0} are treated as
 499   \code{0} (which yields an empty sequence of the same type as
 500   \var{s}).  Note also that the copies are shallow; nested structures
 501   are not copied.  This often haunts new Python programmers; consider:
 502
 503 \begin{verbatim}
 504 >>> lists = [[]] * 3
 505 >>> lists
 506 [[], [], []]
 507 >>> lists[0].append(3)
 508 >>> lists
 509 [[3], [3], [3]]
 510 \end{verbatim}
 511
 512   What has happened is that \code{[[]]} is a one-element list containing
 513   an empty list, so all three elements of \code{[[]] * 3} are (pointers to)
 514   this single empty list.  Modifying any of the elements of \code{lists}
 515   modifies this single list.  You can create a list of different lists this
 516   way:
 517
 518 \begin{verbatim}
 519 >>> lists = [[] for i in range(3)]
 520 >>> lists[0].append(3)
 521 >>> lists[1].append(5)
 522 >>> lists[2].append(7)
 523 >>> lists
 524 [[3], [5], [7]]
 525 \end{verbatim}
 526
 527 \item[(3)] If \var{i} or \var{j} is negative, the index is relative to
 528   the end of the string: \code{len(\var{s}) + \var{i}} or
 529   \code{len(\var{s}) + \var{j}} is substituted.  But note that \code{-0} is
 530   still \code{0}.
 531
 532 \item[(4)] The slice of \var{s} from \var{i} to \var{j} is defined as
 533   the sequence of items with index \var{k} such that \code{\var{i} <=
 534   \var{k} < \var{j}}.  If \var{i} or \var{j} is greater than
 535   \code{len(\var{s})}, use \code{len(\var{s})}.  If \var{i} is omitted,
 536   use \code{0}.  If \var{j} is omitted, use \code{len(\var{s})}.  If
 537   \var{i} is greater than or equal to \var{j}, the slice is empty.
 538
 539 \item[(5)] The slice of \var{s} from \var{i} to \var{j} with step
 540   \var{k} is defined as the sequence of items with index
 541   \code{\var{x} = \var{i} + \var{n}*\var{k}} such that
 542   $0 \leq n < \frac{j-i}{k}$.  In other words, the indices
 543   are \code{i}, \code{i+k}, \code{i+2*k}, \code{i+3*k} and so on, stopping when
 544   \var{j} is reached (but never including \var{j}).  If \var{i} or \var{j}
 545   is greater than \code{len(\var{s})}, use \code{len(\var{s})}.  If
 546   \var{i} or \var{j} are omitted then they become ``end'' values
 547   (which end depends on the sign of \var{k}).  Note, \var{k} cannot
 548   be zero.
 549
 550 \item[(6)] If \var{s} and \var{t} are both strings, some Python
 551 implementations such as CPython can usually perform an in-place optimization
 552 for assignments of the form \code{\var{s}=\var{s}+\var{t}} or
 553 \code{\var{s}+=\var{t}}.  When applicable, this optimization makes
 554 quadratic run-time much less likely.  This optimization is both version
 555 and implementation dependent.  For performance sensitive code, it is
 556 preferable to use the \method{str.join()} method which assures consistent
 557 linear concatenation performance across versions and implementations.
 558 \versionchanged[Formerly, string concatenation never occurred in-place]{2.4}
 559
 560 \end{description}
 561
 562
 563 \subsubsection{String Methods \label{string-methods}}
 564
 565 These are the string methods which both 8-bit strings and Unicode
 566 objects support:
 567
 568 \begin{methoddesc}[string]{capitalize}{}
 569 Return a copy of the string with only its first character capitalized.
 570
 571 For 8-bit strings, this method is locale-dependent.
 572 \end{methoddesc}
 573
 574 \begin{methoddesc}[string]{center}{width\optional{, fillchar}}
 575 Return centered in a string of length \var{width}. Padding is done
 576 using the specified \var{fillchar} (default is a space).
 577 \versionchanged[Support for the \var{fillchar} argument]{2.4}
 578 \end{methoddesc}
 579
 580 \begin{methoddesc}[string]{count}{sub\optional{, start\optional{, end}}}
 581 Return the number of occurrences of substring \var{sub} in string
 582 S\code{[\var{start}:\var{end}]}.  Optional arguments \var{start} and
 583 \var{end} are interpreted as in slice notation.
 584 \end{methoddesc}
 585
 586 \begin{methoddesc}[string]{decode}{\optional{encoding\optional{, errors}}}
 587 Decodes the string using the codec registered for \var{encoding}.
 588 \var{encoding} defaults to the default string encoding.  \var{errors}
 589 may be given to set a different error handling scheme.  The default is
 590 \code{'strict'}, meaning that encoding errors raise
 591 \exception{UnicodeError}.  Other possible values are \code{'ignore'},
 592 \code{'replace'} and any other name registered via
 593 \function{codecs.register_error}, see section~\ref{codec-base-classes}.
 594 \versionadded{2.2}
 595 \versionchanged[Support for other error handling schemes added]{2.3}
 596 \end{methoddesc}
 597
 598 \begin{methoddesc}[string]{encode}{\optional{encoding\optional{,errors}}}
 599 Return an encoded version of the string.  Default encoding is the current
 600 default string encoding.  \var{errors} may be given to set a different
 601 error handling scheme.  The default for \var{errors} is
 602 \code{'strict'}, meaning that encoding errors raise a
 603 \exception{UnicodeError}.  Other possible values are \code{'ignore'},
 604 \code{'replace'}, \code{'xmlcharrefreplace'}, \code{'backslashreplace'}
 605 and any other name registered via \function{codecs.register_error},
 606 see section~\ref{codec-base-classes}.
 607 For a list of possible encodings, see section~\ref{standard-encodings}.
 608 \versionadded{2.0}
 609 \versionchanged[Support for \code{'xmlcharrefreplace'} and
 610 \code{'backslashreplace'} and other error handling schemes added]{2.3}
 611 \end{methoddesc}
 612
 613 \begin{methoddesc}[string]{endswith}{suffix\optional{, start\optional{, end}}}
 614 Return \code{True} if the string ends with the specified \var{suffix},
 615 otherwise return \code{False}.  With optional \var{start}, test beginning at
 616 that position.  With optional \var{end}, stop comparing at that position.
 617 \end{methoddesc}
 618
 619 \begin{methoddesc}[string]{expandtabs}{\optional{tabsize}}
 620 Return a copy of the string where all tab characters are expanded
 621 using spaces.  If \var{tabsize} is not given, a tab size of \code{8}
 622 characters is assumed.
 623 \end{methoddesc}
 624
 625 \begin{methoddesc}[string]{find}{sub\optional{, start\optional{, end}}}
 626 Return the lowest index in the string where substring \var{sub} is
 627 found, such that \var{sub} is contained in the range [\var{start},
 628 \var{end}].  Optional arguments \var{start} and \var{end} are
 629 interpreted as in slice notation.  Return \code{-1} if \var{sub} is
 630 not found.
 631 \end{methoddesc}
 632
 633 \begin{methoddesc}[string]{index}{sub\optional{, start\optional{, end}}}
 634 Like \method{find()}, but raise \exception{ValueError} when the
 635 substring is not found.
 636 \end{methoddesc}
 637
 638 \begin{methoddesc}[string]{isalnum}{}
 639 Return true if all characters in the string are alphanumeric and there
 640 is at least one character, false otherwise.
 641
 642 For 8-bit strings, this method is locale-dependent.
 643 \end{methoddesc}
 644
 645 \begin{methoddesc}[string]{isalpha}{}
 646 Return true if all characters in the string are alphabetic and there
 647 is at least one character, false otherwise.
 648
 649 For 8-bit strings, this method is locale-dependent.
 650 \end{methoddesc}
 651
 652 \begin{methoddesc}[string]{isdigit}{}
 653 Return true if all characters in the string are digits and there
 654 is at least one character, false otherwise.
 655
 656 For 8-bit strings, this method is locale-dependent.
 657 \end{methoddesc}
 658
 659 \begin{methoddesc}[string]{islower}{}
 660 Return true if all cased characters in the string are lowercase and
 661 there is at least one cased character, false otherwise.
 662
 663 For 8-bit strings, this method is locale-dependent.
 664 \end{methoddesc}
 665
 666 \begin{methoddesc}[string]{isspace}{}
 667 Return true if there are only whitespace characters in the string and
 668 there is at least one character, false otherwise.
 669
 670 For 8-bit strings, this method is locale-dependent.
 671 \end{methoddesc}
 672
 673 \begin{methoddesc}[string]{istitle}{}
 674 Return true if the string is a titlecased string and there is at least one
 675 character, for example uppercase characters may only follow uncased
 676 characters and lowercase characters only cased ones.  Return false
 677 otherwise.
 678
 679 For 8-bit strings, this method is locale-dependent.
 680 \end{methoddesc}
 681
 682 \begin{methoddesc}[string]{isupper}{}
 683 Return true if all cased characters in the string are uppercase and
 684 there is at least one cased character, false otherwise.
 685
 686 For 8-bit strings, this method is locale-dependent.
 687 \end{methoddesc}
 688
 689 \begin{methoddesc}[string]{join}{seq}
 690 Return a string which is the concatenation of the strings in the
 691 sequence \var{seq}.  The separator between elements is the string
 692 providing this method.
 693 \end{methoddesc}
 694
 695 \begin{methoddesc}[string]{ljust}{width\optional{, fillchar}}
 696 Return the string left justified in a string of length \var{width}.
 697 Padding is done using the specified \var{fillchar} (default is a
 698 space).  The original string is returned if
 699 \var{width} is less than \code{len(\var{s})}.
 700 \versionchanged[Support for the \var{fillchar} argument]{2.4}
 701 \end{methoddesc}
 702
 703 \begin{methoddesc}[string]{lower}{}
 704 Return a copy of the string converted to lowercase.
 705
 706 For 8-bit strings, this method is locale-dependent.
 707 \end{methoddesc}
 708
 709 \begin{methoddesc}[string]{lstrip}{\optional{chars}}
 710 Return a copy of the string with leading characters removed.  The
 711 \var{chars} argument is a string specifying the set of characters
 712 to be removed.  If omitted or \code{None}, the \var{chars} argument
 713 defaults to removing whitespace.  The \var{chars} argument is not
 714 a prefix; rather, all combinations of its values are stripped:
 715 \begin{verbatim}
 716     >>> '   spacious   '.lstrip()
 717     'spacious   '
 718     >>> 'www.example.com'.lstrip('cmowz.')
 719     'example.com'
 720 \end{verbatim}
 721 \versionchanged[Support for the \var{chars} argument]{2.2.2}
 722 \end{methoddesc}
 723
 724 \begin{methoddesc}[string]{replace}{old, new\optional{, count}}
 725 Return a copy of the string with all occurrences of substring
 726 \var{old} replaced by \var{new}.  If the optional argument
 727 \var{count} is given, only the first \var{count} occurrences are
 728 replaced.
 729 \end{methoddesc}
 730
 731 \begin{methoddesc}[string]{rfind}{sub \optional{,start \optional{,end}}}
 732 Return the highest index in the string where substring \var{sub} is
 733 found, such that \var{sub} is contained within s[start,end].  Optional
 734 arguments \var{start} and \var{end} are interpreted as in slice
 735 notation.  Return \code{-1} on failure.
 736 \end{methoddesc}
 737
 738 \begin{methoddesc}[string]{rindex}{sub\optional{, start\optional{, end}}}
 739 Like \method{rfind()} but raises \exception{ValueError} when the
 740 substring \var{sub} is not found.
 741 \end{methoddesc}
 742
 743 \begin{methoddesc}[string]{rjust}{width\optional{, fillchar}}
 744 Return the string right justified in a string of length \var{width}.
 745 Padding is done using the specified \var{fillchar} (default is a space).
 746 The original string is returned if
 747 \var{width} is less than \code{len(\var{s})}.
 748 \versionchanged[Support for the \var{fillchar} argument]{2.4}
 749 \end{methoddesc}
 750
 751 \begin{methoddesc}[string]{rsplit}{\optional{sep \optional{,maxsplit}}}
 752 Return a list of the words in the string, using \var{sep} as the
 753 delimiter string.  If \var{maxsplit} is given, at most \var{maxsplit}
 754 splits are done, the \emph{rightmost} ones.  If \var{sep} is not specified
 755 or \code{None}, any whitespace string is a separator.  Except for splitting
 756 from the right, \method{rsplit()} behaves like \method{split()} which
 757 is described in detail below.
 758 \versionadded{2.4}
 759 \end{methoddesc}
 760
 761 \begin{methoddesc}[string]{rstrip}{\optional{chars}}
 762 Return a copy of the string with trailing characters removed.  The
 763 \var{chars} argument is a string specifying the set of characters
 764 to be removed.  If omitted or \code{None}, the \var{chars} argument
 765 defaults to removing whitespace.  The \var{chars} argument is not
 766 a suffix; rather, all combinations of its values are stripped:
 767 \begin{verbatim}
 768     >>> '   spacious   '.rstrip()
 769     '   spacious'
 770     >>> 'mississippi'.rstrip('ipz')
 771     'mississ'
 772 \end{verbatim}
 773 \versionchanged[Support for the \var{chars} argument]{2.2.2}
 774 \end{methoddesc}
 775
 776 \begin{methoddesc}[string]{split}{\optional{sep \optional{,maxsplit}}}
 777 Return a list of the words in the string, using \var{sep} as the
 778 delimiter string.  If \var{maxsplit} is given, at most \var{maxsplit}
 779 splits are done. (thus, the list will have at most \code{\var{maxsplit}+1}
 780 elements).  If \var{maxsplit} is not specified, then there
 781 is no limit on the number of splits (all possible splits are made).
 782 Consecutive delimiters are not grouped together and are
 783 deemed to delimit empty strings (for example, \samp{'1,,2'.split(',')}
 784 returns \samp{['1', '', '2']}).  The \var{sep} argument may consist of
 785 multiple characters (for example, \samp{'1, 2, 3'.split(', ')} returns
 786 \samp{['1', '2', '3']}).  Splitting an empty string with a specified
 787 separator returns \samp{['']}.
 788
 789 If \var{sep} is not specified or is \code{None}, a different splitting
 790 algorithm is applied.  First, whitespace characters (spaces, tabs,
 791 newlines, returns, and formfeeds) are stripped from both ends.  Then,
 792 words are separated by arbitrary length strings of whitespace
 793 characters. Consecutive whitespace delimiters are treated as a single
 794 delimiter (\samp{'1  2  3'.split()} returns \samp{['1', '2', '3']}).
 795 Splitting an empty string or a string consisting of just whitespace
 796 returns an empty list.
 797 \end{methoddesc}
 798
 799 \begin{methoddesc}[string]{splitlines}{\optional{keepends}}
 800 Return a list of the lines in the string, breaking at line
 801 boundaries.  Line breaks are not included in the resulting list unless
 802 \var{keepends} is given and true.
 803 \end{methoddesc}
 804
 805 \begin{methoddesc}[string]{startswith}{prefix\optional{,
 806                                        start\optional{, end}}}
 807 Return \code{True} if string starts with the \var{prefix}, otherwise
 808 return \code{False}.  With optional \var{start}, test string beginning at
 809 that position.  With optional \var{end}, stop comparing string at that
 810 position.
 811 \end{methoddesc}
 812
 813 \begin{methoddesc}[string]{strip}{\optional{chars}}
 814 Return a copy of the string with the leading and trailing characters
 815 removed.  The \var{chars} argument is a string specifying the set of
 816 characters to be removed.  If omitted or \code{None}, the \var{chars}
 817 argument defaults to removing whitespace.  The \var{chars} argument is not
 818 a prefix or suffix; rather, all combinations of its values are stripped:
 819 \begin{verbatim}
 820     >>> '   spacious   '.strip()
 821     'spacious'
 822     >>> 'www.example.com'.strip('cmowz.')
 823     'example'
 824 \end{verbatim}
 825 \versionchanged[Support for the \var{chars} argument]{2.2.2}
 826 \end{methoddesc}
 827
 828 \begin{methoddesc}[string]{swapcase}{}
 829 Return a copy of the string with uppercase characters converted to
 830 lowercase and vice versa.
 831
 832 For 8-bit strings, this method is locale-dependent.
 833 \end{methoddesc}
 834
 835 \begin{methoddesc}[string]{title}{}
 836 Return a titlecased version of the string: words start with uppercase
 837 characters, all remaining cased characters are lowercase.
 838
 839 For 8-bit strings, this method is locale-dependent.
 840 \end{methoddesc}
 841
 842 \begin{methoddesc}[string]{translate}{table\optional{, deletechars}}
 843 Return a copy of the string where all characters occurring in the
 844 optional argument \var{deletechars} are removed, and the remaining
 845 characters have been mapped through the given translation table, which
 846 must be a string of length 256.
 847
 848 For Unicode objects, the \method{translate()} method does not
 849 accept the optional \var{deletechars} argument.  Instead, it
 850 returns a copy of the \var{s} where all characters have been mapped
 851 through the given translation table which must be a mapping of
 852 Unicode ordinals to Unicode ordinals, Unicode strings or \code{None}.
 853 Unmapped characters are left untouched. Characters mapped to \code{None}
 854 are deleted.  Note, a more flexible approach is to create a custom
 855 character mapping codec using the \refmodule{codecs} module (see
 856 \module{encodings.cp1251} for an example).
 857 \end{methoddesc}
 858
 859 \begin{methoddesc}[string]{upper}{}
 860 Return a copy of the string converted to uppercase.
 861
 862 For 8-bit strings, this method is locale-dependent.
 863 \end{methoddesc}
 864
 865 \begin{methoddesc}[string]{zfill}{width}
 866 Return the numeric string left filled with zeros in a string
 867 of length \var{width}. The original string is returned if
 868 \var{width} is less than \code{len(\var{s})}.
 869 \versionadded{2.2.2}
 870 \end{methoddesc}
 871
 872
 873 \subsubsection{String Formatting Operations \label{typesseq-strings}}
 874
 875 \index{formatting, string (\%{})}
 876 \index{interpolation, string (\%{})}
 877 \index{string!formatting}
 878 \index{string!interpolation}
 879 \index{printf-style formatting}
 880 \index{sprintf-style formatting}
 881 \index{\protect\%{} formatting}
 882 \index{\protect\%{} interpolation}
 883
 884 String and Unicode objects have one unique built-in operation: the
 885 \code{\%} operator (modulo).  This is also known as the string
 886 \emph{formatting} or \emph{interpolation} operator.  Given
 887 \code{\var{format} \% \var{values}} (where \var{format} is a string or
 888 Unicode object), \code{\%} conversion specifications in \var{format}
 889 are replaced with zero or more elements of \var{values}.  The effect
 890 is similar to the using \cfunction{sprintf()} in the C language.  If
 891 \var{format} is a Unicode object, or if any of the objects being
 892 converted using the \code{\%s} conversion are Unicode objects, the
 893 result will also be a Unicode object.
 894
 895 If \var{format} requires a single argument, \var{values} may be a
 896 single non-tuple object.\footnote{To format only a tuple you
 897 should therefore provide a singleton tuple whose only element
 898 is the tuple to be formatted.}  Otherwise, \var{values} must be a tuple with
 899 exactly the number of items specified by the format string, or a
 900 single mapping object (for example, a dictionary).
 901
 902 A conversion specifier contains two or more characters and has the
 903 following components, which must occur in this order:
 904
 905 \begin{enumerate}
 906   \item  The \character{\%} character, which marks the start of the
 907          specifier.
 908   \item  Mapping key (optional), consisting of a parenthesised sequence
 909          of characters (for example, \code{(somename)}).
 910   \item  Conversion flags (optional), which affect the result of some
 911          conversion types.
 912   \item  Minimum field width (optional).  If specified as an
 913          \character{*} (asterisk), the actual width is read from the
 914          next element of the tuple in \var{values}, and the object to
 915          convert comes after the minimum field width and optional
 916          precision.
 917   \item  Precision (optional), given as a \character{.} (dot) followed
 918          by the precision.  If specified as \character{*} (an
 919          asterisk), the actual width is read from the next element of
 920          the tuple in \var{values}, and the value to convert comes after
 921          the precision.
 922   \item  Length modifier (optional).
 923   \item  Conversion type.
 924 \end{enumerate}
 925
 926 When the right argument is a dictionary (or other mapping type), then
 927 the formats in the string \emph{must} include a parenthesised mapping key into
 928 that dictionary inserted immediately after the \character{\%}
 929 character. The mapping key selects the value to be formatted from the
 930 mapping.  For example:
 931
 932 \begin{verbatim}
 933 >>> print '%(language)s has %(#)03d quote types.' % \
 934           {'language': "Python", "#": 2}
 935 Python has 002 quote types.
 936 \end{verbatim}
 937
 938 In this case no \code{*} specifiers may occur in a format (since they
 939 require a sequential parameter list).
 940
 941 The conversion flag characters are:
 942
 943 \begin{tableii}{c|l}{character}{Flag}{Meaning}
 944   \lineii{\#}{The value conversion will use the ``alternate form''
 945               (where defined below).}
 946   \lineii{0}{The conversion will be zero padded for numeric values.}
 947   \lineii{-}{The converted value is left adjusted (overrides
 948              the \character{0} conversion if both are given).}
 949   \lineii{{~}}{(a space) A blank should be left before a positive number
 950              (or empty string) produced by a signed conversion.}
 951   \lineii{+}{A sign character (\character{+} or \character{-}) will
 952              precede the conversion (overrides a "space" flag).}
 953 \end{tableii}
 954
 955 A length modifier (\code{h}, \code{l}, or \code{L}) may be
 956 present, but is ignored as it is not necessary for Python.
 957
 958 The conversion types are:
 959
 960 \begin{tableiii}{c|l|c}{character}{Conversion}{Meaning}{Notes}
 961   \lineiii{d}{Signed integer decimal.}{}
 962   \lineiii{i}{Signed integer decimal.}{}
 963   \lineiii{o}{Unsigned octal.}{(1)}
 964   \lineiii{u}{Unsigned decimal.}{}
 965   \lineiii{x}{Unsigned hexadecimal (lowercase).}{(2)}
 966   \lineiii{X}{Unsigned hexadecimal (uppercase).}{(2)}
 967   \lineiii{e}{Floating point exponential format (lowercase).}{}
 968   \lineiii{E}{Floating point exponential format (uppercase).}{}
 969   \lineiii{f}{Floating point decimal format.}{}
 970   \lineiii{F}{Floating point decimal format.}{}
 971   \lineiii{g}{Same as \character{e} if exponent is greater than -4 or
 972               less than precision, \character{f} otherwise.}{}
 973   \lineiii{G}{Same as \character{E} if exponent is greater than -4 or
 974               less than precision, \character{F} otherwise.}{}
 975   \lineiii{c}{Single character (accepts integer or single character
 976               string).}{}
 977   \lineiii{r}{String (converts any python object using
 978               \function{repr()}).}{(3)}
 979   \lineiii{s}{String (converts any python object using
 980               \function{str()}).}{(4)}
 981   \lineiii{\%}{No argument is converted, results in a \character{\%}
 982                character in the result.}{}
 983 \end{tableiii}
 984
 985 \noindent
 986 Notes:
 987 \begin{description}
 988   \item[(1)]
 989     The alternate form causes a leading zero (\character{0}) to be
 990     inserted between left-hand padding and the formatting of the
 991     number if the leading character of the result is not already a
 992     zero.
 993   \item[(2)]
 994     The alternate form causes a leading \code{'0x'} or \code{'0X'}
 995     (depending on whether the \character{x} or \character{X} format
 996     was used) to be inserted between left-hand padding and the
 997     formatting of the number if the leading character of the result is
 998     not already a zero.
 999   \item[(3)]
1000     The \code{\%r} conversion was added in Python 2.0.
1001   \item[(4)]
1002     If the object or format provided is a \class{unicode} string,
1003     the resulting string will also be \class{unicode}.
1004 \end{description}
1005
1006 % XXX Examples?
1007
1008 Since Python strings have an explicit length, \code{\%s} conversions
1009 do not assume that \code{'\e0'} is the end of the string.
1010
1011 For safety reasons, floating point precisions are clipped to 50;
1012 \code{\%f} conversions for numbers whose absolute value is over 1e25
1013 are replaced by \code{\%g} conversions.\footnote{
1014   These numbers are fairly arbitrary.  They are intended to
1015   avoid printing endless strings of meaningless digits without hampering
1016   correct use and without having to know the exact precision of floating
1017   point values on a particular machine.
1018 }  All other errors raise exceptions.
1019
1020 Additional string operations are defined in standard modules
1021 \refmodule{string}\refstmodindex{string}\ and
1022 \refmodule{re}.\refstmodindex{re}
1023
1024
1025 \subsubsection{XRange Type \label{typesseq-xrange}}
1026
1027 The \class{xrange}\obindex{xrange} type is an immutable sequence which
1028 is commonly used for looping.  The advantage of the \class{xrange}
1029 type is that an \class{xrange} object will always take the same amount
1030 of memory, no matter the size of the range it represents.  There are
1031 no consistent performance advantages.
1032
1033 XRange objects have very little behavior: they only support indexing,
1034 iteration, and the \function{len()} function.
1035
1036
1037 \subsubsection{Mutable Sequence Types \label{typesseq-mutable}}
1038
1039 List objects support additional operations that allow in-place
1040 modification of the object.
1041 Other mutable sequence types (when added to the language) should
1042 also support these operations.
1043 Strings and tuples are immutable sequence types: such objects cannot
1044 be modified once created.
1045 The following operations are defined on mutable sequence types (where
1046 \var{x} is an arbitrary object):
1047 \indexiii{mutable}{sequence}{types}
1048 \obindex{list}
1049
1050 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
1051   \lineiii{\var{s}[\var{i}] = \var{x}}
1052         {item \var{i} of \var{s} is replaced by \var{x}}{}
1053   \lineiii{\var{s}[\var{i}:\var{j}] = \var{t}}
1054         {slice of \var{s} from \var{i} to \var{j} is replaced by \var{t}}{}
1055   \lineiii{del \var{s}[\var{i}:\var{j}]}
1056         {same as \code{\var{s}[\var{i}:\var{j}] = []}}{}
1057   \lineiii{\var{s}[\var{i}:\var{j}:\var{k}] = \var{t}}
1058         {the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} are replaced by those of \var{t}}{(1)}
1059   \lineiii{del \var{s}[\var{i}:\var{j}:\var{k}]}
1060         {removes the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} from the list}{}
1061   \lineiii{\var{s}.append(\var{x})}
1062         {same as \code{\var{s}[len(\var{s}):len(\var{s})] = [\var{x}]}}{(2)}
1063   \lineiii{\var{s}.extend(\var{x})}
1064         {same as \code{\var{s}[len(\var{s}):len(\var{s})] = \var{x}}}{(3)}
1065   \lineiii{\var{s}.count(\var{x})}
1066     {return number of \var{i}'s for which \code{\var{s}[\var{i}] == \var{x}}}{}
1067   \lineiii{\var{s}.index(\var{x}\optional{, \var{i}\optional{, \var{j}}})}
1068     {return smallest \var{k} such that \code{\var{s}[\var{k}] == \var{x}} and
1069     \code{\var{i} <= \var{k} < \var{j}}}{(4)}
1070   \lineiii{\var{s}.insert(\var{i}, \var{x})}
1071         {same as \code{\var{s}[\var{i}:\var{i}] = [\var{x}]}}{(5)}
1072   \lineiii{\var{s}.pop(\optional{\var{i}})}
1073     {same as \code{\var{x} = \var{s}[\var{i}]; del \var{s}[\var{i}]; return \var{x}}}{(6)}
1074   \lineiii{\var{s}.remove(\var{x})}
1075         {same as \code{del \var{s}[\var{s}.index(\var{x})]}}{(4)}
1076   \lineiii{\var{s}.reverse()}
1077         {reverses the items of \var{s} in place}{(7)}
1078   \lineiii{\var{s}.sort(\optional{\var{cmp}\optional{,
1079                         \var{key}\optional{, \var{reverse}}}})}
1080         {sort the items of \var{s} in place}{(7), (8), (9), (10)}
1081 \end{tableiii}
1082 \indexiv{operations on}{mutable}{sequence}{types}
1083 \indexiii{operations on}{sequence}{types}
1084 \indexiii{operations on}{list}{type}
1085 \indexii{subscript}{assignment}
1086 \indexii{slice}{assignment}
1087 \indexii{extended slice}{assignment}
1088 \stindex{del}
1089 \withsubitem{(list method)}{
1090   \ttindex{append()}\ttindex{extend()}\ttindex{count()}\ttindex{index()}
1091   \ttindex{insert()}\ttindex{pop()}\ttindex{remove()}\ttindex{reverse()}
1092   \ttindex{sort()}}
1093 \noindent
1094 Notes:
1095 \begin{description}
1096 \item[(1)] \var{t} must have the same length as the slice it is
1097   replacing.
1098
1099 \item[(2)] The C implementation of Python has historically accepted
1100   multiple parameters and implicitly joined them into a tuple; this
1101   no longer works in Python 2.0.  Use of this misfeature has been
1102   deprecated since Python 1.4.
1103
1104 \item[(3)] \var{x} can be any iterable object.
1105
1106 \item[(4)] Raises \exception{ValueError} when \var{x} is not found in
1107   \var{s}. When a negative index is passed as the second or third parameter
1108   to the \method{index()} method, the list length is added, as for slice
1109   indices.  If it is still negative, it is truncated to zero, as for
1110   slice indices.  \versionchanged[Previously, \method{index()} didn't
1111   have arguments for specifying start and stop positions]{2.3}
1112
1113 \item[(5)] When a negative index is passed as the first parameter to
1114   the \method{insert()} method, the list length is added, as for slice
1115   indices.  If it is still negative, it is truncated to zero, as for
1116   slice indices.  \versionchanged[Previously, all negative indices
1117   were truncated to zero]{2.3}
1118
1119 \item[(6)] The \method{pop()} method is only supported by the list and
1120   array types.  The optional argument \var{i} defaults to \code{-1},
1121   so that by default the last item is removed and returned.
1122
1123 \item[(7)] The \method{sort()} and \method{reverse()} methods modify the
1124   list in place for economy of space when sorting or reversing a large
1125   list.  To remind you that they operate by side effect, they don't return
1126   the sorted or reversed list.
1127
1128 \item[(8)] The \method{sort()} method takes optional arguments for
1129   controlling the comparisons.
1130
1131   \var{cmp} specifies a custom comparison function of two arguments
1132      (list items) which should return a negative, zero or positive number
1133      depending on whether the first argument is considered smaller than,
1134      equal to, or larger than the second argument:
1135      \samp{\var{cmp}=\keyword{lambda} \var{x},\var{y}:
1136      \function{cmp}(x.lower(), y.lower())}
1137
1138   \var{key} specifies a function of one argument that is used to
1139      extract a comparison key from each list element:
1140      \samp{\var{key}=\function{str.lower}}
1141
1142   \var{reverse} is a boolean value.  If set to \code{True}, then the
1143      list elements are sorted as if each comparison were reversed.
1144
1145   In general, the \var{key} and \var{reverse} conversion processes are
1146   much faster than specifying an equivalent \var{cmp} function.  This is
1147   because \var{cmp} is called multiple times for each list element while
1148   \var{key} and \var{reverse} touch each element only once.
1149
1150   \versionchanged[Support for \code{None} as an equivalent to omitting
1151   \var{cmp} was added]{2.3}
1152
1153   \versionchanged[Support for \var{key} and \var{reverse} was added]{2.4}
1154
1155 \item[(9)] Starting with Python 2.3, the \method{sort()} method is
1156   guaranteed to be stable.  A sort is stable if it guarantees not to
1157   change the relative order of elements that compare equal --- this is
1158   helpful for sorting in multiple passes (for example, sort by
1159   department, then by salary grade).
1160
1161 \item[(10)] While a list is being sorted, the effect of attempting to
1162   mutate, or even inspect, the list is undefined.  The C
1163   implementation of Python 2.3 and newer makes the list appear empty
1164   for the duration, and raises \exception{ValueError} if it can detect
1165   that the list has been mutated during a sort.
1166 \end{description}
1167
1168 \subsection{Set Types ---
1169             \class{set}, \class{frozenset}
1170             \label{types-set}}
1171 \obindex{set}
1172
1173 A \dfn{set} object is an unordered collection of immutable values.
1174 Common uses include membership testing, removing duplicates from a sequence,
1175 and computing mathematical operations such as intersection, union, difference,
1176 and symmetric difference.
1177 \versionadded{2.4}
1178
1179 Like other collections, sets support \code{\var{x} in \var{set}},
1180 \code{len(\var{set})}, and \code{for \var{x} in \var{set}}.  Being an
1181 unordered collection, sets do not record element position or order of
1182 insertion.  Accordingly, sets do not support indexing, slicing, or
1183 other sequence-like behavior.
1184
1185 There are currently two builtin set types, \class{set} and \class{frozenset}.
1186 The \class{set} type is mutable --- the contents can be changed using methods
1187 like \method{add()} and \method{remove()}.  Since it is mutable, it has no
1188 hash value and cannot be used as either a dictionary key or as an element of
1189 another set.  The \class{frozenset} type is immutable and hashable --- its
1190 contents cannot be altered after is created; however, it can be used as
1191 a dictionary key or as an element of another set.
1192
1193 Instances of \class{set} and \class{frozenset} provide the following operations:
1194
1195 \begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
1196   \lineiii{len(\var{s})}{}{cardinality of set \var{s}}
1197
1198   \hline
1199   \lineiii{\var{x} in \var{s}}{}
1200          {test \var{x} for membership in \var{s}}
1201   \lineiii{\var{x} not in \var{s}}{}
1202          {test \var{x} for non-membership in \var{s}}
1203   \lineiii{\var{s}.issubset(\var{t})}{\code{\var{s} <= \var{t}}}
1204          {test whether every element in \var{s} is in \var{t}}
1205   \lineiii{\var{s}.issuperset(\var{t})}{\code{\var{s} >= \var{t}}}
1206          {test whether every element in \var{t} is in \var{s}}
1207
1208   \hline
1209   \lineiii{\var{s}.union(\var{t})}{\var{s} | \var{t}}
1210          {new set with elements from both \var{s} and \var{t}}
1211   \lineiii{\var{s}.intersection(\var{t})}{\var{s} \&\ \var{t}}
1212          {new set with elements common to \var{s} and \var{t}}
1213   \lineiii{\var{s}.difference(\var{t})}{\var{s} - \var{t}}
1214          {new set with elements in \var{s} but not in \var{t}}
1215   \lineiii{\var{s}.symmetric_difference(\var{t})}{\var{s} \^\ \var{t}}
1216          {new set with elements in either \var{s} or \var{t} but not both}
1217   \lineiii{\var{s}.copy()}{}
1218          {new set with a shallow copy of \var{s}}
1219 \end{tableiii}
1220
1221 Note, the non-operator versions of \method{union()}, \method{intersection()},
1222 \method{difference()}, and \method{symmetric_difference()},
1223 \method{issubset()}, and \method{issuperset()} methods will accept any
1224 iterable as an argument.  In contrast, their operator based counterparts
1225 require their arguments to be sets.  This precludes error-prone constructions
1226 like \code{set('abc') \&\ 'cbs'} in favor of the more readable
1227 \code{set('abc').intersection('cbs')}.
1228
1229 Both \class{set} and \class{frozenset} support set to set comparisons.
1230 Two sets are equal if and only if every element of each set is contained in
1231 the other (each is a subset of the other).
1232 A set is less than another set if and only if the first set is a proper
1233 subset of the second set (is a subset, but is not equal).
1234 A set is greater than another set if and only if the first set is a proper
1235 superset of the second set (is a superset, but is not equal).
1236
1237 Instances of \class{set} are compared to instances of \class{frozenset} based
1238 on their members.  For example, \samp{set('abc') == frozenset('abc')} returns
1239 \code{True}.
1240
1241 The subset and equality comparisons do not generalize to a complete
1242 ordering function.  For example, any two disjoint sets are not equal and
1243 are not subsets of each other, so \emph{all} of the following return
1244 \code{False}:  \code{\var{a}<\var{b}}, \code{\var{a}==\var{b}}, or
1245 \code{\var{a}>\var{b}}.
1246 Accordingly, sets do not implement the \method{__cmp__} method.
1247
1248 Since sets only define partial ordering (subset relationships), the output
1249 of the \method{list.sort()} method is undefined for lists of sets.
1250
1251 Set elements are like dictionary keys; they need to define both
1252 \method{__hash__} and \method{__eq__} methods.
1253
1254 Binary operations that mix \class{set} instances with \class{frozenset}
1255 return the type of the first operand.  For example:
1256 \samp{frozenset('ab') | set('bc')} returns an instance of \class{frozenset}.
1257
1258 The following table lists operations available for \class{set}
1259 that do not apply to immutable instances of \class{frozenset}:
1260
1261 \begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
1262   \lineiii{\var{s}.update(\var{t})}
1263          {\var{s} |= \var{t}}
1264          {update set \var{s}, adding elements from \var{t}}
1265   \lineiii{\var{s}.intersection_update(\var{t})}
1266          {\var{s} \&= \var{t}}
1267          {update set \var{s}, keeping only elements found in both \var{s} and \var{t}}
1268   \lineiii{\var{s}.difference_update(\var{t})}
1269          {\var{s} -= \var{t}}
1270          {update set \var{s}, removing elements found in \var{t}}
1271   \lineiii{\var{s}.symmetric_difference_update(\var{t})}
1272          {\var{s} \textasciicircum= \var{t}}
1273          {update set \var{s}, keeping only elements found in either \var{s} or \var{t}
1274           but not in both}
1275
1276   \hline
1277   \lineiii{\var{s}.add(\var{x})}{}
1278          {add element \var{x} to set \var{s}}
1279   \lineiii{\var{s}.remove(\var{x})}{}
1280          {remove \var{x} from set \var{s}; raises KeyError if not present}
1281   \lineiii{\var{s}.discard(\var{x})}{}
1282          {removes \var{x} from set \var{s} if present}
1283   \lineiii{\var{s}.pop()}{}
1284          {remove and return an arbitrary element from \var{s}; raises
1285           \exception{KeyError} if empty}
1286   \lineiii{\var{s}.clear()}{}
1287          {remove all elements from set \var{s}}
1288 \end{tableiii}
1289
1290 Note, the non-operator versions of the \method{update()},
1291 \method{intersection_update()}, \method{difference_update()}, and
1292 \method{symmetric_difference_update()} methods will accept any iterable
1293 as an argument.
1294
1295 The design of the set types was based on lessons learned from the
1296 \module{sets} module.
1297
1298 \begin{seealso}
1299   \seelink{comparison-to-builtin-set.html}
1300           {Comparison to the built-in set types}
1301           {Differences between the \module{sets} module and the
1302            built-in set types.}
1303 \end{seealso}
1304
1305
1306 \subsection{Mapping Types --- \class{dict} \label{typesmapping}}
1307 \obindex{mapping}
1308 \obindex{dictionary}
1309
1310 A \dfn{mapping} object maps  immutable values to
1311 arbitrary objects.  Mappings are mutable objects.  There is currently
1312 only one standard mapping type, the \dfn{dictionary}.  A dictionary's keys are
1313 almost arbitrary values.  Only values containing lists, dictionaries
1314 or other mutable types (that are compared by value rather than by
1315 object identity) may not be used as keys.
1316 Numeric types used for keys obey the normal rules for numeric
1317 comparison: if two numbers compare equal (such as \code{1} and
1318 \code{1.0}) then they can be used interchangeably to index the same
1319 dictionary entry.
1320
1321 Dictionaries are created by placing a comma-separated list of
1322 \code{\var{key}: \var{value}} pairs within braces, for example:
1323 \code{\{'jack': 4098, 'sjoerd': 4127\}} or
1324 \code{\{4098: 'jack', 4127: 'sjoerd'\}}.
1325
1326 The following operations are defined on mappings (where \var{a} and
1327 \var{b} are mappings, \var{k} is a key, and \var{v} and \var{x} are
1328 arbitrary objects):
1329 \indexiii{operations on}{mapping}{types}
1330 \indexiii{operations on}{dictionary}{type}
1331 \stindex{del}
1332 \bifuncindex{len}
1333 \withsubitem{(dictionary method)}{
1334   \ttindex{clear()}
1335   \ttindex{copy()}
1336   \ttindex{has_key()}
1337   \ttindex{fromkeys()}
1338   \ttindex{items()}
1339   \ttindex{keys()}
1340   \ttindex{update()}
1341   \ttindex{values()}
1342   \ttindex{get()}
1343   \ttindex{setdefault()}
1344   \ttindex{pop()}
1345   \ttindex{popitem()}
1346   \ttindex{iteritems()}
1347   \ttindex{iterkeys()}
1348   \ttindex{itervalues()}}
1349
1350 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
1351   \lineiii{len(\var{a})}{the number of items in \var{a}}{}
1352   \lineiii{\var{a}[\var{k}]}{the item of \var{a} with key \var{k}}{(1)}
1353   \lineiii{\var{a}[\var{k}] = \var{v}}
1354           {set \code{\var{a}[\var{k}]} to \var{v}}
1355           {}
1356   \lineiii{del \var{a}[\var{k}]}
1357           {remove \code{\var{a}[\var{k}]} from \var{a}}
1358           {(1)}
1359   \lineiii{\var{a}.clear()}{remove all items from \code{a}}{}
1360   \lineiii{\var{a}.copy()}{a (shallow) copy of \code{a}}{}
1361   \lineiii{\var{a}.has_key(\var{k})}
1362           {\code{True} if \var{a} has a key \var{k}, else \code{False}}
1363           {}
1364   \lineiii{\var{k} \code{in} \var{a}}
1365           {Equivalent to \var{a}.has_key(\var{k})}
1366           {(2)}
1367   \lineiii{\var{k} not in \var{a}}
1368           {Equivalent to \code{not} \var{a}.has_key(\var{k})}
1369           {(2)}
1370   \lineiii{\var{a}.items()}
1371           {a copy of \var{a}'s list of (\var{key}, \var{value}) pairs}
1372           {(3)}
1373   \lineiii{\var{a}.keys()}{a copy of \var{a}'s list of keys}{(3)}
1374   \lineiii{\var{a}.update(\optional{\var{b}})}
1375           {updates (and overwrites) key/value pairs from \var{b}}
1376           {(9)}
1377   \lineiii{\var{a}.fromkeys(\var{seq}\optional{, \var{value}})}
1378           {Creates a new dictionary with keys from \var{seq} and values set to \var{value}}
1379           {(7)}
1380   \lineiii{\var{a}.values()}{a copy of \var{a}'s list of values}{(3)}
1381   \lineiii{\var{a}.get(\var{k}\optional{, \var{x}})}
1382           {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1383            else \var{x}}
1384           {(4)}
1385   \lineiii{\var{a}.setdefault(\var{k}\optional{, \var{x}})}
1386           {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1387            else \var{x} (also setting it)}
1388           {(5)}
1389   \lineiii{\var{a}.pop(\var{k}\optional{, \var{x}})}
1390           {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1391            else \var{x} (and remove k)}
1392           {(8)}
1393   \lineiii{\var{a}.popitem()}
1394           {remove and return an arbitrary (\var{key}, \var{value}) pair}
1395           {(6)}
1396   \lineiii{\var{a}.iteritems()}
1397           {return an iterator over (\var{key}, \var{value}) pairs}
1398           {(2), (3)}
1399   \lineiii{\var{a}.iterkeys()}
1400           {return an iterator over the mapping's keys}
1401           {(2), (3)}
1402   \lineiii{\var{a}.itervalues()}
1403           {return an iterator over the mapping's values}
1404           {(2), (3)}
1405 \end{tableiii}
1406
1407 \noindent
1408 Notes:
1409 \begin{description}
1410 \item[(1)] Raises a \exception{KeyError} exception if \var{k} is not
1411 in the map.
1412
1413 \item[(2)] \versionadded{2.2}
1414
1415 \item[(3)] Keys and values are listed in an arbitrary order which is
1416 non-random, varies across Python implementations, and depends on the
1417 dictionary's history of insertions and deletions.
1418 If \method{items()}, \method{keys()}, \method{values()},
1419 \method{iteritems()}, \method{iterkeys()}, and \method{itervalues()}
1420 are called with no intervening modifications to the dictionary, the
1421 lists will directly correspond.  This allows the creation of
1422 \code{(\var{value}, \var{key})} pairs using \function{zip()}:
1423 \samp{pairs = zip(\var{a}.values(), \var{a}.keys())}.  The same
1424 relationship holds for the \method{iterkeys()} and
1425 \method{itervalues()} methods: \samp{pairs = zip(\var{a}.itervalues(),
1426 \var{a}.iterkeys())} provides the same value for \code{pairs}.
1427 Another way to create the same list is \samp{pairs = [(v, k) for (k,
1428 v) in \var{a}.iteritems()]}.
1429
1430 \item[(4)] Never raises an exception if \var{k} is not in the map,
1431 instead it returns \var{x}.  \var{x} is optional; when \var{x} is not
1432 provided and \var{k} is not in the map, \code{None} is returned.
1433
1434 \item[(5)] \function{setdefault()} is like \function{get()}, except
1435 that if \var{k} is missing, \var{x} is both returned and inserted into
1436 the dictionary as the value of \var{k}. \var{x} defaults to \var{None}.
1437
1438 \item[(6)] \function{popitem()} is useful to destructively iterate
1439 over a dictionary, as often used in set algorithms.  If the dictionary
1440 is empty, calling \function{popitem()} raises a \exception{KeyError}.
1441
1442 \item[(7)] \function{fromkeys()} is a class method that returns a
1443 new dictionary. \var{value} defaults to \code{None}.  \versionadded{2.3}
1444
1445 \item[(8)] \function{pop()} raises a \exception{KeyError} when no default
1446 value is given and the key is not found.  \versionadded{2.3}
1447
1448 \item[(9)] \function{update()} accepts either another mapping object
1449 or an iterable of key/value pairs (as a tuple or other iterable of
1450 length two).  If keyword arguments are specified, the mapping is
1451 then is updated with those key/value pairs:
1452 \samp{d.update(red=1, blue=2)}.
1453 \versionchanged[Allowed the argument to be an iterable of key/value
1454                 pairs and allowed keyword arguments]{2.4}
1455
1456 \end{description}
1457
1458 \subsection{File Objects
1459             \label{bltin-file-objects}}
1460
1461 File objects\obindex{file} are implemented using C's \code{stdio}
1462 package and can be created with the built-in constructor
1463 \function{file()}\bifuncindex{file} described in section
1464 \ref{built-in-funcs}, ``Built-in Functions.''\footnote{\function{file()}
1465 is new in Python 2.2.  The older built-in \function{open()} is an
1466 alias for \function{file()}.}  File objects are also returned
1467 by some other built-in functions and methods, such as
1468 \function{os.popen()} and \function{os.fdopen()} and the
1469 \method{makefile()} method of socket objects.
1470 \refstmodindex{os}
1471 \refbimodindex{socket}
1472
1473 When a file operation fails for an I/O-related reason, the exception
1474 \exception{IOError} is raised.  This includes situations where the
1475 operation is not defined for some reason, like \method{seek()} on a tty
1476 device or writing a file opened for reading.
1477
1478 Files have the following methods:
1479
1480
1481 \begin{methoddesc}[file]{close}{}
1482   Close the file.  A closed file cannot be read or written any more.
1483   Any operation which requires that the file be open will raise a
1484   \exception{ValueError} after the file has been closed.  Calling
1485   \method{close()} more than once is allowed.
1486 \end{methoddesc}
1487
1488 \begin{methoddesc}[file]{flush}{}
1489   Flush the internal buffer, like \code{stdio}'s
1490   \cfunction{fflush()}.  This may be a no-op on some file-like
1491   objects.
1492 \end{methoddesc}
1493
1494 \begin{methoddesc}[file]{fileno}{}
1495   \index{file descriptor}
1496   \index{descriptor, file}
1497   Return the integer ``file descriptor'' that is used by the
1498   underlying implementation to request I/O operations from the
1499   operating system.  This can be useful for other, lower level
1500   interfaces that use file descriptors, such as the
1501   \refmodule{fcntl}\refbimodindex{fcntl} module or
1502   \function{os.read()} and friends.  \note{File-like objects
1503   which do not have a real file descriptor should \emph{not} provide
1504   this method!}
1505 \end{methoddesc}
1506
1507 \begin{methoddesc}[file]{isatty}{}
1508   Return \code{True} if the file is connected to a tty(-like) device, else
1509   \code{False}.  \note{If a file-like object is not associated
1510   with a real file, this method should \emph{not} be implemented.}
1511 \end{methoddesc}
1512
1513 \begin{methoddesc}[file]{next}{}
1514 A file object is its own iterator, for example \code{iter(\var{f})} returns
1515 \var{f} (unless \var{f} is closed).  When a file is used as an
1516 iterator, typically in a \keyword{for} loop (for example,
1517 \code{for line in f: print line}), the \method{next()} method is
1518 called repeatedly.  This method returns the next input line, or raises
1519 \exception{StopIteration} when \EOF{} is hit.  In order to make a
1520 \keyword{for} loop the most efficient way of looping over the lines of
1521 a file (a very common operation), the \method{next()} method uses a
1522 hidden read-ahead buffer.  As a consequence of using a read-ahead
1523 buffer, combining \method{next()} with other file methods (like
1524 \method{readline()}) does not work right.  However, using
1525 \method{seek()} to reposition the file to an absolute position will
1526 flush the read-ahead buffer.
1527 \versionadded{2.3}
1528 \end{methoddesc}
1529
1530 \begin{methoddesc}[file]{read}{\optional{size}}
1531   Read at most \var{size} bytes from the file (less if the read hits
1532   \EOF{} before obtaining \var{size} bytes).  If the \var{size}
1533   argument is negative or omitted, read all data until \EOF{} is
1534   reached.  The bytes are returned as a string object.  An empty
1535   string is returned when \EOF{} is encountered immediately.  (For
1536   certain files, like ttys, it makes sense to continue reading after
1537   an \EOF{} is hit.)  Note that this method may call the underlying
1538   C function \cfunction{fread()} more than once in an effort to
1539   acquire as close to \var{size} bytes as possible. Also note that
1540   when in non-blocking mode, less data than what was requested may
1541   be returned, even if no \var{size} parameter was given.
1542 \end{methoddesc}
1543
1544 \begin{methoddesc}[file]{readline}{\optional{size}}
1545   Read one entire line from the file.  A trailing newline character is
1546   kept in the string (but may be absent when a file ends with an
1547   incomplete line).\footnote{
1548         The advantage of leaving the newline on is that
1549         returning an empty string is then an unambiguous \EOF{}
1550         indication.  It is also possible (in cases where it might
1551         matter, for example, if you
1552         want to make an exact copy of a file while scanning its lines)
1553         to tell whether the last line of a file ended in a newline
1554         or not (yes this happens!).
1555   }  If the \var{size} argument is present and
1556   non-negative, it is a maximum byte count (including the trailing
1557   newline) and an incomplete line may be returned.
1558   An empty string is returned \emph{only} when \EOF{} is encountered
1559   immediately.  \note{Unlike \code{stdio}'s \cfunction{fgets()}, the
1560   returned string contains null characters (\code{'\e 0'}) if they
1561   occurred in the input.}
1562 \end{methoddesc}
1563
1564 \begin{methoddesc}[file]{readlines}{\optional{sizehint}}
1565   Read until \EOF{} using \method{readline()} and return a list containing
1566   the lines thus read.  If the optional \var{sizehint} argument is
1567   present, instead of reading up to \EOF, whole lines totalling
1568   approximately \var{sizehint} bytes (possibly after rounding up to an
1569   internal buffer size) are read.  Objects implementing a file-like
1570   interface may choose to ignore \var{sizehint} if it cannot be
1571   implemented, or cannot be implemented efficiently.
1572 \end{methoddesc}
1573
1574 \begin{methoddesc}[file]{xreadlines}{}
1575   This method returns the same thing as \code{iter(f)}.
1576   \versionadded{2.1}
1577   \deprecated{2.3}{Use \samp{for \var{line} in \var{file}} instead.}
1578 \end{methoddesc}
1579
1580 \begin{methoddesc}[file]{seek}{offset\optional{, whence}}
1581   Set the file's current position, like \code{stdio}'s \cfunction{fseek()}.
1582   The \var{whence} argument is optional and defaults to \code{0}
1583   (absolute file positioning); other values are \code{1} (seek
1584   relative to the current position) and \code{2} (seek relative to the
1585   file's end).  There is no return value.  Note that if the file is
1586   opened for appending (mode \code{'a'} or \code{'a+'}), any
1587   \method{seek()} operations will be undone at the next write.  If the
1588   file is only opened for writing in append mode (mode \code{'a'}),
1589   this method is essentially a no-op, but it remains useful for files
1590   opened in append mode with reading enabled (mode \code{'a+'}).  If the
1591   file is opened in text mode (without \code{'b'}), only offsets returned
1592   by \method{tell()} are legal.  Use of other offsets causes undefined
1593   behavior.
1594
1595   Note that not all file objects are seekable.
1596 \end{methoddesc}
1597
1598 \begin{methoddesc}[file]{tell}{}
1599   Return the file's current position, like \code{stdio}'s
1600   \cfunction{ftell()}.
1601
1602   \note{On Windows, \method{tell()} can return illegal values (after an
1603   \cfunction{fgets()}) when reading files with \UNIX{}-style line-endings.
1604   Use binary mode (\code{'rb'}) to circumvent this problem.}
1605 \end{methoddesc}
1606
1607 \begin{methoddesc}[file]{truncate}{\optional{size}}
1608   Truncate the file's size.  If the optional \var{size} argument is
1609   present, the file is truncated to (at most) that size.  The size
1610   defaults to the current position.  The current file position is
1611   not changed.  Note that if a specified size exceeds the file's
1612   current size, the result is platform-dependent:  possibilities
1613   include that the file may remain unchanged, increase to the specified
1614   size as if zero-filled, or increase to the specified size with
1615   undefined new content.
1616   Availability:  Windows, many \UNIX{} variants.
1617 \end{methoddesc}
1618
1619 \begin{methoddesc}[file]{write}{str}
1620   Write a string to the file.  There is no return value.  Due to
1621   buffering, the string may not actually show up in the file until
1622   the \method{flush()} or \method{close()} method is called.
1623 \end{methoddesc}
1624
1625 \begin{methoddesc}[file]{writelines}{sequence}
1626   Write a sequence of strings to the file.  The sequence can be any
1627   iterable object producing strings, typically a list of strings.
1628   There is no return value.
1629   (The name is intended to match \method{readlines()};
1630   \method{writelines()} does not add line separators.)
1631 \end{methoddesc}
1632
1633
1634 Files support the iterator protocol.  Each iteration returns the same
1635 result as \code{\var{file}.readline()}, and iteration ends when the
1636 \method{readline()} method returns an empty string.
1637
1638
1639 File objects also offer a number of other interesting attributes.
1640 These are not required for file-like objects, but should be
1641 implemented if they make sense for the particular object.
1642
1643 \begin{memberdesc}[file]{closed}
1644 bool indicating the current state of the file object.  This is a
1645 read-only attribute; the \method{close()} method changes the value.
1646 It may not be available on all file-like objects.
1647 \end{memberdesc}
1648
1649 \begin{memberdesc}[file]{encoding}
1650 The encoding that this file uses. When Unicode strings are written
1651 to a file, they will be converted to byte strings using this encoding.
1652 In addition, when the file is connected to a terminal, the attribute
1653 gives the encoding that the terminal is likely to use (that
1654 information might be incorrect if the user has misconfigured the
1655 terminal). The attribute is read-only and may not be present on
1656 all file-like objects. It may also be \code{None}, in which case
1657 the file uses the system default encoding for converting Unicode
1658 strings.
1659
1660 \versionadded{2.3}
1661 \end{memberdesc}
1662
1663 \begin{memberdesc}[file]{mode}
1664 The I/O mode for the file.  If the file was created using the
1665 \function{open()} built-in function, this will be the value of the
1666 \var{mode} parameter.  This is a read-only attribute and may not be
1667 present on all file-like objects.
1668 \end{memberdesc}
1669
1670 \begin{memberdesc}[file]{name}
1671 If the file object was created using \function{open()}, the name of
1672 the file.  Otherwise, some string that indicates the source of the
1673 file object, of the form \samp{<\mbox{\ldots}>}.  This is a read-only
1674 attribute and may not be present on all file-like objects.
1675 \end{memberdesc}
1676
1677 \begin{memberdesc}[file]{newlines}
1678 If Python was built with the \longprogramopt{with-universal-newlines}
1679 option to \program{configure} (the default) this read-only attribute
1680 exists, and for files opened in
1681 universal newline read mode it keeps track of the types of newlines
1682 encountered while reading the file. The values it can take are
1683 \code{'\e r'}, \code{'\e n'}, \code{'\e r\e n'}, \code{None} (unknown,
1684 no newlines read yet) or a tuple containing all the newline
1685 types seen, to indicate that multiple
1686 newline conventions were encountered. For files not opened in universal
1687 newline read mode the value of this attribute will be \code{None}.
1688 \end{memberdesc}
1689
1690 \begin{memberdesc}[file]{softspace}
1691 Boolean that indicates whether a space character needs to be printed
1692 before another value when using the \keyword{print} statement.
1693 Classes that are trying to simulate a file object should also have a
1694 writable \member{softspace} attribute, which should be initialized to
1695 zero.  This will be automatic for most classes implemented in Python
1696 (care may be needed for objects that override attribute access); types
1697 implemented in C will have to provide a writable
1698 \member{softspace} attribute.
1699 \note{This attribute is not used to control the
1700 \keyword{print} statement, but to allow the implementation of
1701 \keyword{print} to keep track of its internal state.}
1702 \end{memberdesc}
1703
1704
1705 \subsection{Other Built-in Types \label{typesother}}
1706
1707 The interpreter supports several other kinds of objects.
1708 Most of these support only one or two operations.
1709
1710
1711 \subsubsection{Modules \label{typesmodules}}
1712
1713 The only special operation on a module is attribute access:
1714 \code{\var{m}.\var{name}}, where \var{m} is a module and \var{name}
1715 accesses a name defined in \var{m}'s symbol table.  Module attributes
1716 can be assigned to.  (Note that the \keyword{import} statement is not,
1717 strictly speaking, an operation on a module object; \code{import
1718 \var{foo}} does not require a module object named \var{foo} to exist,
1719 rather it requires an (external) \emph{definition} for a module named
1720 \var{foo} somewhere.)
1721
1722 A special member of every module is \member{__dict__}.
1723 This is the dictionary containing the module's symbol table.
1724 Modifying this dictionary will actually change the module's symbol
1725 table, but direct assignment to the \member{__dict__} attribute is not
1726 possible (you can write \code{\var{m}.__dict__['a'] = 1}, which
1727 defines \code{\var{m}.a} to be \code{1}, but you can't write
1728 \code{\var{m}.__dict__ = \{\}}).  Modifying \member{__dict__} directly
1729 is not recommended.
1730
1731 Modules built into the interpreter are written like this:
1732 \code{<module 'sys' (built-in)>}.  If loaded from a file, they are
1733 written as \code{<module 'os' from
1734 '/usr/local/lib/python\shortversion/os.pyc'>}.
1735
1736
1737 \subsubsection{Classes and Class Instances \label{typesobjects}}
1738 \nodename{Classes and Instances}
1739
1740 See chapters 3 and 7 of the \citetitle[../ref/ref.html]{Python
1741 Reference Manual} for these.
1742
1743
1744 \subsubsection{Functions \label{typesfunctions}}
1745
1746 Function objects are created by function definitions.  The only
1747 operation on a function object is to call it:
1748 \code{\var{func}(\var{argument-list})}.
1749
1750 There are really two flavors of function objects: built-in functions
1751 and user-defined functions.  Both support the same operation (to call
1752 the function), but the implementation is different, hence the
1753 different object types.
1754
1755 See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1756 information.
1757
1758 \subsubsection{Methods \label{typesmethods}}
1759 \obindex{method}
1760
1761 Methods are functions that are called using the attribute notation.
1762 There are two flavors: built-in methods (such as \method{append()} on
1763 lists) and class instance methods.  Built-in methods are described
1764 with the types that support them.
1765
1766 The implementation adds two special read-only attributes to class
1767 instance methods: \code{\var{m}.im_self} is the object on which the
1768 method operates, and \code{\var{m}.im_func} is the function
1769 implementing the method.  Calling \code{\var{m}(\var{arg-1},
1770 \var{arg-2}, \textrm{\ldots}, \var{arg-n})} is completely equivalent to
1771 calling \code{\var{m}.im_func(\var{m}.im_self, \var{arg-1},
1772 \var{arg-2}, \textrm{\ldots}, \var{arg-n})}.
1773
1774 Class instance methods are either \emph{bound} or \emph{unbound},
1775 referring to whether the method was accessed through an instance or a
1776 class, respectively.  When a method is unbound, its \code{im_self}
1777 attribute will be \code{None} and if called, an explicit \code{self}
1778 object must be passed as the first argument.  In this case,
1779 \code{self} must be an instance of the unbound method's class (or a
1780 subclass of that class), otherwise a \code{TypeError} is raised.
1781
1782 Like function objects, methods objects support getting
1783 arbitrary attributes.  However, since method attributes are actually
1784 stored on the underlying function object (\code{meth.im_func}),
1785 setting method attributes on either bound or unbound methods is
1786 disallowed.  Attempting to set a method attribute results in a
1787 \code{TypeError} being raised.  In order to set a method attribute,
1788 you need to explicitly set it on the underlying function object:
1789
1790 \begin{verbatim}
1791 class C:
1792     def method(self):
1793         pass
1794
1795 c = C()
1796 c.method.im_func.whoami = 'my name is c'
1797 \end{verbatim}
1798
1799 See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1800 information.
1801
1802
1803 \subsubsection{Code Objects \label{bltin-code-objects}}
1804 \obindex{code}
1805
1806 Code objects are used by the implementation to represent
1807 ``pseudo-compiled'' executable Python code such as a function body.
1808 They differ from function objects because they don't contain a
1809 reference to their global execution environment.  Code objects are
1810 returned by the built-in \function{compile()} function and can be
1811 extracted from function objects through their \member{func_code}
1812 attribute.
1813 \bifuncindex{compile}
1814 \withsubitem{(function object attribute)}{\ttindex{func_code}}
1815
1816 A code object can be executed or evaluated by passing it (instead of a
1817 source string) to the \keyword{exec} statement or the built-in
1818 \function{eval()} function.
1819 \stindex{exec}
1820 \bifuncindex{eval}
1821
1822 See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1823 information.
1824
1825
1826 \subsubsection{Type Objects \label{bltin-type-objects}}
1827
1828 Type objects represent the various object types.  An object's type is
1829 accessed by the built-in function \function{type()}.  There are no special
1830 operations on types.  The standard module \refmodule{types} defines names
1831 for all standard built-in types.
1832 \bifuncindex{type}
1833 \refstmodindex{types}
1834
1835 Types are written like this: \code{<type 'int'>}.
1836
1837
1838 \subsubsection{The Null Object \label{bltin-null-object}}
1839
1840 This object is returned by functions that don't explicitly return a
1841 value.  It supports no special operations.  There is exactly one null
1842 object, named \code{None} (a built-in name).
1843
1844 It is written as \code{None}.
1845
1846
1847 \subsubsection{The Ellipsis Object \label{bltin-ellipsis-object}}
1848
1849 This object is used by extended slice notation (see the
1850 \citetitle[../ref/ref.html]{Python Reference Manual}).  It supports no
1851 special operations.  There is exactly one ellipsis object, named
1852 \constant{Ellipsis} (a built-in name).
1853
1854 It is written as \code{Ellipsis}.
1855
1856 \subsubsection{Boolean Values}
1857
1858 Boolean values are the two constant objects \code{False} and
1859 \code{True}.  They are used to represent truth values (although other
1860 values can also be considered false or true).  In numeric contexts
1861 (for example when used as the argument to an arithmetic operator),
1862 they behave like the integers 0 and 1, respectively.  The built-in
1863 function \function{bool()} can be used to cast any value to a Boolean,
1864 if the value can be interpreted as a truth value (see section Truth
1865 Value Testing above).
1866
1867 They are written as \code{False} and \code{True}, respectively.
1868 \index{False}
1869 \index{True}
1870 \indexii{Boolean}{values}
1871
1872
1873 \subsubsection{Internal Objects \label{typesinternal}}
1874
1875 See the \citetitle[../ref/ref.html]{Python Reference Manual} for this
1876 information.  It describes stack frame objects, traceback objects, and
1877 slice objects.
1878
1879
1880 \subsection{Special Attributes \label{specialattrs}}
1881
1882 The implementation adds a few special read-only attributes to several
1883 object types, where they are relevant.  Some of these are not reported
1884 by the \function{dir()} built-in function.
1885
1886 \begin{memberdesc}[object]{__dict__}
1887 A dictionary or other mapping object used to store an
1888 object's (writable) attributes.
1889 \end{memberdesc}
1890
1891 \begin{memberdesc}[object]{__methods__}
1892 \deprecated{2.2}{Use the built-in function \function{dir()} to get a
1893 list of an object's attributes.  This attribute is no longer available.}
1894 \end{memberdesc}
1895
1896 \begin{memberdesc}[object]{__members__}
1897 \deprecated{2.2}{Use the built-in function \function{dir()} to get a
1898 list of an object's attributes.  This attribute is no longer available.}
1899 \end{memberdesc}
1900
1901 \begin{memberdesc}[instance]{__class__}
1902 The class to which a class instance belongs.
1903 \end{memberdesc}
1904
1905 \begin{memberdesc}[class]{__bases__}
1906 The tuple of base classes of a class object.  If there are no base
1907 classes, this will be an empty tuple.
1908 \end{memberdesc}
1909
1910 \begin{memberdesc}[class]{__name__}
1911 The name of the class or type.
1912 \end{memberdesc}