Added a test for the ability to specify a class attribute in Formatter configuration...
[python.git] / Doc / lib / libstdtypes.tex
blobc2a7c88aa69a22d5d50c7ec457395ef88755d959
1 \section{Built-in Types \label{types}}
3 The following sections describe the standard types that are built into
4 the interpreter. Historically, Python's built-in types have differed
5 from user-defined types because it was not possible to use the built-in
6 types as the basis for object-oriented inheritance. With the 2.2
7 release this situation has started to change, although the intended
8 unification of user-defined and built-in types is as yet far from
9 complete.
11 The principal built-in types are numerics, sequences, mappings, files
12 classes, instances and exceptions.
13 \indexii{built-in}{types}
15 Some operations are supported by several object types; in particular,
16 practically all objects can be compared, tested for truth value,
17 and converted to a string (with the \code{`\textrm{\ldots}`} notation,
18 the equivalent \function{repr()} function, or the slightly different
19 \function{str()} function). The latter
20 function is implicitly used when an object is written by the
21 \keyword{print}\stindex{print} statement.
22 (Information on \ulink{\keyword{print} statement}{../ref/print.html}
23 and other language statements can be found in the
24 \citetitle[../ref/ref.html]{Python Reference Manual} and the
25 \citetitle[../tut/tut.html]{Python Tutorial}.)
28 \subsection{Truth Value Testing\label{truth}}
30 Any object can be tested for truth value, for use in an \keyword{if} or
31 \keyword{while} condition or as operand of the Boolean operations below.
32 The following values are considered false:
33 \stindex{if}
34 \stindex{while}
35 \indexii{truth}{value}
36 \indexii{Boolean}{operations}
37 \index{false}
39 \begin{itemize}
41 \item \code{None}
42 \withsubitem{(Built-in object)}{\ttindex{None}}
44 \item \code{False}
45 \withsubitem{(Built-in object)}{\ttindex{False}}
47 \item zero of any numeric type, for example, \code{0}, \code{0L},
48 \code{0.0}, \code{0j}.
50 \item any empty sequence, for example, \code{''}, \code{()}, \code{[]}.
52 \item any empty mapping, for example, \code{\{\}}.
54 \item instances of user-defined classes, if the class defines a
55 \method{__nonzero__()} or \method{__len__()} method, when that
56 method returns the integer zero or \class{bool} value
57 \code{False}.\footnote{Additional
58 information on these special methods may be found in the
59 \citetitle[../ref/ref.html]{Python Reference Manual}.}
61 \end{itemize}
63 All other values are considered true --- so objects of many types are
64 always true.
65 \index{true}
67 Operations and built-in functions that have a Boolean result always
68 return \code{0} or \code{False} for false and \code{1} or \code{True}
69 for true, unless otherwise stated. (Important exception: the Boolean
70 operations \samp{or}\opindex{or} and \samp{and}\opindex{and} always
71 return one of their operands.)
72 \index{False}
73 \index{True}
75 \subsection{Boolean Operations ---
76 \keyword{and}, \keyword{or}, \keyword{not}
77 \label{boolean}}
79 These are the Boolean operations, ordered by ascending priority:
80 \indexii{Boolean}{operations}
82 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
83 \lineiii{\var{x} or \var{y}}
84 {if \var{x} is false, then \var{y}, else \var{x}}{(1)}
85 \lineiii{\var{x} and \var{y}}
86 {if \var{x} is false, then \var{x}, else \var{y}}{(1)}
87 \hline
88 \lineiii{not \var{x}}
89 {if \var{x} is false, then \code{True}, else \code{False}}{(2)}
90 \end{tableiii}
91 \opindex{and}
92 \opindex{or}
93 \opindex{not}
95 \noindent
96 Notes:
98 \begin{description}
100 \item[(1)]
101 These only evaluate their second argument if needed for their outcome.
103 \item[(2)]
104 \samp{not} has a lower priority than non-Boolean operators, so
105 \code{not \var{a} == \var{b}} is interpreted as \code{not (\var{a} ==
106 \var{b})}, and \code{\var{a} == not \var{b}} is a syntax error.
108 \end{description}
111 \subsection{Comparisons \label{comparisons}}
113 Comparison operations are supported by all objects. They all have the
114 same priority (which is higher than that of the Boolean operations).
115 Comparisons can be chained arbitrarily; for example, \code{\var{x} <
116 \var{y} <= \var{z}} is equivalent to \code{\var{x} < \var{y} and
117 \var{y} <= \var{z}}, except that \var{y} is evaluated only once (but
118 in both cases \var{z} is not evaluated at all when \code{\var{x} <
119 \var{y}} is found to be false).
120 \indexii{chaining}{comparisons}
122 This table summarizes the comparison operations:
124 \begin{tableiii}{c|l|c}{code}{Operation}{Meaning}{Notes}
125 \lineiii{<}{strictly less than}{}
126 \lineiii{<=}{less than or equal}{}
127 \lineiii{>}{strictly greater than}{}
128 \lineiii{>=}{greater than or equal}{}
129 \lineiii{==}{equal}{}
130 \lineiii{!=}{not equal}{(1)}
131 \lineiii{<>}{not equal}{(1)}
132 \lineiii{is}{object identity}{}
133 \lineiii{is not}{negated object identity}{}
134 \end{tableiii}
135 \indexii{operator}{comparison}
136 \opindex{==} % XXX *All* others have funny characters < ! >
137 \opindex{is}
138 \opindex{is not}
140 \noindent
141 Notes:
143 \begin{description}
145 \item[(1)]
146 \code{<>} and \code{!=} are alternate spellings for the same operator.
147 \code{!=} is the preferred spelling; \code{<>} is obsolescent.
149 \end{description}
151 Objects of different types, except different numeric types and different string types, never
152 compare equal; such objects are ordered consistently but arbitrarily
153 (so that sorting a heterogeneous array yields a consistent result).
154 Furthermore, some types (for example, file objects) support only a
155 degenerate notion of comparison where any two objects of that type are
156 unequal. Again, such objects are ordered arbitrarily but
157 consistently. The \code{<}, \code{<=}, \code{>} and \code{>=}
158 operators will raise a \exception{TypeError} exception when any operand
159 is a complex number.
160 \indexii{object}{numeric}
161 \indexii{objects}{comparing}
163 Instances of a class normally compare as non-equal unless the class
164 \withsubitem{(instance method)}{\ttindex{__cmp__()}}
165 defines the \method{__cmp__()} method. Refer to the
166 \citetitle[../ref/customization.html]{Python Reference Manual} for
167 information on the use of this method to effect object comparisons.
169 \strong{Implementation note:} Objects of different types except
170 numbers are ordered by their type names; objects of the same types
171 that don't support proper comparison are ordered by their address.
173 Two more operations with the same syntactic priority,
174 \samp{in}\opindex{in} and \samp{not in}\opindex{not in}, are supported
175 only by sequence types (below).
178 \subsection{Numeric Types ---
179 \class{int}, \class{float}, \class{long}, \class{complex}
180 \label{typesnumeric}}
182 There are four distinct numeric types: \dfn{plain integers},
183 \dfn{long integers},
184 \dfn{floating point numbers}, and \dfn{complex numbers}.
185 In addition, Booleans are a subtype of plain integers.
186 Plain integers (also just called \dfn{integers})
187 are implemented using \ctype{long} in C, which gives them at least 32
188 bits of precision. Long integers have unlimited precision. Floating
189 point numbers are implemented using \ctype{double} in C. All bets on
190 their precision are off unless you happen to know the machine you are
191 working with.
192 \obindex{numeric}
193 \obindex{Boolean}
194 \obindex{integer}
195 \obindex{long integer}
196 \obindex{floating point}
197 \obindex{complex number}
198 \indexii{C}{language}
200 Complex numbers have a real and imaginary part, which are each
201 implemented using \ctype{double} in C. To extract these parts from
202 a complex number \var{z}, use \code{\var{z}.real} and \code{\var{z}.imag}.
204 Numbers are created by numeric literals or as the result of built-in
205 functions and operators. Unadorned integer literals (including hex
206 and octal numbers) yield plain integers unless the value they denote
207 is too large to be represented as a plain integer, in which case
208 they yield a long integer. Integer literals with an
209 \character{L} or \character{l} suffix yield long integers
210 (\character{L} is preferred because \samp{1l} looks too much like
211 eleven!). Numeric literals containing a decimal point or an exponent
212 sign yield floating point numbers. Appending \character{j} or
213 \character{J} to a numeric literal yields a complex number with a
214 zero real part. A complex numeric literal is the sum of a real and
215 an imaginary part.
216 \indexii{numeric}{literals}
217 \indexii{integer}{literals}
218 \indexiii{long}{integer}{literals}
219 \indexii{floating point}{literals}
220 \indexii{complex number}{literals}
221 \indexii{hexadecimal}{literals}
222 \indexii{octal}{literals}
224 Python fully supports mixed arithmetic: when a binary arithmetic
225 operator has operands of different numeric types, the operand with the
226 ``narrower'' type is widened to that of the other, where plain
227 integer is narrower than long integer is narrower than floating point is
228 narrower than complex.
229 Comparisons between numbers of mixed type use the same rule.\footnote{
230 As a consequence, the list \code{[1, 2]} is considered equal
231 to \code{[1.0, 2.0]}, and similarly for tuples.
232 } The constructors \function{int()}, \function{long()}, \function{float()},
233 and \function{complex()} can be used
234 to produce numbers of a specific type.
235 \index{arithmetic}
236 \bifuncindex{int}
237 \bifuncindex{long}
238 \bifuncindex{float}
239 \bifuncindex{complex}
241 All numeric types (except complex) support the following operations,
242 sorted by ascending priority (operations in the same box have the same
243 priority; all numeric operations have a higher priority than
244 comparison operations):
246 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
247 \lineiii{\var{x} + \var{y}}{sum of \var{x} and \var{y}}{}
248 \lineiii{\var{x} - \var{y}}{difference of \var{x} and \var{y}}{}
249 \hline
250 \lineiii{\var{x} * \var{y}}{product of \var{x} and \var{y}}{}
251 \lineiii{\var{x} / \var{y}}{quotient of \var{x} and \var{y}}{(1)}
252 \lineiii{\var{x} \%{} \var{y}}{remainder of \code{\var{x} / \var{y}}}{(4)}
253 \hline
254 \lineiii{-\var{x}}{\var{x} negated}{}
255 \lineiii{+\var{x}}{\var{x} unchanged}{}
256 \hline
257 \lineiii{abs(\var{x})}{absolute value or magnitude of \var{x}}{}
258 \lineiii{int(\var{x})}{\var{x} converted to integer}{(2)}
259 \lineiii{long(\var{x})}{\var{x} converted to long integer}{(2)}
260 \lineiii{float(\var{x})}{\var{x} converted to floating point}{}
261 \lineiii{complex(\var{re},\var{im})}{a complex number with real part \var{re}, imaginary part \var{im}. \var{im} defaults to zero.}{}
262 \lineiii{\var{c}.conjugate()}{conjugate of the complex number \var{c}}{}
263 \lineiii{divmod(\var{x}, \var{y})}{the pair \code{(\var{x} // \var{y}, \var{x} \%{} \var{y})}}{(3)(4)}
264 \lineiii{pow(\var{x}, \var{y})}{\var{x} to the power \var{y}}{}
265 \lineiii{\var{x} ** \var{y}}{\var{x} to the power \var{y}}{}
266 \end{tableiii}
267 \indexiii{operations on}{numeric}{types}
268 \withsubitem{(complex number method)}{\ttindex{conjugate()}}
270 \noindent
271 Notes:
272 \begin{description}
274 \item[(1)]
275 For (plain or long) integer division, the result is an integer.
276 The result is always rounded towards minus infinity: 1/2 is 0,
277 (-1)/2 is -1, 1/(-2) is -1, and (-1)/(-2) is 0. Note that the result
278 is a long integer if either operand is a long integer, regardless of
279 the numeric value.
280 \indexii{integer}{division}
281 \indexiii{long}{integer}{division}
283 \item[(2)]
284 Conversion from floating point to (long or plain) integer may round or
285 truncate as in C; see functions \function{floor()} and
286 \function{ceil()} in the \refmodule{math}\refbimodindex{math} module
287 for well-defined conversions.
288 \withsubitem{(in module math)}{\ttindex{floor()}\ttindex{ceil()}}
289 \indexii{numeric}{conversions}
290 \indexii{C}{language}
292 \item[(3)]
293 See section \ref{built-in-funcs}, ``Built-in Functions,'' for a full
294 description.
296 \item[(4)]
297 Complex floor division operator, modulo operator, and \function{divmod()}.
299 \deprecated{2.3}{Instead convert to float using \function{abs()}
300 if appropriate.}
302 \end{description}
303 % XXXJH exceptions: overflow (when? what operations?) zerodivision
305 \subsubsection{Bit-string Operations on Integer Types \label{bitstring-ops}}
306 \nodename{Bit-string Operations}
308 Plain and long integer types support additional operations that make
309 sense only for bit-strings. Negative numbers are treated as their 2's
310 complement value (for long integers, this assumes a sufficiently large
311 number of bits that no overflow occurs during the operation).
313 The priorities of the binary bit-wise operations are all lower than
314 the numeric operations and higher than the comparisons; the unary
315 operation \samp{\~} has the same priority as the other unary numeric
316 operations (\samp{+} and \samp{-}).
318 This table lists the bit-string operations sorted in ascending
319 priority (operations in the same box have the same priority):
321 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
322 \lineiii{\var{x} | \var{y}}{bitwise \dfn{or} of \var{x} and \var{y}}{}
323 \lineiii{\var{x} \^{} \var{y}}{bitwise \dfn{exclusive or} of \var{x} and \var{y}}{}
324 \lineiii{\var{x} \&{} \var{y}}{bitwise \dfn{and} of \var{x} and \var{y}}{}
325 % The empty groups below prevent conversion to guillemets.
326 \lineiii{\var{x} <{}< \var{n}}{\var{x} shifted left by \var{n} bits}{(1), (2)}
327 \lineiii{\var{x} >{}> \var{n}}{\var{x} shifted right by \var{n} bits}{(1), (3)}
328 \hline
329 \lineiii{\~\var{x}}{the bits of \var{x} inverted}{}
330 \end{tableiii}
331 \indexiii{operations on}{integer}{types}
332 \indexii{bit-string}{operations}
333 \indexii{shifting}{operations}
334 \indexii{masking}{operations}
336 \noindent
337 Notes:
338 \begin{description}
339 \item[(1)] Negative shift counts are illegal and cause a
340 \exception{ValueError} to be raised.
341 \item[(2)] A left shift by \var{n} bits is equivalent to
342 multiplication by \code{pow(2, \var{n})} without overflow check.
343 \item[(3)] A right shift by \var{n} bits is equivalent to
344 division by \code{pow(2, \var{n})} without overflow check.
345 \end{description}
348 \subsection{Iterator Types \label{typeiter}}
350 \versionadded{2.2}
351 \index{iterator protocol}
352 \index{protocol!iterator}
353 \index{sequence!iteration}
354 \index{container!iteration over}
356 Python supports a concept of iteration over containers. This is
357 implemented using two distinct methods; these are used to allow
358 user-defined classes to support iteration. Sequences, described below
359 in more detail, always support the iteration methods.
361 One method needs to be defined for container objects to provide
362 iteration support:
364 \begin{methoddesc}[container]{__iter__}{}
365 Return an iterator object. The object is required to support the
366 iterator protocol described below. If a container supports
367 different types of iteration, additional methods can be provided to
368 specifically request iterators for those iteration types. (An
369 example of an object supporting multiple forms of iteration would be
370 a tree structure which supports both breadth-first and depth-first
371 traversal.) This method corresponds to the \member{tp_iter} slot of
372 the type structure for Python objects in the Python/C API.
373 \end{methoddesc}
375 The iterator objects themselves are required to support the following
376 two methods, which together form the \dfn{iterator protocol}:
378 \begin{methoddesc}[iterator]{__iter__}{}
379 Return the iterator object itself. This is required to allow both
380 containers and iterators to be used with the \keyword{for} and
381 \keyword{in} statements. This method corresponds to the
382 \member{tp_iter} slot of the type structure for Python objects in
383 the Python/C API.
384 \end{methoddesc}
386 \begin{methoddesc}[iterator]{next}{}
387 Return the next item from the container. If there are no further
388 items, raise the \exception{StopIteration} exception. This method
389 corresponds to the \member{tp_iternext} slot of the type structure
390 for Python objects in the Python/C API.
391 \end{methoddesc}
393 Python defines several iterator objects to support iteration over
394 general and specific sequence types, dictionaries, and other more
395 specialized forms. The specific types are not important beyond their
396 implementation of the iterator protocol.
398 The intention of the protocol is that once an iterator's
399 \method{next()} method raises \exception{StopIteration}, it will
400 continue to do so on subsequent calls. Implementations that
401 do not obey this property are deemed broken. (This constraint
402 was added in Python 2.3; in Python 2.2, various iterators are
403 broken according to this rule.)
405 Python's generators provide a convenient way to implement the
406 iterator protocol. If a container object's \method{__iter__()}
407 method is implemented as a generator, it will automatically
408 return an iterator object (technically, a generator object)
409 supplying the \method{__iter__()} and \method{next()} methods.
412 \subsection{Sequence Types ---
413 \class{str}, \class{unicode}, \class{list},
414 \class{tuple}, \class{buffer}, \class{xrange}
415 \label{typesseq}}
417 There are six sequence types: strings, Unicode strings, lists,
418 tuples, buffers, and xrange objects.
420 String literals are written in single or double quotes:
421 \code{'xyzzy'}, \code{"frobozz"}. See chapter 2 of the
422 \citetitle[../ref/strings.html]{Python Reference Manual} for more about
423 string literals. Unicode strings are much like strings, but are
424 specified in the syntax using a preceding \character{u} character:
425 \code{u'abc'}, \code{u"def"}. Lists are constructed with square brackets,
426 separating items with commas: \code{[a, b, c]}. Tuples are
427 constructed by the comma operator (not within square brackets), with
428 or without enclosing parentheses, but an empty tuple must have the
429 enclosing parentheses, such as \code{a, b, c} or \code{()}. A single
430 item tuple must have a trailing comma, such as \code{(d,)}.
431 \obindex{sequence}
432 \obindex{string}
433 \obindex{Unicode}
434 \obindex{tuple}
435 \obindex{list}
437 Buffer objects are not directly supported by Python syntax, but can be
438 created by calling the builtin function
439 \function{buffer()}.\bifuncindex{buffer} They don't support
440 concatenation or repetition.
441 \obindex{buffer}
443 Xrange objects are similar to buffers in that there is no specific
444 syntax to create them, but they are created using the \function{xrange()}
445 function.\bifuncindex{xrange} They don't support slicing,
446 concatenation or repetition, and using \code{in}, \code{not in},
447 \function{min()} or \function{max()} on them is inefficient.
448 \obindex{xrange}
450 Most sequence types support the following operations. The \samp{in} and
451 \samp{not in} operations have the same priorities as the comparison
452 operations. The \samp{+} and \samp{*} operations have the same
453 priority as the corresponding numeric operations.\footnote{They must
454 have since the parser can't tell the type of the operands.}
456 This table lists the sequence operations sorted in ascending priority
457 (operations in the same box have the same priority). In the table,
458 \var{s} and \var{t} are sequences of the same type; \var{n}, \var{i}
459 and \var{j} are integers:
461 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
462 \lineiii{\var{x} in \var{s}}{\code{True} if an item of \var{s} is equal to \var{x}, else \code{False}}{(1)}
463 \lineiii{\var{x} not in \var{s}}{\code{False} if an item of \var{s} is
464 equal to \var{x}, else \code{True}}{(1)}
465 \hline
466 \lineiii{\var{s} + \var{t}}{the concatenation of \var{s} and \var{t}}{(6)}
467 \lineiii{\var{s} * \var{n}\textrm{,} \var{n} * \var{s}}{\var{n} shallow copies of \var{s} concatenated}{(2)}
468 \hline
469 \lineiii{\var{s}[\var{i}]}{\var{i}'th item of \var{s}, origin 0}{(3)}
470 \lineiii{\var{s}[\var{i}:\var{j}]}{slice of \var{s} from \var{i} to \var{j}}{(3), (4)}
471 \lineiii{\var{s}[\var{i}:\var{j}:\var{k}]}{slice of \var{s} from \var{i} to \var{j} with step \var{k}}{(3), (5)}
472 \hline
473 \lineiii{len(\var{s})}{length of \var{s}}{}
474 \lineiii{min(\var{s})}{smallest item of \var{s}}{}
475 \lineiii{max(\var{s})}{largest item of \var{s}}{}
476 \end{tableiii}
477 \indexiii{operations on}{sequence}{types}
478 \bifuncindex{len}
479 \bifuncindex{min}
480 \bifuncindex{max}
481 \indexii{concatenation}{operation}
482 \indexii{repetition}{operation}
483 \indexii{subscript}{operation}
484 \indexii{slice}{operation}
485 \indexii{extended slice}{operation}
486 \opindex{in}
487 \opindex{not in}
489 \noindent
490 Notes:
492 \begin{description}
493 \item[(1)] When \var{s} is a string or Unicode string object the
494 \code{in} and \code{not in} operations act like a substring test. In
495 Python versions before 2.3, \var{x} had to be a string of length 1.
496 In Python 2.3 and beyond, \var{x} may be a string of any length.
498 \item[(2)] Values of \var{n} less than \code{0} are treated as
499 \code{0} (which yields an empty sequence of the same type as
500 \var{s}). Note also that the copies are shallow; nested structures
501 are not copied. This often haunts new Python programmers; consider:
503 \begin{verbatim}
504 >>> lists = [[]] * 3
505 >>> lists
506 [[], [], []]
507 >>> lists[0].append(3)
508 >>> lists
509 [[3], [3], [3]]
510 \end{verbatim}
512 What has happened is that \code{[[]]} is a one-element list containing
513 an empty list, so all three elements of \code{[[]] * 3} are (pointers to)
514 this single empty list. Modifying any of the elements of \code{lists}
515 modifies this single list. You can create a list of different lists this
516 way:
518 \begin{verbatim}
519 >>> lists = [[] for i in range(3)]
520 >>> lists[0].append(3)
521 >>> lists[1].append(5)
522 >>> lists[2].append(7)
523 >>> lists
524 [[3], [5], [7]]
525 \end{verbatim}
527 \item[(3)] If \var{i} or \var{j} is negative, the index is relative to
528 the end of the string: \code{len(\var{s}) + \var{i}} or
529 \code{len(\var{s}) + \var{j}} is substituted. But note that \code{-0} is
530 still \code{0}.
532 \item[(4)] The slice of \var{s} from \var{i} to \var{j} is defined as
533 the sequence of items with index \var{k} such that \code{\var{i} <=
534 \var{k} < \var{j}}. If \var{i} or \var{j} is greater than
535 \code{len(\var{s})}, use \code{len(\var{s})}. If \var{i} is omitted,
536 use \code{0}. If \var{j} is omitted, use \code{len(\var{s})}. If
537 \var{i} is greater than or equal to \var{j}, the slice is empty.
539 \item[(5)] The slice of \var{s} from \var{i} to \var{j} with step
540 \var{k} is defined as the sequence of items with index
541 \code{\var{x} = \var{i} + \var{n}*\var{k}} such that
542 $0 \leq n < \frac{j-i}{k}$. In other words, the indices
543 are \code{i}, \code{i+k}, \code{i+2*k}, \code{i+3*k} and so on, stopping when
544 \var{j} is reached (but never including \var{j}). If \var{i} or \var{j}
545 is greater than \code{len(\var{s})}, use \code{len(\var{s})}. If
546 \var{i} or \var{j} are omitted then they become ``end'' values
547 (which end depends on the sign of \var{k}). Note, \var{k} cannot
548 be zero.
550 \item[(6)] If \var{s} and \var{t} are both strings, some Python
551 implementations such as CPython can usually perform an in-place optimization
552 for assignments of the form \code{\var{s}=\var{s}+\var{t}} or
553 \code{\var{s}+=\var{t}}. When applicable, this optimization makes
554 quadratic run-time much less likely. This optimization is both version
555 and implementation dependent. For performance sensitive code, it is
556 preferable to use the \method{str.join()} method which assures consistent
557 linear concatenation performance across versions and implementations.
558 \versionchanged[Formerly, string concatenation never occurred in-place]{2.4}
560 \end{description}
563 \subsubsection{String Methods \label{string-methods}}
565 These are the string methods which both 8-bit strings and Unicode
566 objects support:
568 \begin{methoddesc}[string]{capitalize}{}
569 Return a copy of the string with only its first character capitalized.
571 For 8-bit strings, this method is locale-dependent.
572 \end{methoddesc}
574 \begin{methoddesc}[string]{center}{width\optional{, fillchar}}
575 Return centered in a string of length \var{width}. Padding is done
576 using the specified \var{fillchar} (default is a space).
577 \versionchanged[Support for the \var{fillchar} argument]{2.4}
578 \end{methoddesc}
580 \begin{methoddesc}[string]{count}{sub\optional{, start\optional{, end}}}
581 Return the number of occurrences of substring \var{sub} in string
582 S\code{[\var{start}:\var{end}]}. Optional arguments \var{start} and
583 \var{end} are interpreted as in slice notation.
584 \end{methoddesc}
586 \begin{methoddesc}[string]{decode}{\optional{encoding\optional{, errors}}}
587 Decodes the string using the codec registered for \var{encoding}.
588 \var{encoding} defaults to the default string encoding. \var{errors}
589 may be given to set a different error handling scheme. The default is
590 \code{'strict'}, meaning that encoding errors raise
591 \exception{UnicodeError}. Other possible values are \code{'ignore'},
592 \code{'replace'} and any other name registered via
593 \function{codecs.register_error}, see section~\ref{codec-base-classes}.
594 \versionadded{2.2}
595 \versionchanged[Support for other error handling schemes added]{2.3}
596 \end{methoddesc}
598 \begin{methoddesc}[string]{encode}{\optional{encoding\optional{,errors}}}
599 Return an encoded version of the string. Default encoding is the current
600 default string encoding. \var{errors} may be given to set a different
601 error handling scheme. The default for \var{errors} is
602 \code{'strict'}, meaning that encoding errors raise a
603 \exception{UnicodeError}. Other possible values are \code{'ignore'},
604 \code{'replace'}, \code{'xmlcharrefreplace'}, \code{'backslashreplace'}
605 and any other name registered via \function{codecs.register_error},
606 see section~\ref{codec-base-classes}.
607 For a list of possible encodings, see section~\ref{standard-encodings}.
608 \versionadded{2.0}
609 \versionchanged[Support for \code{'xmlcharrefreplace'} and
610 \code{'backslashreplace'} and other error handling schemes added]{2.3}
611 \end{methoddesc}
613 \begin{methoddesc}[string]{endswith}{suffix\optional{, start\optional{, end}}}
614 Return \code{True} if the string ends with the specified \var{suffix},
615 otherwise return \code{False}. With optional \var{start}, test beginning at
616 that position. With optional \var{end}, stop comparing at that position.
617 \end{methoddesc}
619 \begin{methoddesc}[string]{expandtabs}{\optional{tabsize}}
620 Return a copy of the string where all tab characters are expanded
621 using spaces. If \var{tabsize} is not given, a tab size of \code{8}
622 characters is assumed.
623 \end{methoddesc}
625 \begin{methoddesc}[string]{find}{sub\optional{, start\optional{, end}}}
626 Return the lowest index in the string where substring \var{sub} is
627 found, such that \var{sub} is contained in the range [\var{start},
628 \var{end}]. Optional arguments \var{start} and \var{end} are
629 interpreted as in slice notation. Return \code{-1} if \var{sub} is
630 not found.
631 \end{methoddesc}
633 \begin{methoddesc}[string]{index}{sub\optional{, start\optional{, end}}}
634 Like \method{find()}, but raise \exception{ValueError} when the
635 substring is not found.
636 \end{methoddesc}
638 \begin{methoddesc}[string]{isalnum}{}
639 Return true if all characters in the string are alphanumeric and there
640 is at least one character, false otherwise.
642 For 8-bit strings, this method is locale-dependent.
643 \end{methoddesc}
645 \begin{methoddesc}[string]{isalpha}{}
646 Return true if all characters in the string are alphabetic and there
647 is at least one character, false otherwise.
649 For 8-bit strings, this method is locale-dependent.
650 \end{methoddesc}
652 \begin{methoddesc}[string]{isdigit}{}
653 Return true if all characters in the string are digits and there
654 is at least one character, false otherwise.
656 For 8-bit strings, this method is locale-dependent.
657 \end{methoddesc}
659 \begin{methoddesc}[string]{islower}{}
660 Return true if all cased characters in the string are lowercase and
661 there is at least one cased character, false otherwise.
663 For 8-bit strings, this method is locale-dependent.
664 \end{methoddesc}
666 \begin{methoddesc}[string]{isspace}{}
667 Return true if there are only whitespace characters in the string and
668 there is at least one character, false otherwise.
670 For 8-bit strings, this method is locale-dependent.
671 \end{methoddesc}
673 \begin{methoddesc}[string]{istitle}{}
674 Return true if the string is a titlecased string and there is at least one
675 character, for example uppercase characters may only follow uncased
676 characters and lowercase characters only cased ones. Return false
677 otherwise.
679 For 8-bit strings, this method is locale-dependent.
680 \end{methoddesc}
682 \begin{methoddesc}[string]{isupper}{}
683 Return true if all cased characters in the string are uppercase and
684 there is at least one cased character, false otherwise.
686 For 8-bit strings, this method is locale-dependent.
687 \end{methoddesc}
689 \begin{methoddesc}[string]{join}{seq}
690 Return a string which is the concatenation of the strings in the
691 sequence \var{seq}. The separator between elements is the string
692 providing this method.
693 \end{methoddesc}
695 \begin{methoddesc}[string]{ljust}{width\optional{, fillchar}}
696 Return the string left justified in a string of length \var{width}.
697 Padding is done using the specified \var{fillchar} (default is a
698 space). The original string is returned if
699 \var{width} is less than \code{len(\var{s})}.
700 \versionchanged[Support for the \var{fillchar} argument]{2.4}
701 \end{methoddesc}
703 \begin{methoddesc}[string]{lower}{}
704 Return a copy of the string converted to lowercase.
706 For 8-bit strings, this method is locale-dependent.
707 \end{methoddesc}
709 \begin{methoddesc}[string]{lstrip}{\optional{chars}}
710 Return a copy of the string with leading characters removed. The
711 \var{chars} argument is a string specifying the set of characters
712 to be removed. If omitted or \code{None}, the \var{chars} argument
713 defaults to removing whitespace. The \var{chars} argument is not
714 a prefix; rather, all combinations of its values are stripped:
715 \begin{verbatim}
716 >>> ' spacious '.lstrip()
717 'spacious '
718 >>> 'www.example.com'.lstrip('cmowz.')
719 'example.com'
720 \end{verbatim}
721 \versionchanged[Support for the \var{chars} argument]{2.2.2}
722 \end{methoddesc}
724 \begin{methoddesc}[string]{replace}{old, new\optional{, count}}
725 Return a copy of the string with all occurrences of substring
726 \var{old} replaced by \var{new}. If the optional argument
727 \var{count} is given, only the first \var{count} occurrences are
728 replaced.
729 \end{methoddesc}
731 \begin{methoddesc}[string]{rfind}{sub \optional{,start \optional{,end}}}
732 Return the highest index in the string where substring \var{sub} is
733 found, such that \var{sub} is contained within s[start,end]. Optional
734 arguments \var{start} and \var{end} are interpreted as in slice
735 notation. Return \code{-1} on failure.
736 \end{methoddesc}
738 \begin{methoddesc}[string]{rindex}{sub\optional{, start\optional{, end}}}
739 Like \method{rfind()} but raises \exception{ValueError} when the
740 substring \var{sub} is not found.
741 \end{methoddesc}
743 \begin{methoddesc}[string]{rjust}{width\optional{, fillchar}}
744 Return the string right justified in a string of length \var{width}.
745 Padding is done using the specified \var{fillchar} (default is a space).
746 The original string is returned if
747 \var{width} is less than \code{len(\var{s})}.
748 \versionchanged[Support for the \var{fillchar} argument]{2.4}
749 \end{methoddesc}
751 \begin{methoddesc}[string]{rsplit}{\optional{sep \optional{,maxsplit}}}
752 Return a list of the words in the string, using \var{sep} as the
753 delimiter string. If \var{maxsplit} is given, at most \var{maxsplit}
754 splits are done, the \emph{rightmost} ones. If \var{sep} is not specified
755 or \code{None}, any whitespace string is a separator. Except for splitting
756 from the right, \method{rsplit()} behaves like \method{split()} which
757 is described in detail below.
758 \versionadded{2.4}
759 \end{methoddesc}
761 \begin{methoddesc}[string]{rstrip}{\optional{chars}}
762 Return a copy of the string with trailing characters removed. The
763 \var{chars} argument is a string specifying the set of characters
764 to be removed. If omitted or \code{None}, the \var{chars} argument
765 defaults to removing whitespace. The \var{chars} argument is not
766 a suffix; rather, all combinations of its values are stripped:
767 \begin{verbatim}
768 >>> ' spacious '.rstrip()
769 ' spacious'
770 >>> 'mississippi'.rstrip('ipz')
771 'mississ'
772 \end{verbatim}
773 \versionchanged[Support for the \var{chars} argument]{2.2.2}
774 \end{methoddesc}
776 \begin{methoddesc}[string]{split}{\optional{sep \optional{,maxsplit}}}
777 Return a list of the words in the string, using \var{sep} as the
778 delimiter string. If \var{maxsplit} is given, at most \var{maxsplit}
779 splits are done. (thus, the list will have at most \code{\var{maxsplit}+1}
780 elements). If \var{maxsplit} is not specified, then there
781 is no limit on the number of splits (all possible splits are made).
782 Consecutive delimiters are not grouped together and are
783 deemed to delimit empty strings (for example, \samp{'1,,2'.split(',')}
784 returns \samp{['1', '', '2']}). The \var{sep} argument may consist of
785 multiple characters (for example, \samp{'1, 2, 3'.split(', ')} returns
786 \samp{['1', '2', '3']}). Splitting an empty string with a specified
787 separator returns \samp{['']}.
789 If \var{sep} is not specified or is \code{None}, a different splitting
790 algorithm is applied. First, whitespace characters (spaces, tabs,
791 newlines, returns, and formfeeds) are stripped from both ends. Then,
792 words are separated by arbitrary length strings of whitespace
793 characters. Consecutive whitespace delimiters are treated as a single
794 delimiter (\samp{'1 2 3'.split()} returns \samp{['1', '2', '3']}).
795 Splitting an empty string or a string consisting of just whitespace
796 returns an empty list.
797 \end{methoddesc}
799 \begin{methoddesc}[string]{splitlines}{\optional{keepends}}
800 Return a list of the lines in the string, breaking at line
801 boundaries. Line breaks are not included in the resulting list unless
802 \var{keepends} is given and true.
803 \end{methoddesc}
805 \begin{methoddesc}[string]{startswith}{prefix\optional{,
806 start\optional{, end}}}
807 Return \code{True} if string starts with the \var{prefix}, otherwise
808 return \code{False}. With optional \var{start}, test string beginning at
809 that position. With optional \var{end}, stop comparing string at that
810 position.
811 \end{methoddesc}
813 \begin{methoddesc}[string]{strip}{\optional{chars}}
814 Return a copy of the string with the leading and trailing characters
815 removed. The \var{chars} argument is a string specifying the set of
816 characters to be removed. If omitted or \code{None}, the \var{chars}
817 argument defaults to removing whitespace. The \var{chars} argument is not
818 a prefix or suffix; rather, all combinations of its values are stripped:
819 \begin{verbatim}
820 >>> ' spacious '.strip()
821 'spacious'
822 >>> 'www.example.com'.strip('cmowz.')
823 'example'
824 \end{verbatim}
825 \versionchanged[Support for the \var{chars} argument]{2.2.2}
826 \end{methoddesc}
828 \begin{methoddesc}[string]{swapcase}{}
829 Return a copy of the string with uppercase characters converted to
830 lowercase and vice versa.
832 For 8-bit strings, this method is locale-dependent.
833 \end{methoddesc}
835 \begin{methoddesc}[string]{title}{}
836 Return a titlecased version of the string: words start with uppercase
837 characters, all remaining cased characters are lowercase.
839 For 8-bit strings, this method is locale-dependent.
840 \end{methoddesc}
842 \begin{methoddesc}[string]{translate}{table\optional{, deletechars}}
843 Return a copy of the string where all characters occurring in the
844 optional argument \var{deletechars} are removed, and the remaining
845 characters have been mapped through the given translation table, which
846 must be a string of length 256.
848 For Unicode objects, the \method{translate()} method does not
849 accept the optional \var{deletechars} argument. Instead, it
850 returns a copy of the \var{s} where all characters have been mapped
851 through the given translation table which must be a mapping of
852 Unicode ordinals to Unicode ordinals, Unicode strings or \code{None}.
853 Unmapped characters are left untouched. Characters mapped to \code{None}
854 are deleted. Note, a more flexible approach is to create a custom
855 character mapping codec using the \refmodule{codecs} module (see
856 \module{encodings.cp1251} for an example).
857 \end{methoddesc}
859 \begin{methoddesc}[string]{upper}{}
860 Return a copy of the string converted to uppercase.
862 For 8-bit strings, this method is locale-dependent.
863 \end{methoddesc}
865 \begin{methoddesc}[string]{zfill}{width}
866 Return the numeric string left filled with zeros in a string
867 of length \var{width}. The original string is returned if
868 \var{width} is less than \code{len(\var{s})}.
869 \versionadded{2.2.2}
870 \end{methoddesc}
873 \subsubsection{String Formatting Operations \label{typesseq-strings}}
875 \index{formatting, string (\%{})}
876 \index{interpolation, string (\%{})}
877 \index{string!formatting}
878 \index{string!interpolation}
879 \index{printf-style formatting}
880 \index{sprintf-style formatting}
881 \index{\protect\%{} formatting}
882 \index{\protect\%{} interpolation}
884 String and Unicode objects have one unique built-in operation: the
885 \code{\%} operator (modulo). This is also known as the string
886 \emph{formatting} or \emph{interpolation} operator. Given
887 \code{\var{format} \% \var{values}} (where \var{format} is a string or
888 Unicode object), \code{\%} conversion specifications in \var{format}
889 are replaced with zero or more elements of \var{values}. The effect
890 is similar to the using \cfunction{sprintf()} in the C language. If
891 \var{format} is a Unicode object, or if any of the objects being
892 converted using the \code{\%s} conversion are Unicode objects, the
893 result will also be a Unicode object.
895 If \var{format} requires a single argument, \var{values} may be a
896 single non-tuple object.\footnote{To format only a tuple you
897 should therefore provide a singleton tuple whose only element
898 is the tuple to be formatted.} Otherwise, \var{values} must be a tuple with
899 exactly the number of items specified by the format string, or a
900 single mapping object (for example, a dictionary).
902 A conversion specifier contains two or more characters and has the
903 following components, which must occur in this order:
905 \begin{enumerate}
906 \item The \character{\%} character, which marks the start of the
907 specifier.
908 \item Mapping key (optional), consisting of a parenthesised sequence
909 of characters (for example, \code{(somename)}).
910 \item Conversion flags (optional), which affect the result of some
911 conversion types.
912 \item Minimum field width (optional). If specified as an
913 \character{*} (asterisk), the actual width is read from the
914 next element of the tuple in \var{values}, and the object to
915 convert comes after the minimum field width and optional
916 precision.
917 \item Precision (optional), given as a \character{.} (dot) followed
918 by the precision. If specified as \character{*} (an
919 asterisk), the actual width is read from the next element of
920 the tuple in \var{values}, and the value to convert comes after
921 the precision.
922 \item Length modifier (optional).
923 \item Conversion type.
924 \end{enumerate}
926 When the right argument is a dictionary (or other mapping type), then
927 the formats in the string \emph{must} include a parenthesised mapping key into
928 that dictionary inserted immediately after the \character{\%}
929 character. The mapping key selects the value to be formatted from the
930 mapping. For example:
932 \begin{verbatim}
933 >>> print '%(language)s has %(#)03d quote types.' % \
934 {'language': "Python", "#": 2}
935 Python has 002 quote types.
936 \end{verbatim}
938 In this case no \code{*} specifiers may occur in a format (since they
939 require a sequential parameter list).
941 The conversion flag characters are:
943 \begin{tableii}{c|l}{character}{Flag}{Meaning}
944 \lineii{\#}{The value conversion will use the ``alternate form''
945 (where defined below).}
946 \lineii{0}{The conversion will be zero padded for numeric values.}
947 \lineii{-}{The converted value is left adjusted (overrides
948 the \character{0} conversion if both are given).}
949 \lineii{{~}}{(a space) A blank should be left before a positive number
950 (or empty string) produced by a signed conversion.}
951 \lineii{+}{A sign character (\character{+} or \character{-}) will
952 precede the conversion (overrides a "space" flag).}
953 \end{tableii}
955 A length modifier (\code{h}, \code{l}, or \code{L}) may be
956 present, but is ignored as it is not necessary for Python.
958 The conversion types are:
960 \begin{tableiii}{c|l|c}{character}{Conversion}{Meaning}{Notes}
961 \lineiii{d}{Signed integer decimal.}{}
962 \lineiii{i}{Signed integer decimal.}{}
963 \lineiii{o}{Unsigned octal.}{(1)}
964 \lineiii{u}{Unsigned decimal.}{}
965 \lineiii{x}{Unsigned hexadecimal (lowercase).}{(2)}
966 \lineiii{X}{Unsigned hexadecimal (uppercase).}{(2)}
967 \lineiii{e}{Floating point exponential format (lowercase).}{}
968 \lineiii{E}{Floating point exponential format (uppercase).}{}
969 \lineiii{f}{Floating point decimal format.}{}
970 \lineiii{F}{Floating point decimal format.}{}
971 \lineiii{g}{Same as \character{e} if exponent is greater than -4 or
972 less than precision, \character{f} otherwise.}{}
973 \lineiii{G}{Same as \character{E} if exponent is greater than -4 or
974 less than precision, \character{F} otherwise.}{}
975 \lineiii{c}{Single character (accepts integer or single character
976 string).}{}
977 \lineiii{r}{String (converts any python object using
978 \function{repr()}).}{(3)}
979 \lineiii{s}{String (converts any python object using
980 \function{str()}).}{(4)}
981 \lineiii{\%}{No argument is converted, results in a \character{\%}
982 character in the result.}{}
983 \end{tableiii}
985 \noindent
986 Notes:
987 \begin{description}
988 \item[(1)]
989 The alternate form causes a leading zero (\character{0}) to be
990 inserted between left-hand padding and the formatting of the
991 number if the leading character of the result is not already a
992 zero.
993 \item[(2)]
994 The alternate form causes a leading \code{'0x'} or \code{'0X'}
995 (depending on whether the \character{x} or \character{X} format
996 was used) to be inserted between left-hand padding and the
997 formatting of the number if the leading character of the result is
998 not already a zero.
999 \item[(3)]
1000 The \code{\%r} conversion was added in Python 2.0.
1001 \item[(4)]
1002 If the object or format provided is a \class{unicode} string,
1003 the resulting string will also be \class{unicode}.
1004 \end{description}
1006 % XXX Examples?
1008 Since Python strings have an explicit length, \code{\%s} conversions
1009 do not assume that \code{'\e0'} is the end of the string.
1011 For safety reasons, floating point precisions are clipped to 50;
1012 \code{\%f} conversions for numbers whose absolute value is over 1e25
1013 are replaced by \code{\%g} conversions.\footnote{
1014 These numbers are fairly arbitrary. They are intended to
1015 avoid printing endless strings of meaningless digits without hampering
1016 correct use and without having to know the exact precision of floating
1017 point values on a particular machine.
1018 } All other errors raise exceptions.
1020 Additional string operations are defined in standard modules
1021 \refmodule{string}\refstmodindex{string}\ and
1022 \refmodule{re}.\refstmodindex{re}
1025 \subsubsection{XRange Type \label{typesseq-xrange}}
1027 The \class{xrange}\obindex{xrange} type is an immutable sequence which
1028 is commonly used for looping. The advantage of the \class{xrange}
1029 type is that an \class{xrange} object will always take the same amount
1030 of memory, no matter the size of the range it represents. There are
1031 no consistent performance advantages.
1033 XRange objects have very little behavior: they only support indexing,
1034 iteration, and the \function{len()} function.
1037 \subsubsection{Mutable Sequence Types \label{typesseq-mutable}}
1039 List objects support additional operations that allow in-place
1040 modification of the object.
1041 Other mutable sequence types (when added to the language) should
1042 also support these operations.
1043 Strings and tuples are immutable sequence types: such objects cannot
1044 be modified once created.
1045 The following operations are defined on mutable sequence types (where
1046 \var{x} is an arbitrary object):
1047 \indexiii{mutable}{sequence}{types}
1048 \obindex{list}
1050 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
1051 \lineiii{\var{s}[\var{i}] = \var{x}}
1052 {item \var{i} of \var{s} is replaced by \var{x}}{}
1053 \lineiii{\var{s}[\var{i}:\var{j}] = \var{t}}
1054 {slice of \var{s} from \var{i} to \var{j} is replaced by \var{t}}{}
1055 \lineiii{del \var{s}[\var{i}:\var{j}]}
1056 {same as \code{\var{s}[\var{i}:\var{j}] = []}}{}
1057 \lineiii{\var{s}[\var{i}:\var{j}:\var{k}] = \var{t}}
1058 {the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} are replaced by those of \var{t}}{(1)}
1059 \lineiii{del \var{s}[\var{i}:\var{j}:\var{k}]}
1060 {removes the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} from the list}{}
1061 \lineiii{\var{s}.append(\var{x})}
1062 {same as \code{\var{s}[len(\var{s}):len(\var{s})] = [\var{x}]}}{(2)}
1063 \lineiii{\var{s}.extend(\var{x})}
1064 {same as \code{\var{s}[len(\var{s}):len(\var{s})] = \var{x}}}{(3)}
1065 \lineiii{\var{s}.count(\var{x})}
1066 {return number of \var{i}'s for which \code{\var{s}[\var{i}] == \var{x}}}{}
1067 \lineiii{\var{s}.index(\var{x}\optional{, \var{i}\optional{, \var{j}}})}
1068 {return smallest \var{k} such that \code{\var{s}[\var{k}] == \var{x}} and
1069 \code{\var{i} <= \var{k} < \var{j}}}{(4)}
1070 \lineiii{\var{s}.insert(\var{i}, \var{x})}
1071 {same as \code{\var{s}[\var{i}:\var{i}] = [\var{x}]}}{(5)}
1072 \lineiii{\var{s}.pop(\optional{\var{i}})}
1073 {same as \code{\var{x} = \var{s}[\var{i}]; del \var{s}[\var{i}]; return \var{x}}}{(6)}
1074 \lineiii{\var{s}.remove(\var{x})}
1075 {same as \code{del \var{s}[\var{s}.index(\var{x})]}}{(4)}
1076 \lineiii{\var{s}.reverse()}
1077 {reverses the items of \var{s} in place}{(7)}
1078 \lineiii{\var{s}.sort(\optional{\var{cmp}\optional{,
1079 \var{key}\optional{, \var{reverse}}}})}
1080 {sort the items of \var{s} in place}{(7), (8), (9), (10)}
1081 \end{tableiii}
1082 \indexiv{operations on}{mutable}{sequence}{types}
1083 \indexiii{operations on}{sequence}{types}
1084 \indexiii{operations on}{list}{type}
1085 \indexii{subscript}{assignment}
1086 \indexii{slice}{assignment}
1087 \indexii{extended slice}{assignment}
1088 \stindex{del}
1089 \withsubitem{(list method)}{
1090 \ttindex{append()}\ttindex{extend()}\ttindex{count()}\ttindex{index()}
1091 \ttindex{insert()}\ttindex{pop()}\ttindex{remove()}\ttindex{reverse()}
1092 \ttindex{sort()}}
1093 \noindent
1094 Notes:
1095 \begin{description}
1096 \item[(1)] \var{t} must have the same length as the slice it is
1097 replacing.
1099 \item[(2)] The C implementation of Python has historically accepted
1100 multiple parameters and implicitly joined them into a tuple; this
1101 no longer works in Python 2.0. Use of this misfeature has been
1102 deprecated since Python 1.4.
1104 \item[(3)] \var{x} can be any iterable object.
1106 \item[(4)] Raises \exception{ValueError} when \var{x} is not found in
1107 \var{s}. When a negative index is passed as the second or third parameter
1108 to the \method{index()} method, the list length is added, as for slice
1109 indices. If it is still negative, it is truncated to zero, as for
1110 slice indices. \versionchanged[Previously, \method{index()} didn't
1111 have arguments for specifying start and stop positions]{2.3}
1113 \item[(5)] When a negative index is passed as the first parameter to
1114 the \method{insert()} method, the list length is added, as for slice
1115 indices. If it is still negative, it is truncated to zero, as for
1116 slice indices. \versionchanged[Previously, all negative indices
1117 were truncated to zero]{2.3}
1119 \item[(6)] The \method{pop()} method is only supported by the list and
1120 array types. The optional argument \var{i} defaults to \code{-1},
1121 so that by default the last item is removed and returned.
1123 \item[(7)] The \method{sort()} and \method{reverse()} methods modify the
1124 list in place for economy of space when sorting or reversing a large
1125 list. To remind you that they operate by side effect, they don't return
1126 the sorted or reversed list.
1128 \item[(8)] The \method{sort()} method takes optional arguments for
1129 controlling the comparisons.
1131 \var{cmp} specifies a custom comparison function of two arguments
1132 (list items) which should return a negative, zero or positive number
1133 depending on whether the first argument is considered smaller than,
1134 equal to, or larger than the second argument:
1135 \samp{\var{cmp}=\keyword{lambda} \var{x},\var{y}:
1136 \function{cmp}(x.lower(), y.lower())}
1138 \var{key} specifies a function of one argument that is used to
1139 extract a comparison key from each list element:
1140 \samp{\var{key}=\function{str.lower}}
1142 \var{reverse} is a boolean value. If set to \code{True}, then the
1143 list elements are sorted as if each comparison were reversed.
1145 In general, the \var{key} and \var{reverse} conversion processes are
1146 much faster than specifying an equivalent \var{cmp} function. This is
1147 because \var{cmp} is called multiple times for each list element while
1148 \var{key} and \var{reverse} touch each element only once.
1150 \versionchanged[Support for \code{None} as an equivalent to omitting
1151 \var{cmp} was added]{2.3}
1153 \versionchanged[Support for \var{key} and \var{reverse} was added]{2.4}
1155 \item[(9)] Starting with Python 2.3, the \method{sort()} method is
1156 guaranteed to be stable. A sort is stable if it guarantees not to
1157 change the relative order of elements that compare equal --- this is
1158 helpful for sorting in multiple passes (for example, sort by
1159 department, then by salary grade).
1161 \item[(10)] While a list is being sorted, the effect of attempting to
1162 mutate, or even inspect, the list is undefined. The C
1163 implementation of Python 2.3 and newer makes the list appear empty
1164 for the duration, and raises \exception{ValueError} if it can detect
1165 that the list has been mutated during a sort.
1166 \end{description}
1168 \subsection{Set Types ---
1169 \class{set}, \class{frozenset}
1170 \label{types-set}}
1171 \obindex{set}
1173 A \dfn{set} object is an unordered collection of immutable values.
1174 Common uses include membership testing, removing duplicates from a sequence,
1175 and computing mathematical operations such as intersection, union, difference,
1176 and symmetric difference.
1177 \versionadded{2.4}
1179 Like other collections, sets support \code{\var{x} in \var{set}},
1180 \code{len(\var{set})}, and \code{for \var{x} in \var{set}}. Being an
1181 unordered collection, sets do not record element position or order of
1182 insertion. Accordingly, sets do not support indexing, slicing, or
1183 other sequence-like behavior.
1185 There are currently two builtin set types, \class{set} and \class{frozenset}.
1186 The \class{set} type is mutable --- the contents can be changed using methods
1187 like \method{add()} and \method{remove()}. Since it is mutable, it has no
1188 hash value and cannot be used as either a dictionary key or as an element of
1189 another set. The \class{frozenset} type is immutable and hashable --- its
1190 contents cannot be altered after is created; however, it can be used as
1191 a dictionary key or as an element of another set.
1193 Instances of \class{set} and \class{frozenset} provide the following operations:
1195 \begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
1196 \lineiii{len(\var{s})}{}{cardinality of set \var{s}}
1198 \hline
1199 \lineiii{\var{x} in \var{s}}{}
1200 {test \var{x} for membership in \var{s}}
1201 \lineiii{\var{x} not in \var{s}}{}
1202 {test \var{x} for non-membership in \var{s}}
1203 \lineiii{\var{s}.issubset(\var{t})}{\code{\var{s} <= \var{t}}}
1204 {test whether every element in \var{s} is in \var{t}}
1205 \lineiii{\var{s}.issuperset(\var{t})}{\code{\var{s} >= \var{t}}}
1206 {test whether every element in \var{t} is in \var{s}}
1208 \hline
1209 \lineiii{\var{s}.union(\var{t})}{\var{s} | \var{t}}
1210 {new set with elements from both \var{s} and \var{t}}
1211 \lineiii{\var{s}.intersection(\var{t})}{\var{s} \&\ \var{t}}
1212 {new set with elements common to \var{s} and \var{t}}
1213 \lineiii{\var{s}.difference(\var{t})}{\var{s} - \var{t}}
1214 {new set with elements in \var{s} but not in \var{t}}
1215 \lineiii{\var{s}.symmetric_difference(\var{t})}{\var{s} \^\ \var{t}}
1216 {new set with elements in either \var{s} or \var{t} but not both}
1217 \lineiii{\var{s}.copy()}{}
1218 {new set with a shallow copy of \var{s}}
1219 \end{tableiii}
1221 Note, the non-operator versions of \method{union()}, \method{intersection()},
1222 \method{difference()}, and \method{symmetric_difference()},
1223 \method{issubset()}, and \method{issuperset()} methods will accept any
1224 iterable as an argument. In contrast, their operator based counterparts
1225 require their arguments to be sets. This precludes error-prone constructions
1226 like \code{set('abc') \&\ 'cbs'} in favor of the more readable
1227 \code{set('abc').intersection('cbs')}.
1229 Both \class{set} and \class{frozenset} support set to set comparisons.
1230 Two sets are equal if and only if every element of each set is contained in
1231 the other (each is a subset of the other).
1232 A set is less than another set if and only if the first set is a proper
1233 subset of the second set (is a subset, but is not equal).
1234 A set is greater than another set if and only if the first set is a proper
1235 superset of the second set (is a superset, but is not equal).
1237 Instances of \class{set} are compared to instances of \class{frozenset} based
1238 on their members. For example, \samp{set('abc') == frozenset('abc')} returns
1239 \code{True}.
1241 The subset and equality comparisons do not generalize to a complete
1242 ordering function. For example, any two disjoint sets are not equal and
1243 are not subsets of each other, so \emph{all} of the following return
1244 \code{False}: \code{\var{a}<\var{b}}, \code{\var{a}==\var{b}}, or
1245 \code{\var{a}>\var{b}}.
1246 Accordingly, sets do not implement the \method{__cmp__} method.
1248 Since sets only define partial ordering (subset relationships), the output
1249 of the \method{list.sort()} method is undefined for lists of sets.
1251 Set elements are like dictionary keys; they need to define both
1252 \method{__hash__} and \method{__eq__} methods.
1254 Binary operations that mix \class{set} instances with \class{frozenset}
1255 return the type of the first operand. For example:
1256 \samp{frozenset('ab') | set('bc')} returns an instance of \class{frozenset}.
1258 The following table lists operations available for \class{set}
1259 that do not apply to immutable instances of \class{frozenset}:
1261 \begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
1262 \lineiii{\var{s}.update(\var{t})}
1263 {\var{s} |= \var{t}}
1264 {update set \var{s}, adding elements from \var{t}}
1265 \lineiii{\var{s}.intersection_update(\var{t})}
1266 {\var{s} \&= \var{t}}
1267 {update set \var{s}, keeping only elements found in both \var{s} and \var{t}}
1268 \lineiii{\var{s}.difference_update(\var{t})}
1269 {\var{s} -= \var{t}}
1270 {update set \var{s}, removing elements found in \var{t}}
1271 \lineiii{\var{s}.symmetric_difference_update(\var{t})}
1272 {\var{s} \textasciicircum= \var{t}}
1273 {update set \var{s}, keeping only elements found in either \var{s} or \var{t}
1274 but not in both}
1276 \hline
1277 \lineiii{\var{s}.add(\var{x})}{}
1278 {add element \var{x} to set \var{s}}
1279 \lineiii{\var{s}.remove(\var{x})}{}
1280 {remove \var{x} from set \var{s}; raises KeyError if not present}
1281 \lineiii{\var{s}.discard(\var{x})}{}
1282 {removes \var{x} from set \var{s} if present}
1283 \lineiii{\var{s}.pop()}{}
1284 {remove and return an arbitrary element from \var{s}; raises
1285 \exception{KeyError} if empty}
1286 \lineiii{\var{s}.clear()}{}
1287 {remove all elements from set \var{s}}
1288 \end{tableiii}
1290 Note, the non-operator versions of the \method{update()},
1291 \method{intersection_update()}, \method{difference_update()}, and
1292 \method{symmetric_difference_update()} methods will accept any iterable
1293 as an argument.
1295 The design of the set types was based on lessons learned from the
1296 \module{sets} module.
1298 \begin{seealso}
1299 \seelink{comparison-to-builtin-set.html}
1300 {Comparison to the built-in set types}
1301 {Differences between the \module{sets} module and the
1302 built-in set types.}
1303 \end{seealso}
1306 \subsection{Mapping Types --- \class{dict} \label{typesmapping}}
1307 \obindex{mapping}
1308 \obindex{dictionary}
1310 A \dfn{mapping} object maps immutable values to
1311 arbitrary objects. Mappings are mutable objects. There is currently
1312 only one standard mapping type, the \dfn{dictionary}. A dictionary's keys are
1313 almost arbitrary values. Only values containing lists, dictionaries
1314 or other mutable types (that are compared by value rather than by
1315 object identity) may not be used as keys.
1316 Numeric types used for keys obey the normal rules for numeric
1317 comparison: if two numbers compare equal (such as \code{1} and
1318 \code{1.0}) then they can be used interchangeably to index the same
1319 dictionary entry.
1321 Dictionaries are created by placing a comma-separated list of
1322 \code{\var{key}: \var{value}} pairs within braces, for example:
1323 \code{\{'jack': 4098, 'sjoerd': 4127\}} or
1324 \code{\{4098: 'jack', 4127: 'sjoerd'\}}.
1326 The following operations are defined on mappings (where \var{a} and
1327 \var{b} are mappings, \var{k} is a key, and \var{v} and \var{x} are
1328 arbitrary objects):
1329 \indexiii{operations on}{mapping}{types}
1330 \indexiii{operations on}{dictionary}{type}
1331 \stindex{del}
1332 \bifuncindex{len}
1333 \withsubitem{(dictionary method)}{
1334 \ttindex{clear()}
1335 \ttindex{copy()}
1336 \ttindex{has_key()}
1337 \ttindex{fromkeys()}
1338 \ttindex{items()}
1339 \ttindex{keys()}
1340 \ttindex{update()}
1341 \ttindex{values()}
1342 \ttindex{get()}
1343 \ttindex{setdefault()}
1344 \ttindex{pop()}
1345 \ttindex{popitem()}
1346 \ttindex{iteritems()}
1347 \ttindex{iterkeys()}
1348 \ttindex{itervalues()}}
1350 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
1351 \lineiii{len(\var{a})}{the number of items in \var{a}}{}
1352 \lineiii{\var{a}[\var{k}]}{the item of \var{a} with key \var{k}}{(1)}
1353 \lineiii{\var{a}[\var{k}] = \var{v}}
1354 {set \code{\var{a}[\var{k}]} to \var{v}}
1356 \lineiii{del \var{a}[\var{k}]}
1357 {remove \code{\var{a}[\var{k}]} from \var{a}}
1358 {(1)}
1359 \lineiii{\var{a}.clear()}{remove all items from \code{a}}{}
1360 \lineiii{\var{a}.copy()}{a (shallow) copy of \code{a}}{}
1361 \lineiii{\var{a}.has_key(\var{k})}
1362 {\code{True} if \var{a} has a key \var{k}, else \code{False}}
1364 \lineiii{\var{k} \code{in} \var{a}}
1365 {Equivalent to \var{a}.has_key(\var{k})}
1366 {(2)}
1367 \lineiii{\var{k} not in \var{a}}
1368 {Equivalent to \code{not} \var{a}.has_key(\var{k})}
1369 {(2)}
1370 \lineiii{\var{a}.items()}
1371 {a copy of \var{a}'s list of (\var{key}, \var{value}) pairs}
1372 {(3)}
1373 \lineiii{\var{a}.keys()}{a copy of \var{a}'s list of keys}{(3)}
1374 \lineiii{\var{a}.update(\optional{\var{b}})}
1375 {updates (and overwrites) key/value pairs from \var{b}}
1376 {(9)}
1377 \lineiii{\var{a}.fromkeys(\var{seq}\optional{, \var{value}})}
1378 {Creates a new dictionary with keys from \var{seq} and values set to \var{value}}
1379 {(7)}
1380 \lineiii{\var{a}.values()}{a copy of \var{a}'s list of values}{(3)}
1381 \lineiii{\var{a}.get(\var{k}\optional{, \var{x}})}
1382 {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1383 else \var{x}}
1384 {(4)}
1385 \lineiii{\var{a}.setdefault(\var{k}\optional{, \var{x}})}
1386 {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1387 else \var{x} (also setting it)}
1388 {(5)}
1389 \lineiii{\var{a}.pop(\var{k}\optional{, \var{x}})}
1390 {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1391 else \var{x} (and remove k)}
1392 {(8)}
1393 \lineiii{\var{a}.popitem()}
1394 {remove and return an arbitrary (\var{key}, \var{value}) pair}
1395 {(6)}
1396 \lineiii{\var{a}.iteritems()}
1397 {return an iterator over (\var{key}, \var{value}) pairs}
1398 {(2), (3)}
1399 \lineiii{\var{a}.iterkeys()}
1400 {return an iterator over the mapping's keys}
1401 {(2), (3)}
1402 \lineiii{\var{a}.itervalues()}
1403 {return an iterator over the mapping's values}
1404 {(2), (3)}
1405 \end{tableiii}
1407 \noindent
1408 Notes:
1409 \begin{description}
1410 \item[(1)] Raises a \exception{KeyError} exception if \var{k} is not
1411 in the map.
1413 \item[(2)] \versionadded{2.2}
1415 \item[(3)] Keys and values are listed in an arbitrary order which is
1416 non-random, varies across Python implementations, and depends on the
1417 dictionary's history of insertions and deletions.
1418 If \method{items()}, \method{keys()}, \method{values()},
1419 \method{iteritems()}, \method{iterkeys()}, and \method{itervalues()}
1420 are called with no intervening modifications to the dictionary, the
1421 lists will directly correspond. This allows the creation of
1422 \code{(\var{value}, \var{key})} pairs using \function{zip()}:
1423 \samp{pairs = zip(\var{a}.values(), \var{a}.keys())}. The same
1424 relationship holds for the \method{iterkeys()} and
1425 \method{itervalues()} methods: \samp{pairs = zip(\var{a}.itervalues(),
1426 \var{a}.iterkeys())} provides the same value for \code{pairs}.
1427 Another way to create the same list is \samp{pairs = [(v, k) for (k,
1428 v) in \var{a}.iteritems()]}.
1430 \item[(4)] Never raises an exception if \var{k} is not in the map,
1431 instead it returns \var{x}. \var{x} is optional; when \var{x} is not
1432 provided and \var{k} is not in the map, \code{None} is returned.
1434 \item[(5)] \function{setdefault()} is like \function{get()}, except
1435 that if \var{k} is missing, \var{x} is both returned and inserted into
1436 the dictionary as the value of \var{k}. \var{x} defaults to \var{None}.
1438 \item[(6)] \function{popitem()} is useful to destructively iterate
1439 over a dictionary, as often used in set algorithms. If the dictionary
1440 is empty, calling \function{popitem()} raises a \exception{KeyError}.
1442 \item[(7)] \function{fromkeys()} is a class method that returns a
1443 new dictionary. \var{value} defaults to \code{None}. \versionadded{2.3}
1445 \item[(8)] \function{pop()} raises a \exception{KeyError} when no default
1446 value is given and the key is not found. \versionadded{2.3}
1448 \item[(9)] \function{update()} accepts either another mapping object
1449 or an iterable of key/value pairs (as a tuple or other iterable of
1450 length two). If keyword arguments are specified, the mapping is
1451 then is updated with those key/value pairs:
1452 \samp{d.update(red=1, blue=2)}.
1453 \versionchanged[Allowed the argument to be an iterable of key/value
1454 pairs and allowed keyword arguments]{2.4}
1456 \end{description}
1458 \subsection{File Objects
1459 \label{bltin-file-objects}}
1461 File objects\obindex{file} are implemented using C's \code{stdio}
1462 package and can be created with the built-in constructor
1463 \function{file()}\bifuncindex{file} described in section
1464 \ref{built-in-funcs}, ``Built-in Functions.''\footnote{\function{file()}
1465 is new in Python 2.2. The older built-in \function{open()} is an
1466 alias for \function{file()}.} File objects are also returned
1467 by some other built-in functions and methods, such as
1468 \function{os.popen()} and \function{os.fdopen()} and the
1469 \method{makefile()} method of socket objects.
1470 \refstmodindex{os}
1471 \refbimodindex{socket}
1473 When a file operation fails for an I/O-related reason, the exception
1474 \exception{IOError} is raised. This includes situations where the
1475 operation is not defined for some reason, like \method{seek()} on a tty
1476 device or writing a file opened for reading.
1478 Files have the following methods:
1481 \begin{methoddesc}[file]{close}{}
1482 Close the file. A closed file cannot be read or written any more.
1483 Any operation which requires that the file be open will raise a
1484 \exception{ValueError} after the file has been closed. Calling
1485 \method{close()} more than once is allowed.
1486 \end{methoddesc}
1488 \begin{methoddesc}[file]{flush}{}
1489 Flush the internal buffer, like \code{stdio}'s
1490 \cfunction{fflush()}. This may be a no-op on some file-like
1491 objects.
1492 \end{methoddesc}
1494 \begin{methoddesc}[file]{fileno}{}
1495 \index{file descriptor}
1496 \index{descriptor, file}
1497 Return the integer ``file descriptor'' that is used by the
1498 underlying implementation to request I/O operations from the
1499 operating system. This can be useful for other, lower level
1500 interfaces that use file descriptors, such as the
1501 \refmodule{fcntl}\refbimodindex{fcntl} module or
1502 \function{os.read()} and friends. \note{File-like objects
1503 which do not have a real file descriptor should \emph{not} provide
1504 this method!}
1505 \end{methoddesc}
1507 \begin{methoddesc}[file]{isatty}{}
1508 Return \code{True} if the file is connected to a tty(-like) device, else
1509 \code{False}. \note{If a file-like object is not associated
1510 with a real file, this method should \emph{not} be implemented.}
1511 \end{methoddesc}
1513 \begin{methoddesc}[file]{next}{}
1514 A file object is its own iterator, for example \code{iter(\var{f})} returns
1515 \var{f} (unless \var{f} is closed). When a file is used as an
1516 iterator, typically in a \keyword{for} loop (for example,
1517 \code{for line in f: print line}), the \method{next()} method is
1518 called repeatedly. This method returns the next input line, or raises
1519 \exception{StopIteration} when \EOF{} is hit. In order to make a
1520 \keyword{for} loop the most efficient way of looping over the lines of
1521 a file (a very common operation), the \method{next()} method uses a
1522 hidden read-ahead buffer. As a consequence of using a read-ahead
1523 buffer, combining \method{next()} with other file methods (like
1524 \method{readline()}) does not work right. However, using
1525 \method{seek()} to reposition the file to an absolute position will
1526 flush the read-ahead buffer.
1527 \versionadded{2.3}
1528 \end{methoddesc}
1530 \begin{methoddesc}[file]{read}{\optional{size}}
1531 Read at most \var{size} bytes from the file (less if the read hits
1532 \EOF{} before obtaining \var{size} bytes). If the \var{size}
1533 argument is negative or omitted, read all data until \EOF{} is
1534 reached. The bytes are returned as a string object. An empty
1535 string is returned when \EOF{} is encountered immediately. (For
1536 certain files, like ttys, it makes sense to continue reading after
1537 an \EOF{} is hit.) Note that this method may call the underlying
1538 C function \cfunction{fread()} more than once in an effort to
1539 acquire as close to \var{size} bytes as possible. Also note that
1540 when in non-blocking mode, less data than what was requested may
1541 be returned, even if no \var{size} parameter was given.
1542 \end{methoddesc}
1544 \begin{methoddesc}[file]{readline}{\optional{size}}
1545 Read one entire line from the file. A trailing newline character is
1546 kept in the string (but may be absent when a file ends with an
1547 incomplete line).\footnote{
1548 The advantage of leaving the newline on is that
1549 returning an empty string is then an unambiguous \EOF{}
1550 indication. It is also possible (in cases where it might
1551 matter, for example, if you
1552 want to make an exact copy of a file while scanning its lines)
1553 to tell whether the last line of a file ended in a newline
1554 or not (yes this happens!).
1555 } If the \var{size} argument is present and
1556 non-negative, it is a maximum byte count (including the trailing
1557 newline) and an incomplete line may be returned.
1558 An empty string is returned \emph{only} when \EOF{} is encountered
1559 immediately. \note{Unlike \code{stdio}'s \cfunction{fgets()}, the
1560 returned string contains null characters (\code{'\e 0'}) if they
1561 occurred in the input.}
1562 \end{methoddesc}
1564 \begin{methoddesc}[file]{readlines}{\optional{sizehint}}
1565 Read until \EOF{} using \method{readline()} and return a list containing
1566 the lines thus read. If the optional \var{sizehint} argument is
1567 present, instead of reading up to \EOF, whole lines totalling
1568 approximately \var{sizehint} bytes (possibly after rounding up to an
1569 internal buffer size) are read. Objects implementing a file-like
1570 interface may choose to ignore \var{sizehint} if it cannot be
1571 implemented, or cannot be implemented efficiently.
1572 \end{methoddesc}
1574 \begin{methoddesc}[file]{xreadlines}{}
1575 This method returns the same thing as \code{iter(f)}.
1576 \versionadded{2.1}
1577 \deprecated{2.3}{Use \samp{for \var{line} in \var{file}} instead.}
1578 \end{methoddesc}
1580 \begin{methoddesc}[file]{seek}{offset\optional{, whence}}
1581 Set the file's current position, like \code{stdio}'s \cfunction{fseek()}.
1582 The \var{whence} argument is optional and defaults to \code{0}
1583 (absolute file positioning); other values are \code{1} (seek
1584 relative to the current position) and \code{2} (seek relative to the
1585 file's end). There is no return value. Note that if the file is
1586 opened for appending (mode \code{'a'} or \code{'a+'}), any
1587 \method{seek()} operations will be undone at the next write. If the
1588 file is only opened for writing in append mode (mode \code{'a'}),
1589 this method is essentially a no-op, but it remains useful for files
1590 opened in append mode with reading enabled (mode \code{'a+'}). If the
1591 file is opened in text mode (without \code{'b'}), only offsets returned
1592 by \method{tell()} are legal. Use of other offsets causes undefined
1593 behavior.
1595 Note that not all file objects are seekable.
1596 \end{methoddesc}
1598 \begin{methoddesc}[file]{tell}{}
1599 Return the file's current position, like \code{stdio}'s
1600 \cfunction{ftell()}.
1602 \note{On Windows, \method{tell()} can return illegal values (after an
1603 \cfunction{fgets()}) when reading files with \UNIX{}-style line-endings.
1604 Use binary mode (\code{'rb'}) to circumvent this problem.}
1605 \end{methoddesc}
1607 \begin{methoddesc}[file]{truncate}{\optional{size}}
1608 Truncate the file's size. If the optional \var{size} argument is
1609 present, the file is truncated to (at most) that size. The size
1610 defaults to the current position. The current file position is
1611 not changed. Note that if a specified size exceeds the file's
1612 current size, the result is platform-dependent: possibilities
1613 include that the file may remain unchanged, increase to the specified
1614 size as if zero-filled, or increase to the specified size with
1615 undefined new content.
1616 Availability: Windows, many \UNIX{} variants.
1617 \end{methoddesc}
1619 \begin{methoddesc}[file]{write}{str}
1620 Write a string to the file. There is no return value. Due to
1621 buffering, the string may not actually show up in the file until
1622 the \method{flush()} or \method{close()} method is called.
1623 \end{methoddesc}
1625 \begin{methoddesc}[file]{writelines}{sequence}
1626 Write a sequence of strings to the file. The sequence can be any
1627 iterable object producing strings, typically a list of strings.
1628 There is no return value.
1629 (The name is intended to match \method{readlines()};
1630 \method{writelines()} does not add line separators.)
1631 \end{methoddesc}
1634 Files support the iterator protocol. Each iteration returns the same
1635 result as \code{\var{file}.readline()}, and iteration ends when the
1636 \method{readline()} method returns an empty string.
1639 File objects also offer a number of other interesting attributes.
1640 These are not required for file-like objects, but should be
1641 implemented if they make sense for the particular object.
1643 \begin{memberdesc}[file]{closed}
1644 bool indicating the current state of the file object. This is a
1645 read-only attribute; the \method{close()} method changes the value.
1646 It may not be available on all file-like objects.
1647 \end{memberdesc}
1649 \begin{memberdesc}[file]{encoding}
1650 The encoding that this file uses. When Unicode strings are written
1651 to a file, they will be converted to byte strings using this encoding.
1652 In addition, when the file is connected to a terminal, the attribute
1653 gives the encoding that the terminal is likely to use (that
1654 information might be incorrect if the user has misconfigured the
1655 terminal). The attribute is read-only and may not be present on
1656 all file-like objects. It may also be \code{None}, in which case
1657 the file uses the system default encoding for converting Unicode
1658 strings.
1660 \versionadded{2.3}
1661 \end{memberdesc}
1663 \begin{memberdesc}[file]{mode}
1664 The I/O mode for the file. If the file was created using the
1665 \function{open()} built-in function, this will be the value of the
1666 \var{mode} parameter. This is a read-only attribute and may not be
1667 present on all file-like objects.
1668 \end{memberdesc}
1670 \begin{memberdesc}[file]{name}
1671 If the file object was created using \function{open()}, the name of
1672 the file. Otherwise, some string that indicates the source of the
1673 file object, of the form \samp{<\mbox{\ldots}>}. This is a read-only
1674 attribute and may not be present on all file-like objects.
1675 \end{memberdesc}
1677 \begin{memberdesc}[file]{newlines}
1678 If Python was built with the \longprogramopt{with-universal-newlines}
1679 option to \program{configure} (the default) this read-only attribute
1680 exists, and for files opened in
1681 universal newline read mode it keeps track of the types of newlines
1682 encountered while reading the file. The values it can take are
1683 \code{'\e r'}, \code{'\e n'}, \code{'\e r\e n'}, \code{None} (unknown,
1684 no newlines read yet) or a tuple containing all the newline
1685 types seen, to indicate that multiple
1686 newline conventions were encountered. For files not opened in universal
1687 newline read mode the value of this attribute will be \code{None}.
1688 \end{memberdesc}
1690 \begin{memberdesc}[file]{softspace}
1691 Boolean that indicates whether a space character needs to be printed
1692 before another value when using the \keyword{print} statement.
1693 Classes that are trying to simulate a file object should also have a
1694 writable \member{softspace} attribute, which should be initialized to
1695 zero. This will be automatic for most classes implemented in Python
1696 (care may be needed for objects that override attribute access); types
1697 implemented in C will have to provide a writable
1698 \member{softspace} attribute.
1699 \note{This attribute is not used to control the
1700 \keyword{print} statement, but to allow the implementation of
1701 \keyword{print} to keep track of its internal state.}
1702 \end{memberdesc}
1705 \subsection{Other Built-in Types \label{typesother}}
1707 The interpreter supports several other kinds of objects.
1708 Most of these support only one or two operations.
1711 \subsubsection{Modules \label{typesmodules}}
1713 The only special operation on a module is attribute access:
1714 \code{\var{m}.\var{name}}, where \var{m} is a module and \var{name}
1715 accesses a name defined in \var{m}'s symbol table. Module attributes
1716 can be assigned to. (Note that the \keyword{import} statement is not,
1717 strictly speaking, an operation on a module object; \code{import
1718 \var{foo}} does not require a module object named \var{foo} to exist,
1719 rather it requires an (external) \emph{definition} for a module named
1720 \var{foo} somewhere.)
1722 A special member of every module is \member{__dict__}.
1723 This is the dictionary containing the module's symbol table.
1724 Modifying this dictionary will actually change the module's symbol
1725 table, but direct assignment to the \member{__dict__} attribute is not
1726 possible (you can write \code{\var{m}.__dict__['a'] = 1}, which
1727 defines \code{\var{m}.a} to be \code{1}, but you can't write
1728 \code{\var{m}.__dict__ = \{\}}). Modifying \member{__dict__} directly
1729 is not recommended.
1731 Modules built into the interpreter are written like this:
1732 \code{<module 'sys' (built-in)>}. If loaded from a file, they are
1733 written as \code{<module 'os' from
1734 '/usr/local/lib/python\shortversion/os.pyc'>}.
1737 \subsubsection{Classes and Class Instances \label{typesobjects}}
1738 \nodename{Classes and Instances}
1740 See chapters 3 and 7 of the \citetitle[../ref/ref.html]{Python
1741 Reference Manual} for these.
1744 \subsubsection{Functions \label{typesfunctions}}
1746 Function objects are created by function definitions. The only
1747 operation on a function object is to call it:
1748 \code{\var{func}(\var{argument-list})}.
1750 There are really two flavors of function objects: built-in functions
1751 and user-defined functions. Both support the same operation (to call
1752 the function), but the implementation is different, hence the
1753 different object types.
1755 See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1756 information.
1758 \subsubsection{Methods \label{typesmethods}}
1759 \obindex{method}
1761 Methods are functions that are called using the attribute notation.
1762 There are two flavors: built-in methods (such as \method{append()} on
1763 lists) and class instance methods. Built-in methods are described
1764 with the types that support them.
1766 The implementation adds two special read-only attributes to class
1767 instance methods: \code{\var{m}.im_self} is the object on which the
1768 method operates, and \code{\var{m}.im_func} is the function
1769 implementing the method. Calling \code{\var{m}(\var{arg-1},
1770 \var{arg-2}, \textrm{\ldots}, \var{arg-n})} is completely equivalent to
1771 calling \code{\var{m}.im_func(\var{m}.im_self, \var{arg-1},
1772 \var{arg-2}, \textrm{\ldots}, \var{arg-n})}.
1774 Class instance methods are either \emph{bound} or \emph{unbound},
1775 referring to whether the method was accessed through an instance or a
1776 class, respectively. When a method is unbound, its \code{im_self}
1777 attribute will be \code{None} and if called, an explicit \code{self}
1778 object must be passed as the first argument. In this case,
1779 \code{self} must be an instance of the unbound method's class (or a
1780 subclass of that class), otherwise a \code{TypeError} is raised.
1782 Like function objects, methods objects support getting
1783 arbitrary attributes. However, since method attributes are actually
1784 stored on the underlying function object (\code{meth.im_func}),
1785 setting method attributes on either bound or unbound methods is
1786 disallowed. Attempting to set a method attribute results in a
1787 \code{TypeError} being raised. In order to set a method attribute,
1788 you need to explicitly set it on the underlying function object:
1790 \begin{verbatim}
1791 class C:
1792 def method(self):
1793 pass
1795 c = C()
1796 c.method.im_func.whoami = 'my name is c'
1797 \end{verbatim}
1799 See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1800 information.
1803 \subsubsection{Code Objects \label{bltin-code-objects}}
1804 \obindex{code}
1806 Code objects are used by the implementation to represent
1807 ``pseudo-compiled'' executable Python code such as a function body.
1808 They differ from function objects because they don't contain a
1809 reference to their global execution environment. Code objects are
1810 returned by the built-in \function{compile()} function and can be
1811 extracted from function objects through their \member{func_code}
1812 attribute.
1813 \bifuncindex{compile}
1814 \withsubitem{(function object attribute)}{\ttindex{func_code}}
1816 A code object can be executed or evaluated by passing it (instead of a
1817 source string) to the \keyword{exec} statement or the built-in
1818 \function{eval()} function.
1819 \stindex{exec}
1820 \bifuncindex{eval}
1822 See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1823 information.
1826 \subsubsection{Type Objects \label{bltin-type-objects}}
1828 Type objects represent the various object types. An object's type is
1829 accessed by the built-in function \function{type()}. There are no special
1830 operations on types. The standard module \refmodule{types} defines names
1831 for all standard built-in types.
1832 \bifuncindex{type}
1833 \refstmodindex{types}
1835 Types are written like this: \code{<type 'int'>}.
1838 \subsubsection{The Null Object \label{bltin-null-object}}
1840 This object is returned by functions that don't explicitly return a
1841 value. It supports no special operations. There is exactly one null
1842 object, named \code{None} (a built-in name).
1844 It is written as \code{None}.
1847 \subsubsection{The Ellipsis Object \label{bltin-ellipsis-object}}
1849 This object is used by extended slice notation (see the
1850 \citetitle[../ref/ref.html]{Python Reference Manual}). It supports no
1851 special operations. There is exactly one ellipsis object, named
1852 \constant{Ellipsis} (a built-in name).
1854 It is written as \code{Ellipsis}.
1856 \subsubsection{Boolean Values}
1858 Boolean values are the two constant objects \code{False} and
1859 \code{True}. They are used to represent truth values (although other
1860 values can also be considered false or true). In numeric contexts
1861 (for example when used as the argument to an arithmetic operator),
1862 they behave like the integers 0 and 1, respectively. The built-in
1863 function \function{bool()} can be used to cast any value to a Boolean,
1864 if the value can be interpreted as a truth value (see section Truth
1865 Value Testing above).
1867 They are written as \code{False} and \code{True}, respectively.
1868 \index{False}
1869 \index{True}
1870 \indexii{Boolean}{values}
1873 \subsubsection{Internal Objects \label{typesinternal}}
1875 See the \citetitle[../ref/ref.html]{Python Reference Manual} for this
1876 information. It describes stack frame objects, traceback objects, and
1877 slice objects.
1880 \subsection{Special Attributes \label{specialattrs}}
1882 The implementation adds a few special read-only attributes to several
1883 object types, where they are relevant. Some of these are not reported
1884 by the \function{dir()} built-in function.
1886 \begin{memberdesc}[object]{__dict__}
1887 A dictionary or other mapping object used to store an
1888 object's (writable) attributes.
1889 \end{memberdesc}
1891 \begin{memberdesc}[object]{__methods__}
1892 \deprecated{2.2}{Use the built-in function \function{dir()} to get a
1893 list of an object's attributes. This attribute is no longer available.}
1894 \end{memberdesc}
1896 \begin{memberdesc}[object]{__members__}
1897 \deprecated{2.2}{Use the built-in function \function{dir()} to get a
1898 list of an object's attributes. This attribute is no longer available.}
1899 \end{memberdesc}
1901 \begin{memberdesc}[instance]{__class__}
1902 The class to which a class instance belongs.
1903 \end{memberdesc}
1905 \begin{memberdesc}[class]{__bases__}
1906 The tuple of base classes of a class object. If there are no base
1907 classes, this will be an empty tuple.
1908 \end{memberdesc}
1910 \begin{memberdesc}[class]{__name__}
1911 The name of the class or type.
1912 \end{memberdesc}