Removed defensive test in Handler.close
[python.git] / Doc / howto / doanddont.tex
blobadbde66987744bd3c0c3ee8130972b5aff7929a2
1 \documentclass{howto}
3 \title{Idioms and Anti-Idioms in Python}
5 \release{0.00}
7 \author{Moshe Zadka}
8 \authoraddress{howto@zadka.site.co.il}
10 \begin{document}
11 \maketitle
13 This document is placed in the public doman.
15 \begin{abstract}
16 \noindent
17 This document can be considered a companion to the tutorial. It
18 shows how to use Python, and even more importantly, how {\em not}
19 to use Python.
20 \end{abstract}
22 \tableofcontents
24 \section{Language Constructs You Should Not Use}
26 While Python has relatively few gotchas compared to other languages, it
27 still has some constructs which are only useful in corner cases, or are
28 plain dangerous.
30 \subsection{from module import *}
32 \subsubsection{Inside Function Definitions}
34 \code{from module import *} is {\em invalid} inside function definitions.
35 While many versions of Python do no check for the invalidity, it does not
36 make it more valid, no more then having a smart lawyer makes a man innocent.
37 Do not use it like that ever. Even in versions where it was accepted, it made
38 the function execution slower, because the compiler could not be certain
39 which names are local and which are global. In Python 2.1 this construct
40 causes warnings, and sometimes even errors.
42 \subsubsection{At Module Level}
44 While it is valid to use \code{from module import *} at module level it
45 is usually a bad idea. For one, this loses an important property Python
46 otherwise has --- you can know where each toplevel name is defined by
47 a simple "search" function in your favourite editor. You also open yourself
48 to trouble in the future, if some module grows additional functions or
49 classes.
51 One of the most awful question asked on the newsgroup is why this code:
53 \begin{verbatim}
54 f = open("www")
55 f.read()
56 \end{verbatim}
58 does not work. Of course, it works just fine (assuming you have a file
59 called "www".) But it does not work if somewhere in the module, the
60 statement \code{from os import *} is present. The \module{os} module
61 has a function called \function{open()} which returns an integer. While
62 it is very useful, shadowing builtins is one of its least useful properties.
64 Remember, you can never know for sure what names a module exports, so either
65 take what you need --- \code{from module import name1, name2}, or keep them in
66 the module and access on a per-need basis ---
67 \code{import module;print module.name}.
69 \subsubsection{When It Is Just Fine}
71 There are situations in which \code{from module import *} is just fine:
73 \begin{itemize}
75 \item The interactive prompt. For example, \code{from math import *} makes
76 Python an amazing scientific calculator.
78 \item When extending a module in C with a module in Python.
80 \item When the module advertises itself as \code{from import *} safe.
82 \end{itemize}
84 \subsection{Unadorned \keyword{exec}, \function{execfile} and friends}
86 The word ``unadorned'' refers to the use without an explicit dictionary,
87 in which case those constructs evaluate code in the {\em current} environment.
88 This is dangerous for the same reasons \code{from import *} is dangerous ---
89 it might step over variables you are counting on and mess up things for
90 the rest of your code. Simply do not do that.
92 Bad examples:
94 \begin{verbatim}
95 >>> for name in sys.argv[1:]:
96 >>> exec "%s=1" % name
97 >>> def func(s, **kw):
98 >>> for var, val in kw.items():
99 >>> exec "s.%s=val" % var # invalid!
100 >>> execfile("handler.py")
101 >>> handle()
102 \end{verbatim}
104 Good examples:
106 \begin{verbatim}
107 >>> d = {}
108 >>> for name in sys.argv[1:]:
109 >>> d[name] = 1
110 >>> def func(s, **kw):
111 >>> for var, val in kw.items():
112 >>> setattr(s, var, val)
113 >>> d={}
114 >>> execfile("handle.py", d, d)
115 >>> handle = d['handle']
116 >>> handle()
117 \end{verbatim}
119 \subsection{from module import name1, name2}
121 This is a ``don't'' which is much weaker then the previous ``don't''s
122 but is still something you should not do if you don't have good reasons
123 to do that. The reason it is usually bad idea is because you suddenly
124 have an object which lives in two seperate namespaces. When the binding
125 in one namespace changes, the binding in the other will not, so there
126 will be a discrepancy between them. This happens when, for example,
127 one module is reloaded, or changes the definition of a function at runtime.
129 Bad example:
131 \begin{verbatim}
132 # foo.py
133 a = 1
135 # bar.py
136 from foo import a
137 if something():
138 a = 2 # danger: foo.a != a
139 \end{verbatim}
141 Good example:
143 \begin{verbatim}
144 # foo.py
145 a = 1
147 # bar.py
148 import foo
149 if something():
150 foo.a = 2
151 \end{verbatim}
153 \subsection{except:}
155 Python has the \code{except:} clause, which catches all exceptions.
156 Since {\em every} error in Python raises an exception, this makes many
157 programming errors look like runtime problems, and hinders
158 the debugging process.
160 The following code shows a great example:
162 \begin{verbatim}
163 try:
164 foo = opne("file") # misspelled "open"
165 except:
166 sys.exit("could not open file!")
167 \end{verbatim}
169 The second line triggers a \exception{NameError} which is caught by the
170 except clause. The program will exit, and you will have no idea that
171 this has nothing to do with the readability of \code{"file"}.
173 The example above is better written
175 \begin{verbatim}
176 try:
177 foo = opne("file") # will be changed to "open" as soon as we run it
178 except IOError:
179 sys.exit("could not open file")
180 \end{verbatim}
182 There are some situations in which the \code{except:} clause is useful:
183 for example, in a framework when running callbacks, it is good not to
184 let any callback disturb the framework.
186 \section{Exceptions}
188 Exceptions are a useful feature of Python. You should learn to raise
189 them whenever something unexpected occurs, and catch them only where
190 you can do something about them.
192 The following is a very popular anti-idiom
194 \begin{verbatim}
195 def get_status(file):
196 if not os.path.exists(file):
197 print "file not found"
198 sys.exit(1)
199 return open(file).readline()
200 \end{verbatim}
202 Consider the case the file gets deleted between the time the call to
203 \function{os.path.exists} is made and the time \function{open} is called.
204 That means the last line will throw an \exception{IOError}. The same would
205 happen if \var{file} exists but has no read permission. Since testing this
206 on a normal machine on existing and non-existing files make it seem bugless,
207 that means in testing the results will seem fine, and the code will get
208 shipped. Then an unhandled \exception{IOError} escapes to the user, who
209 has to watch the ugly traceback.
211 Here is a better way to do it.
213 \begin{verbatim}
214 def get_status(file):
215 try:
216 return open(file).readline()
217 except (IOError, OSError):
218 print "file not found"
219 sys.exit(1)
220 \end{verbatim}
222 In this version, *either* the file gets opened and the line is read
223 (so it works even on flaky NFS or SMB connections), or the message
224 is printed and the application aborted.
226 Still, \function{get_status} makes too many assumptions --- that it
227 will only be used in a short running script, and not, say, in a long
228 running server. Sure, the caller could do something like
230 \begin{verbatim}
231 try:
232 status = get_status(log)
233 except SystemExit:
234 status = None
235 \end{verbatim}
237 So, try to make as few \code{except} clauses in your code --- those will
238 usually be a catch-all in the \function{main}, or inside calls which
239 should always succeed.
241 So, the best version is probably
243 \begin{verbatim}
244 def get_status(file):
245 return open(file).readline()
246 \end{verbatim}
248 The caller can deal with the exception if it wants (for example, if it
249 tries several files in a loop), or just let the exception filter upwards
250 to {\em its} caller.
252 The last version is not very good either --- due to implementation details,
253 the file would not be closed when an exception is raised until the handler
254 finishes, and perhaps not at all in non-C implementations (e.g., Jython).
256 \begin{verbatim}
257 def get_status(file):
258 fp = open(file)
259 try:
260 return fp.readline()
261 finally:
262 fp.close()
263 \end{verbatim}
265 \section{Using the Batteries}
267 Every so often, people seem to be writing stuff in the Python library
268 again, usually poorly. While the occasional module has a poor interface,
269 it is usually much better to use the rich standard library and data
270 types that come with Python then inventing your own.
272 A useful module very few people know about is \module{os.path}. It
273 always has the correct path arithmetic for your operating system, and
274 will usually be much better then whatever you come up with yourself.
276 Compare:
278 \begin{verbatim}
279 # ugh!
280 return dir+"/"+file
281 # better
282 return os.path.join(dir, file)
283 \end{verbatim}
285 More useful functions in \module{os.path}: \function{basename},
286 \function{dirname} and \function{splitext}.
288 There are also many useful builtin functions people seem not to be
289 aware of for some reason: \function{min()} and \function{max()} can
290 find the minimum/maximum of any sequence with comparable semantics,
291 for example, yet many people write they own max/min. Another highly
292 useful function is \function{reduce()}. Classical use of \function{reduce()}
293 is something like
295 \begin{verbatim}
296 import sys, operator
297 nums = map(float, sys.argv[1:])
298 print reduce(operator.add, nums)/len(nums)
299 \end{verbatim}
301 This cute little script prints the average of all numbers given on the
302 command line. The \function{reduce()} adds up all the numbers, and
303 the rest is just some pre- and postprocessing.
305 On the same note, note that \function{float()}, \function{int()} and
306 \function{long()} all accept arguments of type string, and so are
307 suited to parsing --- assuming you are ready to deal with the
308 \exception{ValueError} they raise.
310 \section{Using Backslash to Continue Statements}
312 Since Python treats a newline as a statement terminator,
313 and since statements are often more then is comfortable to put
314 in one line, many people do:
316 \begin{verbatim}
317 if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \
318 calculate_number(10, 20) != forbulate(500, 360):
319 pass
320 \end{verbatim}
322 You should realize that this is dangerous: a stray space after the
323 \code{\\} would make this line wrong, and stray spaces are notoriously
324 hard to see in editors. In this case, at least it would be a syntax
325 error, but if the code was:
327 \begin{verbatim}
328 value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \
329 + calculate_number(10, 20)*forbulate(500, 360)
330 \end{verbatim}
332 then it would just be subtly wrong.
334 It is usually much better to use the implicit continuation inside parenthesis:
336 This version is bulletproof:
338 \begin{verbatim}
339 value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9]
340 + calculate_number(10, 20)*forbulate(500, 360))
341 \end{verbatim}
343 \end{document}