2 :mod:`rfc822` --- Parse RFC 2822 mail headers
3 =============================================
6 :synopsis: Parse 2822 style mail messages.
11 The :mod:`email` package should be used in preference to the :mod:`rfc822`
12 module. This module is present only to maintain backward compatibility, and
13 has been removed in 3.0.
15 This module defines a class, :class:`Message`, which represents an "email
16 message" as defined by the Internet standard :rfc:`2822`. [#]_ Such messages
17 consist of a collection of message headers, and a message body. This module
18 also defines a helper class :class:`AddressList` for parsing :rfc:`2822`
19 addresses. Please refer to the RFC for information on the specific syntax of
22 .. index:: module: mailbox
24 The :mod:`mailbox` module provides classes to read mailboxes produced by
25 various end-user mail programs.
28 .. class:: Message(file[, seekable])
30 A :class:`Message` instance is instantiated with an input object as parameter.
31 Message relies only on the input object having a :meth:`readline` method; in
32 particular, ordinary file objects qualify. Instantiation reads headers from the
33 input object up to a delimiter line (normally a blank line) and stores them in
34 the instance. The message body, following the headers, is not consumed.
36 This class can work with any input object that supports a :meth:`readline`
37 method. If the input object has seek and tell capability, the
38 :meth:`rewindbody` method will work; also, illegal lines will be pushed back
39 onto the input stream. If the input object lacks seek but has an :meth:`unread`
40 method that can push back a line of input, :class:`Message` will use that to
41 push back illegal lines. Thus this class can be used to parse messages coming
42 from a buffered stream.
44 The optional *seekable* argument is provided as a workaround for certain stdio
45 libraries in which :cfunc:`tell` discards buffered data before discovering that
46 the :cfunc:`lseek` system call doesn't work. For maximum portability, you
47 should set the seekable argument to zero to prevent that initial :meth:`tell`
48 when passing in an unseekable object such as a file object created from a socket
51 Input lines as read from the file may either be terminated by CR-LF or by a
52 single linefeed; a terminating CR-LF is replaced by a single linefeed before the
55 All header matching is done independent of upper or lower case; e.g.
56 ``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result.
59 .. class:: AddressList(field)
61 You may instantiate the :class:`AddressList` helper class using a single string
62 parameter, a comma-separated list of :rfc:`2822` addresses to be parsed. (The
63 parameter ``None`` yields an empty list.)
66 .. function:: quote(str)
68 Return a new string with backslashes in *str* replaced by two backslashes and
69 double quotes replaced by backslash-double quote.
72 .. function:: unquote(str)
74 Return a new string which is an *unquoted* version of *str*. If *str* ends and
75 begins with double quotes, they are stripped off. Likewise if *str* ends and
76 begins with angle brackets, they are stripped off.
79 .. function:: parseaddr(address)
81 Parse *address*, which should be the value of some address-containing field such
82 as :mailheader:`To` or :mailheader:`Cc`, into its constituent "realname" and
83 "email address" parts. Returns a tuple of that information, unless the parse
84 fails, in which case a 2-tuple ``(None, None)`` is returned.
87 .. function:: dump_address_pair(pair)
89 The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname,
90 email_address)`` and returns the string value suitable for a :mailheader:`To` or
91 :mailheader:`Cc` header. If the first element of *pair* is false, then the
92 second element is returned unmodified.
95 .. function:: parsedate(date)
97 Attempts to parse a date according to the rules in :rfc:`2822`. however, some
98 mailers don't follow that format as specified, so :func:`parsedate` tries to
99 guess correctly in such cases. *date* is a string containing an :rfc:`2822`
100 date, such as ``'Mon, 20 Nov 1995 19:12:08 -0500'``. If it succeeds in parsing
101 the date, :func:`parsedate` returns a 9-tuple that can be passed directly to
102 :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6,
103 7, and 8 of the result tuple are not usable.
106 .. function:: parsedate_tz(date)
108 Performs the same function as :func:`parsedate`, but returns either ``None`` or
109 a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
110 :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC
111 (which is the official term for Greenwich Mean Time). (Note that the sign of
112 the timezone offset is the opposite of the sign of the ``time.timezone``
113 variable for the same timezone; the latter variable follows the POSIX standard
114 while this module follows :rfc:`2822`.) If the input string has no timezone,
115 the last element of the tuple returned is ``None``. Note that indexes 6, 7, and
116 8 of the result tuple are not usable.
119 .. function:: mktime_tz(tuple)
121 Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. If
122 the timezone item in the tuple is ``None``, assume local time. Minor
123 deficiency: this first interprets the first 8 elements as a local time and then
124 compensates for the timezone difference; this may yield a slight error around
125 daylight savings time switch dates. Not enough to worry about for common use.
131 Comprehensive email handling package; supersedes the :mod:`rfc822` module.
133 Module :mod:`mailbox`
134 Classes to read various mailbox formats produced by end-user mail programs.
136 Module :mod:`mimetools`
137 Subclass of :class:`rfc822.Message` that handles MIME encoded messages.
145 A :class:`Message` instance has the following methods:
148 .. method:: Message.rewindbody()
150 Seek to the start of the message body. This only works if the file object is
154 .. method:: Message.isheader(line)
156 Returns a line's canonicalized fieldname (the dictionary key that will be used
157 to index it) if the line is a legal :rfc:`2822` header; otherwise returns
158 ``None`` (implying that parsing should stop here and the line be pushed back on
159 the input stream). It is sometimes useful to override this method in a
163 .. method:: Message.islast(line)
165 Return true if the given line is a delimiter on which Message should stop. The
166 delimiter line is consumed, and the file object's read location positioned
167 immediately after it. By default this method just checks that the line is
168 blank, but you can override it in a subclass.
171 .. method:: Message.iscomment(line)
173 Return ``True`` if the given line should be ignored entirely, just skipped. By
174 default this is a stub that always returns ``False``, but you can override it in
178 .. method:: Message.getallmatchingheaders(name)
180 Return a list of lines consisting of all headers matching *name*, if any. Each
181 physical line, whether it is a continuation line or not, is a separate list
182 item. Return the empty list if no header matches *name*.
185 .. method:: Message.getfirstmatchingheader(name)
187 Return a list of lines comprising the first header matching *name*, and its
188 continuation line(s), if any. Return ``None`` if there is no header matching
192 .. method:: Message.getrawheader(name)
194 Return a single string consisting of the text after the colon in the first
195 header matching *name*. This includes leading whitespace, the trailing
196 linefeed, and internal linefeeds and whitespace if there any continuation
197 line(s) were present. Return ``None`` if there is no header matching *name*.
200 .. method:: Message.getheader(name[, default])
202 Return a single string consisting of the last header matching *name*,
203 but strip leading and trailing whitespace.
204 Internal whitespace is not stripped. The optional *default* argument can be
205 used to specify a different default to be returned when there is no header
206 matching *name*; it defaults to ``None``.
207 This is the preferred way to get parsed headers.
210 .. method:: Message.get(name[, default])
212 An alias for :meth:`getheader`, to make the interface more compatible with
213 regular dictionaries.
216 .. method:: Message.getaddr(name)
218 Return a pair ``(full name, email address)`` parsed from the string returned by
219 ``getheader(name)``. If no header matching *name* exists, return ``(None,
220 None)``; otherwise both the full name and the address are (possibly empty)
223 Example: If *m*'s first :mailheader:`From` header contains the string
224 ``'jack@cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair
225 ``('Jack Jansen', 'jack@cwi.nl')``. If the header contained ``'Jack Jansen
226 <jack@cwi.nl>'`` instead, it would yield the exact same result.
229 .. method:: Message.getaddrlist(name)
231 This is similar to ``getaddr(list)``, but parses a header containing a list of
232 email addresses (e.g. a :mailheader:`To` header) and returns a list of ``(full
233 name, email address)`` pairs (even if there was only one address in the header).
234 If there is no header matching *name*, return an empty list.
236 If multiple headers exist that match the named header (e.g. if there are several
237 :mailheader:`Cc` headers), all are parsed for addresses. Any continuation lines
238 the named headers contain are also parsed.
241 .. method:: Message.getdate(name)
243 Retrieve a header using :meth:`getheader` and parse it into a 9-tuple compatible
244 with :func:`time.mktime`; note that fields 6, 7, and 8 are not usable. If
245 there is no header matching *name*, or it is unparsable, return ``None``.
247 Date parsing appears to be a black art, and not all mailers adhere to the
248 standard. While it has been tested and found correct on a large collection of
249 email from many sources, it is still possible that this function may
250 occasionally yield an incorrect result.
253 .. method:: Message.getdate_tz(name)
255 Retrieve a header using :meth:`getheader` and parse it into a 10-tuple; the
256 first 9 elements will make a tuple compatible with :func:`time.mktime`, and the
257 10th is a number giving the offset of the date's timezone from UTC. Note that
258 fields 6, 7, and 8 are not usable. Similarly to :meth:`getdate`, if there is
259 no header matching *name*, or it is unparsable, return ``None``.
261 :class:`Message` instances also support a limited mapping interface. In
262 particular: ``m[name]`` is like ``m.getheader(name)`` but raises :exc:`KeyError`
263 if there is no matching header; and ``len(m)``, ``m.get(name[, default])``,
264 ``name in m``, ``m.keys()``, ``m.values()`` ``m.items()``, and
265 ``m.setdefault(name[, default])`` act as expected, with the one difference
266 that :meth:`setdefault` uses an empty string as the default value.
267 :class:`Message` instances also support the mapping writable interface ``m[name]
268 = value`` and ``del m[name]``. :class:`Message` objects do not support the
269 :meth:`clear`, :meth:`copy`, :meth:`popitem`, or :meth:`update` methods of the
270 mapping interface. (Support for :meth:`get` and :meth:`setdefault` was only
271 added in Python 2.2.)
273 Finally, :class:`Message` instances have some public instance variables:
276 .. attribute:: Message.headers
278 A list containing the entire set of header lines, in the order in which they
279 were read (except that setitem calls may disturb this order). Each line contains
280 a trailing newline. The blank line terminating the headers is not contained in
284 .. attribute:: Message.fp
286 The file or file-like object passed at instantiation time. This can be used to
287 read the message content.
290 .. attribute:: Message.unixfrom
292 The Unix ``From`` line, if the message had one, or an empty string. This is
293 needed to regenerate the message in some contexts, such as an ``mbox``\ -style
297 .. _addresslist-objects:
302 An :class:`AddressList` instance has the following methods:
305 .. method:: AddressList.__len__()
307 Return the number of addresses in the address list.
310 .. method:: AddressList.__str__()
312 Return a canonicalized string representation of the address list. Addresses are
313 rendered in "name" <host@domain> form, comma-separated.
316 .. method:: AddressList.__add__(alist)
318 Return a new :class:`AddressList` instance that contains all addresses in both
319 :class:`AddressList` operands, with duplicates removed (set union).
322 .. method:: AddressList.__iadd__(alist)
324 In-place version of :meth:`__add__`; turns this :class:`AddressList` instance
325 into the union of itself and the right-hand instance, *alist*.
328 .. method:: AddressList.__sub__(alist)
330 Return a new :class:`AddressList` instance that contains every address in the
331 left-hand :class:`AddressList` operand that is not present in the right-hand
332 address operand (set difference).
335 .. method:: AddressList.__isub__(alist)
337 In-place version of :meth:`__sub__`, removing addresses in this list which are
340 Finally, :class:`AddressList` instances have one public instance variable:
343 .. attribute:: AddressList.addresslist
345 A list of tuple string pairs, one per address. In each member, the first is the
346 canonicalized name part, the second is the actual route-address (``'@'``\
347 -separated username-host.domain pair).
349 .. rubric:: Footnotes
351 .. [#] This module originally conformed to :rfc:`822`, hence the name. Since then,
352 :rfc:`2822` has been released as an update to :rfc:`822`. This module should be
353 considered :rfc:`2822`\ -conformant, especially in cases where the syntax or
354 semantics have changed since :rfc:`822`.