|
1 |
|
2 :mod:`rfc822` --- Parse RFC 2822 mail headers |
|
3 ============================================= |
|
4 |
|
5 .. module:: rfc822 |
|
6 :synopsis: Parse 2822 style mail messages. |
|
7 :deprecated: |
|
8 |
|
9 |
|
10 .. deprecated:: 2.3 |
|
11 The :mod:`email` package should be used in preference to the :mod:`rfc822` |
|
12 module. This module is present only to maintain backward compatibility, and |
|
13 has been removed in 3.0. |
|
14 |
|
15 This module defines a class, :class:`Message`, which represents an "email |
|
16 message" as defined by the Internet standard :rfc:`2822`. [#]_ Such messages |
|
17 consist of a collection of message headers, and a message body. This module |
|
18 also defines a helper class :class:`AddressList` for parsing :rfc:`2822` |
|
19 addresses. Please refer to the RFC for information on the specific syntax of |
|
20 :rfc:`2822` messages. |
|
21 |
|
22 .. index:: module: mailbox |
|
23 |
|
24 The :mod:`mailbox` module provides classes to read mailboxes produced by |
|
25 various end-user mail programs. |
|
26 |
|
27 |
|
28 .. class:: Message(file[, seekable]) |
|
29 |
|
30 A :class:`Message` instance is instantiated with an input object as parameter. |
|
31 Message relies only on the input object having a :meth:`readline` method; in |
|
32 particular, ordinary file objects qualify. Instantiation reads headers from the |
|
33 input object up to a delimiter line (normally a blank line) and stores them in |
|
34 the instance. The message body, following the headers, is not consumed. |
|
35 |
|
36 This class can work with any input object that supports a :meth:`readline` |
|
37 method. If the input object has seek and tell capability, the |
|
38 :meth:`rewindbody` method will work; also, illegal lines will be pushed back |
|
39 onto the input stream. If the input object lacks seek but has an :meth:`unread` |
|
40 method that can push back a line of input, :class:`Message` will use that to |
|
41 push back illegal lines. Thus this class can be used to parse messages coming |
|
42 from a buffered stream. |
|
43 |
|
44 The optional *seekable* argument is provided as a workaround for certain stdio |
|
45 libraries in which :cfunc:`tell` discards buffered data before discovering that |
|
46 the :cfunc:`lseek` system call doesn't work. For maximum portability, you |
|
47 should set the seekable argument to zero to prevent that initial :meth:`tell` |
|
48 when passing in an unseekable object such as a file object created from a socket |
|
49 object. |
|
50 |
|
51 Input lines as read from the file may either be terminated by CR-LF or by a |
|
52 single linefeed; a terminating CR-LF is replaced by a single linefeed before the |
|
53 line is stored. |
|
54 |
|
55 All header matching is done independent of upper or lower case; e.g. |
|
56 ``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result. |
|
57 |
|
58 |
|
59 .. class:: AddressList(field) |
|
60 |
|
61 You may instantiate the :class:`AddressList` helper class using a single string |
|
62 parameter, a comma-separated list of :rfc:`2822` addresses to be parsed. (The |
|
63 parameter ``None`` yields an empty list.) |
|
64 |
|
65 |
|
66 .. function:: quote(str) |
|
67 |
|
68 Return a new string with backslashes in *str* replaced by two backslashes and |
|
69 double quotes replaced by backslash-double quote. |
|
70 |
|
71 |
|
72 .. function:: unquote(str) |
|
73 |
|
74 Return a new string which is an *unquoted* version of *str*. If *str* ends and |
|
75 begins with double quotes, they are stripped off. Likewise if *str* ends and |
|
76 begins with angle brackets, they are stripped off. |
|
77 |
|
78 |
|
79 .. function:: parseaddr(address) |
|
80 |
|
81 Parse *address*, which should be the value of some address-containing field such |
|
82 as :mailheader:`To` or :mailheader:`Cc`, into its constituent "realname" and |
|
83 "email address" parts. Returns a tuple of that information, unless the parse |
|
84 fails, in which case a 2-tuple ``(None, None)`` is returned. |
|
85 |
|
86 |
|
87 .. function:: dump_address_pair(pair) |
|
88 |
|
89 The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname, |
|
90 email_address)`` and returns the string value suitable for a :mailheader:`To` or |
|
91 :mailheader:`Cc` header. If the first element of *pair* is false, then the |
|
92 second element is returned unmodified. |
|
93 |
|
94 |
|
95 .. function:: parsedate(date) |
|
96 |
|
97 Attempts to parse a date according to the rules in :rfc:`2822`. however, some |
|
98 mailers don't follow that format as specified, so :func:`parsedate` tries to |
|
99 guess correctly in such cases. *date* is a string containing an :rfc:`2822` |
|
100 date, such as ``'Mon, 20 Nov 1995 19:12:08 -0500'``. If it succeeds in parsing |
|
101 the date, :func:`parsedate` returns a 9-tuple that can be passed directly to |
|
102 :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6, |
|
103 7, and 8 of the result tuple are not usable. |
|
104 |
|
105 |
|
106 .. function:: parsedate_tz(date) |
|
107 |
|
108 Performs the same function as :func:`parsedate`, but returns either ``None`` or |
|
109 a 10-tuple; the first 9 elements make up a tuple that can be passed directly to |
|
110 :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC |
|
111 (which is the official term for Greenwich Mean Time). (Note that the sign of |
|
112 the timezone offset is the opposite of the sign of the ``time.timezone`` |
|
113 variable for the same timezone; the latter variable follows the POSIX standard |
|
114 while this module follows :rfc:`2822`.) If the input string has no timezone, |
|
115 the last element of the tuple returned is ``None``. Note that indexes 6, 7, and |
|
116 8 of the result tuple are not usable. |
|
117 |
|
118 |
|
119 .. function:: mktime_tz(tuple) |
|
120 |
|
121 Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. If |
|
122 the timezone item in the tuple is ``None``, assume local time. Minor |
|
123 deficiency: this first interprets the first 8 elements as a local time and then |
|
124 compensates for the timezone difference; this may yield a slight error around |
|
125 daylight savings time switch dates. Not enough to worry about for common use. |
|
126 |
|
127 |
|
128 .. seealso:: |
|
129 |
|
130 Module :mod:`email` |
|
131 Comprehensive email handling package; supersedes the :mod:`rfc822` module. |
|
132 |
|
133 Module :mod:`mailbox` |
|
134 Classes to read various mailbox formats produced by end-user mail programs. |
|
135 |
|
136 Module :mod:`mimetools` |
|
137 Subclass of :class:`rfc822.Message` that handles MIME encoded messages. |
|
138 |
|
139 |
|
140 .. _message-objects: |
|
141 |
|
142 Message Objects |
|
143 --------------- |
|
144 |
|
145 A :class:`Message` instance has the following methods: |
|
146 |
|
147 |
|
148 .. method:: Message.rewindbody() |
|
149 |
|
150 Seek to the start of the message body. This only works if the file object is |
|
151 seekable. |
|
152 |
|
153 |
|
154 .. method:: Message.isheader(line) |
|
155 |
|
156 Returns a line's canonicalized fieldname (the dictionary key that will be used |
|
157 to index it) if the line is a legal :rfc:`2822` header; otherwise returns |
|
158 ``None`` (implying that parsing should stop here and the line be pushed back on |
|
159 the input stream). It is sometimes useful to override this method in a |
|
160 subclass. |
|
161 |
|
162 |
|
163 .. method:: Message.islast(line) |
|
164 |
|
165 Return true if the given line is a delimiter on which Message should stop. The |
|
166 delimiter line is consumed, and the file object's read location positioned |
|
167 immediately after it. By default this method just checks that the line is |
|
168 blank, but you can override it in a subclass. |
|
169 |
|
170 |
|
171 .. method:: Message.iscomment(line) |
|
172 |
|
173 Return ``True`` if the given line should be ignored entirely, just skipped. By |
|
174 default this is a stub that always returns ``False``, but you can override it in |
|
175 a subclass. |
|
176 |
|
177 |
|
178 .. method:: Message.getallmatchingheaders(name) |
|
179 |
|
180 Return a list of lines consisting of all headers matching *name*, if any. Each |
|
181 physical line, whether it is a continuation line or not, is a separate list |
|
182 item. Return the empty list if no header matches *name*. |
|
183 |
|
184 |
|
185 .. method:: Message.getfirstmatchingheader(name) |
|
186 |
|
187 Return a list of lines comprising the first header matching *name*, and its |
|
188 continuation line(s), if any. Return ``None`` if there is no header matching |
|
189 *name*. |
|
190 |
|
191 |
|
192 .. method:: Message.getrawheader(name) |
|
193 |
|
194 Return a single string consisting of the text after the colon in the first |
|
195 header matching *name*. This includes leading whitespace, the trailing |
|
196 linefeed, and internal linefeeds and whitespace if there any continuation |
|
197 line(s) were present. Return ``None`` if there is no header matching *name*. |
|
198 |
|
199 |
|
200 .. method:: Message.getheader(name[, default]) |
|
201 |
|
202 Return a single string consisting of the last header matching *name*, |
|
203 but strip leading and trailing whitespace. |
|
204 Internal whitespace is not stripped. The optional *default* argument can be |
|
205 used to specify a different default to be returned when there is no header |
|
206 matching *name*; it defaults to ``None``. |
|
207 This is the preferred way to get parsed headers. |
|
208 |
|
209 |
|
210 .. method:: Message.get(name[, default]) |
|
211 |
|
212 An alias for :meth:`getheader`, to make the interface more compatible with |
|
213 regular dictionaries. |
|
214 |
|
215 |
|
216 .. method:: Message.getaddr(name) |
|
217 |
|
218 Return a pair ``(full name, email address)`` parsed from the string returned by |
|
219 ``getheader(name)``. If no header matching *name* exists, return ``(None, |
|
220 None)``; otherwise both the full name and the address are (possibly empty) |
|
221 strings. |
|
222 |
|
223 Example: If *m*'s first :mailheader:`From` header contains the string |
|
224 ``'jack@cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair |
|
225 ``('Jack Jansen', 'jack@cwi.nl')``. If the header contained ``'Jack Jansen |
|
226 <jack@cwi.nl>'`` instead, it would yield the exact same result. |
|
227 |
|
228 |
|
229 .. method:: Message.getaddrlist(name) |
|
230 |
|
231 This is similar to ``getaddr(list)``, but parses a header containing a list of |
|
232 email addresses (e.g. a :mailheader:`To` header) and returns a list of ``(full |
|
233 name, email address)`` pairs (even if there was only one address in the header). |
|
234 If there is no header matching *name*, return an empty list. |
|
235 |
|
236 If multiple headers exist that match the named header (e.g. if there are several |
|
237 :mailheader:`Cc` headers), all are parsed for addresses. Any continuation lines |
|
238 the named headers contain are also parsed. |
|
239 |
|
240 |
|
241 .. method:: Message.getdate(name) |
|
242 |
|
243 Retrieve a header using :meth:`getheader` and parse it into a 9-tuple compatible |
|
244 with :func:`time.mktime`; note that fields 6, 7, and 8 are not usable. If |
|
245 there is no header matching *name*, or it is unparsable, return ``None``. |
|
246 |
|
247 Date parsing appears to be a black art, and not all mailers adhere to the |
|
248 standard. While it has been tested and found correct on a large collection of |
|
249 email from many sources, it is still possible that this function may |
|
250 occasionally yield an incorrect result. |
|
251 |
|
252 |
|
253 .. method:: Message.getdate_tz(name) |
|
254 |
|
255 Retrieve a header using :meth:`getheader` and parse it into a 10-tuple; the |
|
256 first 9 elements will make a tuple compatible with :func:`time.mktime`, and the |
|
257 10th is a number giving the offset of the date's timezone from UTC. Note that |
|
258 fields 6, 7, and 8 are not usable. Similarly to :meth:`getdate`, if there is |
|
259 no header matching *name*, or it is unparsable, return ``None``. |
|
260 |
|
261 :class:`Message` instances also support a limited mapping interface. In |
|
262 particular: ``m[name]`` is like ``m.getheader(name)`` but raises :exc:`KeyError` |
|
263 if there is no matching header; and ``len(m)``, ``m.get(name[, default])``, |
|
264 ``name in m``, ``m.keys()``, ``m.values()`` ``m.items()``, and |
|
265 ``m.setdefault(name[, default])`` act as expected, with the one difference |
|
266 that :meth:`setdefault` uses an empty string as the default value. |
|
267 :class:`Message` instances also support the mapping writable interface ``m[name] |
|
268 = value`` and ``del m[name]``. :class:`Message` objects do not support the |
|
269 :meth:`clear`, :meth:`copy`, :meth:`popitem`, or :meth:`update` methods of the |
|
270 mapping interface. (Support for :meth:`get` and :meth:`setdefault` was only |
|
271 added in Python 2.2.) |
|
272 |
|
273 Finally, :class:`Message` instances have some public instance variables: |
|
274 |
|
275 |
|
276 .. attribute:: Message.headers |
|
277 |
|
278 A list containing the entire set of header lines, in the order in which they |
|
279 were read (except that setitem calls may disturb this order). Each line contains |
|
280 a trailing newline. The blank line terminating the headers is not contained in |
|
281 the list. |
|
282 |
|
283 |
|
284 .. attribute:: Message.fp |
|
285 |
|
286 The file or file-like object passed at instantiation time. This can be used to |
|
287 read the message content. |
|
288 |
|
289 |
|
290 .. attribute:: Message.unixfrom |
|
291 |
|
292 The Unix ``From`` line, if the message had one, or an empty string. This is |
|
293 needed to regenerate the message in some contexts, such as an ``mbox``\ -style |
|
294 mailbox file. |
|
295 |
|
296 |
|
297 .. _addresslist-objects: |
|
298 |
|
299 AddressList Objects |
|
300 ------------------- |
|
301 |
|
302 An :class:`AddressList` instance has the following methods: |
|
303 |
|
304 |
|
305 .. method:: AddressList.__len__() |
|
306 |
|
307 Return the number of addresses in the address list. |
|
308 |
|
309 |
|
310 .. method:: AddressList.__str__() |
|
311 |
|
312 Return a canonicalized string representation of the address list. Addresses are |
|
313 rendered in "name" <host@domain> form, comma-separated. |
|
314 |
|
315 |
|
316 .. method:: AddressList.__add__(alist) |
|
317 |
|
318 Return a new :class:`AddressList` instance that contains all addresses in both |
|
319 :class:`AddressList` operands, with duplicates removed (set union). |
|
320 |
|
321 |
|
322 .. method:: AddressList.__iadd__(alist) |
|
323 |
|
324 In-place version of :meth:`__add__`; turns this :class:`AddressList` instance |
|
325 into the union of itself and the right-hand instance, *alist*. |
|
326 |
|
327 |
|
328 .. method:: AddressList.__sub__(alist) |
|
329 |
|
330 Return a new :class:`AddressList` instance that contains every address in the |
|
331 left-hand :class:`AddressList` operand that is not present in the right-hand |
|
332 address operand (set difference). |
|
333 |
|
334 |
|
335 .. method:: AddressList.__isub__(alist) |
|
336 |
|
337 In-place version of :meth:`__sub__`, removing addresses in this list which are |
|
338 also in *alist*. |
|
339 |
|
340 Finally, :class:`AddressList` instances have one public instance variable: |
|
341 |
|
342 |
|
343 .. attribute:: AddressList.addresslist |
|
344 |
|
345 A list of tuple string pairs, one per address. In each member, the first is the |
|
346 canonicalized name part, the second is the actual route-address (``'@'``\ |
|
347 -separated username-host.domain pair). |
|
348 |
|
349 .. rubric:: Footnotes |
|
350 |
|
351 .. [#] This module originally conformed to :rfc:`822`, hence the name. Since then, |
|
352 :rfc:`2822` has been released as an update to :rfc:`822`. This module should be |
|
353 considered :rfc:`2822`\ -conformant, especially in cases where the syntax or |
|
354 semantics have changed since :rfc:`822`. |
|
355 |