rfc822
— 剖析 RFC 2822 邮件头
¶
从 2.3 版起弃用:
The
email
package should be used in preference to the
rfc822
module. This module is present only to maintain backward compatibility, and has been removed in Python 3.
此模块定义类,
Message
, which represents an “email message” as defined by the Internet standard
RFC 2822
.
1
Such messages consist of a collection of message headers, and a message body. This module also defines a helper class
AddressList
用于剖析
RFC 2822
addresses. Please refer to the RFC for information on the specific syntax of
RFC 2822
messages.
The
mailbox
module provides classes to read mailboxes produced by various end-user mail programs.
rfc822.
消息
(
file
[
,
seekable
]
)
¶
A
Message
instance is instantiated with an input object as parameter. Message relies only on the input object having a
readline()
method; in particular, ordinary file objects qualify. Instantiation reads headers from the input object up to a delimiter line (normally a blank line) and stores them in the instance. The message body, following the headers, is not consumed.
This class can work with any input object that supports a
readline()
method. If the input object has seek and tell capability, the
rewindbody()
method will work; also, illegal lines will be pushed back onto the input stream. If the input object lacks seek but has an
unread()
method that can push back a line of input,
Message
will use that to push back illegal lines. Thus this class can be used to parse messages coming from a buffered stream.
可选
seekable
argument is provided as a workaround for certain stdio libraries in which
tell()
discards buffered data before discovering that the
lseek()
system call doesn’t work. For maximum portability, you should set the seekable argument to zero to prevent that initial
tell()
when passing in an unseekable object such as a file object created from a socket object.
Input lines as read from the file may either be terminated by CR-LF or by a single linefeed; a terminating CR-LF is replaced by a single linefeed before the line is stored.
All header matching is done independent of upper or lower case; e.g.
m['From']
,
m['from']
and
m['FROM']
all yield the same result.
rfc822.
AddressList
(
field
)
¶
You may instantiate the
AddressList
helper class using a single string parameter, a comma-separated list of
RFC 2822
addresses to be parsed. (The parameter
None
yields an empty list.)
rfc822.
quote
(
str
)
¶
Return a new string with backslashes in str replaced by two backslashes and double quotes replaced by backslash-double quote.
rfc822.
unquote
(
str
)
¶
返回新的字符串,其是 unquoted 版本的 str 。若 str ends and begins with double quotes, they are stripped off. Likewise if str ends and begins with angle brackets, they are stripped off.
rfc822.
parseaddr
(
address
)
¶
剖析
address
, which should be the value of some address-containing field such as
到
or
Cc
, into its constituent “realname” and “email address” parts. Returns a tuple of that information, unless the parse fails, in which case a 2-tuple
(None, None)
被返回。
rfc822.
dump_address_pair
(
pair
)
¶
逆
parseaddr()
, this takes a 2-tuple of the form
(realname,
email_address)
and returns the string value suitable for a
到
or
Cc
header. If the first element of
pair
is false, then the second element is returned unmodified.
rfc822.
parsedate
(
date
)
¶
Attempts to parse a date according to the rules in
RFC 2822
. however, some mailers don’t follow that format as specified, so
parsedate()
tries to guess correctly in such cases.
date
is a string containing an
RFC 2822
date, such as
'Mon, 20 Nov 1995 19:12:08 -0500'
. If it succeeds in parsing the date,
parsedate()
returns a 9-tuple that can be passed directly to
time.mktime()
;否则
None
will be returned. Note that indexes 6, 7, and 8 of the result tuple are not usable.
rfc822.
parsedate_tz
(
date
)
¶
Performs the same function as
parsedate()
, but returns either
None
or a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
time.mktime()
, and the tenth is the offset of the date’s timezone from UTC (which is the official term for Greenwich Mean Time). (Note that the sign of the timezone offset is the opposite of the sign of the
time.timezone
variable for the same timezone; the latter variable follows the POSIX standard while this module follows
RFC 2822
.) If the input string has no timezone, the last element of the tuple returned is
None
. Note that indexes 6, 7, and 8 of the result tuple are not usable.
rfc822.
mktime_tz
(
tuple
)
¶
Turn a 10-tuple as returned by
parsedate_tz()
into a UTC timestamp. If the timezone item in the tuple is
None
, assume local time. Minor deficiency: this first interprets the first 8 elements as a local time and then compensates for the timezone difference; this may yield a slight error around daylight savings time switch dates. Not enough to worry about for common use.
另请参阅
email
Comprehensive email handling package; supersedes the
rfc822
模块。
mailbox
Classes to read various mailbox formats produced by end-user mail programs.
mimetools
子类化的
rfc822.Message
that handles MIME encoded messages.
A
Message
实例具有下列方法:
Message.
rewindbody
(
)
¶
Seek to the start of the message body. This only works if the file object is seekable.
Message.
isheader
(
line
)
¶
Returns a line’s canonicalized fieldname (the dictionary key that will be used to index it) if the line is a legal
RFC 2822
header; otherwise returns
None
(implying that parsing should stop here and the line be pushed back on the input stream). It is sometimes useful to override this method in a subclass.
Message.
islast
(
line
)
¶
Return true if the given line is a delimiter on which Message should stop. The delimiter line is consumed, and the file object’s read location positioned immediately after it. By default this method just checks that the line is blank, but you can override it in a subclass.
Message.
iscomment
(
line
)
¶
返回
True
if the given line should be ignored entirely, just skipped. By default this is a stub that always returns
False
, but you can override it in a subclass.
Message.
getallmatchingheaders
(
名称
)
¶
Return a list of lines consisting of all headers matching name , if any. Each physical line, whether it is a continuation line or not, is a separate list item. Return the empty list if no header matches name .
Message.
getfirstmatchingheader
(
名称
)
¶
Return a list of lines comprising the first header matching
name
, and its continuation line(s), if any. Return
None
if there is no header matching
name
.
Message.
getrawheader
(
名称
)
¶
Return a single string consisting of the text after the colon in the first header matching
name
. This includes leading whitespace, the trailing linefeed, and internal linefeeds and whitespace if there any continuation line(s) were present. Return
None
if there is no header matching
name
.
Message.
getheader
(
名称
[
,
default
]
)
¶
Return a single string consisting of the last header matching
name
, but strip leading and trailing whitespace. Internal whitespace is not stripped. The optional
default
argument can be used to specify a different default to be returned when there is no header matching
name
;默认为
None
. This is the preferred way to get parsed headers.
Message.
get
(
名称
[
,
default
]
)
¶
别名化的
getheader()
, to make the interface more compatible with regular dictionaries.
Message.
getaddr
(
名称
)
¶
返回一对
(full name, email address)
parsed from the string returned by
getheader(name)
. If no header matching
name
exists, return
(None,
None)
; otherwise both the full name and the address are (possibly empty) strings.
Example: If
m
’s first
从
header contains the string
'jack@cwi.nl (Jack Jansen)'
,那么
m.getaddr('From')
will yield the pair
('Jack Jansen', 'jack@cwi.nl')
. If the header contained
'Jack Jansen
<jack@cwi.nl>'
instead, it would yield the exact same result.
Message.
getaddrlist
(
名称
)
¶
这类似于
getaddr(list)
, but parses a header containing a list of email addresses (e.g. a
到
header) and returns a list of
(full
name, email address)
pairs (even if there was only one address in the header). If there is no header matching
name
, return an empty list.
If multiple headers exist that match the named header (e.g. if there are several Cc headers), all are parsed for addresses. Any continuation lines the named headers contain are also parsed.
Message.
getdate
(
名称
)
¶
Retrieve a header using
getheader()
and parse it into a 9-tuple compatible with
time.mktime()
; note that fields 6, 7, and 8 are not usable. If there is no header matching
name
, or it is unparsable, return
None
.
Date parsing appears to be a black art, and not all mailers adhere to the standard. While it has been tested and found correct on a large collection of email from many sources, it is still possible that this function may occasionally yield an incorrect result.
Message.
getdate_tz
(
名称
)
¶
Retrieve a header using
getheader()
and parse it into a 10-tuple; the first 9 elements will make a tuple compatible with
time.mktime()
, and the 10th is a number giving the offset of the date’s timezone from UTC. Note that fields 6, 7, and 8 are not usable. Similarly to
getdate()
, if there is no header matching
name
, or it is unparsable, return
None
.
Message
instances also support a limited mapping interface. In particular:
m[name]
is like
m.getheader(name)
但引发
KeyError
if there is no matching header; and
len(m)
,
m.get(name[, default])
,
name in m
,
m.keys()
,
m.values()
m.items()
,和
m.setdefault(name[, default])
act as expected, with the one difference that
setdefault()
uses an empty string as the default value.
Message
instances also support the mapping writable interface
m[name]
=
value
and
del m[name]
.
Message
objects do not support the
clear()
,
copy()
,
popitem()
,或
update()
methods of the mapping interface. (Support for
get()
and
setdefault()
was only added in Python 2.2.)
最后,
Message
instances have some public instance variables:
Message.
headers
¶
A list containing the entire set of header lines, in the order in which they were read (except that setitem calls may disturb this order). Each line contains a trailing newline. The blank line terminating the headers is not contained in the list.
Message.
fp
¶
The file or file-like object passed at instantiation time. This can be used to read the message content.
Message.
unixfrom
¶
The Unix
From
line, if the message had one, or an empty string. This is needed to regenerate the message in some contexts, such as an
mbox
-style mailbox file.
An
AddressList
实例具有下列方法:
AddressList.
__len__
(
)
¶
Return the number of addresses in the address list.
AddressList.
__str__
(
)
¶
Return a canonicalized string representation of the address list. Addresses are rendered in “name” < host @ domain > form, comma-separated.
AddressList.
__add__
(
alist
)
¶
返回新的
AddressList
instance that contains all addresses in both
AddressList
operands, with duplicates removed (set union).
AddressList.
__iadd__
(
alist
)
¶
In-place version of
__add__()
; turns this
AddressList
instance into the union of itself and the right-hand instance,
alist
.
AddressList.
__sub__
(
alist
)
¶
返回新的
AddressList
instance that contains every address in the left-hand
AddressList
operand that is not present in the right-hand address operand (set difference).
AddressList.
__isub__
(
alist
)
¶
In-place version of
__sub__()
, removing addresses in this list which are also in
alist
.
最后,
AddressList
instances have one public instance variable:
AddressList.
addresslist
¶
A list of tuple string pairs, one per address. In each member, the first is the canonicalized name part, the second is the actual route-address (
'@'
-separated username-host.domain pair).
脚注