diff --git a/Doc/library/email.generator.rst b/Doc/library/email.generator.rst --- a/Doc/library/email.generator.rst +++ b/Doc/library/email.generator.rst @@ -32,8 +32,7 @@ :mod:`email.generator` module: -.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78, *, \ - policy=policy.default) +.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78) The constructor for the :class:`Generator` class takes a :term:`file-like object` called *outfp* for an argument. *outfp* must support the :meth:`write` method @@ -54,16 +53,10 @@ :class:`~email.header.Header` class. Set to zero to disable header wrapping. The default is 78, as recommended (but not required) by :rfc:`2822`. - The *policy* keyword specifies a :mod:`~email.policy` object that controls a - number of aspects of the generator's operation. The default policy - maintains backward compatibility. - - .. versionchanged:: 3.3 Added the *policy* keyword. - The other public :class:`Generator` methods are: - .. method:: flatten(msg, unixfrom=False, linesep=None) + .. method:: flatten(msg, unixfrom=False, linesep='\\n') Print the textual representation of the message object structure rooted at *msg* to the output file specified when the :class:`Generator` instance @@ -79,13 +72,12 @@ Note that for subparts, no envelope header is ever printed. Optional *linesep* specifies the line separator character used to - terminate lines in the output. If specified it overrides the value - specified by the ``Generator``\ 's ``policy``. + terminate lines in the output. It defaults to ``\n`` because that is + the most useful value for Python application code (other library packages + expect ``\n`` separated lines). ``linesep=\r\n`` can be used to + generate output with RFC-compliant line separators. - Because strings cannot represent non-ASCII bytes, ``Generator`` ignores - the value of the :attr:`~email.policy.Policy.must_be_7bit` - :mod:`~email.policy` setting and operates as if it were set ``True``. - This means that messages parsed with a Bytes parser that have a + Messages parsed with a Bytes parser that have a :mailheader:`Content-Transfer-Encoding` of 8bit will be converted to a use a 7bit Content-Transfer-Encoding. Non-ASCII bytes in the headers will be :rfc:`2047` encoded with a charset of `unknown-8bit`. @@ -111,8 +103,7 @@ formatted string representation of a message object. For more detail, see :mod:`email.message`. -.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78, *, \ - policy=policy.default) +.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78) The constructor for the :class:`BytesGenerator` class takes a binary :term:`file-like object` called *outfp* for an argument. *outfp* must @@ -134,31 +125,19 @@ wrapping. The default is 78, as recommended (but not required) by :rfc:`2822`. - The *policy* keyword specifies a :mod:`~email.policy` object that controls a - number of aspects of the generator's operation. The default policy - maintains backward compatibility. - - .. versionchanged:: 3.3 Added the *policy* keyword. - The other public :class:`BytesGenerator` methods are: - .. method:: flatten(msg, unixfrom=False, linesep=None) + .. method:: flatten(msg, unixfrom=False, linesep='\n') Print the textual representation of the message object structure rooted at *msg* to the output file specified when the :class:`BytesGenerator` instance was created. Subparts are visited depth-first and the resulting - text will be properly MIME encoded. If the :mod:`~email.policy` option - :attr:`~email.policy.Policy.must_be_7bit` is ``False`` (the default), - then any bytes with the high bit set in the original parsed message that - have not been modified will be copied faithfully to the output. If - ``must_be_7bit`` is true, the bytes will be converted as needed using an - ASCII content-transfer-encoding. In particular, RFC-invalid non-ASCII - bytes in headers will be encoded using the MIME ``unknown-8bit`` - character set, thus rendering them RFC-compliant. - - .. XXX: There should be a complimentary option that just does the RFC - compliance transformation but leaves CTE 8bit parts alone. + text will be properly MIME encoded. If the input that created the *msg* + contained bytes with the high bit set and those bytes have not been + modified, they will be copied faithfully to the output, even if doing so + is not strictly RFC compliant. (To produce strictly RFC compliant + output, use the :class:`Generator` class.) Messages parsed with a Bytes parser that have a :mailheader:`Content-Transfer-Encoding` of 8bit will be reconstructed @@ -173,8 +152,10 @@ Note that for subparts, no envelope header is ever printed. Optional *linesep* specifies the line separator character used to - terminate lines in the output. If specified it overrides the value - specified by the ``Generator``\ 's ``policy``. + terminate lines in the output. It defaults to ``\n`` because that is + the most useful value for Python application code (other library packages + expect ``\n`` separated lines). ``linesep=\r\n`` can be used to + generate output with RFC-compliant line separators. .. method:: clone(fp) diff --git a/Doc/library/email.parser.rst b/Doc/library/email.parser.rst --- a/Doc/library/email.parser.rst +++ b/Doc/library/email.parser.rst @@ -58,18 +58,12 @@ Here is the API for the :class:`FeedParser`: -.. class:: FeedParser(_factory=email.message.Message, *, policy=policy.default) +.. class:: FeedParser(_factory=email.message.Message) Create a :class:`FeedParser` instance. Optional *_factory* is a no-argument callable that will be called whenever a new message object is needed. It defaults to the :class:`email.message.Message` class. - The *policy* keyword specifies a :mod:`~email.policy` object that controls a - number of aspects of the parser's operation. The default policy maintains - backward compatibility. - - .. versionchanged:: 3.3 Added the *policy* keyword. - .. method:: feed(data) Feed the :class:`FeedParser` some more data. *data* should be a string @@ -108,7 +102,7 @@ class. -.. class:: Parser(_class=email.message.Message, *, policy=policy.default) +.. class:: Parser(_class=email.message.Message) The constructor for the :class:`Parser` class takes an optional argument *_class*. This must be a callable factory (such as a function or a class), and @@ -116,13 +110,8 @@ :class:`~email.message.Message` (see :mod:`email.message`). The factory will be called without arguments. - The *policy* keyword specifies a :mod:`~email.policy` object that controls a - number of aspects of the parser's operation. The default policy maintains - backward compatibility. - - .. versionchanged:: 3.3 - Removed the *strict* argument that was deprecated in 2.4. Added the - *policy* keyword. + .. versionchanged:: 3.2 + Removed the *strict* argument that was deprecated in 2.4. The other public :class:`Parser` methods are: @@ -153,18 +142,12 @@ the entire contents of the file. -.. class:: BytesParser(_class=email.message.Message, *, policy=policy.default) +.. class:: BytesParser(_class=email.message.Message, strict=None) This class is exactly parallel to :class:`Parser`, but handles bytes input. The *_class* and *strict* arguments are interpreted in the same way as for - the :class:`Parser` constructor. - - The *policy* keyword specifies a :mod:`~email.policy` object that - controls a number of aspects of the parser's operation. The default - policy maintains backward compatibility. - - .. versionchanged:: 3.3 - Removed the *strict* argument. Added the *policy* keyword. + the :class:`Parser` constructor. *strict* is supported only to make porting + code easier; it is deprecated. .. method:: parse(fp, headeronly=False) @@ -202,15 +185,12 @@ .. currentmodule:: email -.. function:: message_from_string(s, _class=email.message.Message, *, \ - policy=policy.default) +.. function:: message_from_string(s, _class=email.message.Message, strict=None) Return a message object structure from a string. This is exactly equivalent to - ``Parser().parsestr(s)``. *_class* and *policy* are interpreted as + ``Parser().parsestr(s)``. Optional *_class* and *strict* are interpreted as with the :class:`Parser` class constructor. - .. versionchanged:: removed *strict*, added *policy* - .. function:: message_from_bytes(s, _class=email.message.Message, strict=None) Return a message object structure from a byte string. This is exactly @@ -218,27 +198,21 @@ *strict* are interpreted as with the :class:`Parser` class constructor. .. versionadded:: 3.2 - .. versionchanged:: 3.3 removed *strict*, added *policy* -.. function:: message_from_file(fp, _class=email.message.Message, *, \ - policy=policy.default) +.. function:: message_from_file(fp, _class=email.message.Message, strict=None) Return a message object structure tree from an open :term:`file object`. - This is exactly equivalent to ``Parser().parse(fp)``. *_class* - and *policy* are interpreted as with the :class:`Parser` class constructor. + This is exactly equivalent to ``Parser().parse(fp)``. Optional *_class* + and *strict* are interpreted as with the :class:`Parser` class constructor. - .. versionchanged:: 3.3 removed *strict*, added *policy* - -.. function:: message_from_binary_file(fp, _class=email.message.Message, *, \ - policy=policy.default) +.. function:: message_from_binary_file(fp, _class=email.message.Message, strict=None) Return a message object structure tree from an open binary :term:`file object`. This is exactly equivalent to ``BytesParser().parse(fp)``. - *_class* and *policy* are interpreted as with the :class:`Parser` + Optional *_class* and *strict* are interpreted as with the :class:`Parser` class constructor. .. versionadded:: 3.2 - .. versionchanged:: 3.3 removed *strict*, added *policy* Here's an example of how you might use this at an interactive Python prompt:: diff --git a/Doc/library/email.policy.rst b/Doc/library/email.policy.rst deleted file mode 100644 --- a/Doc/library/email.policy.rst +++ /dev/null @@ -1,208 +0,0 @@ -:mod:`email`: Policy Objects ----------------------------- - -.. module:: email.policy - :synopsis: Controlling the parsing and generating of messages - - -The :mod:`email` package's prime focus is the handling of email messages as -described by the various email and MIME RFCs. However, the general format of -email messages (a block of header fields each consisting of a name followed by -a colon followed by a value, the whole block followed by a blank line and an -arbitrary 'body'), is a format that has found utility outside of the realm of -email. Some of these uses hew closely to the main RFCs, some do not. And even -when working with email, there are times when it is desirable to break strict -compliance with the RFCs. - -Policy Objects are the mechanism used to provide the email package with the -flexibility to handle all these disparate use cases, - -A :class:`Policy` object encapsulates a set of attributes and methods that -control the behavior of various components of the email package during use. -:class:`Policy` instances can be passed to various classes and methods in the -email package to alter the default behavior. The settable values and their -defaults are described below. The :mod:`policy` module also provides some -pre-created :class:`Policy` instances. In addition to a :const:`default` -instance, there are instances tailored for certain applications. For example -there is an :const:`SMTP` :class:`Policy` with defaults appropriate for -generating output to be sent to an SMTP server. These are listed :ref:`below -`. - -In general an application will only need to deal with setting the policy at the -input and output boundaries. Once parsed, a message is represented by a -:class:`~email.message.Message` object, which is designed to be independent of -the format that the message has "on the wire" when it is received, transmitted, -or displayed. Thus, a :class:`Policy` can be specified when parsing a message -to create a :class:`~email.message.Message`, and again when turning the -:class:`~email.message.Message` into some other representation. While often a -program will use the same :class:`Policy` for both input and output, the two -can be different. - -As an example, the following code could be used to read an email message from a -file on disk and pass it to the system ``sendmail`` program on a ``unix`` -system:: - - >>> from email import msg_from_binary_file - >>> from email.generator import BytesGenerator - >>> import email.policy - >>> from subprocess import Popen, PIPE - >>> msg = msg_from_binary_file(open('mymsg.txt', 'b'), policy=email.policy.mbox) - >>> p = Popen(['sendmail', msg['To][0].address], stdin=PIPE) - >>> g = BytesGenerator(p.stdin, email.policy.policy=SMTP) - >>> g.flatten(msg) - >>> p.stdin.close() - >>> rc = p.wait() - -Many email package methods accept a *policy* keyword argument, allowing the -policy to be overridden for that method. For example, the following code use -the :meth:`email.message.Message.as_string` method to the *msg* object from the -previous example and re-write it to a file using the native line separators for -the platform on which it is running:: - - >>> import os - >>> mypolicy = email.policy.Policy(linesep=os.linesep) - >>> with open('converted.txt', 'wb') as f: - >>> f.write(msg.as_string(policy=mypolicy)) - -Policy instances are immutable, but they are also callable, accepting the same -keyword arguments as the class constructor and returning a new :class:`Policy` -instance that is a copy of the original but with the specified attributes -values changed. For example, the following creates an SMTP policy that will -raise any defects detected as errors:: - - >>> strict_SMTP = email.policy.SMTP(raise_on_defect=True) - -Policy objects can also be added together, producing a policy object whose -settings are a combination of the non-default values of the summed objects:: - - >>> strict_SMTP = email.policy.SMTP + email.policy.strict - -This operation is not commutative; that is, the order in which the objects are -added matters. To illustrate:: - - >>> Policy = email.policy.Policy - >>> apolicy = Policy(max_line_length=100) + Policy(max_line_length=80) - >>> apolicy.max_line_length - 80 - >>> apolicy = Policy(max_line_length=80) + Policy(max_line_length=100) - >>> apolicy.max_line_length - 100 - - -.. class:: Policy(**kw) - - The valid constructor keyword arguments are any of the attributes listed - below. - - .. attribute:: max_line_length - - The maximum length of any line in the serialized output, not counting the - end of line character(s). Default is 78, per :rfc:`5322`. A value of - ``0`` or :const:`None` indicates that no line wrapping should be - done at all. - - .. attribute:: linesep - - The string to be used to terminate lines in serialized output. The - default is '\\n' because that's the internal end-of-line discipline used - by Python, though '\\r\\n' is required by the RFCs. See `Policy - Instances`_ for policies that use an RFC conformant linesep. Setting it - to :attr:`os.linesep` may also be useful. - - .. attribute:: must_be_7bit - - If :const:`True`, data output by a bytes generator is limited to ASCII - characters. If :const:`False` (the default), then bytes with the high - bit set are preserved and/or allowed in certain contexts (for example, - where possible a content transfer encoding of ``8bit`` will be used). - String generators act as if ``must_be_7bit`` is `True` regardless of the - policy in effect, since a string cannot represent non-ASCII bytes. - - .. attribute:: fallback_decode_charset - - The name of a character encoding to use when decoding parts of a - message that have non-ascii characters in them and no associated charset. - It defaults to ``None``, which means that such characters will be - preserved as is, or tagged as ``unknown-8bit``, depending on the context. - For processing email changing this default is generally a bad idea as it - will more often than not lead to mojibake__. - - __ http://en.wikipedia.org/wiki/Mojibake - - .. attribute:: default_encode_charset - - The name of a character set encoding to use when encoding a message part - for transmission and no charset has otherwise been specified. The - default is ``utf-8``. If the specified character set cannot encode all - of the characters in the part, output methods will raise - ``UnicodeEncodeError``\ s. - - .. attribute:: header_indent - - A string used to prefix all lines after the first when folding message - headers. To be RFC compliant this may be composed only of spaces and/or - tabs. The default is a single space. - - .. XXX: the way this works now is suboptimal. We really need two modes: - strict RFC compliance where we never introduce whitespace of our own, - just fold what exists, and a "defacto" mode where we use tabs for - folding *and remove them when unfolding*. - - .. attribute:: raise_on_defect - - If :const:`True`, any defects encountered will be raised as errors. If - :const:`False` (the default), defects will be passed to the - :meth:`register_defect` method. - - .. attribute:: cte_map - - A mapping from :rfc:`2047` content transfer encoding codes to - objects supporting the CTE API. The default registry provides - mappings for q/quoted-printable and b/base64 CTEs. - - .. XXX:: Need to document the CTE API somewhere. - - .. method:: handle_defect(obj, defect) - - *obj* is the object on which to register the defect. *defect* should be - a subclass of :class:`~email.errors.Defect`. If :attr:`raise_on_defect` - is ``True`` the defect is raised as an exception. Otherwise *obj* and - *defect* are passed to :meth:`register_defect`. This method is intended - to be called by parsers when they encounter defects, and will not be - called by code that uses the email library unless that code is - implementing an alternate parser. - - .. method:: register_defect(obj, defect) - - *obj* is the object on which to register the defect. *defect* should be - a subclass of :class:`~email.errors.Defect`. This method is part of the - public API so that custom ``Policy`` subclasses can implement alternate - handling of defects. The default implementation calls the ``append`` - method of the ``defects`` attribute of *obj*. - - -Policy Instances -................ - -The following instances of :class:`Policy` provide defaults suitable for -specific common application domains. - -.. data:: default - - An instance of :class:`Policy` with all defaults unchanged. - -.. data:: SMTP - - Output serialized from a message will conform to the email and SMTP - RFCs. The only changed attribute is :attr:`linesep`, which is set to - ``\r\n``. - -.. data:: HTTP - - Suitable for use when serializing headers for use in HTTP traffic. - :attr:`linesep` is set to ``\r\n``, and :attr:`max_line_length` is set to - :const:`None` (unlimited). - -.. data:: strict - - :attr:`raise_on_defect` is set to :const:`True`. diff --git a/Lib/email/errors.py b/Lib/email/errors.py --- a/Lib/email/errors.py +++ b/Lib/email/errors.py @@ -32,7 +32,7 @@ # These are parsing defects which the parser was able to work around. -class MessageDefect(Exception): +class MessageDefect: """Base class for a message defect.""" def __init__(self, line=None): diff --git a/Lib/email/feedparser.py b/Lib/email/feedparser.py --- a/Lib/email/feedparser.py +++ b/Lib/email/feedparser.py @@ -25,7 +25,6 @@ from email import errors from email import message -from email import policy NLCRE = re.compile('\r\n|\r|\n') NLCRE_bol = re.compile('(\r\n|\r|\n)') @@ -138,16 +137,9 @@ class FeedParser: """A feed-style parser of email.""" - def __init__(self, _factory=message.Message, *, policy=policy.default): - """_factory is called with no arguments to create a new message obj - - The policy keyword specifies a policy object that controls a number of - aspects of the parser's operation. The default policy maintains - backward compatibility. - - """ + def __init__(self, _factory=message.Message): + """_factory is called with no arguments to create a new message obj""" self._factory = _factory - self.policy = policy self._input = BufferedSubFile() self._msgstack = [] self._parse = self._parsegen().__next__ @@ -179,8 +171,7 @@ # Look for final set of defects if root.get_content_maintype() == 'multipart' \ and not root.is_multipart(): - defect = errors.MultipartInvariantViolationDefect() - self.policy.handle_defect(root, defect) + root.defects.append(errors.MultipartInvariantViolationDefect()) return root def _new_message(self): @@ -293,8 +284,7 @@ # defined a boundary. That's a problem which we'll handle by # reading everything until the EOF and marking the message as # defective. - defect = errors.NoBoundaryInMultipartDefect() - self.policy.handle_defect(self._cur, defect) + self._cur.defects.append(errors.NoBoundaryInMultipartDefect()) lines = [] for line in self._input: if line is NeedMoreData: @@ -398,8 +388,7 @@ # that as a defect and store the captured text as the payload. # Everything from here to the EOF is epilogue. if capturing_preamble: - defect = errors.StartBoundaryNotFoundDefect() - self.policy.handle_defect(self._cur, defect) + self._cur.defects.append(errors.StartBoundaryNotFoundDefect()) self._cur.set_payload(EMPTYSTRING.join(preamble)) epilogue = [] for line in self._input: @@ -451,7 +440,7 @@ # is illegal, so let's note the defect, store the illegal # line, and ignore it for purposes of headers. defect = errors.FirstHeaderLineIsContinuationDefect(line) - self.policy.handle_defect(self._cur, defect) + self._cur.defects.append(defect) continue lastvalue.append(line) continue diff --git a/Lib/email/generator.py b/Lib/email/generator.py --- a/Lib/email/generator.py +++ b/Lib/email/generator.py @@ -13,10 +13,8 @@ import warnings from io import StringIO, BytesIO -from email import policy from email.header import Header from email.message import _has_surrogates -import email.charset as _charset UNDERSCORE = '_' NL = '\n' # XXX: no longer used by the code below. @@ -35,8 +33,7 @@ # Public interface # - def __init__(self, outfp, mangle_from_=True, maxheaderlen=78, *, - policy=policy.default): + def __init__(self, outfp, mangle_from_=True, maxheaderlen=78): """Create the generator for message flattening. outfp is the output file-like object for writing the message to. It @@ -52,22 +49,16 @@ defined in the Header class. Set maxheaderlen to zero to disable header wrapping. The default is 78, as recommended (but not required) by RFC 2822. - - The policy keyword specifies a policy object that controls a number of - aspects of the generator's operation. The default policy maintains - backward compatibility. - """ self._fp = outfp self._mangle_from_ = mangle_from_ self._maxheaderlen = maxheaderlen - self.policy = policy def write(self, s): # Just delegate to the file object self._fp.write(s) - def flatten(self, msg, unixfrom=False, linesep=None): + def flatten(self, msg, unixfrom=False, linesep='\n'): r"""Print the message object tree rooted at msg to the output file specified when the Generator instance was created. @@ -79,15 +70,17 @@ Note that for subobjects, no From_ line is printed. linesep specifies the characters used to indicate a new line in - the output. The default value is determined by the policy. + the output. The default value is the most useful for typical + Python applications, but it can be set to \r\n to produce RFC-compliant + line separators when needed. """ # We use the _XXX constants for operating on data that comes directly # from the msg, and _encoded_XXX constants for operating on data that # has already been converted (to bytes in the BytesGenerator) and # inserted into a temporary buffer. - self._NL = linesep if linesep is not None else self.policy.linesep - self._encoded_NL = self._encode(self._NL) + self._NL = linesep + self._encoded_NL = self._encode(linesep) self._EMPTY = '' self._encoded_EMTPY = self._encode('') if unixfrom: @@ -343,10 +336,7 @@ Functionally identical to the base Generator except that the output is bytes and not string. When surrogates were used in the input to encode - bytes, these are decoded back to bytes for output. If the policy has - must_be_7bit set true, then the message is transformed such that the - non-ASCII bytes are properly content transfer encoded, using the - charset unknown-8bit. + bytes, these are decoded back to bytes for output. The outfp object must accept bytes in its write method. """ @@ -369,22 +359,21 @@ # strings with 8bit bytes. for h, v in msg._headers: self.write('%s: ' % h) - if isinstance(v, str): - if _has_surrogates(v): - if not self.policy.must_be_7bit: - # If we have raw 8bit data in a byte string, we have no idea - # what the encoding is. There is no safe way to split this - # string. If it's ascii-subset, then we could do a normal - # ascii split, but if it's multibyte then we could break the - # string. There's no way to know so the least harm seems to - # be to not split the string and risk it being too long. - self.write(v+NL) - continue - h = Header(v, charset=_charset.UNKNOWN8BIT, header_name=h) - else: - h = Header(v, header_name=h) - self.write(h.encode(linesep=self._NL, - maxlinelen=self._maxheaderlen)+self._NL) + if isinstance(v, Header): + self.write(v.encode(maxlinelen=self._maxheaderlen)+NL) + elif _has_surrogates(v): + # If we have raw 8bit data in a byte string, we have no idea + # what the encoding is. There is no safe way to split this + # string. If it's ascii-subset, then we could do a normal + # ascii split, but if it's multibyte then we could break the + # string. There's no way to know so the least harm seems to + # be to not split the string and risk it being too long. + self.write(v+NL) + else: + # Header's got lots of smarts and this string is safe... + header = Header(v, maxlinelen=self._maxheaderlen, + header_name=h) + self.write(header.encode(linesep=self._NL)+self._NL) # A blank line always separates headers from body self.write(self._NL) @@ -393,7 +382,7 @@ # just write it back out. if msg._payload is None: return - if _has_surrogates(msg._payload) and not self.policy.must_be_7bit: + if _has_surrogates(msg._payload): self.write(msg._payload) else: super(BytesGenerator,self)._handle_text(msg) diff --git a/Lib/email/parser.py b/Lib/email/parser.py --- a/Lib/email/parser.py +++ b/Lib/email/parser.py @@ -11,12 +11,11 @@ from email.feedparser import FeedParser from email.message import Message -from email import policy class Parser: - def __init__(self, _class=Message, *, policy=policy.default): + def __init__(self, _class=Message): """Parser of RFC 2822 and MIME email messages. Creates an in-memory object tree representing the email message, which @@ -31,14 +30,8 @@ _class is the class to instantiate for new message objects when they must be created. This class must have a constructor that can take zero arguments. Default is Message.Message. - - The policy keyword specifies a policy object that controls a number of - aspects of the parser's operation. The default policy maintains - backward compatibility. - """ self._class = _class - self.policy = policy def parse(self, fp, headersonly=False): """Create a message structure from the data in a file. @@ -48,7 +41,7 @@ parsing after reading the headers or not. The default is False, meaning it parses the entire contents of the file. """ - feedparser = FeedParser(self._class, policy=self.policy) + feedparser = FeedParser(self._class) if headersonly: feedparser._set_headersonly() while True: diff --git a/Lib/email/policy.py b/Lib/email/policy.py deleted file mode 100644 --- a/Lib/email/policy.py +++ /dev/null @@ -1,197 +0,0 @@ -"""Policy framework for the email package. - -Allows fine grained feature control of how the package parses and emits data. -""" - -__all__ = [ - 'Policy', - 'default', - 'strict', - 'SMTP', - 'HTTP', - ] - - -class _PolicyBase: - - """Policy Object basic framework. - - This class is useless unless subclassed. A subclass should define - class attributes with defaults for any values that are to be - managed by the Policy object. The constructor will then allow - non-default values to be set for these attributes at instance - creation time. The instance will be callable, taking these same - attributes keyword arguments, and returning a new instance - identical to the called instance except for those values changed - by the keyword arguments. Instances may be added, yielding new - instances with any non-default values from the right hand - operand overriding those in the left hand operand. That is, - - A + B == A() - - The repr of an instance can be used to reconstruct the object - if and only if the repr of the values can be used to reconstruct - those values. - - """ - - def __init__(self, **kw): - """Create new Policy, possibly overriding some defaults. - - See class docstring for a list of overridable attributes. - - """ - for name, value in kw.items(): - if hasattr(self, name): - super(_PolicyBase,self).__setattr__(name, value) - else: - raise TypeError( - "{!r} is an invalid keyword argument for {}".format( - name, self.__class__.__name__)) - - def __repr__(self): - args = [ "{}={!r}".format(name, value) - for name, value in self.__dict__.items() ] - return "{}({})".format(self.__class__.__name__, args if args else '') - - def __call__(self, **kw): - """Return a new instance with specified attributes changed. - - The new instance has the same attribute values as the called Policy, - except for the changes passed in as keyword arguments. - - """ - for attr, value in self.__dict__.items(): - if attr not in kw: - kw[attr] = value - return self.__class__(**kw) - - def __setattr__(self, name, value): - if hasattr(self, name): - msg = "{!r} object attribute {!r} is read-only" - else: - msg = "{!r} object has no attribute {!r}" - raise AttributeError(msg.format(self.__class__.__name__, name)) - - def __add__(self, other): - """Non-default values from right operand override those from left. - - The object returned is a new instance of the subclass. - - """ - return self(**other.__dict__) - - -class Policy(_PolicyBase): - - """Controls for how messages are interpreted and formatted. - - Most of the classes and many of the methods in the email package - accept Policy objects as parameters. A Policy object contains a set - of values and functions that control how input is interpreted and how - output is rendered. For example, the parameter 'raise_on_defect' - controls whether or not an RFC violation throws an error or not, - while 'max_line_length' controls the maximum length of output lines - when a Message is serialized. - - Any valid attribute may be overridden when a Policy is created by - passing it as a keyword argument to the constructor. Policy - objects are immutable, but a new Policy object can be created - with only certain values changed by calling the Policy instance - with keyword arguments. Policy objects can also be added, - producing a new Policy object in which the non-default attributes - set in the right hand operand overwrite those specified in the - left operand. - - Settable attributes: - - raise_on_defect -- If true, then defects should be raised - as errors. Default False. - - linesep -- string containing the value to use as - separation between output lines. Default '\n'. - - must_be_7bit -- output must contain only 7bit clean data. - Default False. - - header_indent -- string used to indent header continuation - lines. Default ' '. - - max_line_length -- maximum length of lines, excluding 'linesep', - during serialization. None means no line - wrapping is done. Default is 78. - - fallback_decode_charset -- charset to use if non-ASCII bytes with no - associated charset are encountered during - parsing. Default None, which means preserve - the bytes as is and if a transformation is - done, label them as unknown-8bit. - - default_encode_charset -- charset to use when encoding string body - parts when no specific charset is specified. - Defaults utf-8. - - cte_map -- Mapping from content transfer encoding names to - objects supporting the CTE API. The default - mapping provides Coders for q/quoted-printable and - b/base64. - - Methods that can be overridden in subclasses: - - register_defect(obj, defect) - defect is a Defect subclass. The default implementation appends defect - to the objs 'defects' attribute. - - """ - - raise_on_defect = False - linesep = '\n' - must_be_7bit = False - header_indent = ' ' - max_line_length = 78 - fallback_decode_charset = None - default_encode_charset = 'utf-8' - cte_map = { - # Punt on this for now. - #'q': email.cte.quoted_printable, - #'quoted-printable': email.cte.quoted_printable, - #'b': email.cte.base64, - #'base64': email.cte.base64, - } - - def handle_defect(self, obj, defect): - """Based on policy, either raise defect or all register_defect. - - handle_defect(obj, defect) - - defect should be a Defect subclass, but in any case must be an - Exception subclass. obj is the object on which the defect should be - registered if it is not raised. If the raise_on_defect is True, the - defect is raised as an error, otherwise the object and the defect are - passed to register_defect. - - This class is intended to be called by parsers that discover defects, - and will not be called from code using the library unless that code is - implementing an alternate parser. - - """ - if self.raise_on_defect: - raise defect - self.register_defect(obj, defect) - - def register_defect(self, obj, defect): - """Record 'defect' on 'obj'. - - Called by handle_defect if raise_on_defect is False. This method is - part of the Policy API so that Policy subclasses can implement custom - defect handling. The default implementation calls the append method - of the defects attribute of obj. - - """ - obj.defects.append(defect) - - -default = Policy() -strict = default(raise_on_defect=True) -SMTP = default(linesep='\r\n') -HTTP = default(linesep='\r\n', max_line_length=None) diff --git a/Lib/test/test_email/test_email.py b/Lib/test/test_email/test_email.py --- a/Lib/test/test_email/test_email.py +++ b/Lib/test/test_email/test_email.py @@ -1556,12 +1556,7 @@ # Test some badly formatted messages -class TestNonConformantBase: - - def _msgobj(self, filename): - with openfile(filename) as fp: - return email.message_from_file(fp, policy=self.policy) - +class TestNonConformant(TestEmailBase): def test_parse_missing_minor_type(self): eq = self.assertEqual msg = self._msgobj('msg_14.txt') @@ -1575,18 +1570,17 @@ # XXX We can probably eventually do better inner = msg.get_payload(0) unless(hasattr(inner, 'defects')) - self.assertEqual(len(self.get_defects(inner)), 1) - unless(isinstance(self.get_defects(inner)[0], + self.assertEqual(len(inner.defects), 1) + unless(isinstance(inner.defects[0], errors.StartBoundaryNotFoundDefect)) def test_multipart_no_boundary(self): unless = self.assertTrue msg = self._msgobj('msg_25.txt') unless(isinstance(msg.get_payload(), str)) - self.assertEqual(len(self.get_defects(msg)), 2) - unless(isinstance(self.get_defects(msg)[0], - errors.NoBoundaryInMultipartDefect)) - unless(isinstance(self.get_defects(msg)[1], + self.assertEqual(len(msg.defects), 2) + unless(isinstance(msg.defects[0], errors.NoBoundaryInMultipartDefect)) + unless(isinstance(msg.defects[1], errors.MultipartInvariantViolationDefect)) def test_invalid_content_type(self): @@ -1642,10 +1636,9 @@ unless = self.assertTrue msg = self._msgobj('msg_41.txt') unless(hasattr(msg, 'defects')) - self.assertEqual(len(self.get_defects(msg)), 2) - unless(isinstance(self.get_defects(msg)[0], - errors.NoBoundaryInMultipartDefect)) - unless(isinstance(self.get_defects(msg)[1], + self.assertEqual(len(msg.defects), 2) + unless(isinstance(msg.defects[0], errors.NoBoundaryInMultipartDefect)) + unless(isinstance(msg.defects[1], errors.MultipartInvariantViolationDefect)) def test_missing_start_boundary(self): @@ -1659,71 +1652,21 @@ # # [*] This message is missing its start boundary bad = outer.get_payload(1).get_payload(0) - self.assertEqual(len(self.get_defects(bad)), 1) - self.assertTrue(isinstance(self.get_defects(bad)[0], + self.assertEqual(len(bad.defects), 1) + self.assertTrue(isinstance(bad.defects[0], errors.StartBoundaryNotFoundDefect)) def test_first_line_is_continuation_header(self): eq = self.assertEqual m = ' Line 1\nLine 2\nLine 3' - msg = email.message_from_string(m, policy=self.policy) + msg = email.message_from_string(m) eq(msg.keys(), []) eq(msg.get_payload(), 'Line 2\nLine 3') - eq(len(self.get_defects(msg)), 1) - self.assertTrue(isinstance(self.get_defects(msg)[0], + eq(len(msg.defects), 1) + self.assertTrue(isinstance(msg.defects[0], errors.FirstHeaderLineIsContinuationDefect)) - eq(self.get_defects(msg)[0].line, ' Line 1\n') - - -class TestNonConformant(TestNonConformantBase, TestEmailBase): - - policy=email.policy.default - - def get_defects(self, obj): - return obj.defects - - -class TestNonConformantCapture(TestNonConformantBase, TestEmailBase): - - class CapturePolicy(email.policy.Policy): - captured = None - def register_defect(self, obj, defect): - self.captured.append(defect) - - def setUp(self): - self.policy = self.CapturePolicy(captured=list()) - - def get_defects(self, obj): - return self.policy.captured - - -class TestRaisingDefects(TestEmailBase): - - def _msgobj(self, filename): - with openfile(filename) as fp: - return email.message_from_file(fp, policy=email.policy.strict) - - def test_same_boundary_inner_outer(self): - with self.assertRaises(errors.StartBoundaryNotFoundDefect): - self._msgobj('msg_15.txt') - - def test_multipart_no_boundary(self): - with self.assertRaises(errors.NoBoundaryInMultipartDefect): - self._msgobj('msg_25.txt') - - def test_lying_multipart(self): - with self.assertRaises(errors.NoBoundaryInMultipartDefect): - self._msgobj('msg_41.txt') - - - def test_missing_start_boundary(self): - with self.assertRaises(errors.StartBoundaryNotFoundDefect): - self._msgobj('msg_42.txt') - - def test_first_line_is_continuation_header(self): - m = ' Line 1\nLine 2\nLine 3' - with self.assertRaises(errors.FirstHeaderLineIsContinuationDefect): - msg = email.message_from_string(m, policy=email.policy.strict) + eq(msg.defects[0].line, ' Line 1\n') + # Test RFC 2047 header encoding and decoding @@ -2781,25 +2724,6 @@ g.flatten(msg, linesep='\r\n') self.assertEqual(s.getvalue(), text) - def test_crlf_control_via_policy(self): - with openfile('msg_26.txt', newline='\n') as fp: - text = fp.read() - msg = email.message_from_string(text) - s = StringIO() - g = email.generator.Generator(s, policy=email.policy.SMTP) - g.flatten(msg) - self.assertEqual(s.getvalue(), text) - - def test_flatten_linesep_overrides_policy(self): - # msg_27 is lf separated - with openfile('msg_27.txt', newline='\n') as fp: - text = fp.read() - msg = email.message_from_string(text) - s = StringIO() - g = email.generator.Generator(s, policy=email.policy.SMTP) - g.flatten(msg, linesep='\n') - self.assertEqual(s.getvalue(), text) - maxDiff = None def test_multipart_digest_with_extra_mime_headers(self): @@ -3219,45 +3143,6 @@ g = email.generator.BytesGenerator(s) g.flatten(msg, linesep='\r\n') self.assertEqual(s.getvalue(), text) - - def test_crlf_control_via_policy(self): - # msg_26 is crlf terminated - with openfile('msg_26.txt', 'rb') as fp: - text = fp.read() - msg = email.message_from_bytes(text) - s = BytesIO() - g = email.generator.BytesGenerator(s, policy=email.policy.SMTP) - g.flatten(msg) - self.assertEqual(s.getvalue(), text) - - def test_flatten_linesep_overrides_policy(self): - # msg_27 is lf separated - with openfile('msg_27.txt', 'rb') as fp: - text = fp.read() - msg = email.message_from_bytes(text) - s = BytesIO() - g = email.generator.BytesGenerator(s, policy=email.policy.SMTP) - g.flatten(msg, linesep='\n') - self.assertEqual(s.getvalue(), text) - - def test_must_be_7bit_handles_unknown_8bit(self): - msg = email.message_from_bytes(self.non_latin_bin_msg) - out = BytesIO() - g = email.generator.BytesGenerator(out, - policy=email.policy.default(must_be_7bit=True)) - g.flatten(msg) - self.assertEqual(out.getvalue(), - self.non_latin_bin_msg_as7bit_wrapped.encode('ascii')) - - def test_must_be_7bit_transforms_8bit_cte(self): - msg = email.message_from_bytes(self.latin_bin_msg) - out = BytesIO() - g = email.generator.BytesGenerator(out, - policy=email.policy.default(must_be_7bit=True)) - g.flatten(msg) - self.assertEqual(out.getvalue(), - self.latin_bin_msg_as7bit.encode('ascii')) - maxDiff = None diff --git a/Lib/test/test_email/test_policy.py b/Lib/test/test_email/test_policy.py deleted file mode 100644 --- a/Lib/test/test_email/test_policy.py +++ /dev/null @@ -1,152 +0,0 @@ -import types -import unittest -import email.policy - -class PolicyAPITests(unittest.TestCase): - - longMessage = True - - # These default values are the ones set on email.policy.default. - # If any of these defaults change, the docs must be updated. - policy_defaults = { - 'max_line_length': 78, - 'linesep': '\n', - 'must_be_7bit': False, - 'header_indent': ' ', - 'raise_on_defect': False, - 'cte_map': {}, # XXX: fix this - 'default_encode_charset': 'utf-8', # XXX: should this be a Charset? - 'fallback_decode_charset': None, - } - - # For each policy under test, we give here the values of the attributes - # that are different from the defaults for that policy. - policies = { - email.policy.Policy(): {}, - email.policy.default: {}, - email.policy.SMTP: {'linesep': '\r\n'}, - email.policy.HTTP: {'linesep': '\r\n', 'max_line_length': None}, - email.policy.strict: {'raise_on_defect': True}, - } - - def test_defaults(self): - for policy, changed_defaults in self.policies.items(): - expected = self.policy_defaults.copy() - expected.update(changed_defaults) - for attr, value in expected.items(): - self.assertEqual(getattr(policy, attr), value, - ("change {} docs/docstrins if defaults have " - "changed").format(policy)) - - def test_all_attributes_covered(self): - for attr in dir(email.policy.default): - if (attr.startswith('_') or - isinstance(getattr(email.policy.Policy, attr), - types.FunctionType)): - continue - else: - self.assertIn(attr, self.policy_defaults, - "{} is not fully tested".format(attr)) - - def test_policy_is_immutable(self): - for policy in self.policies.keys(): - for attr in self.policy_defaults: - with self.assertRaisesRegex(AttributeError, attr+".*read-only"): - setattr(policy, attr, None) - with self.assertRaisesRegex(AttributeError, 'no attribute.*foo'): - policy.foo = None - - def test_set_policy_attrs_when_calledl(self): - testattrdict = { attr: None for attr in self.policy_defaults } - for policyclass in self.policies: - policy = policyclass(**testattrdict) - for attr in self.policy_defaults: - self.assertIsNone(getattr(policy, attr)) - - def test_reject_non_policy_keyword_when_called(self): - for policyclass in self.policies: - with self.assertRaises(TypeError): - policyclass(this_keyword_should_not_be_valid=None) - with self.assertRaises(TypeError): - policyclass(newtline=None) - - def test_policy_addition(self): - expected = self.policy_defaults.copy() - p1 = email.policy.default(max_line_length=100) - p2 = email.policy.default(max_line_length=50) - added = p1 + p2 - expected.update(max_line_length=50) - for attr, value in expected.items(): - self.assertEqual(getattr(added, attr), value) - added = p2 + p1 - expected.update(max_line_length=100) - for attr, value in expected.items(): - self.assertEqual(getattr(added, attr), value) - added = added + email.policy.default - for attr, value in expected.items(): - self.assertEqual(getattr(added, attr), value) - - def test_register_defect(self): - class Dummy: - def __init__(self): - self.defects = [] - obj = Dummy() - defect = object() - policy = email.policy.Policy() - policy.register_defect(obj, defect) - self.assertEqual(obj.defects, [defect]) - defect2 = object() - policy.register_defect(obj, defect2) - self.assertEqual(obj.defects, [defect, defect2]) - - class MyObj: - def __init__(self): - self.defects = [] - - class MyDefect(Exception): - pass - - def test_handle_defect_raises_on_strict(self): - foo = self.MyObj() - defect = self.MyDefect("the telly is broken") - with self.assertRaisesRegex(self.MyDefect, "the telly is broken"): - email.policy.strict.handle_defect(foo, defect) - - def test_handle_defect_registers_defect(self): - foo = self.MyObj() - defect1 = self.MyDefect("one") - email.policy.default.handle_defect(foo, defect1) - self.assertEqual(foo.defects, [defect1]) - defect2 = self.MyDefect("two") - email.policy.default.handle_defect(foo, defect2) - self.assertEqual(foo.defects, [defect1, defect2]) - - class MyPolicy(email.policy.Policy): - defects = [] - def register_defect(self, obj, defect): - self.defects.append(defect) - - def test_overridden_register_defect_still_raises(self): - foo = self.MyObj() - defect = self.MyDefect("the telly is broken") - with self.assertRaisesRegex(self.MyDefect, "the telly is broken"): - self.MyPolicy(raise_on_defect=True).handle_defect(foo, defect) - - def test_overriden_register_defect_works(self): - foo = self.MyObj() - defect1 = self.MyDefect("one") - my_policy = self.MyPolicy() - my_policy.handle_defect(foo, defect1) - self.assertEqual(my_policy.defects, [defect1]) - self.assertEqual(foo.defects, []) - defect2 = self.MyDefect("two") - my_policy.handle_defect(foo, defect2) - self.assertEqual(my_policy.defects, [defect1, defect2]) - self.assertEqual(foo.defects, []) - - # XXX: Need subclassing tests. - # For adding subclassed objects, make sure the usual rules apply (subclass - # wins), but that the order still works (right overrides left). - -if __name__ == '__main__': - unittest.main()