This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author r.david.murray
Recipients aikinci, r.david.murray
Date 2012-03-13.19:55:32
SpamBayes Score 4.4621227e-09
Marked as misclassified No
Message-id <1331668534.02.0.642658515735.issue14291@psf.upfronthosting.co.za>
In-reply-to
Content
In Python2, this works:

    >>> from email.mime.text import MIMEText
    >>> m = MIMEText('abc')
    >>> str(m)
    'From nobody Tue Mar 13 15:44:59 2012\nContent-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\n\nabc'
    >>> m['Subject'] = u'É test'
    >>> str(m)
    'From nobody Tue Mar 13 15:48:11 2012\nContent-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\nSubject: =?utf-8?q?=C3=89_test?=\n\nabc'

That is, unicode string automatically get turned into encoded words.
In Python3 this no longer works:

    >>> from email.mime.text import MIMEText
    >>> m = MIMEText('abc')
    >>> str(m)
    'Content-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\n\nabc'
    >>> m['Subject'] = u'É test'
    >>> str(m)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/rdmurray/python/p33/Lib/email/message.py", line 154, in __str__
        return self.as_string()
      File "/home/rdmurray/python/p33/Lib/email/message.py", line 168, in as_string
        g.flatten(self, unixfrom=unixfrom)
      File "/home/rdmurray/python/p33/Lib/email/generator.py", line 99, in flatten
        self._write(msg)
      File "/home/rdmurray/python/p33/Lib/email/generator.py", line 152, in _write
        self._write_headers(msg)
      File "/home/rdmurray/python/p33/Lib/email/generator.py", line 186, in _write_headers
        header_name=h)
      File "/home/rdmurray/python/p33/Lib/email/header.py", line 205, in __init__
        self.append(s, charset, errors)
      File "/home/rdmurray/python/p33/Lib/email/header.py", line 286, in append
        s.encode(output_charset, errors)
    UnicodeEncodeError: 'ascii' codec can't encode character '\xc9' in position 0: ordinal not in range(128)

Presumably the problem is that the Python2 code tests for 'string' and if
it isn't string handles it by CTE encoding it.  In Python3 everything
is a string.  Probably what should happen is the encoding error should
be caught, and the CTE encoding done at that point, based on the model of how Python2 handled unicode strings.
History
Date User Action Args
2012-03-13 19:55:34r.david.murraysetrecipients: + r.david.murray, aikinci
2012-03-13 19:55:34r.david.murraysetmessageid: <1331668534.02.0.642658515735.issue14291@psf.upfronthosting.co.za>
2012-03-13 19:55:33r.david.murraylinkissue14291 messages
2012-03-13 19:55:32r.david.murraycreate