Author beazley
Recipients beazley
Date 2008-12-29.17:21:38
SpamBayes Score 1.28059e-12
Marked as misclassified No
Message-id <1230571301.08.0.30132908931.issue4768@psf.upfronthosting.co.za>
In-reply-to
Content
The email.generator.Generator class does not work correctly message 
objects created with binary data (MIMEImage, MIMEAudio, MIMEApplication, 
etc.).  For example:

>>> from email.mime.image import MIMEImage
>>> data = open("IMG.jpg","rb").read()
>>> m = MIMEImage(data,'jpeg')
>>> s = m.as_string()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/lib/python3.0/email/message.py", line 136, in as_string
    g.flatten(self, unixfrom=unixfrom)
  File "/tmp/lib/python3.0/email/generator.py", line 76, in flatten
    self._write(msg)
  File "/tmp/lib/python3.0/email/generator.py", line 101, in _write
    self._dispatch(msg)
  File "/tmp/lib/python3.0/email/generator.py", line 127, in _dispatch
    meth(msg)
  File "/tmp/lib/python3.0/email/generator.py", line 155, in 
_handle_text
    raise TypeError('string payload expected: %s' % type(payload))
TypeError: string payload expected: <class 'bytes'>
>>> 

The source of the problem is rather complicated, but here is the gist of 
it.

1.  Classes such as MIMEAudio and MIMEImage accept raw binary data as 
input.  This data is going to be in the form of bytes.

2.  These classes immediately encode the data using a base64 encoder. 
This encoder uses the library function base64.b64encode().

3. base64.b64encode() takes a byte string as input and returns a byte 
string as output.  So, even after encoding, the payload of the message 
is of type 'bytes'

4. When messages are generated, the method Generator._dispatch() is 
used.   It looks at the MIME main type and subtype and tries to dispatch 
message processing to a handler method of the form 
'_handle_type_subtype'.    If it can't find such a handler, it defaults 
to a method _writeBody().  For image and audio types, this is what 
happens. 

5. _writeBody() is an alias for _handle_text().

6. _handle_text() crashes because it's not expecting a payload of type 
'bytes'.

Suggested fix:

I think the library function base64.b64encode() should return a string, 
not bytes.  The whole point of base64 encoding is to take binary data 
and encode it into characters safe for inclusion in text strings. 

Other fixes:

Modify the Generator class in email.generator to properly detect bytes 
and use a different _handle function for it.  For instance, maybe add a 
_handle_binary() method.
History
Date User Action Args
2008-12-29 17:21:41beazleysetrecipients: + beazley
2008-12-29 17:21:41beazleysetmessageid: <1230571301.08.0.30132908931.issue4768@psf.upfronthosting.co.za>
2008-12-29 17:21:40beazleylinkissue4768 messages
2008-12-29 17:21:38beazleycreate