Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

email.generator.Generator object bytes/str crash - b64encode() bug? #49018

Closed
beazley mannequin opened this issue Dec 29, 2008 · 10 comments
Closed

email.generator.Generator object bytes/str crash - b64encode() bug? #49018

beazley mannequin opened this issue Dec 29, 2008 · 10 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@beazley
Copy link
Mannequin

beazley mannequin commented Dec 29, 2008

BPO 4768
Nosy @warsaw, @vstinner, @merwok, @bitdancer, @anacrolix
Files
  • email_base64_bytes.patch
  • python-email-encoders-base64-str.patch: Patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/bitdancer'
    closed_at = <Date 2010-06-04.16:17:20.301>
    created_at = <Date 2008-12-29.17:21:40.153>
    labels = ['type-bug', 'library']
    title = 'email.generator.Generator object bytes/str crash - b64encode() bug?'
    updated_at = <Date 2013-06-28.19:05:06.521>
    user = 'https://bugs.python.org/beazley'

    bugs.python.org fields:

    activity = <Date 2013-06-28.19:05:06.521>
    actor = 'r.david.murray'
    assignee = 'r.david.murray'
    closed = True
    closed_date = <Date 2010-06-04.16:17:20.301>
    closer = 'r.david.murray'
    components = ['Library (Lib)']
    creation = <Date 2008-12-29.17:21:40.153>
    creator = 'beazley'
    dependencies = []
    files = ['12525', '17551']
    hgrepos = []
    issue_num = 4768
    keywords = ['patch']
    message_count = 10.0
    messages = ['78464', '78744', '103596', '105381', '106546', '107063', '107065', '107073', '107075', '192010']
    nosy_count = 11.0
    nosy_names = ['barry', 'beazley', 'vstinner', 'eric.araujo', 'forest_atq', 'r.david.murray', 'brotchie', 'stac', 'l0nwlf', 'anacrolix', 'garazi111']
    pr_nums = []
    priority = 'critical'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue4768'
    versions = ['Python 3.1', 'Python 3.2']

    @beazley
    Copy link
    Mannequin Author

    beazley mannequin commented Dec 29, 2008

    The email.generator.Generator class does not work correctly message
    objects created with binary data (MIMEImage, MIMEAudio, MIMEApplication,
    etc.). For example:

    >>> from email.mime.image import MIMEImage
    >>> data = open("IMG.jpg","rb").read()
    >>> m = MIMEImage(data,'jpeg')
    >>> s = m.as_string()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/tmp/lib/python3.0/email/message.py", line 136, in as_string
        g.flatten(self, unixfrom=unixfrom)
      File "/tmp/lib/python3.0/email/generator.py", line 76, in flatten
        self._write(msg)
      File "/tmp/lib/python3.0/email/generator.py", line 101, in _write
        self._dispatch(msg)
      File "/tmp/lib/python3.0/email/generator.py", line 127, in _dispatch
        meth(msg)
      File "/tmp/lib/python3.0/email/generator.py", line 155, in 
    _handle_text
        raise TypeError('string payload expected: %s' % type(payload))
    TypeError: string payload expected: <class 'bytes'>
    >>> 

    The source of the problem is rather complicated, but here is the gist of
    it.

    1. Classes such as MIMEAudio and MIMEImage accept raw binary data as
      input. This data is going to be in the form of bytes.

    2. These classes immediately encode the data using a base64 encoder.
      This encoder uses the library function base64.b64encode().

    3. base64.b64encode() takes a byte string as input and returns a byte
      string as output. So, even after encoding, the payload of the message
      is of type 'bytes'

    4. When messages are generated, the method Generator._dispatch() is
      used. It looks at the MIME main type and subtype and tries to dispatch
      message processing to a handler method of the form
      '_handle_type_subtype'. If it can't find such a handler, it defaults
      to a method _writeBody(). For image and audio types, this is what
      happens.

    5. _writeBody() is an alias for _handle_text().

    6. _handle_text() crashes because it's not expecting a payload of type
      'bytes'.

    Suggested fix:

    I think the library function base64.b64encode() should return a string,
    not bytes. The whole point of base64 encoding is to take binary data
    and encode it into characters safe for inclusion in text strings.

    Other fixes:

    Modify the Generator class in email.generator to properly detect bytes
    and use a different _handle function for it. For instance, maybe add a
    _handle_binary() method.

    @beazley beazley mannequin added type-crash A hard crash of the interpreter, possibly with a core dump stdlib Python modules in the Lib dir labels Dec 29, 2008
    @vstinner
    Copy link
    Member

    vstinner commented Jan 2, 2009

    I think the library function base64.b64encode() should return
    a string, not bytes.

    Yes, in the email module, the payload is an unicode string, not a
    bytes string. We have to be able to concatenate headers
    (eg. "Content-Type: image/fish\nMIME-Version:
    1.0\nContent-Transfer-Encoding: base64\n") and encoded data
    (eg. "R0lGO").

    Attached patch implements this fix: encode_base64() returns str (and
    not bytes). The patchs fixes the unit tests and adds a new regression
    test for MIMEImage.as_string().

    @stac
    Copy link
    Mannequin

    stac mannequin commented Apr 19, 2010

    Hello,

    This patch has never been commited. I tested today with the 3.1 branch (and checked in the lib code). Is there a better way to attach images in an email ?

    Thanks in advance for your help,

    Regards,
    Stac

    @bitdancer bitdancer assigned bitdancer and unassigned warsaw Apr 23, 2010
    @bitdancer bitdancer added type-bug An unexpected behavior, bug, or error and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Apr 23, 2010
    @garazi111
    Copy link
    Mannequin

    garazi111 mannequin commented May 9, 2010

    Hi,

    I think the bug is also present in the function encode_quopri which should look like this :

    def encode_quopri(msg):
        """Encode the message's payload in quoted-printable.
    Also, add an appropriate Content-Transfer-Encoding header.
    """
    orig = msg.get_payload()
    encdata = _qencode(orig)
    data = str(encdata, "ASCII")
    msg.set_payload(data)
    msg['Content-Transfer-Encoding'] = 'quoted-printable'
    

    @vstinner
    Copy link
    Member

    I wrote a patch for base64.b64encode() to accept str (str is encoded to utf-8): patch attached to bpo-4768. It should fix this issue, but we can add the tests of email_base64_bytes.patch.

    @forestatq
    Copy link
    Mannequin

    forestatq mannequin commented Jun 4, 2010

    Attaching patch from reported duplicate bpo-8896.

    @forestatq
    Copy link
    Mannequin

    forestatq mannequin commented Jun 4, 2010

    Note that my patch is roughly the same as the original posted by haypo.

    @bitdancer
    Copy link
    Member

    Yes, but yours was better formatted, so I used it :) Thanks for the patch. Applied in r81685 to py3k, and r81686.

    @bitdancer
    Copy link
    Member

    @garazi111: if you have an example where quopri fails, please open a new issue for it. I suspect you are right that there is a problem there.

    @bitdancer
    Copy link
    Member

    For the record, encode_quopri was fixed in bpo-14360.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants