Message 78466 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	beazley
Recipients	beazley
Date	2008-12-29.17:35:51
SpamBayes Score	2.452908e-10
Marked as misclassified	No
Message-id	<1230572153.22.0.954048343235.issue4769@psf.upfronthosting.co.za>
In-reply-to

Content
The whole point of base64 encoding is to safely encode binary data into text characters. Thus, the base64.b64decode() function should equally accept text strings or binary strings as input. For example, there is a reasonable expectation that something like this should work: >>> x = 'SGVsbG8=' >>> base64.b64decode(x) b'Hello' >>> In Python 3, you get this exception however: >>> base64.b64decode(x) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/lib/python3.0/base64.py", line 80, in b64decode raise TypeError("expected bytes, not %s" % s.__class__.__name__) TypeError: expected bytes, not str >>> I realize that there are encoding issues with Unicode strings, but base64 encodes everything into the first 127 ASCII characters. If the input to b64decode is a str, just do a encode('ascii') operation on it and proceed. If that fails, it wasn't valid Base64 to begin with. I can't think of any real negative impact to making this change as long as the result is still always bytes. The main benefit is just simplifying the decoding process for end-users. See issue 4768.

The whole point of base64 encoding is to safely encode binary data into 
text characters.  Thus, the base64.b64decode() function should equally 
accept text strings or binary strings as input. For example, there is a 
reasonable expectation that something like this should work:

>>> x = 'SGVsbG8='
>>> base64.b64decode(x)
b'Hello'
>>>

In Python 3, you get this exception however:

>>> base64.b64decode(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/lib/python3.0/base64.py", line 80, in b64decode
    raise TypeError("expected bytes, not %s" % s.__class__.__name__)
TypeError: expected bytes, not str
>>> 

I realize that there are encoding issues with Unicode strings, but 
base64 encodes everything into the first 127 ASCII characters.  If the 
input to b64decode is a str, just do a encode('ascii') operation on it 
and proceed.  If that fails, it wasn't valid Base64 to begin with.

I can't think of any real negative impact to making this change as long 
as the result is still always bytes.   The main benefit is just 
simplifying the decoding process for end-users.

See issue 4768.

History
Date	User	Action	Args
2008-12-29 17:35:53	beazley	set	recipients: + beazley
2008-12-29 17:35:53	beazley	set	messageid: <1230572153.22.0.954048343235.issue4769@psf.upfronthosting.co.za>
2008-12-29 17:35:52	beazley	link	issue4769 messages
2008-12-29 17:35:51	beazley	create