Message 241800 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	ezio.melotti, lemburg, vstinner
Date	2015-04-22.13:23:31
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1429709012.06.0.232941400287.issue24025@psf.upfronthosting.co.za>
In-reply-to

Content
In Python 2, the unicode() constructor does not accept bytes arguments, unless an encoding argument is given: >>> unicode(u'abcäöü'.encode('utf-8')) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128) In Python 3, the str() constructor masks this programming error by returning the repr() of the bytes object: >>> str('abcäöü'.encode('utf-8')) "b'abc\\xc3\\xa4\\xc3\\xb6\\xc3\\xbc'" I think it would be more helpful to point the programmer to the most probably missing encoding argument by raising an error. Also note that you get a different output with encoding argument set: >>> str('abcäöü'.encode('utf-8'), 'utf-8') 'abcäöü' I know this is documented, but it is still not very helpful and can easily hide errors.

In Python 2, the unicode() constructor does not accept bytes arguments, unless an encoding argument is given:

>>> unicode(u'abcäöü'.encode('utf-8'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)

In Python 3, the str() constructor masks this programming error by returning the repr() of the bytes object:

>>> str('abcäöü'.encode('utf-8'))
"b'abc\\xc3\\xa4\\xc3\\xb6\\xc3\\xbc'"

I think it would be more helpful to point the programmer to the most probably missing encoding argument by raising an error.

Also note that you get a different output with encoding argument set:

>>> str('abcäöü'.encode('utf-8'), 'utf-8')
'abcäöü'

I know this is documented, but it is still not very helpful and can easily hide errors.

History
Date	User	Action	Args
2015-04-22 13:23:32	lemburg	set	recipients: + lemburg, vstinner, ezio.melotti
2015-04-22 13:23:32	lemburg	set	messageid: <1429709012.06.0.232941400287.issue24025@psf.upfronthosting.co.za>
2015-04-22 13:23:32	lemburg	link	issue24025 messages
2015-04-22 13:23:31	lemburg	create