Message 230274 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Tom.Christie
Recipients	Tom.Christie
Date	2014-10-30.16:35:50
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1414686950.25.0.801159105863.issue22767@psf.upfronthosting.co.za>
In-reply-to

Content
This is one of those behavioural issues that is a borderline bug. The seperators argument to `json.dumps()` behaves differently across python 2 and 3. * In python 2 it should be provided as a bytestring, and can cause a UnicodeDecodeError otherwise. * In python 3 it should be provided as unicode,and can cause a TypeError otherwise. Examples: Python 2.7.2 >>> print json.dumps({'snowman': '☃'}, separators=(':', ','), ensure_ascii=False) {"snowman","☃"} >>> print json.dumps({'snowman': '☃'}, separators=(u':', u','), ensure_ascii=False) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1: ordinal not in range(128) And: Python 3.4.0 >>> print(json.dumps({'snowman': '☃'}, separators=(':', ','), ensure_ascii=False)) {"snowman","☃"} >>> print(json.dumps({'snowman': '☃'}, separators=(b':', b','), ensure_ascii=False)) <...> TypeError: sequence item 2: expected str instance, bytes found Technically this isn't out of line with the documentation - in both cases it uses `separators=(':', ',')` which is indeed the correct type in both v2 and v3. However it's unexpected behaviour that it changes types between versions, without being called out. Working on a codebase with `from __future__ import unicode_literals` this is particularly unexpected because we get a `UnicodeDecodeError` when running code that otherwise looks correct. It's also slightly awkward to fix because it's a bit of a weird branch condition. The fix would probably be to forcibly coerce it to the correct type regardless of if it is supplied as unicode or a bytestring, or at least to do so for python 2.7. Possibly related to http://bugs.python.org/issue22701 but wasn't able to understand if that ticket was in fact a different user error.

This is one of those behavioural issues that is a borderline bug.

The seperators argument to `json.dumps()` behaves differently across python 2 and 3.

* In python 2 it should be provided as a bytestring, and can cause a UnicodeDecodeError otherwise.
* In python 3 it should be provided as unicode,and can cause a TypeError otherwise.

Examples:

    Python 2.7.2
    >>> print json.dumps({'snowman': '☃'}, separators=(':', ','), ensure_ascii=False)
    {"snowman","☃"}
    >>> print json.dumps({'snowman': '☃'}, separators=(u':', u','), ensure_ascii=False)
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1: ordinal not in range(128)

And:

    Python 3.4.0
    >>> print(json.dumps({'snowman': '☃'}, separators=(':', ','), ensure_ascii=False))
    {"snowman","☃"}
    >>> print(json.dumps({'snowman': '☃'}, separators=(b':', b','), ensure_ascii=False))
    <...>
    TypeError: sequence item 2: expected str instance, bytes found

Technically this isn't out of line with the documentation - in both cases it uses `separators=(':', ',')` which is indeed the correct type in both v2 and v3. However it's unexpected behaviour that it changes types between versions, without being called out.

Working on a codebase with `from __future__ import unicode_literals` this is particularly unexpected because we get a `UnicodeDecodeError` when running code that otherwise looks correct.

It's also slightly awkward to fix because it's a bit of a weird branch condition.

The fix would probably be to forcibly coerce it to the correct type regardless of if it is supplied as unicode or a bytestring, or at least to do so for python 2.7.

Possibly related to http://bugs.python.org/issue22701 but wasn't able to understand if that ticket was in fact a different user error.

History
Date	User	Action	Args
2014-10-30 16:35:50	Tom.Christie	set	recipients: + Tom.Christie
2014-10-30 16:35:50	Tom.Christie	set	messageid: <1414686950.25.0.801159105863.issue22767@psf.upfronthosting.co.za>
2014-10-30 16:35:50	Tom.Christie	link	issue22767 messages
2014-10-30 16:35:50	Tom.Christie	create