This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: json.dumps with ensure_ascii=False doesn't escape control characters
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: akira, ezio.melotti, pitrou, rhettinger, weeble
Priority: normal Keywords:

Created on 2014-04-10 09:31 by weeble, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (3)
msg215868 - (view) Author: Weeble (weeble) Date: 2014-04-10 09:31
The JSON spec (http://www.json.org/) does not allow unescaped control characters. (See the railroad diagram for strings and the grammar on the right.) If json.dumps is called with ensure_ascii=False, it fails to escape control codes in the range U+007F to U+009F. Here's an example:

>>> import json
>>> import unicodedata
>>> for i in range(256):
...     jsonstring = json.dumps(chr(i), ensure_ascii=False)
...     if any(unicodedata.category(ch) == 'Cc' for ch in jsonstring):
...         print("Fail:",repr(chr(i)))
Fail: '\x7f'
Fail: '\x80'
Fail: '\x81'
Fail: '\x82'
Fail: '\x83'
Fail: '\x84'
Fail: '\x85'
Fail: '\x86'
Fail: '\x87'
Fail: '\x88'
Fail: '\x89'
Fail: '\x8a'
Fail: '\x8b'
Fail: '\x8c'
Fail: '\x8d'
Fail: '\x8e'
Fail: '\x8f'
Fail: '\x90'
Fail: '\x91'
Fail: '\x92'
Fail: '\x93'
Fail: '\x94'
Fail: '\x95'
Fail: '\x96'
Fail: '\x97'
Fail: '\x98'
Fail: '\x99'
Fail: '\x9a'
Fail: '\x9b'
Fail: '\x9c'
Fail: '\x9d'
Fail: '\x9e'
Fail: '\x9f'
msg215898 - (view) Author: Akira Li (akira) * Date: 2014-04-10 18:40
json.dumps works correctly in this case.

Both json/application rfc [1] and ecma json standard [2] say:

> All characters may be placed within the quotation marks, except for the characters that must be escaped: quotation mark (U+0022), reverse solidus (U+005C), and the control characters (U+0000 through U+001F).

i.e., only a subset (00-1F) of control characters must be escaped in json string

[1]: https://tools.ietf.org/html/rfc7159#section-7
[2]: http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
msg215923 - (view) Author: Weeble (weeble) Date: 2014-04-11 10:18
Ah, sorry for the confusion.
History
Date User Action Args
2022-04-11 14:58:01adminsetgithub: 65393
2014-04-11 16:20:21ned.deilysetstatus: open -> closed
resolution: not a bug
stage: resolved
2014-04-11 10:18:06weeblesetmessages: + msg215923
2014-04-10 18:40:58akirasetnosy: + akira
messages: + msg215898
2014-04-10 10:46:28ezio.melottisetnosy: + rhettinger, pitrou, ezio.melotti
2014-04-10 09:31:10weeblecreate