Title: json encoder does not support JSONP/JavaScript safe escaping
Type: enhancement Stage: committed/rejected
Components: Library (Lib) Versions:
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Ztane, ezio.melotti, pitrou, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2013-06-24 04:46 by Ztane, last changed 2013-11-20 11:49 by serhiy.storchaka. This issue is now closed.

Messages (5)
msg191742 - (view) Author: Antti Haapala (Ztane) * Date: 2013-06-24 04:46
JSON is not a strict superset of JavaScript ( However, certain web technologies use JSON values as a part of JavaScript code (JSONP, inline <script> tags)... The Python json module, however, by default does not escape \u2028 or \u2029 when ensure_ascii is false. Furthermore, the / -> \/ escape is not supported by any switch.

Strictly speaking, json specification only requires that " be escaped to \" and \ to \\ - all other escaping is optional. The whitespace escapes only exist to aid handwriting and embedding values in HTML/code. Thus it can be argued that the choice of escapes used by json encoder is ill-adviced.

In an inline HTML <script></script> tag, no < cannot be escaped; however only the string '</script>' (or sometimes </) is interpreted as the "end of script". Thus a non-trivial XSS attack can be made by having a JSON stream {"key":"</script><script src=''></script>"} embedded in inline javascript. The only correct way to escape such content in inline html is to escape all / into \/.

The \u2028, \u2029 problem is more subtle and can break not only inline javascript but also JSONP. Thus there an incorrect value injected by a malicious or unwitting user to the database might break the entire protocol.

The current solution is to re-escape everything that comes out of JSON encoder. The best solution for python would be to make these 3 escapes default in the python json module (notice again that the current set of default escapes when ensure_ascii=False is chosen arbitrarily), or if not default, then at least they could be enabled by a switch. Furthermore, documentation should be updated appropriately, to explain why such escape is needed.
msg191744 - (view) Author: Antti Haapala (Ztane) * Date: 2013-06-24 04:57
My mistake in writing, json ofc does specify that "control characters" be escaped. Then, it needs to be pointed out that JSON module DOES not currently escape \u007f-\u009f as it maybe strictly should

>>> unicodedata.category('\u007f')
>>> json.dumps({'a': '\u007f'}, ensure_ascii=False)
'{"a": "\x7f"}'
msg194537 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-06 12:29
I think this is not JSON issue. If you need escaping of some domain-specific characters, do it youself. I.e.

    json.dump(...).replace('\u2028', r'\u2028').replace('\u2029', r'\u2029').replace('</', r'\u003c\u002f')
msg194581 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-06 21:25
On the one hand, supporting JSONP is a valid request for the json module. On the other hand, according to Wikipedia, "There have been some criticisms raised about JSONP. Cross-origin resource sharing (CORS) is a more recent method of getting data from a server in a different domain, which addresses some of those criticisms". Therefore, supporting JSONP might not really be worth it.
msg194648 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-08 07:27
Embedding JSON inside <script> tag doesn't differ from embedding any string in some format (i.e. JSON in Python string, Python sources in HTML, or XML in a shell script). We just escape characters which have special meaning.

I propose close this issue because embedding JSON (as any other generated code) in inline JavaScript can be done very easily with a sequence of string replaces. This has no relations to the json module.
Date User Action Args
2013-11-20 11:49:11serhiy.storchakasetstatus: pending -> closed
stage: committed/rejected
2013-08-08 07:27:16serhiy.storchakasetstatus: open -> pending

messages: + msg194648
2013-08-06 21:25:49pitrousetstatus: pending -> open

messages: + msg194581
2013-08-06 12:29:26serhiy.storchakasetstatus: open -> pending
resolution: not a bug
messages: + msg194537
2013-06-24 08:08:20serhiy.storchakasetnosy: + rhettinger, pitrou, ezio.melotti, serhiy.storchaka
2013-06-24 04:57:24Ztanesetmessages: + msg191744
2013-06-24 04:46:19Ztanecreate