Title: parse failed for mutibytes characters, encode will show in \xxx
Type: behavior Stage: patch review
Components: Unicode Versions: Python 3.8
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, vstinner, zhou.ronghua
Priority: normal Keywords: patch

Created on 2018-05-29 14:29 by zhou.ronghua, last changed 2018-05-31 14:45 by python-dev.

Pull Requests
URL Status Linked Edit
PR 7203 closed python-dev, 2018-05-29 14:31
PR 7286 closed python-dev, 2018-05-31 14:45
Messages (1)
msg318039 - (view) Author: zhou.ronghua (zhou.ronghua) * Date: 2018-05-29 14:29
when type this command in windows(xp or win7, all the same):
python -m json.tool xxx.txt xxx.json
if xxx.txt contains Chinese(or other multibytes characters):
if xxx.txt is encoded in ansi, xxx.json will encode Chinese as \xxx, very bad to see what they are;
if xxx.txt is encoded in utf8(without bom for most of the time), because with no bom, json.tool will think it is encoded in ansi, and decode fail.

as now, utf8 is widely use, set default to utf8 for most of the time when auto detect encoding failed
Date User Action Args
2018-05-31 14:45:43python-devsetpull_requests: + pull_request6912
2018-05-29 14:44:15zhou.ronghuasetnosy: + ezio.melotti, vstinner

type: behavior
components: + Unicode
versions: + Python 3.8
2018-05-29 14:31:13python-devsetkeywords: + patch
stage: patch review
pull_requests: + pull_request6838
2018-05-29 14:29:39zhou.ronghuacreate