classification
Title: parse failed for mutibytes characters, encode will show in \xxx
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, inada.naoki, miss-islington, vstinner, zhou.ronghua
Priority: normal Keywords: patch

Created on 2018-05-29 14:29 by zhou.ronghua, last changed 2019-12-04 10:27 by inada.naoki. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 7203 closed python-dev, 2018-05-29 14:31
PR 7286 closed python-dev, 2018-05-31 14:45
PR 17460 merged inada.naoki, 2019-12-04 06:39
PR 17464 merged miss-islington, 2019-12-04 09:39
PR 17465 merged inada.naoki, 2019-12-04 10:05
Messages (4)
msg318039 - (view) Author: zhou.ronghua (zhou.ronghua) * Date: 2018-05-29 14:29
when type this command in windows(xp or win7, all the same):
python -m json.tool xxx.txt xxx.json
if xxx.txt contains Chinese(or other multibytes characters):
if xxx.txt is encoded in ansi, xxx.json will encode Chinese as \xxx, very bad to see what they are;
if xxx.txt is encoded in utf8(without bom for most of the time), because with no bom, json.tool will think it is encoded in ansi, and decode fail.

as now, utf8 is widely use, set default to utf8 for most of the time when auto detect encoding failed
msg357786 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-12-04 09:39
New changeset 808769f3a4cbdc47cf1a5708dd61b1787bb192d4 by Inada Naoki in branch 'master':
bpo-33684: json.tool: Use utf-8 for infile and outfile. (GH-17460)
https://github.com/python/cpython/commit/808769f3a4cbdc47cf1a5708dd61b1787bb192d4
msg357789 - (view) Author: miss-islington (miss-islington) Date: 2019-12-04 09:57
New changeset a75cad440ab50d823af5f06e51dfed3a319f1e8c by Miss Islington (bot) in branch '3.8':
bpo-33684: json.tool: Use utf-8 for infile and outfile. (GH-17460)
https://github.com/python/cpython/commit/a75cad440ab50d823af5f06e51dfed3a319f1e8c
msg357791 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-12-04 10:26
New changeset e0f148e6635480521036415bd782c3424fe6c619 by Inada Naoki in branch '3.7':
bpo-33684: json.tool: Use utf-8 for infile and outfile. (GH-17460)
https://github.com/python/cpython/commit/e0f148e6635480521036415bd782c3424fe6c619
History
Date User Action Args
2019-12-04 10:27:17inada.naokisetcomponents: + Library (Lib), - Unicode
2019-12-04 10:27:07inada.naokisetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 3.7, Python 3.9
2019-12-04 10:26:29inada.naokisetmessages: + msg357791
2019-12-04 10:05:53inada.naokisetpull_requests: + pull_request16944
2019-12-04 09:57:59miss-islingtonsetnosy: + miss-islington
messages: + msg357789
2019-12-04 09:39:52miss-islingtonsetpull_requests: + pull_request16943
2019-12-04 09:39:37inada.naokisetnosy: + inada.naoki
messages: + msg357786
2019-12-04 06:39:59inada.naokisetpull_requests: + pull_request16941
2018-05-31 14:45:43python-devsetpull_requests: + pull_request6912
2018-05-29 14:44:15zhou.ronghuasetnosy: + ezio.melotti, vstinner

type: behavior
components: + Unicode
versions: + Python 3.8
2018-05-29 14:31:13python-devsetkeywords: + patch
stage: patch review
pull_requests: + pull_request6838
2018-05-29 14:29:39zhou.ronghuacreate