classification
Title: Fix incremental decoder and stream reader in the "raw-unicode-escape" codec
Type: behavior Stage: resolved
Components: Unicode Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: 45461 Superseder:
Assigned To: Nosy List: ezio.melotti, lemburg, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2021-10-14 10:29 by serhiy.storchaka, last changed 2021-10-14 19:17 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 28944 merged serhiy.storchaka, 2021-10-14 10:36
PR 28952 merged serhiy.storchaka, 2021-10-14 17:30
PR 28953 merged serhiy.storchaka, 2021-10-14 17:31
Messages (5)
msg403893 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-10-14 10:29
Similar to 45461, but with "raw-unicode-escape".

When an incremental decoder gets a part of escape sequence (\uXXXX or \UXXXXXXXX) it raises an exception or return a bare "\" if it was the only part instead of keeping it until getting the rest. It is exposed in text files (io.TextIOWrapper) when reads from the underlying binary stream splits an escape sequence between blocks. There is similar issue with stream readers (codecs.StreamReader).
msg403921 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-10-14 17:04
New changeset 39aa98346d5dd8ac591a7cafb467af21c53f1e5d by Serhiy Storchaka in branch 'main':
bpo-45467: Fix IncrementalDecoder and StreamReader in the "raw-unicode-escape" codec (GH-28944)
https://github.com/python/cpython/commit/39aa98346d5dd8ac591a7cafb467af21c53f1e5d
msg403927 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-10-14 18:23
New changeset 4641afef661e6a22bc64194bd334b161c95edfe2 by Serhiy Storchaka in branch '3.10':
[3.10] bpo-45467: Fix IncrementalDecoder and StreamReader in the "raw-unicode-escape" codec (GH-28944) (GH-28952)
https://github.com/python/cpython/commit/4641afef661e6a22bc64194bd334b161c95edfe2
msg403928 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-10-14 18:23
New changeset 684860280687561f6312e206c4ccfbe4baa17e89 by Serhiy Storchaka in branch '3.9':
bpo-45467: Fix IncrementalDecoder and StreamReader in the "raw-unicode-escape" codec (GH-28944) (GH-28953)
https://github.com/python/cpython/commit/684860280687561f6312e206c4ccfbe4baa17e89
msg403935 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-10-14 19:06
Serhiy: I suppose that this issue can now be closed?
History
Date User Action Args
2021-10-14 19:17:23serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-10-14 19:06:32vstinnersetmessages: + msg403935
2021-10-14 18:23:56serhiy.storchakasetmessages: + msg403928
2021-10-14 18:23:50serhiy.storchakasetmessages: + msg403927
2021-10-14 17:31:20serhiy.storchakasetpull_requests: + pull_request27241
2021-10-14 17:30:20serhiy.storchakasetpull_requests: + pull_request27240
2021-10-14 17:04:23serhiy.storchakasetmessages: + msg403921
2021-10-14 10:36:55serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request27232
2021-10-14 10:30:09serhiy.storchakasetdependencies: + UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string
2021-10-14 10:29:46serhiy.storchakacreate