This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Code page decoder incorrectly handles input >2GiB
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: miss-islington, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2018-12-01 17:17 by serhiy.storchaka, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 10848 merged serhiy.storchaka, 2018-12-01 17:33
PR 10859 merged miss-islington, 2018-12-03 08:36
PR 10860 merged miss-islington, 2018-12-03 08:37
Messages (6)
msg330855 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-01 17:17
>>> b = b'a'*(2**31-2)+b'\xff'*2
>>> x, y = codecs.code_page_decode(932, b, 'replace', True)
>>> len(x)
2
>>> x, y
('aa', 2147483648)
msg330912 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-03 08:36
New changeset 4013c179117754b039957db4730880bf3285919d by Serhiy Storchaka in branch 'master':
bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)
https://github.com/python/cpython/commit/4013c179117754b039957db4730880bf3285919d
msg330918 - (view) Author: miss-islington (miss-islington) Date: 2018-12-03 09:09
New changeset bdeb56cd21ef3f4f086c93045d80f2a753823379 by Miss Islington (bot) in branch '3.7':
bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)
https://github.com/python/cpython/commit/bdeb56cd21ef3f4f086c93045d80f2a753823379
msg330921 - (view) Author: miss-islington (miss-islington) Date: 2018-12-03 09:15
New changeset 0f9b6687eb8b26dd804abcc6efd4d6430ae16f24 by Miss Islington (bot) in branch '3.6':
bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)
https://github.com/python/cpython/commit/0f9b6687eb8b26dd804abcc6efd4d6430ae16f24
msg330936 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-03 11:27
Thanks for the fix ;-) I guess that nobody tried this code with a string longer than 2 GiB before you :-)
msg330939 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-03 11:43
Decoding a 2 GiB string takes > 80 seconds on my computer and needs around 14 GiB of memory.
History
Date User Action Args
2022-04-11 14:59:08adminsetgithub: 79553
2018-12-03 11:43:05serhiy.storchakasetmessages: + msg330939
2018-12-03 11:27:04vstinnersetmessages: + msg330936
2018-12-03 10:40:17serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-12-03 09:15:05miss-islingtonsetmessages: + msg330921
2018-12-03 09:09:17miss-islingtonsetnosy: + miss-islington
messages: + msg330918
2018-12-03 08:37:05miss-islingtonsetpull_requests: + pull_request10095
2018-12-03 08:36:56miss-islingtonsetpull_requests: + pull_request10094
2018-12-03 08:36:47serhiy.storchakasetmessages: + msg330912
2018-12-01 17:52:57serhiy.storchakalinkissue35365 dependencies
2018-12-01 17:33:53serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request10083
2018-12-01 17:17:56serhiy.storchakacreate