Title: Code page decoder incorrectly handles input >2GiB
Components: Interpreter Core Versions: Python 3.8, Python 3.7, Python 3.6
Created on 2018-12-01 17:17 by serhiy.storchaka, last changed 2022-04-11 14:59 by admin.

Messages (6)
msg330855 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-01 17:17
>>> b = b'a'*(2**31-2)+b'\xff'*2
>>> x, y = codecs.code_page_decode(932, b, 'replace', True)
>>> len(x)
>>> x, y
('aa', 2147483648)
msg330912 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-03 08:36
New changeset 4013c179117754b039957db4730880bf3285919d by Serhiy Storchaka in branch 'master':
bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)
msg330918 - (view) Author: miss-islington (miss-islington) Date: 2018-12-03 09:09
New changeset bdeb56cd21ef3f4f086c93045d80f2a753823379 by Miss Islington (bot) in branch '3.7':
bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)
msg330921 - (view) Author: miss-islington (miss-islington) Date: 2018-12-03 09:15
New changeset 0f9b6687eb8b26dd804abcc6efd4d6430ae16f24 by Miss Islington (bot) in branch '3.6':
bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)
msg330936 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-03 11:27
Thanks for the fix ;-) I guess that nobody tried this code with a string longer than 2 GiB before you :-)
msg330939 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-03 11:43
Decoding a 2 GiB string takes > 80 seconds on my computer and needs around 14 GiB of memory.
