Author steve.dower
Recipients doerwalter, lemburg, paul.moore, serhiy.storchaka, steve.dower, terry.reedy, tim.golden, zach.ware
Date 2019-08-02.21:34:20
Message-id <>
If we reduce our chunk size below INT_MAX, then we avoid the issue entirely. Our logic for hitting the middle of a multibyte character is fine (perhaps fixed since this issue was opened?), there's just a weird edge case at 2 GiB in the API call.

As a bonus, smaller chunks seems to have a performance benefit too. It seems like INT_MAX/4 is the sweet spot - it took about a quarter of the time for my 2GiB test case as INT_MAX (and we're measuring in tens of seconds here, so I'm pretty comfortable with the direction of the result). INT_MAX/2 and INT_MAX/8 were both slower than INT_MAX/4.
