Message348921
If we reduce our chunk size below INT_MAX, then we avoid the issue entirely. Our logic for hitting the middle of a multibyte character is fine (perhaps fixed since this issue was opened?), there's just a weird edge case at 2 GiB in the API call.
As a bonus, smaller chunks seems to have a performance benefit too. It seems like INT_MAX/4 is the sweet spot - it took about a quarter of the time for my 2GiB test case as INT_MAX (and we're measuring in tens of seconds here, so I'm pretty comfortable with the direction of the result). INT_MAX/2 and INT_MAX/8 were both slower than INT_MAX/4. |
|
Date |
User |
Action |
Args |
2019-08-02 21:34:20 | steve.dower | set | recipients:
+ steve.dower, lemburg, doerwalter, terry.reedy, paul.moore, tim.golden, zach.ware, serhiy.storchaka |
2019-08-02 21:34:20 | steve.dower | set | messageid: <1564781660.58.0.735421605814.issue36311@roundup.psfhosted.org> |
2019-08-02 21:34:20 | steve.dower | link | issue36311 messages |
2019-08-02 21:34:20 | steve.dower | create | |
|