This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients benjamin.peterson, ezio.melotti, lemburg, methane, serhiy.storchaka, terry.reedy, vstinner, xiang.zhang
Date 2017-09-16.08:14:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1505549681.77.0.497304254315.issue31484@psf.upfronthosting.co.za>
In-reply-to
Content
Initially I used 2 x 128 slots. It is enough for single block alphabetic languages. But it was caused significant slow down for Chinese. Increasing the size to 2 x 256 compensates the overhead for Chinese and restores the performance. If it is appropriate that the optimization affects only languages with small alphabets and keeps the performance for Chinese, Japan and Korean roughly unchanged (plus-minus few percents), this size is enough. I we want to optimize also processing texts with Chinese characters, it can be increased to 2 x 512 or 2 x 1024. Further increasing have smaller effect.

The cache of size 2 x 256 slots can increase memory consumption by 50 KiB in worst case, 2 x 1024 -- by 200 KiB.
History
Date User Action Args
2017-09-16 08:14:41serhiy.storchakasetrecipients: + serhiy.storchaka, lemburg, terry.reedy, vstinner, benjamin.peterson, ezio.melotti, methane, xiang.zhang
2017-09-16 08:14:41serhiy.storchakasetmessageid: <1505549681.77.0.497304254315.issue31484@psf.upfronthosting.co.za>
2017-09-16 08:14:41serhiy.storchakalinkissue31484 messages
2017-09-16 08:14:41serhiy.storchakacreate