This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients benjamin.peterson, eryksun, ezio.melotti, larry, lemburg, pitrou, random832, serhiy.storchaka, steven.daprano, terry.reedy, vstinner, Árpád Kósa
Date 2015-11-24.09:07:54
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1448356074.5.0.156865369935.issue25709@psf.upfronthosting.co.za>
In-reply-to
Content
> Why do strings cache their UTF-8 encoding?

Mainly for compatibility with existing C API. Common way to parse function arguments in implemented in C function is to use special argument parsing API: PyArg_ParseTuple, PyArg_ParseTupleAndKeywords, or PyArg_Parse. Most format codes for Unicode strings returned a C pointer to char array. For that encoded Unicode strings should be kept somewhere at least for the time of executing C function. As well as PyArg_Parse* functions doesn't allow user to specify a storage for encoded string, it should be saved in Unicode object. That is not new to PEP 393 or Python 3, in Python 2 the Unicode objects also keep cached encoded version.
History
Date User Action Args
2015-11-24 09:07:54serhiy.storchakasetrecipients: + serhiy.storchaka, lemburg, terry.reedy, pitrou, vstinner, larry, benjamin.peterson, ezio.melotti, steven.daprano, eryksun, random832, Árpád Kósa
2015-11-24 09:07:54serhiy.storchakasetmessageid: <1448356074.5.0.156865369935.issue25709@psf.upfronthosting.co.za>
2015-11-24 09:07:54serhiy.storchakalinkissue25709 messages
2015-11-24 09:07:54serhiy.storchakacreate