This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients belopolsky, eric.araujo, ezio.melotti, jcea, lemburg, sdaoden, vstinner
Date 2011-02-24.16:35:37
SpamBayes Score 5.392131e-12
Marked as misclassified No
Message-id <4D6688D8.1020500@egenix.com>
In-reply-to <AANLkTika1J3zUHpfxGVHU_mx81tH6tpQBu+Y9bJt-GAm@mail.gmail.com>
Content
Alexander Belopolsky wrote:
> 
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
> 
> On Thu, Feb 24, 2011 at 11:01 AM, Marc-Andre Lemburg
> <report@bugs.python.org> wrote:
> ..
>> On this ticker, we're discussing just one application area: that
>> of the builtin short cuts.
>>
> Fair enough.  I was hoping to close this ticket by simply committing
> the posted patch, but it looks like people want to do more.  I don't
> think we'll get measurable performance gains but may improve code
> understandability.
> 
>> To have more encoding name variants benefit from the optimization,
>> we might want to enhance that particular normalization function
>> to avoid having to compare against "utf8" and "utf-8" in the
>> encode/decode functions.
> 
> Which function are you talking about?
> 
> 1. normalize_encoding() in unicodeobject.c
> 2. normalizestring() in codecs.c

The first one, since that's being used by the shortcuts.

> The first is s.lower().replace('-', '_') and the second is

It does this: s.lower().replace('_', '-')

> s.lower().replace(' ', '_'). (Note space vs. dash difference.)
> 
> Why do we need both?  And why should they be different?

Because the first is specifically used for the shortcuts
(which can do more without breaking anything, since it's
only used internally) and the second prepares the encoding
names for lookup in the codec registry (which has a PEP100
defined behavior we cannot easily change).
History
Date User Action Args
2011-02-24 16:35:39lemburgsetrecipients: + lemburg, jcea, belopolsky, vstinner, ezio.melotti, eric.araujo, sdaoden
2011-02-24 16:35:37lemburglinkissue11303 messages
2011-02-24 16:35:37lemburgcreate