Message 129283 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	belopolsky
Recipients	belopolsky, eric.araujo, ezio.melotti, jcea, lemburg, sdaoden, vstinner
Date	2011-02-24.16:30:02
SpamBayes Score	7.873613e-11
Marked as misclassified	No
Message-id	<AANLkTika1J3zUHpfxGVHU_mx81tH6tpQBu+Y9bJt-GAm@mail.gmail.com>
In-reply-to	<4D6680ED.5090305@egenix.com>

Content
On Thu, Feb 24, 2011 at 11:01 AM, Marc-Andre Lemburg <report@bugs.python.org> wrote: .. > On this ticker, we're discussing just one application area: that > of the builtin short cuts. > Fair enough. I was hoping to close this ticket by simply committing the posted patch, but it looks like people want to do more. I don't think we'll get measurable performance gains but may improve code understandability. > To have more encoding name variants benefit from the optimization, > we might want to enhance that particular normalization function > to avoid having to compare against "utf8" and "utf-8" in the > encode/decode functions. Which function are you talking about? 1. normalize_encoding() in unicodeobject.c 2. normalizestring() in codecs.c The first is s.lower().replace('-', '_') and the second is s.lower().replace(' ', '_'). (Note space vs. dash difference.) Why do we need both? And why should they be different?

On Thu, Feb 24, 2011 at 11:01 AM, Marc-Andre Lemburg
<report@bugs.python.org> wrote:
..
> On this ticker, we're discussing just one application area: that
> of the builtin short cuts.
>
Fair enough.  I was hoping to close this ticket by simply committing
the posted patch, but it looks like people want to do more.  I don't
think we'll get measurable performance gains but may improve code
understandability.

> To have more encoding name variants benefit from the optimization,
> we might want to enhance that particular normalization function
> to avoid having to compare against "utf8" and "utf-8" in the
> encode/decode functions.

Which function are you talking about?

1. normalize_encoding() in unicodeobject.c
2. normalizestring() in codecs.c

The first is s.lower().replace('-', '_') and the second is
s.lower().replace(' ', '_'). (Note space vs. dash difference.)

Why do we need both?  And why should they be different?

History
Date	User	Action	Args
2011-02-24 16:30:03	belopolsky	set	recipients: + belopolsky, lemburg, jcea, vstinner, ezio.melotti, eric.araujo, sdaoden
2011-02-24 16:30:02	belopolsky	link	issue11303 messages
2011-02-24 16:30:02	belopolsky	create