Message 66418 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rsc
Recipients	belopolsky, benjamin.peterson, donlorenzo, rsc, zanella
Date	2008-05-08.14:36:06
SpamBayes Score	0.001340745
Marked as misclassified	No
Message-id	<20080508143906.D71FF1E8C55@holo.morphisms.net>
In-reply-to	<1210255709.4.0.944861854389.issue2650@psf.upfronthosting.co.za>

Content
> Lorenz's patch uses a set, not a list for special characters. Set > lookup is as fast as dict lookup, but a set takes less memory because it > does not have to store dummy values. More importantly, use of frozenset > instead of dict makes the code clearer. On the other hand, I would > simply use a string. For a dozen entries, hash lookup does not buy you > much. > > Another nit: why use "\\%c" % (c) instead of obvious "\\" + c? > > Finally, you can eliminate use of index and a temporary list altogether > by using a generator expression: > > ''.join(("\\" + c if c in _special else '\\000' if c == "\000" else c), > for c in pattern) The title of this issue (#2650) is "re.escape should not escape underscore", not "re.escape is too slow and too easy to read". If you have an actual, measured performance problem with re.escape, please open a new issue with numbers to back it up. That's not what this one is about. Thanks. Russ

> Lorenz's patch uses a set, not a list for special characters.  Set 
> lookup is as fast as dict lookup, but a set takes less memory because it 
> does not have to store dummy values.  More importantly, use of frozenset 
> instead of dict makes the code clearer.  On the other hand, I would 
> simply use a string.  For a dozen entries, hash lookup does not buy you 
> much.
> 
> Another nit: why use "\\%c" % (c) instead of obvious "\\" + c?
> 
> Finally, you can eliminate use of index and a temporary list altogether 
> by using a generator expression:
> 
> ''.join(("\\" + c if c in _special else '\\000' if c == "\000" else c),
>         for c in pattern)

The title of this issue (#2650) is "re.escape should not escape underscore",
not "re.escape is too slow and too easy to read".

If you have an actual, measured performance problem with re.escape,
please open a new issue with numbers to back it up. 
That's not what this one is about.

Thanks.
Russ

History
Date	User	Action	Args
2008-05-08 14:36:11	rsc	set	spambayes_score: 0.00134075 -> 0.001340745 recipients: + rsc, belopolsky, benjamin.peterson, zanella, donlorenzo
2008-05-08 14:36:09	rsc	link	issue2650 messages
2008-05-08 14:36:06	rsc	create