Message 110704 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vbr
Recipients	akitada, akuchling, amaury.forgeotdarc, brian.curtin, collinwinter, ezio.melotti, georg.brandl, gregory.p.smith, jaylogan, jhalcrow, jimjjewett, loewis, mark, moreati, mrabarnett, nneonneo, pitrou, r.david.murray, rsc, sjmachin, timehorse, vbr
Date	2010-07-19.01:37:20
SpamBayes Score	0.009840091
Marked as misclassified	No
Message-id	<1279503442.5.0.702153192921.issue2636@psf.upfronthosting.co.za>
In-reply-to

Content
Thanks for the update; Just a small observation regarding some character ranges and ignorecase, probably irrelevant, but a difference to the current re anyway: >>> zero2z = u"0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz" >>> re.findall("(?i)[X-d]", zero2z) [] >>> regex.findall("(?i)[X-d]", zero2z) [u'A', u'B', u'C', u'D', u'X', u'Y', u'Z', u'[', u'\\', u']', u'^', u'_', u'`', u'a', u'b', u'c', u'd', u'x', u'y', u'z'] >>> re.findall("(?i)[B-d]", zero2z) [u'B', u'C', u'D', u'b', u'c', u'd'] regex.findall("(?i)[B-d]", zero2z) [u'A', u'B', u'C', u'D', u'E', u'F', u'G', u'H', u'I', u'J', u'K', u'L', u'M', u'N', u'O', u'P', u'Q', u'R', u'S', u'T', u'U', u'V', u'W', u'X', u'Y', u'Z', u'[', u'\\', u']', u'^', u'_', u'`', u'a', u'b', u'c', u'd', u'e', u'f', u'g', u'h', u'i', u'j', u'k', u'l', u'm', u'n', u'o', u'p', u'q', u'r', u's', u't', u'u', u'v', u'w', u'x', u'y', u'z'] It seems, that the re module is building the character set using a case insensitive "alphabet" in some way. I guess, the behaviour of re is buggy here, while regex is ok (tested on py 2.7, Win XPp). vbr

Thanks for the update;
Just a small observation regarding some character ranges and ignorecase, probably irrelevant, but a difference to the current re anyway:

>>> zero2z = u"0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz"

>>> re.findall("(?i)[X-d]", zero2z)
[]

>>> regex.findall("(?i)[X-d]", zero2z)
[u'A', u'B', u'C', u'D', u'X', u'Y', u'Z', u'[', u'\\', u']', u'^', u'_', u'`', u'a', u'b', u'c', u'd', u'x', u'y', u'z']
>>>


re.findall("(?i)[B-d]", zero2z)
[u'B', u'C', u'D', u'b', u'c', u'd']

regex.findall("(?i)[B-d]", zero2z)
[u'A', u'B', u'C', u'D', u'E', u'F', u'G', u'H', u'I', u'J', u'K', u'L', u'M', u'N', u'O', u'P', u'Q', u'R', u'S', u'T', u'U', u'V', u'W', u'X', u'Y', u'Z', u'[', u'\\', u']', u'^', u'_', u'`', u'a', u'b', u'c', u'd', u'e', u'f', u'g', u'h', u'i', u'j', u'k', u'l', u'm', u'n', u'o', u'p', u'q', u'r', u's', u't', u'u', u'v', u'w', u'x', u'y', u'z']

It seems, that the re module is building the character set using a case insensitive "alphabet" in some way.

I guess, the behaviour of re is buggy here, while regex is ok (tested on py 2.7, Win XPp).

vbr

History
Date	User	Action	Args
2010-07-19 01:37:23	vbr	set	recipients: + vbr, loewis, akuchling, georg.brandl, collinwinter, gregory.p.smith, jimjjewett, sjmachin, amaury.forgeotdarc, pitrou, nneonneo, rsc, timehorse, mark, ezio.melotti, mrabarnett, jaylogan, akitada, moreati, r.david.murray, brian.curtin, jhalcrow
2010-07-19 01:37:22	vbr	set	messageid: <1279503442.5.0.702153192921.issue2636@psf.upfronthosting.co.za>
2010-07-19 01:37:21	vbr	link	issue2636 messages
2010-07-19 01:37:20	vbr	create