Author vbr
Recipients akitada, akuchling, amaury.forgeotdarc, brian.curtin, collinwinter, ezio.melotti, georg.brandl, gregory.p.smith, jaylogan, jhalcrow, jimjjewett, loewis, mark, moreati, mrabarnett, nneonneo, pitrou, r.david.murray, rsc, sjmachin, timehorse, vbr
Date 2010-07-19.01:37:20
SpamBayes Score 0.00984009
Marked as misclassified No
Message-id <1279503442.5.0.702153192921.issue2636@psf.upfronthosting.co.za>
In-reply-to
Content
Thanks for the update;
Just a small observation regarding some character ranges and ignorecase, probably irrelevant, but a difference to the current re anyway:

>>> zero2z = u"0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz"

>>> re.findall("(?i)[X-d]", zero2z)
[]

>>> regex.findall("(?i)[X-d]", zero2z)
[u'A', u'B', u'C', u'D', u'X', u'Y', u'Z', u'[', u'\\', u']', u'^', u'_', u'`', u'a', u'b', u'c', u'd', u'x', u'y', u'z']
>>>


re.findall("(?i)[B-d]", zero2z)
[u'B', u'C', u'D', u'b', u'c', u'd']

regex.findall("(?i)[B-d]", zero2z)
[u'A', u'B', u'C', u'D', u'E', u'F', u'G', u'H', u'I', u'J', u'K', u'L', u'M', u'N', u'O', u'P', u'Q', u'R', u'S', u'T', u'U', u'V', u'W', u'X', u'Y', u'Z', u'[', u'\\', u']', u'^', u'_', u'`', u'a', u'b', u'c', u'd', u'e', u'f', u'g', u'h', u'i', u'j', u'k', u'l', u'm', u'n', u'o', u'p', u'q', u'r', u's', u't', u'u', u'v', u'w', u'x', u'y', u'z']

It seems, that the re module is building the character set using a case insensitive "alphabet" in some way.

I guess, the behaviour of re is buggy here, while regex is ok (tested on py 2.7, Win XPp).

vbr
History
Date User Action Args
2010-07-19 01:37:23vbrsetrecipients: + vbr, loewis, akuchling, georg.brandl, collinwinter, gregory.p.smith, jimjjewett, sjmachin, amaury.forgeotdarc, pitrou, nneonneo, rsc, timehorse, mark, ezio.melotti, mrabarnett, jaylogan, akitada, moreati, r.david.murray, brian.curtin, jhalcrow
2010-07-19 01:37:22vbrsetmessageid: <1279503442.5.0.702153192921.issue2636@psf.upfronthosting.co.za>
2010-07-19 01:37:21vbrlinkissue2636 messages
2010-07-19 01:37:20vbrcreate