This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author johnsonm
Recipients adi, akuchling, effbot, ezio.melotti, gpolo, greg@gregdetre.co.uk, gvanrossum, johnsonm, kristall, mathieu.clabaut, ostkamp, pitrou, rsc, timehorse
Date 2009-11-16.01:13:24
SpamBayes Score 2.553513e-15
Marked as misclassified No
Message-id <1258334008.56.0.180475284719.issue1160@psf.upfronthosting.co.za>
In-reply-to
Content
The test case at the top of this issue reproduces just fine; if you are
looking for a different test case you'll have to specify what you don't
like about it so that it's clear what you are looking for.

I don't think there's any mystery about this issue; it seems perfectly
well understood.  I commented merely to encourage others who run into
this issue to consider one way of using sets if they are running into
the same case I was, in which I was trying to use a regular expression
to match a candidate string against a large set of exact matches.

I was doing this because the initial purpose of the interface I was
working with was to allow small, hand-specified regular expressions;
this interface was later additionally wrapped in code that automatically
created regular expressions for this interface originally (and still
also) intended for use with hand-crafted regular expressions.  That's
why the interface was not originally crafted to use sets, and why it was
not appropriate to simply change the interface to use sets.  However, my
interface also allows passing a callable which resolves the object at
the time of use, and so I merely passed a reference to a method which
returned an object derived from set but which implemented the match and
search methods.

If you REALLY want a simpler reproducer, this does it for me in the
restricted case (i.e., not using UCS4 encoding):
 import re
 r = re.compile('|'.join(('%d'%x for x in range(7000))))

But I really don't think that additional test cases are a barrier here.

Again, my goal was merely to suggest an easy way to use sets as a
replacement for regexps, for machine-generated regexps intended to match
against exact strings; subclass set and add necessary methods such as
search and/or match.
History
Date User Action Args
2009-11-16 01:13:29johnsonmsetrecipients: + johnsonm, gvanrossum, effbot, akuchling, pitrou, ostkamp, rsc, timehorse, mathieu.clabaut, gpolo, ezio.melotti, greg@gregdetre.co.uk, adi, kristall
2009-11-16 01:13:28johnsonmsetmessageid: <1258334008.56.0.180475284719.issue1160@psf.upfronthosting.co.za>
2009-11-16 01:13:25johnsonmlinkissue1160 messages
2009-11-16 01:13:24johnsonmcreate