This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mrabarnett
Recipients Arfrever, ezio.melotti, mrabarnett, pitrou, pyos, serhiy.storchaka, vstinner
Date 2012-12-16.00:33:54
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1355618034.21.0.0729645092194.issue16688@psf.upfronthosting.co.za>
In-reply-to
Content
I found another bug while looking through the source.

On line 495 in function SRE_COUNT:

    if (maxcount < end - ptr && maxcount != 65535)
        end = ptr + maxcount*state->charsize;

where 'end' and 'ptr' are of type 'char*'. That means that 'end - ptr' is the length in _bytes_, not characters.

If the byte after the end of the string is 0 then you get this:

>>> # Good:
>>> re.search(r"\x00{1,3}", "a\x00\x00").span()
(1, 3)
>>> # Bad:
>>> re.search(r"\x00{1,3}", "\u0100\x00\x00").span()
(1, 4)

I'll keep looking before submitting a patch.
History
Date User Action Args
2012-12-16 00:33:54mrabarnettsetrecipients: + mrabarnett, pitrou, vstinner, ezio.melotti, Arfrever, serhiy.storchaka, pyos
2012-12-16 00:33:54mrabarnettsetmessageid: <1355618034.21.0.0729645092194.issue16688@psf.upfronthosting.co.za>
2012-12-16 00:33:54mrabarnettlinkissue16688 messages
2012-12-16 00:33:54mrabarnettcreate