Message177573
I found another bug while looking through the source.
On line 495 in function SRE_COUNT:
if (maxcount < end - ptr && maxcount != 65535)
end = ptr + maxcount*state->charsize;
where 'end' and 'ptr' are of type 'char*'. That means that 'end - ptr' is the length in _bytes_, not characters.
If the byte after the end of the string is 0 then you get this:
>>> # Good:
>>> re.search(r"\x00{1,3}", "a\x00\x00").span()
(1, 3)
>>> # Bad:
>>> re.search(r"\x00{1,3}", "\u0100\x00\x00").span()
(1, 4)
I'll keep looking before submitting a patch. |
|
Date |
User |
Action |
Args |
2012-12-16 00:24:21 | mrabarnett | set | recipients:
+ mrabarnett, pitrou, vstinner, ezio.melotti, Arfrever, serhiy.storchaka, pyos |
2012-12-16 00:24:21 | mrabarnett | set | messageid: <1355617461.61.0.11376930095.issue16688@psf.upfronthosting.co.za> |
2012-12-16 00:24:21 | mrabarnett | link | issue16688 messages |
2012-12-16 00:24:21 | mrabarnett | create | |
|