classification
Title: re.sub replaces only first 32 matches with re.U flag
Type: security Stage:
Components: Regular Expressions Versions: Python 2.6
process
Status: closed Resolution: invalid
Dependencies: Superseder:
Assigned To: Nosy List: Eugene.Morozov, SilentGhost
Priority: normal Keywords:

Created on 2011-02-20 22:28 by Eugene.Morozov, last changed 2011-02-20 23:01 by SilentGhost. This issue is now closed.

Messages (2)
msg128923 - (view) Author: Eugene Morozov (Eugene.Morozov) Date: 2011-02-20 22:28
There's a peculiar and difficult to find bug in the re.sub method. Try following example:
>>> text = 'X'*4096
>>> nt = re.sub(u"XX", u".", text, re.U)
>>> nt
u'............XXXXXXXXXXXXXXXXXXX' (only 32 dots, the rest of the string is not changed).

If I first compile regexp, and then perform compiled_regexp.sub, everything seems to work correctly.
msg128925 - (view) Author: SilentGhost (SilentGhost) Date: 2011-02-20 23:01
If you read docs carefully, you notice that re.sub doesn't accept flags argument. Its 4th argument is count, re.U numerical value is 32.

Closing as invalid. There are some duplicates too, I'm sure.
History
Date User Action Args
2011-02-20 23:01:03SilentGhostsetstatus: open -> closed

nosy: + SilentGhost
messages: + msg128925

resolution: invalid
2011-02-20 22:28:09Eugene.Morozovcreate