This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: re.sub replaces only first 32 matches with re.U flag
Type: security Stage:
Components: Regular Expressions Versions: Python 2.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Eugene.Morozov, SilentGhost
Priority: normal Keywords:

Created on 2011-02-20 22:28 by Eugene.Morozov, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (2)
msg128923 - (view) Author: Eugene Morozov (Eugene.Morozov) Date: 2011-02-20 22:28
There's a peculiar and difficult to find bug in the re.sub method. Try following example:
>>> text = 'X'*4096
>>> nt = re.sub(u"XX", u".", text, re.U)
>>> nt
u'............XXXXXXXXXXXXXXXXXXX' (only 32 dots, the rest of the string is not changed).

If I first compile regexp, and then perform compiled_regexp.sub, everything seems to work correctly.
msg128925 - (view) Author: SilentGhost (SilentGhost) * (Python triager) Date: 2011-02-20 23:01
If you read docs carefully, you notice that re.sub doesn't accept flags argument. Its 4th argument is count, re.U numerical value is 32.

Closing as invalid. There are some duplicates too, I'm sure.
History
Date User Action Args
2022-04-11 14:57:13adminsetgithub: 55471
2011-02-20 23:01:03SilentGhostsetstatus: open -> closed

nosy: + SilentGhost
messages: + msg128925

resolution: not a bug
2011-02-20 22:28:09Eugene.Morozovcreate