This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Bad parsing of compiling regex with re.MULTILINE
Type: behavior Stage:
Components: Regular Expressions Versions: Python 2.4, Python 2.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: misha, pitrou
Priority: normal Keywords:

Created on 2008-08-18 12:02 by misha, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (2)
msg71323 - (view) Author: Misha Seltzer (misha) Date: 2008-08-18 12:02
import re
regex = r"[\w]+"

# Normal behaviour:
>>> re.findall(regex, "hello world", re.M)
['hello', 'world']
>>> re.compile(regex).findall("hello world")
['hello', 'world']

# Bug behaviour:
>>> re.compile(regex).findall("hello world", re.M)
['rld']
msg71326 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-18 13:25
The re.M flag is an attribute of the compiled pattern, and as such it
must be passed to compile(), not to findall(). 

These all work:

>>> re.compile(r"[a-z]+").findall("hello world")
['hello', 'world']
>>> re.compile(r"[a-z]+", re.M).findall("hello world")
['hello', 'world']
>>> re.compile(r"(?m)[a-z]+").findall("hello world")
['hello', 'world']

The second argument to the findall() method of compile objects is the
start position to match from (see
http://docs.python.org/lib/re-objects.html). This explains the behaviour
you are witnessing:

>>> re.M
8
>>> re.compile(r"[a-z]+").findall("hello world", 8)
['rld']
History
Date User Action Args
2022-04-11 14:56:37adminsetgithub: 47837
2008-08-18 13:25:09pitrousetstatus: open -> closed
resolution: not a bug
messages: + msg71326
nosy: + pitrou
2008-08-18 12:02:22mishacreate