Title: Problem with regular expression
Type: behavior Stage:
Components: Regular Expressions Versions: Python 2.5
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: carlosklock, vstinner
Priority: normal Keywords:

Created on 2008-10-28 15:46 by carlosklock, last changed 2008-10-28 16:55 by georg.brandl. This issue is now closed.

Messages (3)
msg75286 - (view) Author: Carlos Eduardo Klock (carlosklock) Date: 2008-10-28 15:46

I am having a weird problem with regex. I am trying to get the tokens
that match the pattern below, but it is not working only for a specific
case. I do this for many lines of text, and it works, except for the
string '1214578800'.

Any idea of what is happening? Is it a problem of my code or a bug in
regular expressions?

Thanks for any help,


import re
r =
text =
timestamps = r.findall(text)
print timestamps

Python 2.5.2 (r252:60911, Jul 31 2008, 17:31:22) 
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on Trabalho15, Standard
>>> ['1211641200', '1214662622']
msg75287 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-10-28 15:55
It's not a Python bug: your regex is invalid. When the regex 
finds '1211641200', it reads >,'1211641200',< includind the last 
comma. So the cursor will be at the apostrophe before 1214578800:

You have to change your regex to not check the comma or use a 
non-matching group like (?<=,) and (?=[,)]).

Note: you're using [,|)] which matchs >,<, >|<, and >)<. I guess that 
you wanted [,)].
msg75288 - (view) Author: Carlos Eduardo Klock (carlosklock) Date: 2008-10-28 16:04
Sorry, it is really a problem with the comma.

Thanks for helping! :)
Date User Action Args
2008-10-28 16:55:53georg.brandlsetstatus: open -> closed
2008-10-28 16:04:39carlosklocksetmessages: + msg75288
2008-10-28 15:55:08vstinnersetresolution: not a bug
messages: + msg75287
nosy: + vstinner
2008-10-28 15:46:12carlosklockcreate