Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with regular expression #48469

Closed
carlosklock mannequin opened this issue Oct 28, 2008 · 3 comments
Closed

Problem with regular expression #48469

carlosklock mannequin opened this issue Oct 28, 2008 · 3 comments
Labels
topic-regex type-bug An unexpected behavior, bug, or error

Comments

@carlosklock
Copy link
Mannequin

carlosklock mannequin commented Oct 28, 2008

BPO 4219
Nosy @vstinner

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2008-10-28.16:55:53.830>
created_at = <Date 2008-10-28.15:46:12.597>
labels = ['expert-regex', 'type-bug', 'invalid']
title = 'Problem with regular expression'
updated_at = <Date 2008-10-28.16:55:53.828>
user = 'https://bugs.python.org/carlosklock'

bugs.python.org fields:

activity = <Date 2008-10-28.16:55:53.828>
actor = 'georg.brandl'
assignee = 'none'
closed = True
closed_date = <Date 2008-10-28.16:55:53.830>
closer = 'georg.brandl'
components = ['Regular Expressions']
creation = <Date 2008-10-28.15:46:12.597>
creator = 'carlosklock'
dependencies = []
files = []
hgrepos = []
issue_num = 4219
keywords = []
message_count = 3.0
messages = ['75286', '75287', '75288']
nosy_count = 2.0
nosy_names = ['vstinner', 'carlosklock']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = None
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue4219'
versions = ['Python 2.5']

@carlosklock
Copy link
Mannequin Author

carlosklock mannequin commented Oct 28, 2008

Hello,

I am having a weird problem with regex. I am trying to get the tokens
that match the pattern below, but it is not working only for a specific
case. I do this for many lines of text, and it works, except for the
string '1214578800'.

Any idea of what is happening? Is it a problem of my code or a bug in
regular expressions?

Thanks for any help,

Carlos.

import re
r =
re.compile(",'([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])'[,|)]")
text =
"('25','2','3','2','0','1','0','0/350','30','21','5','','1211641200','1214578800','0','2','1214662622');"
timestamps = r.findall(text)
print timestamps
OUTPUT:
Python 2.5.2 (r252:60911, Jul 31 2008, 17:31:22) 
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on Trabalho15, Standard
>>> ['1211641200', '1214662622']

@carlosklock carlosklock mannequin added topic-regex type-bug An unexpected behavior, bug, or error labels Oct 28, 2008
@vstinner
Copy link
Member

It's not a Python bug: your regex is invalid. When the regex
finds '1211641200', it reads >,'1211641200',< includind the last
comma. So the cursor will be at the apostrophe before 1214578800:
...200','121457...
--------^

You have to change your regex to not check the comma or use a
non-matching group like (?<=,) and (?=[,)]).

Note: you're using [,|)] which matchs >,<, >|<, and >)<. I guess that
you wanted [,)].

@carlosklock
Copy link
Mannequin Author

carlosklock mannequin commented Oct 28, 2008

Sorry, it is really a problem with the comma.

Thanks for helping! :)

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-regex type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants