classification
Title: time.strptime does not allow same format directive twice
Type: behavior Stage: needs patch
Components: Versions: Python 2.5
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, mrabarnett, sil
Priority: low Keywords:

Created on 2008-11-25 16:36 by sil, last changed 2010-08-04 23:30 by terry.reedy. This issue is now closed.

Messages (3)
msg76417 - (view) Author: (sil) Date: 2008-11-25 16:36
$ python -c "import time; print time.strptime('25/11/2008
25/11/2008','%d/%m/%y %d/%m/%y')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.5/_strptime.py", line 311, in strptime
    format_regex = _TimeRE_cache.compile(format)
  File "/usr/lib/python2.5/_strptime.py", line 267, in compile
    return re_compile(self.pattern(format), IGNORECASE)
  File "/usr/lib/python2.5/re.py", line 188, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python2.5/re.py", line 241, in _compile
    raise error, v # invalid expression
sre_constants.error: redefinition of group name 'd' as group 4; was group 1

If a format directive is repeated in time.strptime's format string, it
throws an error and should not do so. Subversion, for example, repeats
date parts in its svn log output ("2008-09-26 16:20:59 +0100 (Fri, 26
Sep 2008)"), which repeats both %d (day) and %y (year).
msg76418 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2008-11-25 18:36
Subversion is formatting a string from a time (strftime), so a repeated
placeholder is OK.

You're trying to _parse_ a time from a string (strptime). If you're
telling it that 2 different parts of the string are the date, what
should it do? The Pythonic thing to do is raise an exception.

(I suppose an alternative would be to raise an exception only if they
give different results.)
msg76421 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2008-11-25 19:28
The reason this occurs is that in order to have a portable and sane
implementation time.strptime() uses the re module to parse dates. The
issue here is that by specifying the same format twice the re module is
complaining that there are two named groups with the same name, leading
to a conflict.

About the only solution I can think of that doesn't require some massive
rewrite is to drop named group usage from time.strptime() and move to
positional groups by keeping track of the order of the formats and
zipping the format order and results together or something.

But I don't plan on doing that personally as that would require writing
a parser for format strings as well. But if someone manages to get it to
work I would be willing to review the patch.

Setting the priority to low as this is easy to work around since you
just use one set of date information instead of two which is redundant.
History
Date User Action Args
2010-08-04 23:30:09terry.reedysetstatus: open -> closed
resolution: wont fix
2008-11-25 19:28:52brett.cannonsetpriority: low
nosy: + brett.cannon
messages: + msg76421
stage: needs patch
2008-11-25 18:36:56mrabarnettsetnosy: + mrabarnett
messages: + msg76418
2008-11-25 16:36:41silcreate