classification
Title: strptime() gives inconsistent exceptions
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: belopolsky Nosy List: belopolsky, ezio.melotti, gruszczy, ryles
Priority: normal Keywords:

Created on 2009-05-09 19:14 by ryles, last changed 2014-09-29 16:31 by berker.peksag. This issue is now closed.

Messages (3)
msg87505 - (view) Author: Ryan Leslie (ryles) Date: 2009-05-09 19:14
e.g.

>>> from datetime import datetime
>>>
>>> datetime.strptime("19951001", "%Y%m%d")
datetime.datetime(1995, 10, 1, 0, 0)
>>>
>>> datetime.strptime("19951000", "%Y%m%d") # day = 0, month < 11
 ...
ValueError: time data '19951000' does not match format '%Y%m%d'
>>>
>>> datetime.strptime("19951100", "%Y%m%d") # day = 0, month >= 11
 ...
ValueError: unconverted data remains: 0
>>>

The exception messages are not really a serious issue, but note that the
latter one can be confusing for users.

However, there seems to be some underlying issues with the choice to
recognize single digit months with double digit days, which can make
date strings ambiguous:

Consider "19951100" from above with the last '0' removed.

>>> datetime.strptime("1995110", "%Y%m%d")
datetime.datetime(1995, 1, 10, 0, 0)
>>>

In this case, strptime has treated the middle '1' as the month,
resulting in 1995-01-10. This hints at why the second exception from
above gives a strange message: with the extra '0' the day portion of
"19951100" (00) is invalid, and strptime falls back on parsing the first
7 characters as above, and then failing due to the remaining '0'.

This seems a little dangerous. For instance:
timestamp = "19951101-23:20:18"
datestamp=timestamp[:7] # Oops, needed to use :8 to get up to index 7.
reallyImportantButWrongDate = datetime.strptime(datestamp, "%Y%m%d")

Note that in this case strptime() from glibc would instead result in an
error, which IMHO is the right thing to do.

I do realize, though, that changing the behavior of strptime() could
devastate some existing code.
msg107173 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-06-06 02:07
Looks like a bug to me:

>>> datetime.strptime("1", "%d")
datetime.datetime(1900, 1, 1, 0, 0)

>>> datetime.strptime('1', '%m')
datetime.datetime(1900, 1, 1, 0, 0)

both %m and %d accept single digits but they should not.

>>> datetime.strptime('123', '%m%d')
datetime.datetime(1900, 12, 3, 0, 0)

>>> import this
..
In the face of ambiguity, refuse the temptation to guess.
msg129663 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2011-02-27 22:24
But this is exactly how strptime in C. Consider this:

#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(){
    
    char buf[255];
    struct tm tm;
    
    memset(&tm, 0, sizeof(tm));
    strptime("123", "%m%d", &tm);
    strftime(buf, sizeof(buf), "%d %b %Y %H:%M", &tm);
    printf("%s\n", buf);
    
    return 0;
    
}

This produces output:

03 Dec 1900 00:00


Shouldn't it stay consistent with how C function works?
History
Date User Action Args
2014-09-29 16:31:06berker.peksagsetstage: needs patch -> resolved
2014-09-29 16:01:03belopolskysetstatus: open -> closed
resolution: wont fix
2013-02-23 06:22:11ezio.melottisetversions: + Python 3.4
2011-02-27 22:24:20gruszczysetnosy: + gruszczy
messages: + msg129663
2011-01-10 23:45:03belopolskysetnosy: belopolsky, ezio.melotti, ryles
stage: test needed -> needs patch
versions: + Python 3.3, - Python 3.2
2010-06-06 02:08:18belopolskysetversions: + Python 3.2, - Python 2.6, Python 2.5, Python 2.4, Python 3.0, Python 3.1, Python 2.7
2010-06-06 02:07:42belopolskysetnosy: + belopolsky
messages: + msg107173

assignee: belopolsky
stage: test needed
2009-05-10 15:43:05ezio.melottisetnosy: + ezio.melotti
2009-05-09 19:14:09rylescreate