classification
Title: datetime.strptime without a year fails on Feb 29
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Sriram Rajagopalan, belopolsky, gerardw@alum.mit.edu, gregory.p.smith, nickzoic, p-ganssle, polymorphm, xtreak
Priority: normal Keywords:

Created on 2016-02-29 18:02 by Sriram Rajagopalan, last changed 2020-03-03 17:16 by nickzoic.

Messages (13)
msg261014 - (view) Author: Sriram Rajagopalan (Sriram Rajagopalan) Date: 2016-02-29 18:02
$ python
    Python 3.5.1 (default, Dec  7 2015, 12:58:09) 
    [GCC 5.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 
    >>> 
    >>> 
    >>> import time
    >>> 
    >>> time.strptime("Feb 29", "%b %d")
    time.struct_time(tm_year=1900, tm_mon=2, tm_mday=29, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=60, tm_isdst=-1)
    >>> 
    >>> 
    >>> import datetime
    >>> 
    >>> datetime.datetime.strptime("Feb 29", "%b %d")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.5/_strptime.py", line 511, in _strptime_datetime
        return cls(*args)
    ValueError: day is out of range for month

The same issue is seen in all versions of Python
msg261024 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2016-02-29 21:26
Python's time.strptime() behavior is consistent with that of glibc 2.19:

======= strptime_c.c =======
#define _XOPEN_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

int
main(void)
{
  struct tm tm;
  char buf[255];

  memset(&tm, 0, sizeof(struct tm));
  strptime("Feb 29", "%b %d", &tm);
  strftime(buf, sizeof(buf), "%d %b %Y %H:%M", &tm);
  puts(buf);
  exit(EXIT_SUCCESS);
}
=======

$ gcc strptime_c.c 
$ ./a.out
29 Feb 1900 00:00


I'm not saying that the behavior is a good API, but given the unfortunate API at hand, parsing a date without specifying what year it is using strptime is a bad idea.
msg261027 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2016-02-29 22:51
This is not no more bug than

>>> from datetime import *
>>> datetime.strptime('0228', '%m%d')
datetime.datetime(1900, 2, 28, 0, 0)

Naturally, as long as datetime.strptime('0228', '%m%d') is the same as datetime.strptime('19000228', '%Y%m%d'), datetime.strptime('0229', '%m%d') should raise a ValueError as long as datetime.strptime('19000229', '%m%d') does.

The only improvement, I can think of in this situation is to point the user to time.strptime() in the error message.  The time.strptime method works just fine in the recent versions (see issue 14157.)

>>> time.strptime('0229', '%m%d')[1:3]
(2, 29)
msg261028 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2016-02-29 22:54
> Python's time.strptime() behavior is consistent with that of glibc 2.19

Gregory,

I believe OP is complaining about the way datetime.datetime.strptime() behaves, not time.strptime() which is mentioned as (preferred?) alternative.


See msg261015  in issue 14157 for context.
msg261033 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2016-02-29 23:44
time.strptime() is "working" (not raising an exception) as it appears not to validate the day of the month when a year is not specified, yet the return value from either of these APIs is a date which has no concept of an ambiguous year.

## Via the admittedly old Python 2.7.6 from Ubuntu 14.04: ##
# 1900 was not a leap year as it is not divisible by 400.
>>> time.strptime("1900 Feb 29", "%Y %b %d")
ValueError: day is out of range for month
>>> time.strptime("Feb 29", "%b %d")
time.struct_time(tm_year=1900, tm_mon=2, tm_mday=29, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=60, tm_isdst=-1)

So what should the validation behavior be?

>>> datetime.datetime.strptime("Feb 29", "%b %d")
ValueError: day is out of range for month
>>> datetime.datetime.strptime("2016 Feb 29", "%Y %b %d")
datetime.datetime(2016, 2, 29, 0, 0)
>>> datetime.datetime.strptime("1900 Feb 29", "%Y %b %d")
ValueError: day is out of range for month
>>> datetime.datetime(year=1900, month=2, day=29)
ValueError: day is out of range for month

datetime objects cannot be constructed with the invalid date (as the time.strptime return value allows).

Changing the API to assume the current year or a +/- 6 months from "now" when no year is parsed is likely to break existing code.
msg343085 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-05-21 19:08
See also issue19376. This behavior is now documented with https://github.com/python/cpython/commit/56027ccd6b9dab4a090e4fef8574933fb9a36ff2
msg363123 - (view) Author: Nick Moore (nickzoic) Date: 2020-03-02 06:17
I suspect this is going to come up about this time of every leap year :-/

The workaround is prepending "%Y " to the pattern and eg: "2020 " to the date string, but that's not very nice.

Would adding a kwarg "default_year" be an acceptable solution?
I can't think of any other situation other than leap years when this is going to come up.  If both "default_year" and "%Y" are present throw an exception (maybe therefore just call the kwarg "year")

In the weird case where you want to do date maths involving the month as well, you can always use a safe choice like "default_year=2020" and then fix the year up afterwards:

```
    dt = datetime.strptime(date_str, "%b %d", default_year=2020)
    dt = dt.replace(year=2021 if dt.month > 6 else 2022)
```
msg363202 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2020-03-02 19:37
I don't think adding a default_year parameter is the right solution here.

The actual problem is that `time.strptime`, and by extension `datetime.strptime` has a strange and confusing interface. What should happen is either that `year` is set to None or some other marker of a missing value or datetime.strptime should raise an exception when it's being asked to construct something that does not contain a year.

Since there is no concept of a partial datetime, I think our best option would be to throw an exception, except that this has been baked into the library for ages and would start to throw exceptions even when the person has correctly handled the Feb 29th case.

I think one possible "solution" to this would be to raise a warning any time someone tries to use `datetime.strptime` without requesting a year to warn them that the thing they're doing only exists for backwards compatibility reasons. We could possibly eventually make that an exception, but I'm not sure it's totally worth a break in backwards compatibility when a warning should put people on notice.
msg363215 - (view) Author: Nick Moore (nickzoic) Date: 2020-03-02 23:14
Not disagreeing with you that "%b %d" timestamps with no "%Y" are excerable, but they're fairly common in the *nix world unfortunately.

People need to parse them, and the simple and obvious way to do this breaks every four years.

I like the idea of having a warning for not including %Y *and* not setting a default_year kwarg though.
msg363217 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-03-02 23:33
I _doubt_ there is code expecting the default year when unspecified to actually be 1900.

Change that default to any old year with a leap year (1904?) and it'll still (a) stand out as a special year that can be looked up should it wind up being _used_ as the year in code somewhere and (b) not fail every four years for people just parsing to extract Month + Day values.
msg363223 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2020-03-03 00:12
> On Mar 2, 2020, at 6:33 PM, Gregory P. Smith <report@bugs.python.org> wrote:
> 
> Change that default to any old year with a leap year (1904?)

In the 21st century, the year 2000 default makes much more sense than 1900. Luckily 2000 is also a leap year.
msg363257 - (view) Author: Gerard C Weatherby (gerardw@alum.mit.edu) Date: 2020-03-03 13:17
Yes, code that has been working for my organization the past two years just broke this weekend.

Meaning depends on context. The straightforward solution is that if no year is specified, the return value should default to the current year.
msg363280 - (view) Author: Nick Moore (nickzoic) Date: 2020-03-03 17:16
It's kind of funny that there's already consideration of this in _strptime._strptime(), which returns a tuple used by datetime.datetime.strptime() to construct the new datetime.
Search for `leap_year_fix`.

I think the concern though is if we changed the default year that might possibly break someone's existing code: thus my suggestion to allow the programmer to explicitly change the default.

However, I can also see that if their code is parsing dates in this way it is already wrong, and that if we're causing users pain now when they upgrade Python we're at least saving them pain at 2024-02-29 00:00:01.

Taking that approach, perhaps parsing dates with no year should just throw an exception, forcing the programmer to do it right the first time.  In this case though, I'd rather have a "year" kwarg to prevent the programmer having to do horrible string hacks like my code currently does.

I'm not sure: is it useful for me to produce a PR so we have something specific to consider?
History
Date User Action Args
2020-03-03 17:16:20nickzoicsetmessages: + msg363280
2020-03-03 13:17:39gerardw@alum.mit.edusetnosy: + gerardw@alum.mit.edu
messages: + msg363257
2020-03-03 00:12:17belopolskysetmessages: + msg363223
2020-03-02 23:33:48gregory.p.smithsetmessages: + msg363217
2020-03-02 23:14:59nickzoicsetmessages: + msg363215
2020-03-02 19:37:17p-gansslesetnosy: + p-ganssle
messages: + msg363202
2020-03-02 06:17:00nickzoicsetnosy: + nickzoic

messages: + msg363123
versions: + Python 3.7, Python 3.8
2019-05-21 19:08:34xtreaksetnosy: + xtreak
messages: + msg343085
2016-03-02 20:43:07polymorphmsetnosy: + polymorphm
2016-02-29 23:44:22gregory.p.smithsetmessages: + msg261033
2016-02-29 22:54:41belopolskysetmessages: + msg261028
2016-02-29 22:51:30belopolskysetversions: - Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5
nosy: + belopolsky

messages: + msg261027

type: behavior -> enhancement
2016-02-29 21:26:33gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg261024
2016-02-29 18:02:35Sriram Rajagopalancreate