Author p-ganssle
Recipients Arfrever, Martin.Morrison, Matthew.Earl, belopolsky, bradengroom, brett.cannon, docs@python, fbidu, hynek, p-ganssle, pconnell, pitrou, swalker, taleinat, vstinner
Date 2018-10-30.14:39:55
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1540910395.64.0.788709270274.issue19376@psf.upfronthosting.co.za>
In-reply-to
Content
@Victor: You mean a PR to fix the *issue* or a PR to add this to the docs?

The current behavior is pretty counter-intuitive, particularly because it also fails because of the (relatively) little-known fact that 1900 happens to not be a leap year because it is evenly divisible by 100 but not by 400.

I think it's pretty simple for end-users to work around this:

def strptime_smarter(dtstr, fmt):
    try:
        return datetime.strptime(dtstr, fmt)
    except ValueError:
        tt = time.strptime(dtstr, fmt)
        if tt[0:3] == (1900, 2, 29):
            return datetime(1904, *tt[1:6])
        raise


But this is largely a problem that arises because we don't have any concept of a "partial datetime", see this dateutil issue: https://github.com/dateutil/dateutil/issues/449

What users want when they do `datetime.strptime("Feb 29", "%b %d")` is something like `(None, 2, 29)`, but we're both specifying an arbitrary default year *and* enforcing that the final date be legal. I think the best solution would be to change the default year to 2000 for *all* dates, but for historical reasons that is just not feasible. :(

Another option is that we could allow specifying a "default date" from which missing values would be drawn. We have done this in dateutil.parser.parse: https://dateutil.readthedocs.io/en/stable/parser.html#dateutil.parser.parse

The biggest problem in dateutil is that the default value for "default date" is the *current date*, which causes many problems with reproducibility. For `datetime.strptime`, the default value would be `datetime(1900, 1, 1)`, which has none of those same problems.

Still, adding such a parameter to `datetime.strptime` seems like a lot of effort to go through to just to make it *easier* for people to work around this bug in `strptime`, particularly since in this case you can *kinda* do the same thing with:

    strptime('1904 ' + dtstr, '%Y %b %d')

Long-winded carping on about datetime issues aside, I think my final vote is for leaving the behavior as-is and documenting it. Looking at the documentation, the only documentation I see for what happens when you don't have %Y, %m or %d is:

    For time objects, the format codes for year, month, and day should not be used,
    as time objects have no such values. If they’re used anyway, 1900 is substituted
    for the year, and 1 for the month and day.

This only makes sense in the context of `strftime`. I think for starters we should document the behavior of strptime when no year, month or day are specified. As part of that documentation, we can add a footnote about Feb 29th.

I can make a PR for this, but as Tal mentions, I think this is a good issue for a first-time contributor, so I'd like to give someone else an opportunity to take a crack at this.
History
Date User Action Args
2018-10-30 14:39:55p-gansslesetrecipients: + p-ganssle, brett.cannon, belopolsky, pitrou, vstinner, taleinat, Arfrever, swalker, docs@python, hynek, Martin.Morrison, pconnell, Matthew.Earl, fbidu, bradengroom
2018-10-30 14:39:55p-gansslesetmessageid: <1540910395.64.0.788709270274.issue19376@psf.upfronthosting.co.za>
2018-10-30 14:39:55p-gansslelinkissue19376 messages
2018-10-30 14:39:55p-gansslecreate