This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author p-ganssle
Recipients Catherine.Devlin, Steve Yeung, p-ganssle, r.david.murray
Date 2021-05-18.20:25:06
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1621369506.43.0.990129064819.issue24929@roundup.psfhosted.org>
In-reply-to
Content
I also commented on GH-26215 ( https://github.com/python/cpython/pull/26215 ), but for posterity, I'll note a few things:

1. It seems that (and this may have changed since 2015), `_strptime._strptime` now has a stage that (unconditionally?) constructs a temporary `datetime_date`, which means it does do this particular validation in both `time.strptime` and `datetime.strptime`. That said, both flavors of `strptime` are *way* slower than I'd like them to be, and constructing an unnecessary `date`/`datetime` is a pretty good way to slow down your function, so if we ever go around optimizing this function, that may be one of the first bits on the chopping block.

2. The logic for `strptime` is very complicated and it's very hard to test the full input space of the function (particularly since we're not  using property tests (yet)...). This makes me somewhat uneasy about moving the validation stage from the beginning of the function (in parsing the regular expression) to the very *end* of the function (in the datetime constructor).

It's *probably* safe to do so, but it may also be worth exploring the possibility of validating this directly in `_strptime` (possibly immediately after the string is parsed by the regex), and raising a cleaner error message on failure.

Probably not worth spending a ton of time on that compared to improving the testing around this so that we can feel confident making changes under the hood. `.strptime` is really quite slow, and I wouldn't be surprised if we pulled out its guts and replaced most of the regex stuff with a fast C parser at some point in the future. Having good tests will both give us confidence to make this change (and that making this change won't lead to regressions in the future) and help with any future project to replace `_strptime._strptime` with a faster version, so I'd say that's the most important thing to do here.
History
Date User Action Args
2021-05-18 20:25:06p-gansslesetrecipients: + p-ganssle, r.david.murray, Catherine.Devlin, Steve Yeung
2021-05-18 20:25:06p-gansslesetmessageid: <1621369506.43.0.990129064819.issue24929@roundup.psfhosted.org>
2021-05-18 20:25:06p-gansslelinkissue24929 messages
2021-05-18 20:25:06p-gansslecreate