This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: datetime.strptime: Support for parsing offsets with a colon
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: belopolsky, mariocj89, martin.panter, p-ganssle, pablogsal
Priority: normal Keywords: patch

Created on 2017-10-16 22:51 by mariocj89, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 4015 merged mariocj89, 2017-10-16 22:54
Messages (9)
msg304486 - (view) Author: Mario Corchero (mariocj89) * (Python triager) Date: 2017-10-16 22:51
Currently, datetime.strptime does not support parsing utc offsets that include a colon. "+0000" is parsed without issues whilst it fails with "+00:00".

"+NN:NN" is not only ISO8601 valid but also the way the offset is presented to the user when using .isoformat on a datetime with a timezone/offset.

This lead to the users needing to go to external libraries like dateutil or iso8601 just to be able to parse the datetime encoded in strings that "datetime" produces.

Even if a long-term goal would be to provide a way to parse any isoformatted string this issue just aims to address the problem that the %z parsing presents. This already unblocks users from parsing datetime object serialized with isoformat.

With this change, the following will just work:

>>> import datetime as dt
>>> iso_fmt = '%Y-%m-%dT%H:%M:%S%z'
>>> d = dt.datetime.strptime('2004-01-01T10:10:10+05:00', iso_fmt)

*'2004-01-01T10:10:10+05:00' is a sample string generated via datetime.isoformat()

Other options like having a new %:z was proposed but having just %z seems much simpler for the user.



Note: There has been already conversations about adding support on datetime to parse any ISO-formatted string. This is a more simplistic approach. We might be able to get to that situation after this patch, but this aims just to unblock us.

Related:
http://www.loc.gov/standards/datetime/iso-tc154-wg5_n0039_iso_wd_8601-2_2016-02-16.pdf
https://mail.python.org/pipermail/python-ideas/2014-March/027018.html
https://bugs.python.org/issue15873
msg304489 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-10-17 04:12
FWIW it looks like “strptime” in glibc, and Open and Free BSD support parsing this and even more formats (RFC 822 and RFC 3339; includes “Z”, U.S. time zones, ±HH). Also, there is Issue 24954 for adding “%:z” like Gnu “date”.
msg304490 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-10-17 04:15
Sorry, I meant Net BSD not Free BSD
msg304510 - (view) Author: Mario Corchero (mariocj89) * (Python triager) Date: 2017-10-17 14:07
Yep, http://man7.org/linux/man-pages/man3/strptime.3.html does support it even if it might look asymetrical.

Example:

           struct tm tm;
           char buf[255];

           memset(&tm, 0, sizeof(struct tm));
           strptime("+00:00", "%z", &tm);
           strftime(buf, sizeof(buf), "%z", &tm);
           puts(buf); // Will print +0000
           exit(EXIT_SUCCESS);

Martin do you want me to "cleanup" the PR, add docs, news entry, etc?
msg304620 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2017-10-19 13:56
This seems very useful to me. I very frequently advise people *against* using dateutil.parser (despite my conflict of interest as maintainer of dateutil) for well-known formats, but the problem frequently comes up of, "what should I do when I have date created by isoformat()?", to which there's no clean satisfying answer other than, "use dateutil.parser even though you know the format."

I think the strptime page that Mario linked to is evidence that the %z directive is *intended* to match against -HH:MM, and so that might be the most "standard" solution.

That said, I somewhat prefer the granularity of the GNU date extensions %z, %:z and %::z, since this allows downstream users to be stricter about what they are willing to accept. I think either approach is defensible, but that *something* should be done soon, preferably for the 3.7 release.
msg304644 - (view) Author: Mario Corchero (mariocj89) * (Python triager) Date: 2017-10-19 22:15
As a note

Seems support for the ":" was added in 2015 for glibc:
http://code.metager.de/source/xref/gnu/glibc/time/strptime_l.c#765

Commit e952e1df

Before that, it basically just ignores the minutes.
msg304645 - (view) Author: Mario Corchero (mariocj89) * (Python triager) Date: 2017-10-19 22:24
I have a patch to add 'Z' support as well if we are interested in making it the same as it glibc does. (as it supports it as well)
msg304836 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2017-10-23 19:49
Note that #5288 relaxed the whole number of minutes restriction on UTC offsets.  Since the goal is to be able to parse the output of .isoformat(), I think %z should accept sub-minute offsets.
msg305017 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2017-10-26 00:35
New changeset 32318930da70ff03320ec50813b843e7db6fbc2e by Alexander Belopolsky (Mario Corchero) in branch 'master':
Closes bpo-31800: Support for colon when parsing time offsets (#4015)
https://github.com/python/cpython/commit/32318930da70ff03320ec50813b843e7db6fbc2e
History
Date User Action Args
2022-04-11 14:58:53adminsetgithub: 75981
2017-10-26 00:35:43belopolskysetstatus: open -> closed
resolution: fixed
messages: + msg305017

stage: patch review -> resolved
2017-10-26 00:10:21belopolskylinkissue24954 dependencies
2017-10-23 19:49:11belopolskysetnosy: + belopolsky
messages: + msg304836
2017-10-19 22:24:09mariocj89setmessages: + msg304645
2017-10-19 22:15:10mariocj89setmessages: + msg304644
2017-10-19 13:56:11p-gansslesetnosy: + p-ganssle
messages: + msg304620
2017-10-17 14:07:22mariocj89setmessages: + msg304510
2017-10-17 04:15:00martin.pantersetmessages: + msg304490
2017-10-17 04:12:30martin.pantersetnosy: + martin.panter
messages: + msg304489
2017-10-17 01:37:07pablogsalsetnosy: + pablogsal
2017-10-16 22:54:18mariocj89setkeywords: + patch
stage: patch review
pull_requests: + pull_request3989
2017-10-16 22:51:49mariocj89create