Message381102
I am trying to parse ISO8601-formatted datetime strings with timezones.
This works fine when there is a colon separating the hour and minute digits:
>>> import datetime
>>> datetime.datetime.fromisoformat('2020-11-16T11:00:00+00:00')
>>> datetime.datetime(2020, 11, 16, 11, 0, tzinfo=datetime.timezone.utc)
However this fails when there is no colon between the hour and the minute digits:
>>> import datetime
>>> datetime.datetime.fromisoformat('2020-11-16T11:00:00+0000')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Invalid isoformat string: '2020-11-16T11:00:00+0000'
This behavior is unexpected, as the ISO8601 standard allows omitting the colon in the string and defining the timezone as "<time>±hhmm
":
https://en.wikipedia.org/wiki/ISO_8601#Time_offsets_from_UTC
As a workaround, I normalized the timezone suffixes before parsing:
>>> if iso8601_string.endswith('+0000'):
>>> return iso8601_string[:-len('+0000')] + '+00:00'
>>> if iso8601_string.endswith('+00'):
>>> return iso8601_string[:-len('+00')] + '+00:00'
>>> if iso8601_string.endswith('-0000'):
>>> return iso8601_string[:-len('-0000')] + '+00:00'
>>> if iso8601_string.endswith('-00'):
>>> return iso8601_string[:-len('-00')] + '+00:00'
This only works for the UTC timezone. I would be nice to have a more general solution which can handle any timezone.
I tested this with CPython 3.8. `.fromisoformat()` was added in 3.7, so earlier versions should not be affected by this:
https://docs.python.org/3/library/datetime.html#datetime.date.fromisoformat |
|
Date |
User |
Action |
Args |
2020-11-16 15:00:17 | Bengt.Lüers | set | recipients:
+ Bengt.Lüers |
2020-11-16 15:00:17 | Bengt.Lüers | set | messageid: <1605538817.4.0.335687964648.issue42371@roundup.psfhosted.org> |
2020-11-16 15:00:17 | Bengt.Lüers | link | issue42371 messages |
2020-11-16 15:00:16 | Bengt.Lüers | create | |
|