msg169941 - (view) |
Author: John Nagle (nagle) |
Date: 2012-09-06 21:08 |
The datetime module has support for output to a string of dates and times in ISO 8601 format ("2012-09-09T18:00:00-07:00"), with the object method "isoformat([sep])". But there's no support for parsing such strings. A string to datetime class method should be provided, one capable of parsing at least the RFC 3339 subset of ISO 8601.
The problem is parsing time zone information correctly.
The allowed formats for time zone are
empty - no TZ, date/time is "naive" in the datetime sense
Z - zero, or Zulu time, i.e. UTC.
[+-]nn.nn - offset from UTC
"strptime" does not understand timezone offsets. The "datetime" documentation suggests that the "z" format directive handles time zone info, but that's not actually implemented for input.
Pypi has four modules for parsing ISO 8601 dates. Each has least one major
problem in time zone handling:
iso8601 0.1.4
Abandonware. Mishandles time zone when time zone is "Z" and
the default time zone is specified.
iso8601.py 0.1dev
Always returns a "naive" datetime object, even if zone specified.
iso8601plus 0.1.6
Fork of abandonware version above. Same bug.
zc.iso8601 0.2.0
Zope version. Imports the pytz module with the full Olsen time zone
database, but doesn't actually use that database.
Thus, nothing in Pypi provides a good alternative.
It would be appropriate to handle this in the datetime module. One small, correct, tested function would be better than the existing five bad alternatives.
|
msg169952 - (view) |
Author: Alexander Belopolsky (Alexander.Belopolsky) |
Date: 2012-09-06 23:08 |
%z format is supported, but it cannot accept colon in TZ offset. It can parse offsets like -0600 just fine. What OP is looking for is the GNU date %:z format which datetime does not support.
For ISO 8601 compliance, however I think we need a way to specify a parser that will accept any valid 8601 format: with T or space separator and with or without : in time and timezone and with or without dashes in date.
I would very much like such promiscuous parser to be implemented in datetime.__new__. So that we can create datetime objects from strings the way we do it with numbers.
|
msg169966 - (view) |
Author: John Nagle (nagle) |
Date: 2012-09-07 01:51 |
Re: "%z format is supported".
That's platform-specific; the actual parsing is delegated to the C library. It's not in Python 2.7 / Win32:
ValueError: 'z' is a bad directive in format '%Y-%m-%dT%H:%M:%S%z'
It really shouldn't be platform-specific; the underlying platform is irrelevant to this task. That's more of a documentation error; the features not common to all supported Python platforms should not be mentioned in the documentation.
Re: "I would very much like such promiscuous parser to be implemented in datetime.__new__. "
For string input, it's probably better to do this conversion in a specific class-level function. Full ISO 8601 dates/times generally come from computer-generated data via a file or API. If invalid text shows up, it should be detected as an error, not be heuristically interpreted as a date. There's already "fromtimestamp" and "fromordinal",
and "isoformat" as an instance method, so "fromisoformat" seems reasonable.
I'd also suggest providing a standard subclass of tzinfo in datetime for fixed offsets. That's needed to express the time zone information in an ISO 8601 date. The new "fromisoformat" would convert an ISO 8601 date/time would be convertible to a time-zone "aware" datetime object. If converted back to an ISO 8601 string with .isoformat(), the round trip should preserve the original data, including time zone offset.
(Several more implementations of this conversion have turned up. In addition to the four already mentioned, there was one in xml.util, and one in feedparser. There are probably more yet to be found.)
|
msg169968 - (view) |
Author: Alexander Belopolsky (Alexander.Belopolsky) |
Date: 2012-09-07 02:36 |
On Thu, Sep 6, 2012 at 9:51 PM, John Nagle <report@bugs.python.org> wrote:
> It's not in Python 2.7 / Win32.
Python 2.x series is closed and cannot accept new features. Both %z
and fixed offset tzinfo subclass are implemented in 3.2.
|
msg169970 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2012-09-07 03:06 |
I am attaching a quick python only prototype for the proposed feature. My goal is to make date/time objects behave like numeric types for which constructors accept strings produced by str(). Since str() format is ISO 8601, it is natural to accept ISO 8601 formats in constructors.
|
msg170098 - (view) |
Author: Roy Smith (roysmith) |
Date: 2012-09-09 12:15 |
We need to define the scope of what input strings will be accepted. ISO-8601 defines a lot of stuff which we may not wish to accept.
Do we want to accept both basic format (YYYYMMDD) and extended format (YYYY-MM-DD)?
Do we want to accept things like "1985-W15-5", which is (if I understand this correctly(), the 5th day of the 15th week of 1985 [section 4.1.4.2].
Do we want to accept [section 4.2.2.4], "23:20,8", which is 23 hours, 20 minutes, 8 tenths of a minute.
I suspect most people who have been following the recent thread (https://groups.google.com/d/topic/comp.lang.python/Q2w4R89Nq1w/discussion) would say none of the above are needed. All that's needed is if you have an existing datetime object, d1, you can do:
s = str(d1)
d2 = datetime.datetime(s)
assert d1 == d2
for all values of d1.
But, let's at least agree on that. Or, in the alternative, agree on something else. Then we know what we're shooting for.
|
msg170104 - (view) |
Author: Alexander Belopolsky (Alexander.Belopolsky) |
Date: 2012-09-09 14:38 |
On Sep 9, 2012, at 8:15 AM, Roy Smith <report@bugs.python.org> wrote:
> We need to define the scope of what input strings will be accepted.
Since it is easier to widen the domain of acceptable arguments than to narrow it in the future, I would say let's start by accepting str(x) only where x is date, time, timezone or datetime. I would leave out timedelta for now because it's str(x) does not resemble ISO at all.
Either that or full ISO 8601. Anything in between is just too hard to explain.
|
msg170109 - (view) |
Author: Roy Smith (roysmith) |
Date: 2012-09-09 15:06 |
I see I mis-stated my example. When I wrote:
s = str(d1)
d2 = datetime.datetime(s)
assert d1 == d2
what I really meant was:
s = d1.isoformat()
d2 = datetime.datetime(s)
assert d1 == d2
But, now I realize that while that is certainly an absolute lower bound, it's almost certainly not sufficient. The most common use case I see on a daily basis is parsing strings that look like "2012-09-07T23:59:59+00:00". This is also John Nagle's original use case from the cited mailing list thread:
> I want to parse standard ISO date/time strings such as
> 2012-09-09T18:00:00-07:00
Datetime.isoformat() returns something that matches the beginning of that, but doesn't have the time zone offset. And it's the offset that makes strptime() not usable as a soluation, because "%z" isn't portable.
If we don't satisfy the "2012-09-07T23:59:59+00:00" case, then we won't have really done anything useful.
|
msg170112 - (view) |
Author: John Nagle (nagle) |
Date: 2012-09-09 16:06 |
For what parts of ISO 8601 to accept, there's a standard: RFC3339, "Date and Time on the Internet: Timestamps". See section 5.6:
date-fullyear = 4DIGIT
date-month = 2DIGIT ; 01-12
date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on
; month/year
time-hour = 2DIGIT ; 00-23
time-minute = 2DIGIT ; 00-59
time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap second
; rules
time-secfrac = "." 1*DIGIT
time-numoffset = ("+" / "-") time-hour ":" time-minute
time-offset = "Z" / time-numoffset
partial-time = time-hour ":" time-minute ":" time-second
[time-secfrac]
full-date = date-fullyear "-" date-month "-" date-mday
full-time = partial-time time-offset
date-time = full-date "T" full-time
NOTE: Per [ABNF] and ISO8601, the "T" and "Z" characters in this
syntax may alternatively be lower case "t" or "z" respectively.
ISO 8601 defines date and time separated by "T".
Applications using this syntax may choose, for the sake of
readability, to specify a full-date and full-time separated by
(say) a space character.
That's straightforward, and can be expressed as a regular expression.
|
msg170114 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2012-09-09 16:11 |
> I realize that while that is certainly an absolute lower bound,
> it's almost certainly not sufficient. The most common use case
> I see on a daily basis is parsing strings that look like
> "2012-09-07T23:59:59+00:00".
This is exactly what isoformat() of an aware datetime looks like:
>>> datetime.now(timezone.utc).isoformat()
'2012-09-09T16:09:46.165886+00:00'
str() is the same up to T replaced by space:
>>> print(datetime.now(timezone.utc))
2012-09-09 15:19:12.567692+00:00
|
msg170116 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2012-09-09 16:25 |
> For what parts of ISO 8601 to accept, there's a standard: RFC3339
This is almost indistinguishable from the idea of accepting .isoformat() and str() results. From what I see the only difference is that 't' is accepted for date/time separator and 'z' is accepted as a timezone.
Let's start with this.
As an ultimate solution, I would like to see something like codec registry so that we can do things like datetime(.., format='rfc3339') or date(.., format='gnu') for GNU parse_datetime. I think this will look more pythonic than strptime(). Of course, strptime format can also be accepted as the value for the format keyword.
|
msg170180 - (view) |
Author: Roy Smith (roysmith) |
Date: 2012-09-10 12:06 |
I've started collecting some test cases. I'll keep adding to the collection. I'm going to start trolling ISO 8601:2004(E) for more. Let me know if there are other sources I should be considering.
|
msg170181 - (view) |
Author: Roy Smith (roysmith) |
Date: 2012-09-10 12:07 |
Ooops, clicked the wrong button.
|
msg174339 - (view) |
Author: (flying sheep) * |
Date: 2012-10-31 17:33 |
there is a module that parses those strings pretty nicely, it’s called pyiso8601: http://code.google.com/p/pyiso8601/
in the context of writing a better plistlib, i also needed the capability to parse those strings, and decided not to use the sucky incomplete implementation of plistlib, but the one mentioned above.
i py3ified it, eliminating quite some code, and the result is pretty terse, check it out: https://github.com/flying-sheep/plist/blob/master/iso8601.py
note that that implementation returns utc-datetimes for timezoneless strings, instead of naive ones. (l.30)
|
msg183672 - (view) |
Author: Anders Hovmöller (Anders.Hovmöller) * |
Date: 2013-03-07 15:37 |
I've written a parser for ISO 8601: https://github.com/boxed/iso8601
Some basic tests are included and it supports most of the standard. Haven't gotten around to the more obscure parts like durations and intervals, but those are trivial to add...
|
msg183743 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2013-03-08 15:14 |
Are you offering the module for inclusion in the stdlib?
|
msg183809 - (view) |
Author: Anders Hovmöller (Anders.Hovmöller) * |
Date: 2013-03-09 10:22 |
Éric Araujo: absolutely. Although I think my code can be improved (speed wise, elegance, etc) since I just wrote it quickly a weekend :)
|
msg183921 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2013-03-11 04:21 |
John listed four modules with issues in the first message, and now we have proposals for two more modules. Could you work together to make a unified patch?
Alexander, do you think there is a need to check python-ideas or python-dev before working on this?
(I changed the title to clarify scope: ISO 8601 is huge and not easily accessible whereas W3CDTF/RFC 3339 is narrower in scope and freely accessible.)
|
msg183931 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2013-03-11 05:06 |
Éric> do you think there is a need to check python-ideas or python-dev before working on this?
Yes, I think this is python-ideas material. IMHO, what should be added to datetime module in 3.4 is ability to construct date/time objects from their str() representation:
assert time(str(t)) == t
assert date(str(d)) == d
assert datetime(str(dt)) == dt
I am not sure the same is needed for timedelta, but this can be discussed.
Implementation of any external to python standard should be wetted at PyPI first. There may be a reason why there is no rfc3339.py module on PyPI.
|
msg221829 - (view) |
Author: karl (karlcow) * |
Date: 2014-06-29 03:58 |
I had the issue today. I needed to parse a date with the following format.
2014-04-04T23:59:00+09:00
and could not with strptime.
I see a discussion in March 2014 http://code.activestate.com/lists/python-ideas/26883/ but no followup.
For references:
http://www.w3.org/TR/NOTE-datetime
http://tools.ietf.org/html/rfc3339
|
msg221830 - (view) |
Author: karl (karlcow) * |
Date: 2014-06-29 04:36 |
On closer inspection, Anders Hovmöller proposal doesn't work.
https://github.com/boxed/iso8601
At least for the microseconds part.
In http://tools.ietf.org/html/rfc3339#section-5.6, the microsecond part is defined as:
time-secfrac = "." 1*DIGIT
In http://www.w3.org/TR/NOTE-datetime, same thing:
s = one or more digits representing a decimal fraction of a second
Anders considers it to be only six digits. It can be more or it can be less. :)
Will comment on github too.
|
msg221831 - (view) |
Author: karl (karlcow) * |
Date: 2014-06-29 05:30 |
Noticed some people doing the same thing
https://github.com/tonyg/python-rfc3339
http://home.blarg.net/~steveha/pyfeed.html
https://wiki.python.org/moin/WorkingWithTime
|
msg221903 - (view) |
Author: karl (karlcow) * |
Date: 2014-06-29 21:23 |
After inspections, the best library for parsing RFC3339 style date is definitely:
https://github.com/tonyg/python-rfc3339/
Main code at
https://github.com/tonyg/python-rfc3339/blob/master/rfc3339.py
|
msg260099 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-11 11:58 |
So, shall we include it ? Otherwise, py8601 (https://bitbucket.org/micktwomey/pyiso8601/) looks pretty popular and well maintained (various committers, started in 2012, last commit in 2016).
I think we should hurry, that's a great shame it has been while Python is able to generate a 8601 datetime but not parsing it back.
|
msg260100 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-02-11 12:04 |
I'm working on the OpenStack project and iso8601 is heavily used.
> Otherwise, py8601 (https://bitbucket.org/micktwomey/pyiso8601/) looks pretty popular and well maintained (various committers, started in 2012, last commit in 2016).
I don't think that we should add the iso8601 module to the stdlib, but merge iso8601 "features" into the datetime module.
The iso8601 module supports Python 2.7 and so has to implement its own timezone classes. The datetime module now has datetime.timezone since Python 3.2 for fixed timezone.
The iso8601 module provides functions. I would prefer datetime.datetime *methods*.
Would you mind to try to implement that? It would be kind to contact iso8601 author before.
The important part is also unit tests.
|
msg260150 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-02-12 00:46 |
See also #12006 for ISO 8601: "The datetime.strftime() and date.strftime() methods now support ISO 8601 date directives %G, %u and %V. (Contributed by Ashley Anderson in issue 12006.)".
|
msg260266 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-14 08:17 |
#12006 will unfortunately of no use for this one.
Actually, I realized that the best implementation of parsing rfc 3339 is in django dateparse utils. To me it's the finest, the most elegant, and no other one can claim to be more robust since it's probably the #1 iso parsing functions used in python. Have a look at https://docs.djangoproject.com/en/1.9/_modules/django/utils/dateparse/#parse_datetime
Alexander, I won't start before I have your opinion. What do you think ?
|
msg260276 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-14 11:50 |
Here is the PoC with code taken from django.utils.parse_datetime and adapted for the datetime module (I didn't ask for their agreement yet). Of course tests pass. For me it's the most elegant solution.
(I think date and time also need their "fromisotimestamp" counterpart).
|
msg260280 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-14 12:30 |
(slightly improved version, better use of timedelta)
|
msg260282 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-02-14 14:17 |
Is the django license compatible with the Python license?
|
msg260292 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-15 02:35 |
I don't know. The taken code is really little, modified, and is nothing much that an implementation you had seen a while ago, and recoded by memory not remembering where you saw it in the first place. Do you think that's really an issue ?
|
msg260293 - (view) |
Author: karl (karlcow) * |
Date: 2016-02-15 02:54 |
From https://www.djangoproject.com/foundation/cla/faq/
> Am I giving away the copyright to my contributions?
>
> No. This is a pure license agreement, not a copyright assignment. You
> still maintain the full copyright for your contributions. You are
> only providing a license to the DSF to distribute your code without
> further restrictions. This is not the case for all CLA's, but it is
> the case for the one we are using.
|
msg260294 - (view) |
Author: karl (karlcow) * |
Date: 2016-02-15 03:00 |
About
> Actually, I realized that the best implementation of parsing rfc 3339
> is in django dateparse utils. To me it's the finest, the most
> elegant, and no other one can claim to be more robust since it's
> probably the #1 iso parsing functions used in python. Have a look at
> https://docs.djangoproject.com/en/1.9/_modules/django/utils/dateparse/#parse_datetime
How does it parse this date:
2016-02-15T11:59:46.16588638674+09:00
|
msg260295 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-15 04:02 |
discarding the microseconds digits after the 6th.
2016-02-15 13:30 GMT+10:30 karl <report@bugs.python.org>:
>
> karl added the comment:
>
> About
>
> > Actually, I realized that the best implementation of parsing rfc 3339
> > is in django dateparse utils. To me it's the finest, the most
> > elegant, and no other one can claim to be more robust since it's
> > probably the #1 iso parsing functions used in python. Have a look at
> >
> https://docs.djangoproject.com/en/1.9/_modules/django/utils/dateparse/#parse_datetime
>
> How does it parse this date:
>
> 2016-02-15T11:59:46.16588638674+09:00
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue15873>
> _______________________________________
>
|
msg260298 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-15 06:35 |
slightly improved + addresses issues stated here : https://bugs.python.org/review/15873/diff/16581/Lib/datetime.py#newcode1418Lib/datetime.py:1418
|
msg260303 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-02-15 09:42 |
> How does it parse this date:
> 2016-02-15T11:59:46.16588638674+09:00
Mathieu Dupuy added the comment:
> discarding the microseconds digits after the 6th.
Hum, you should use the same rounding method than
datetime.datetime.fromtimestamp(): ROUND_HALF_UP, as round().
In practice, you can for example pass a floating point number as
microseconds to datetime.datetime constructor.
Since datetime is implemented in C, I'm not sure that using the re is
the best choice. Since the regex looks simple enough, we may parse the
string without the re module. Well, maybe only for the C
implementation.
What is the behaviour is there are spaces before/after the string?
What if there are other characters like letters before/after? You
should add an unit test for that. I expect an error when parsing
"t=2012-04-23T09:15:00" for example.
Your regex ends with $ but doesn't start with ^. Using re.match(), ^
and $ are probably not needed, but I'm not confident when I use regex
:-)
|
msg260309 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-15 11:41 |
> Hum, you should use the same rounding method than
datetime.datetime.fromtimestamp(): ROUND_HALF_UP, as round().
In practice, you can for example pass a floating point number as
microseconds to datetime.datetime constructor.
Unfortunately, you're mistaking with the timedelta constructor. Datetime's one only take int :(
But I figured out an elegant manner to cope with (in my opinion)
> Since datetime is implemented in C, I'm not sure that using the re is
the best choice. Since the regex looks simple enough, we may parse the
string without the re module. Well, maybe only for the C
implementation.
No regex available at all in CPython ? Otherwise, yeah, if I have to, I can do it with strptime.
> What is the behaviour is there are spaces before/after the string?
What if there are other characters like letters before/after? You
should add an unit test for that. I expect an error when parsing
"t=2012-04-23T09:15:00" for example.
Your regex ends with $ but doesn't start with ^. Using re.match(), ^
and $ are probably not needed, but I'm not confident when I use regex
:-)
re.match only look at the beginning of the string, so no need for '^'. And therefore, the case
you mention is already handled :)
joined to this mail the last revision of the feature, with correct rounding, more test and one useless
line removed. Maybe the good one :) ?
|
msg260318 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-02-15 16:31 |
> No regex available at all in CPython?
It's not really convenient to use the re module in C.
> Otherwise, yeah, if I have to, I can do it with strptime.
I suggest to parse directly the string with C code, since the format looks quite simple (numbers and a few separators).
|
msg260337 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-15 22:54 |
> I suggest to parse directly the string with C code, since the format looks quite simple (numbers and a few separators).
But some of them are optional. And I would really like to mimic the same implementation logic in C.
Now I think the python version is fairly ready. What next ?
|
msg260342 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-02-16 03:23 |
It looks to me like you copied a lot of code, doc strings, tests, etc from <https://github.com/django/django/commit/9b1cb75#diff-4db1d116f25f482278090b122e3b0028> and <https://github.com/django/django/commit/2f59e94>. I wouldn’t call it trivial. There is a BSD license for Django. Or do we have to get the relevant authors to do the Python CLA thing?
The current patch seems to allow a timezone without a colon, or even without minutes (+1100 and +11 as well as the RFC’s +11:00). Is this needed? The colon was made optional in Django in <https://code.djangoproject.com/ticket/18728>; the argument given for this just seems to be ISO 8601 alignment, nothing practical. According to <https://code.djangoproject.com/ticket/22814> Postgre SQL outputs time zones without the minutes field, but I don’t know if Python should go out of its way to support this obscure format.
RFC 3339 does not specify single digits in many places (e.g. 2016-2-1 1:0:0 is not specified). Should we also be stricter, at least for the minutes and seconds fields?
Also, is it necessary to allow the seconds field to be omitted, as in "2016-02-01 01:21"?
It seems that the “datetime” module does not support leap seconds, so if we mention RFC 3339 we should point out this inconsistency.
Victor: From my limited experiments, datetime.fromtimestamp() seems to use the round-to-even rule (not always rounding half up). Can you confirm? Maybe we should use that for consistency if it is practical. Otherwise, truncation towards zero would be the simplest.
As well as adding datetime.fromisoformat(), I think we should add similar methods to the separate date and time classes. One can parse the RFC’s full-date format fairly easily with strptime(), but not so for partial-time because of the fractional seconds.
|
msg260343 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-16 04:26 |
The real question is : should we accept whatever iso8601 format is common to be found on the internet, or just be able to consume back the string issued by isoformat. From that results the answers to the questions you're asking: don't accept single digits, neither second-less datetime, ...
I don't really mind what the answer is. I'm OK for a stricter acceptance. I would like to ask ourselves : does a simpler, stricter implementation fulfill people needs ? If it's OK for you, it's OK for me.
By taking the Django version, I deviated the bit from the author's original need which was just being able to parse back datetime isoformat. The limitations he raises for not using strptime are gone now (strptime understand timezone), but it still can't understand microseconds nor optional parts (T or space for separator, optional microseconds). Even for a much simpler, stricter implementation, I'd like to stick with regex.
I'll do a date & time version, I just wait that we fall agree on the whole datetime thing.
Wether we change to a simpler code or keep it this way, I can rewrite tests & docstring.
|
msg260344 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-02-16 04:37 |
The regular expression r"\d" matches any digit in Unicode I think, not just ASCII digits 0-9. Perhaps we should limit it to ASCII digits. Or is it intended to allow non-ASCII digits like in "२०१६-०२-१६ ०१:२१:१४"?
|
msg260345 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-16 04:41 |
Oh my god you're right. Thanks there is the re.ASCII flag.
2016-02-16 15:07 GMT+10:30 Martin Panter <report@bugs.python.org>:
>
> Martin Panter added the comment:
>
> The regular expression r"\d" matches any digit in Unicode I think, not
> just ASCII digits 0-9. Perhaps we should limit it to ASCII digits. Or is it
> intended to allow non-ASCII digits like in "२०१६-०२-१६ ०१:२१:१४"?
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue15873>
> _______________________________________
>
|
msg260347 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-16 05:32 |
simpler version using a simpler, stricter regex
|
msg260350 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-16 07:44 |
(OK, I said a stupidity: datetime's strptime handle microseconds. But time's one doesn't)
|
msg260356 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-16 11:24 |
OK, I know I post a lot, but this one should be the good one:
* recoded from scratch. Apart the algorithm, nothing come from Django anymore.
* help me fill the docstring, I'm not inspired
* datetime has few tests since it use the implementation of time._parse_time, which is heavily tested in time unittests.
* I now handle lowercase T and Z. (I know I could do "if tzinfo[0] in ('Z', 'z')", but to me it feel like imposing a micro performance penalty to implementation correctly sending an uppercase Z)
I'm impatient to receive your feedback.
|
msg260382 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-17 09:15 |
Crap, I just checked spams today and almost all mails of the reviewboard landed in spams ! So I made a new patch addressing all concerns:
* regex moved closer to where they're used
* regex globals start with an _
* case insensitive regex + handling(already handled in the previous revision)
* correct rounding + case suggested by Martin added as a test case
* more precise docstrings specifying that only a subset of ISO 8601 is accepted
bonus:
* useless non-capturing groups removed in regex, thus shorter and simpler
I still have a doubt though about the place I moved the regex. Tell me.
|
msg260420 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-02-18 01:46 |
Hi Aymeric Augustin. I am guessing you are the original author of the code and tests in Django for parsing datetime strings (https://bugs.python.org/issue15873#msg260342). If so, would you be happy for it to be incorporated into Python?
Mathieu: I left a couple quick review comments. (Normally I leave a message in the main bug thread, but I forgot the other time.)
Doc strings: Generally I think we keep the doc strings to a minimum, and leave the main documentation for the RST files. For the RST documentation, I would suggest including a rough summary of the format. E.g. for time.fromisoformat(), something like “The string should be in the RFC’s ‘full-time’ format, which looks like HH:MM:SS[.mmmmmm][Z|±HH:MM].”
Now that you added the two new regex strings, I can see that it might be useful to keep them together, rather than next to each class. Or you could even make them class attributes. No strong opinions either way; whatever works for you I think.
|
msg260426 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-18 04:22 |
New patch with all your concerns addressed (martin.panther+ silentghost) EXCEPT the single dispatch dictionary thing.
|
msg260427 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-18 04:32 |
SilentGhost: the dictionary single dispatch thing attached (apply on top of the last, fromisoformat_new3).
I mind the performance penalty for date-only parsing users, but the code is definitively shorter and more elegant.
But we have a major problem: tests fails because what is used in tests is a subclass of datetime classes (Subclass[Date|Time|DateTime]) and thus, the dispatch break with a KeyError: class.SubDate[...]. I have no idea on how mitigate that. Do you ?
|
msg260440 - (view) |
Author: Aymeric Augustin (aymeric.augustin) * |
Date: 2016-02-18 08:23 |
martin.panter: of course, I'm fine with integrating that code into Python.
deronnax: could you create a ticket on https://code.djangoproject.com/ highlighting the differences between Django's original implementation and the improved version that you worked on?
I'd like to use the stdlib implementation when it's available and align Django's current implementation to whatever's getting into the stdlib (to prepare the transition, even though we aren't going to drop support for Python 3.5 soon).
|
msg260441 - (view) |
Author: SilentGhost (SilentGhost) * |
Date: 2016-02-18 08:37 |
Mathieu, nothing was attached. The penalty's worth only a few if statements, I wouldn't worry too much about it. Besides, a C version is going to be provided as well, right?
Perhaps the following approach might solve the subclasses problem:
regex = dispatch.get(cls)
if not regex:
classes = datetime, date, time
cls = next((c for c in classes if issubclass(cls, c)), None)
if cls is None:
raise TypeError
regex = dispatch[cls]
Perhaps, TypeError is unnecessary there and just propagating StopIteration would do. In that case the if clause would look like:
classes = datetime, date, time
cls = next(c for c in classes if issubclass(cls, c))
regex = dispatch[cls]
|
msg260442 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-18 08:52 |
crap, here is the attachment.
Yeah, but I really would like to use regex in the C version (unless you strongly disadvise), so we will have the same logic and the same problem. And I never made a patch for the C interpreter itself, so the C equivalent is not close to be here soon. (btw if you have a starting point to recommend)
I definitely do not like this fix, it destroys the elegance and the simplicity of the "single-dispatch" solution. And it introduce a lot of noisy code for a very rare case, people subclassing datetime.* classes.
Maybe making the regex dictionary having string keys instead of class and passing the correct string from the calling function, like:
def fromisoformat(string):
_parse_isodatetime('time', string)
or maybe functools.singledispatch handle this case ?
|
msg260445 - (view) |
Author: SilentGhost (SilentGhost) * |
Date: 2016-02-18 09:34 |
Probably only other solution that I see is to add the third argument, an actual class, e.g.:
_parse_isodatetime(cls, string, datetime)
|
msg260449 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-02-18 11:26 |
Did you see my class attributes suggestion a couple messages back? That might solve your dispatch problem:
def _parse_isodatetime(cls, string):
match = cls._iso_regex.match(...)
class time:
_iso_regex = re.compile(...)
# or
time._iso_regex = re.compile(...)
date._iso_regex = ...
For the C module, start looking at /Modules/datetimemodule.c. Maybe if you can think of a similar function implemented in C to copy off. It looks like strptime() defers to a Python-only implementation; maybe that is another option.
|
msg260989 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-29 01:53 |
What I really want is to use regex in the C part as I did for the python one. It's the best approach and by very far.
I need to figure out how to use regex in CPython internals.
If I defer the actual processing to the Python part, what's the point of doing a C part ?
|
msg260990 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-02-29 02:19 |
> If I defer the actual processing to the Python part, what's the point of doing a C part ?
Currently, the "datetime" module is fully implemented in C, it's the
_datetime module in practice (accessed by the datetime module
namespace).
|
msg260991 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-02-29 02:25 |
I know. Martin was suggesting to defer the processing to an actual Python implementation, hence my answer.
|
msg263867 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-04-21 01:08 |
If you decide to only do a Python implementation, the main reason to write a wrapper in C would be because the datetime class is defined in C. Similar to how the datetime_strptime() function in Modules/_datetimemodule.c imports and calls _strptime._strptime_datetime().
An alternative might be to subclass the C classes _datetime.datetime, etc, and define the methods directly in those subclasses. Similar to how the C-defined class _socket.socket is subclassed in Lib/socket.py and more methods are added.
|
msg269714 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-07-02 14:28 |
Hi.
I'm back, and willing to move forward on this issue. With the new code layout, the compiled regexes now lay in datetime classes as class attributes. Will it be possible to import date, time and datetime from datetime.py in _datetime.c without a problem ?
By the way, I just discovered, that the way we treat microseconds differs from the strptime one : we are smarter read every digits and smartly round to six, strptime doesn't go that far and just *truncate* to this. Should go that way, for consistency with what strptime does, maybe ?
|
msg269722 - (view) |
Author: Anders Hovmöller (Anders.Hovmöller) * |
Date: 2016-07-02 15:52 |
>
> By the way, I just discovered, that the way we treat microseconds differs from the strptime one : we are smarter read every digits and smartly round to six, strptime doesn't go that far and just *truncate* to this. Should go that way, for consistency with what strptime does, maybe ?
I'm strongly against silently throwing away data and calling it even. If we want compatibility with strptime then it should be a separate flag like silent_truncate_milliseconds=True.
On another matter: does the latest proposed code pass the tests in my ISO8601 implementation that I mentioned 2013 (! time flies)?
|
msg270529 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2016-07-16 01:41 |
I would very much like to see this ready before the feature cut-off for Python 3.6. Could someone post a summary on python-ideas to get a show of hands on some of the remaining wrinkles?
I would not worry about a C implementation at this point. We can put python implementation in _strptime.py and call it from C as we do for the strptime method.
|
msg270828 - (view) |
Author: Anders Hovmöller (Anders.Hovmöller) * |
Date: 2016-07-19 14:07 |
The tests attached to this ticket seem pretty bare. Issues that I can spot directly:
- only tests for datetimes, not times or dates
- only tests for zulu and "-8:00” timezones
- no tests for invalid input (parsing a valid date as a datetime for example)
- only tests for YYYY-MM-DDTHH:MM:SSZ, but ISO8601 supports:
- Naive times
- Timezone information (specified as offsets or as Z for 0 offset)
- Year
- Year-month
- Year-month-date
- Year-week
- Year-week-weekday
- Year-ordinal day
- Hour
- Hour-minute
- Hour-minute
- Hour-minute-second
- Hour-minute-second-microsecond
- All combinations of the three "families" above!
(the above list is a copy paste from my project that implements all ISO8601 that fits into native python: https://github.com/boxed/iso8601 <https://github.com/boxed/iso8601>)
This is a more reasonable test suite: https://github.com/boxed/iso8601/blob/master/iso8601.py#L166 <https://github.com/boxed/iso8601/blob/master/iso8601.py#L166> although it lacks the tests for bogus inputs.
> On 2016-07-16, at 03:41, Alexander Belopolsky <report@bugs.python.org> wrote:
>
>
> Alexander Belopolsky added the comment:
>
> I would very much like to see this ready before the feature cut-off for Python 3.6. Could someone post a summary on python-ideas to get a show of hands on some of the remaining wrinkles?
>
> I would not worry about a C implementation at this point. We can put python implementation in _strptime.py and call it from C as we do for the strptime method.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue15873>
> _______________________________________
|
msg270829 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-07-19 14:20 |
because it limits itself to only support the RFC 3339 subset, as
explained in the begining of the discussion.
2016-07-19 16:07 GMT+02:00 Anders Hovmöller <report@bugs.python.org>:
>
> Anders Hovmöller added the comment:
>
> The tests attached to this ticket seem pretty bare. Issues that I can spot directly:
>
> - only tests for datetimes, not times or dates
> - only tests for zulu and "-8:00” timezones
> - no tests for invalid input (parsing a valid date as a datetime for example)
> - only tests for YYYY-MM-DDTHH:MM:SSZ, but ISO8601 supports:
> - Naive times
> - Timezone information (specified as offsets or as Z for 0 offset)
> - Year
> - Year-month
> - Year-month-date
> - Year-week
> - Year-week-weekday
> - Year-ordinal day
> - Hour
> - Hour-minute
> - Hour-minute
> - Hour-minute-second
> - Hour-minute-second-microsecond
> - All combinations of the three "families" above!
> (the above list is a copy paste from my project that implements all ISO8601 that fits into native python: https://github.com/boxed/iso8601 <https://github.com/boxed/iso8601>)
>
> This is a more reasonable test suite: https://github.com/boxed/iso8601/blob/master/iso8601.py#L166 <https://github.com/boxed/iso8601/blob/master/iso8601.py#L166> although it lacks the tests for bogus inputs.
>
>> On 2016-07-16, at 03:41, Alexander Belopolsky <report@bugs.python.org> wrote:
>>
>>
>> Alexander Belopolsky added the comment:
>>
>> I would very much like to see this ready before the feature cut-off for Python 3.6. Could someone post a summary on python-ideas to get a show of hands on some of the remaining wrinkles?
>>
>> I would not worry about a C implementation at this point. We can put python implementation in _strptime.py and call it from C as we do for the strptime method.
>>
>> ----------
>>
>> _______________________________________
>> Python tracker <report@bugs.python.org>
>> <http://bugs.python.org/issue15873>
>> _______________________________________
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue15873>
> _______________________________________
|
msg270831 - (view) |
Author: Anders Hovmöller (Anders.Hovmöller) * |
Date: 2016-07-19 15:05 |
Hmm, ok. I guess I was confused by "dates and times" part of the subject. Ok, so only datetimes. My other comments still apply though.
> On 19 Jul 2016, at 16:20, Mathieu Dupuy <report@bugs.python.org> wrote:
>
>
> Mathieu Dupuy added the comment:
>
> because it limits itself to only support the RFC 3339 subset, as
> explained in the begining of the discussion.
>
> 2016-07-19 16:07 GMT+02:00 Anders Hovmöller <report@bugs.python.org>:
>>
>> Anders Hovmöller added the comment:
>>
>> The tests attached to this ticket seem pretty bare. Issues that I can spot directly:
>>
>> - only tests for datetimes, not times or dates
>> - only tests for zulu and "-8:00” timezones
>> - no tests for invalid input (parsing a valid date as a datetime for example)
>> - only tests for YYYY-MM-DDTHH:MM:SSZ, but ISO8601 supports:
>> - Naive times
>> - Timezone information (specified as offsets or as Z for 0 offset)
>> - Year
>> - Year-month
>> - Year-month-date
>> - Year-week
>> - Year-week-weekday
>> - Year-ordinal day
>> - Hour
>> - Hour-minute
>> - Hour-minute
>> - Hour-minute-second
>> - Hour-minute-second-microsecond
>> - All combinations of the three "families" above!
>> (the above list is a copy paste from my project that implements all ISO8601 that fits into native python: https://github.com/boxed/iso8601 <https://github.com/boxed/iso8601>)
>>
>> This is a more reasonable test suite: https://github.com/boxed/iso8601/blob/master/iso8601.py#L166 <https://github.com/boxed/iso8601/blob/master/iso8601.py#L166> although it lacks the tests for bogus inputs.
>>
>>> On 2016-07-16, at 03:41, Alexander Belopolsky <report@bugs.python.org> wrote:
>>>
>>>
>>> Alexander Belopolsky added the comment:
>>>
>>> I would very much like to see this ready before the feature cut-off for Python 3.6. Could someone post a summary on python-ideas to get a show of hands on some of the remaining wrinkles?
>>>
>>> I would not worry about a C implementation at this point. We can put python implementation in _strptime.py and call it from C as we do for the strptime method.
>>>
>>> ----------
>>>
>>> _______________________________________
>>> Python tracker <report@bugs.python.org>
>>> <http://bugs.python.org/issue15873>
>>> _______________________________________
>>
>> ----------
>>
>> _______________________________________
>> Python tracker <report@bugs.python.org>
>> <http://bugs.python.org/issue15873>
>> _______________________________________
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue15873>
> _______________________________________
|
msg270899 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-07-21 01:21 |
Mathieu: Maybe you haven’t seen some of the comments on your older patches. E.g. my comment on fromisoformat4.patch about improper use of “with self.assertRaises(...)” still stands.
Also, adding some documentation to the patch might help the likes of Anders figure out the scope of the change. I think we decided to parse RFC 3339’s “internet date and time format” profile of ISO 8601 with the date, time, and datetime classes, including tolerating arbitrary resolutions of fractions of seconds in the time, and parsing time zones.
I don’t think we need to test every combination of the other ISO 8601 formats. There are already a couple of negative tests. Are there any in particular you think are important to add?
|
msg272021 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-08-05 09:37 |
I'm back on the issue. I'm currently stuck on the design. We need to store the regexes somewhere, and that's what causes problem : I can't really find a good place to store them. We basically have two possible designs :
* single dispatch kind, class-type dictionary lookup for regexes, stored in _strpime.py. It's minimally invasive, allow a very simple C implementation, and allows us to avoid to add a 're' import in datetime.py. Problem : it breaks when the given class is not of type date, time or datetime. And it currently breaks the tests because tests are doing this, testing using subclasses. We could rely on "isinstance" but do we want this ?
* regex stored as classes attributes. More robust, more invasive, 're' import in datetime.py, allows subclassing, passes test. C implementation not done yet. Since it requires a better understanding of the C API, I will do it only we are sure that's the way to go.
I post the two versions of the implementation as patches here. These adress all the concerns expressed before (Martin). If we can't decide, I will post a mail on the mailing list Martin suggested, python-ideas. By the way, are you sure it's the right one to ask ? Wouldn't be python-dev more appropriated ?
|
msg272026 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2016-08-05 12:50 |
updated version with SilentGhost's concerns addressed.
|
msg273609 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2016-08-24 22:55 |
Please move _parse_isotime to _strptime so that it can be called from C implementation. Also, the new method should be documented.
|
msg291822 - (view) |
Author: larsonreever (larsonreever) |
Date: 2017-04-18 05:37 |
Otherwise, py8601 (https://bitbucket.org/micktwomey/pyiso8601/) looks pretty popular and well maintained (various committers, started in 2012, last commit in 2016). I don't think that we should add the iso8601 module to the stdlib, but merge iso8601 "features" into the datetime module. The iso8601 module supports Python 2.7 and so has to implement its own timezone classes. The datetime module now has datetime.timezone since Python 3.2 for fixed timezone. To me it's the finest, the most elegant, and no other one can claim to be more robust since it's probably the #1 iso parsing functions used in python. Have a look at https://docs.djangoproject.com/en/1.9/_modules/django/utils/dateparse/#parse_datetime.
|
msg291831 - (view) |
Author: Anders Hovmöller (Anders.Hovmöller) * |
Date: 2017-04-18 09:42 |
@larsonreever That lib is pretty limited, in that it doesn't handle dates or deltas. Again: my lib that is linked above does and has comprehensive tests.
|
msg304950 - (view) |
Author: Elvis Pranskevichus (Elvis.Pranskevichus) * |
Date: 2017-10-24 23:44 |
I think that both the pyiso8601 and boxed/iso8601 implementations parse ISO 8601 strings incorrectly. The standard explicitly says that all truncated datetime strings are *reduced accuracy timestamps*. In other words, "2017-10" is *not* equal to "2017-10-01". Instead, "2017-10" represents the whole month of October 2017. Same thing with hours. Earlier versions of ISO 8601 even allowed dropping the year: "--10-01", which meant October 1st of _any year_. They dropped this from more recent revisions of the standard.
The only place where the truncated representation means "default to zero" is the timezone offset, so "10:10:00+4" and "10:10:00+04:00" mean the same thing.
|
msg307603 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2017-12-04 22:39 |
P-ganssle seems to be proposing to limit parsing to exactly what “datetime.isoformat” produces; i.e. whole number of seconds, milliseconds or microseconds. Personally I would prefer it without this limitation, like in Mathieu’s patches. But P-ganssle has done some documentation, so perhaps we can combine the work of each?
|
msg307604 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2017-12-04 22:41 |
The other difference is Mattieu guarantees ValueError for invalid input strings, which I think is good.
|
msg307605 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2017-12-04 22:45 |
The better is the enemy of the good here. Given the history of this issue, I would rather accept a well documented restrictive parser than wait for a more general code to be written. Note that we can always relax the parsing rules in the future.
|
msg307606 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2017-12-04 22:47 |
I'm right now available again to work on this issue. I'll submit a pull
request within a week with all issues addressed
Le 4 déc. 2017 11:45 PM, "Alexander Belopolsky" <report@bugs.python.org> a
écrit :
>
> Alexander Belopolsky <alexander.belopolsky@gmail.com> added the comment:
>
> The better is the enemy of the good here. Given the history of this
> issue, I would rather accept a well documented restrictive parser than wait
> for a more general code to be written. Note that we can always relax the
> parsing rules in the future.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue15873>
> _______________________________________
>
|
msg307607 - (view) |
Author: Paul Ganssle (p-ganssle) * |
Date: 2017-12-05 00:33 |
> The better is the enemy of the good here. Given the history of this issue, I would rather accept a well documented restrictive parser than wait for a more general code to be written. Note that we can always relax the parsing rules in the future.
This is in fact the exact reason why I wrote the isoformat parser like I did, because ISO 8601 is actually a quite expansive standard, and this is the least controversial subset of the features. In fact, I spent quite a bit of time on adapting the general purpose ISO8601 parser I wrote for dateutil *into* one that only accepts the output of isoformat() because it places a minimum burden on ongoing support, so it's not really a matter of waiting for a more general parser to be written.
I suggest that for Python 3.7 we *only* support output of isoformat(). Many general iso8601 parsers exist, including the one I have already implemented for python-dateutil (which will be part of the dateutil 2.7.0 release). We can have further discussion later about what exactly should be supported in Python 3.8, but even in the pre-release discussions I'm already seeing pushback about some of the more unusual 8601 formats, and it's a *lot* easier to explain (in documentation) that `fromisoformat()` is intended to be the inverse of `isoformat()` than it is to explain which variations of ISO 8601 are and are not supported (fractional minutes? if you're following the standard, the separator has to be a T, so what other variations of the standard are allowed?).
|
msg307610 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2017-12-05 00:41 |
+1 on what Paul said.
Mathieu, the goal for 3.7 will be to get Paul's PR merged. It will be great if you could help in reviewing it. We can return to the features in your PR during the 3.8 development cycle.
|
msg307616 - (view) |
Author: Paul Ganssle (p-ganssle) * |
Date: 2017-12-05 02:17 |
> The other difference is Mattieu guarantees ValueError for invalid input strings, which I think is good.
I forgot to address this - but I don't think this is a difference in approaches. If you pass `None` or an int or something, the problem is with the type, not the value, so at a minimum you're looking at TypeError and ValueError - and those are the only exceptions raised in my patch.
(I'll note that my patch does not accept bytes, though this is something of an artificial limitation, since the patch makes use of the fact that all valid isoformat() strings will contain at most exactly 1 non-ascii character in position 10, so we could easily work around this, but I think the trend for CPython is to avoid blurring the lines between bytes and str rather than encouraging their interchangeable use.)
|
msg308214 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2017-12-13 16:38 |
I finally released my work. It looks like Paul's work is more comprehensive, but if you want to pick one thing or two in mine, feel free.
|
msg308505 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2017-12-18 01:37 |
Regarding Matthieu’s RFC 3339 parser, Victor wanted to use the round-half-to-even rule to get a whole number of microseconds. But considering the “time” class cannot represent 24:00, how do you round up in the extreme case past 23:59?
time.fromisoformat("23:59:59.9999995")
Perhaps it is better to always truncate to zero, only support 6 digits (rejecting fractions of a microsecond), or add Anders’s truncate_microseconds=True option.
|
msg308507 - (view) |
Author: Paul Ganssle (p-ganssle) * |
Date: 2017-12-18 02:19 |
@martin.panter I don't see the problem here? Wouldn't 23:59.9999995 round up to 00:00?
|
msg308510 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2017-12-18 04:01 |
Not if the time is associated with a particular day. Imagine implementing datetime.fromisoformat by separately calling date.fromisoformat and time.fromisoformat. The date will be off by one day if you naively rounded 2017-12-18 23:59 “up” to 2017-12-18 00:00.
|
msg308569 - (view) |
Author: Paul Ganssle (p-ganssle) * |
Date: 2017-12-18 14:59 |
> Not if the time is associated with a particular day. Imagine implementing datetime.fromisoformat by separately calling date.fromisoformat and time.fromisoformat. The date will be off by one day if you naively rounded 2017-12-18 23:59 “up” to 2017-12-18 00:00.
Yes, I suppose this is a problem if you implement it that way. Seems like a somewhat moot point, but I think any decision about rounding should probably be driven by what people are expecting more than by how it is implemented.
That said, I can see a good case for truncation *and* rounding up for something like '2016-12-31T23:59:59.999999999'. Rounding up to '2017-01-01' is certainly the closest whole millisecond to round to, *but* often people expressing a "23:59:59.9999999" are trying to actually express "the last possible moment *before* 00:00".
|
msg308637 - (view) |
Author: Daniel Holmes (jaitaiwan) |
Date: 2017-12-19 13:10 |
I wanted to note here... I've been trying to get strptime to work with the types of dates specified in this request and came across a documentation bug here: https://docs.python.org/3.5/library/time.html#time.strptime
You can see that the %z attribute's examples given have colons in them while the format specified is +HHMM rather than +HH:MM which the examples illude to.
|
msg308851 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2017-12-21 05:33 |
New changeset 09dc2f508c8513e0466a759cc27a09108c1e55c2 by Alexander Belopolsky (Paul Ganssle) in branch 'master':
bpo-15873: Implement [date][time].fromisoformat (#4699)
https://github.com/python/cpython/commit/09dc2f508c8513e0466a759cc27a09108c1e55c2
|
msg309168 - (view) |
Author: Mathieu Dupuy (deronnax) * |
Date: 2017-12-29 10:57 |
maybe it's worth adding an entry in python 3.7 "what's new" ? I think it was a very long awaited issue.
The opposite of isoformat() is a very frequent question from python newcomers
|
msg309175 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2017-12-29 14:50 |
Correct, a new feature should always get a what's new entry. You could submit a PR for it :)
|
msg311703 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2018-02-06 02:28 |
New changeset 22864bc8e4a076bbac748ccda6c27f1ec41b53e7 by Alexander Belopolsky (Paul Ganssle) in branch 'master':
Add What's new entry for datetime.fromisoformat (#5559)
https://github.com/python/cpython/commit/22864bc8e4a076bbac748ccda6c27f1ec41b53e7
|
msg313105 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2018-03-01 18:48 |
New changeset 0e06be836ca0d578cf9fc0c68979eb682c00f89c by Alexander Belopolsky (Miss Islington (bot)) in branch '3.7':
Add What's new entry for datetime.fromisoformat (GH-5559) (GH-5939)
https://github.com/python/cpython/commit/0e06be836ca0d578cf9fc0c68979eb682c00f89c
|