classification
Title: Enhancements/fixes to pure-python datetime module
Type: behavior Stage: resolved
Components: Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: belopolsky Nosy List: bdkearns, belopolsky, benjamin.peterson, josh.r, lemburg, python-dev, tim.peters
Priority: normal Keywords: patch

Created on 2014-03-06 18:37 by bdkearns, last changed 2014-09-29 14:56 by berker.peksag. This issue is now closed.

Files
File name Uploaded Description Edit
datetime-py34.patch bdkearns, 2014-03-06 18:37 review
datetime-py33.patch bdkearns, 2014-03-06 18:37
datetime-py33-v2.patch bdkearns, 2014-03-06 20:13
datetime-py34-v2.patch bdkearns, 2014-03-06 20:13 review
datetime-py33-v3.patch bdkearns, 2014-03-07 01:47
datetime-py34-v3.patch bdkearns, 2014-03-07 01:47 review
datetime-py35.patch bdkearns, 2014-07-07 20:52 review
Messages (18)
msg212832 - (view) Author: Brian Kearns (bdkearns) * Date: 2014-03-06 18:37
This patch brings the pure-python datetime more in-line with the C module. We have been running these modifications in PyPy2 stdlib for more than a year with no issue.

Includes:
- General PEP8/cleanups
- Better testing of argument types passed to constructors
- Removal of duplicate operations (in some paths values were checked twice for validity)
- Optimization of timedelta creation (brings it from 8-9usec to ~6 usec on CPython 3.3 on local machine)
- Enhancements/bug fixes in tests
msg212833 - (view) Author: Brian Kearns (bdkearns) * Date: 2014-03-06 18:39
Also includes bug fixes/tests for certain rounding cases (doesn't apply to the 3.4 version).
msg212840 - (view) Author: Brian Kearns (bdkearns) * Date: 2014-03-06 20:14
Updated patch to v2 with another test/fix for type checking of __format__ argument to match the C module.
msg212846 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2014-03-06 22:17
_check_int_field seems needlessly complex. When you want a value that is logically an integer (not merely capable of being coerced to an integer), you want object.__index__, per PEP 357, or to avoid explicit calls to special methods, use operator.index. Any reason to not just use operator.index directly?
msg212850 - (view) Author: Brian Kearns (bdkearns) * Date: 2014-03-06 22:52
The C datetime module uses the 'i' code for parsing these args, not 'n' (which would correspond to operator.index). Using operator.index fails a test case I added (cases for classes like decimal.Decimal, implementing __int__ but not __index__).
msg212852 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2014-03-06 23:26
That's actually an argument to fix the C datetime implementation. Right now, you get:

    >>> from decimal import Decimal as d
    >>> from datetime import datetime
    >>> datetime(d("2000.5"), 1, 2)
    datetime.datetime(2000, 1, 2, 0, 0)

This is wildly inconsistent; if you passed 2000.0, it would raise an exception because float (even floats directly equivalent to an int value) are forbidden. But the logically equivalent Decimal type will work just fine, silently truncating. Basically any user defined type with integer coercion (but not integer equivalence) would have the same problem; str doesn't, because str is special cased (it doesn't actually have __int__), but any user-defined str-like class that defined int coercion would work as a datetime arg in a way str does not.

You've just given me an excuse to open my first bug. Thanks! :-)
msg212854 - (view) Author: Brian Kearns (bdkearns) * Date: 2014-03-06 23:51
Right, that's the behavior as it stands, so I hope this patch can be considered independently of that issue (and if such a change is made to the C implementation, then a corresponding change could be made in the python implementation).
msg212855 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2014-03-07 00:05
Oh, definitely. No reason to delay this just because I have my knickers in a twist on a tangential matter.
msg212856 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-03-07 00:09
I would like to hear from PyPy developers before we decide what to do with this effort.  Pure Python implementation is not used by CPython,
but I am afraid that people who actually use it will not appreciate the code churn.
msg212857 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-03-07 00:11
Oh - I did not realize that this originated in PyPy.
msg212858 - (view) Author: Brian Kearns (bdkearns) * Date: 2014-03-07 00:12
Yes, I am the PyPy developer who worked on these datetime improvements there -- just finally got around to pushing them upstream.
msg221907 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-06-29 21:59
Brian,

I would like to apply your changes for 3.5.  Do you have any updates?
msg222515 - (view) Author: Brian Kearns (bdkearns) * Date: 2014-07-07 20:52
Updated patch, now it also caches the result of __hash__ like the C accelerator.
msg222942 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-07-13 15:35
Brian,

Could you, please update the summary of your changes from your first post?  For example, you did not mention caching of the timedelta hashes.

This particular chance seems to call for a discussion.

Do we cache timedelta hashes in C implementation?  What is the general wisdom on this technique?  AFAICR, such hashing is done in integer objects, but I vaguely remember an old discussion on whether the same should be done for tuples.  Can you remind me what was the outcome for tuples?
msg222944 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-07-13 15:41
> Updated patch, now it also caches the result of __hash__ like the C accelerator.

I should read your notes!  Sorry, Brian.  You already answered my questions.  Too bad that the latest notes are so far from the entry area.

Still, it would be helpful if you could provide a self-contained description that I can copy to the NEWS file.  (Don't put it in the patch  - NEWS file gets out of date very quickly - put it in a tracker comment.)

Also, with your patch, are we in sync with PyPy?  If so, with what version?
msg222946 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-07-13 15:46
[Josh Rosenberg]
> You've just given me an excuse to open my first bug.

Did you open an issue for that?  (Use "n" code in date/datetime constructors.)
msg222947 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-07-13 15:53
Here is the tuple hash caching thread that I mentioned above:

https://mail.python.org/pipermail/python-dev/2003-August/037416.html

Since the C code uses caching already, I don't think we need to discuss it any further.  And the thread on tuples does not give any good reason not to cache anyways.
msg227780 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-09-28 23:12
New changeset 5313b4c0bb6c by Alexander Belopolsky in branch 'default':
Closes issue #20858: Enhancements/fixes to pure-python datetime module
https://hg.python.org/cpython/rev/5313b4c0bb6c
History
Date User Action Args
2014-09-29 14:56:35berker.peksagsetstage: commit review -> resolved
2014-09-29 01:58:29benjamin.petersonsetstatus: open -> closed
resolution: fixed
2014-09-28 23:12:07python-devsetnosy: + python-dev
messages: + msg227780
2014-07-13 15:53:56belopolskysetmessages: + msg222947
2014-07-13 15:46:55belopolskysetmessages: + msg222946
2014-07-13 15:41:05belopolskysetmessages: + msg222944
2014-07-13 15:35:33belopolskysetmessages: + msg222942
2014-07-07 20:53:02bdkearnssetfiles: + datetime-py35.patch

messages: + msg222515
2014-06-29 21:59:50belopolskysetstage: patch review -> commit review
messages: + msg221907
versions: + Python 3.5, - Python 3.3, Python 3.4
2014-03-07 01:47:58bdkearnssetfiles: + datetime-py34-v3.patch
2014-03-07 01:47:46bdkearnssetfiles: + datetime-py33-v3.patch
2014-03-07 00:12:55bdkearnssetmessages: + msg212858
2014-03-07 00:11:06belopolskysetmessages: + msg212857
2014-03-07 00:09:31belopolskysetnosy: + benjamin.peterson
messages: + msg212856
2014-03-07 00:05:14josh.rsetmessages: + msg212855
2014-03-06 23:51:10bdkearnssetmessages: + msg212854
2014-03-06 23:26:26josh.rsetmessages: + msg212852
2014-03-06 22:52:10bdkearnssetmessages: + msg212850
2014-03-06 22:17:29josh.rsetnosy: + josh.r
messages: + msg212846
2014-03-06 21:44:09belopolskysetassignee: belopolsky
2014-03-06 21:15:10ned.deilysetnosy: + lemburg, tim.peters, belopolsky
stage: patch review

versions: + Python 3.3, Python 3.4
2014-03-06 20:14:57bdkearnssetmessages: + msg212840
2014-03-06 20:13:52bdkearnssetfiles: + datetime-py34-v2.patch
2014-03-06 20:13:39bdkearnssetfiles: + datetime-py33-v2.patch
2014-03-06 18:39:16bdkearnssetmessages: + msg212833
2014-03-06 18:37:58bdkearnssetfiles: + datetime-py33.patch
2014-03-06 18:37:38bdkearnscreate