Author p-ganssle
Recipients belopolsky, berker.peksag, bmispelon, corona10, p-ganssle, rhettinger, serhiy.storchaka, taleinat, tim.peters, vstinner
Date 2019-09-08.11:37:44
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1567942664.84.0.290669117642.issue24416@roundup.psfhosted.org>
In-reply-to
Content
I haven't had time to try this with an optimized build, I have done a few more benchmarks using a standard build, I'm seeing almost a 50% regression on isocalendar() calls, but the picture is a little rosier if you consider the fact that you need to construct a `date` or `datetime` object in the first place and anyone concerned with performance will likely be making exactly one `isocalendar()` call per datetime object. The common case is that they construct the datetime with a datetime literal or something equivalent. The "worst case scenario" is that they construct a single "seed" datetime and then construct many new datetimes with faster operations like adding a timedelta.

I have compared the following cases:

call only: -s 'import datetime; dt = datetime.datetime(2019, 1, 1)' 'dt.isocalendar()'
constructor + call: -s 'import datetime' 'dt = datetime.datetime(2019, 1, 1); dt.isocalendar()'
timedelta + call: -s 'import datetime; dt = datetime.datetime(2019, 1, 1); td = timedelta(days=1)' '(dt + td).isocalendar()'


The results are summarized below, the most likely real practical impact on performance-sensitive "hot loops" would be a 29% performance regression *if* isocalendar() is the bottleneck:


       benchmark                    |  master (ns)  |  PR 15633 (ns)  |  Δ (%)
------------------------------------+---------------+-----------------+----------
call only (datetime)                |    349 (±14)  |     511 (±22)   |   46
constructor + call (datetime)       |    989 (±48)  |    1130 (±50)   |   14
timedelta + call (datetime)         |    550 (±14)  |     711 (±22)   |   29


The numbers for `datetime.date` are basically similar:

       benchmark                    |  master (ns)  |  PR 15633 (ns)  |  Δ (%)
------------------------------------+---------------+-----------------+----------
call only (date)                    |    360 (±18)  |     530 (±41)   |   47
constructor + call (date)           |    824 (±17)  |     975 (±29)   |   18
timedelta + call (datetime)         |    534 (±20)  |     685 (±24)   |   28


This is a bit disheartening, but I think the fact that these performance sensitive situations are rare and the "grab a single component" use is overwhelmingly the common case, I think it's worth it in the end.

If there are significant complaints about the performance regression, we may be able to create a variant method similar to the way that `chain` has `chain.from_iterables`, something like `dt.isocalendar.as_tuple()` for the performance-sensitive folks. That said, that's very YAGNI, we should not do that unless someone comes forward with a real use case that will be adversely affected here.
History
Date User Action Args
2019-09-08 11:37:44p-gansslesetrecipients: + p-ganssle, tim.peters, rhettinger, belopolsky, vstinner, taleinat, berker.peksag, serhiy.storchaka, bmispelon, corona10
2019-09-08 11:37:44p-gansslesetmessageid: <1567942664.84.0.290669117642.issue24416@roundup.psfhosted.org>
2019-09-08 11:37:44p-gansslelinkissue24416 messages
2019-09-08 11:37:44p-gansslecreate