classification
Title: Datetime NoneType after calling Py_Finalize and Py_Initialize
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: belopolsky Nosy List: Chi Hsuan Yen, Denny Weinberg, Jim.Jewett, Nathan Jensen, Roman.Evstifeev, ammar2, belopolsky, christian.heimes, cschramm, josh.r, ncoghlan, palm.kevin, steve.dower
Priority: normal Keywords: patch

Created on 2016-06-27 13:49 by Denny Weinberg, last changed 2017-03-15 20:50 by Nathan Jensen.

Files
File name Uploaded Description Edit
issue27400.patch belopolsky, 2016-09-15 22:06 review
27400.patch cschramm, 2017-02-01 12:11 review
Messages (15)
msg269379 - (view) Author: Denny Weinberg (Denny Weinberg) Date: 2016-06-27 13:49
After calling Py_Finalize and Py_Initialize I get the message "attribute of type 'NoneType' is not callable" on the datetime.strptime method.

Example:
from datetime import datetime
s = '20160505 160000'
refdatim = datetime.strptime(s, '%Y%m%d %H%M%S')

The first call works fine but it crashes after the re initialization.

Workaround:
from datetime import datetime
s = '20160505 160000'
try:
    refdatim = datetime.strptime(s, '%Y%m%d %H%M%S')
except TypeError:
    import time
    refdatim = datetime.fromtimestamp(time.mktime(time.strptime(s, '%Y%m%d %H%M%S')))

Related Issue: Issue17408 ("second python execution fails when embedding")
msg269381 - (view) Author: Denny Weinberg (Denny Weinberg) Date: 2016-06-27 14:10
Just to be clear:

The error happens after these steps:

1. Call strptime
2. Call cpython function "Py_Finalize" and "Py_Initialize"
3. Call strptime again

Now we get the error "attribute of type 'NoneType' is not callable"
msg269751 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-07-03 04:31
Thanks for the report Denny. Looking at https://hg.python.org/cpython/file/30099abdb3a4/Modules/datetimemodule.c#l3929, there's a problematic caching of the "_strptime" module that is almost certainly the cause of the problem - it will attempt to call _strptime._strptime from the already finalized interpreter rather than the new one.

It should be possible to adjust that logic to permit a check for _strptime._strptime being set to None, and reimporting _strptime in that case.
msg269914 - (view) Author: Ammar Askar (ammar2) * Date: 2016-07-07 00:18
Is there any particular reason that datetime.strptime caches the imported module like that? 

From a quick search, these two other examples don't bother with any caching: 

https://github.com/python/cpython/blob/2d264235f6e066611b412f7c2e1603866e0f7f1b/Modules/timemodule.c#L709

https://github.com/python/cpython/blob/64fe35c9fee088f7fec4dd2d760cb0026ac54ec8/Python/traceback.c#L277
msg269916 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-07-07 02:27
Aye, skipping the caching entirely would be an even simpler solution - the only thing it is saving in the typical case is a dictionary lookup in the modules cache.
msg276598 - (view) Author: Josh Rosenberg (josh.r) * Date: 2016-09-15 19:16
Nick: Looks like it's quite a bit more work than just a dict lookup. That PyImport_ImportModuleNoBlock call (which seems odd; the implementation of NoBlock is just to wrap the blocking function; guess we don't allow non-blocking imports anymore and this is just to avoid changing all the names elsewhere?) involves a *lot* more work than just a dict lookup (it devolves to a PyImport_Import call https://hg.python.org/cpython/file/3.5/Python/import.c#l1743 , which basically does everything involved in the import process aside from actually reading/parsing the file unconditionally, because of how weird __import__ overrides can be, I guess).

While it's not a perfect comparison, compare:

>>> import _strptime  # It's now cached

# Cache globals dict for fair comparison without globals() call overhead
>>> g = globals()     

# Reimport (this might be *more* expensive at C layer, see notes below)
>>> %timeit -r5 import _strptime
1000000 loops, best of 5: 351 ns per loop

# Dict lookup (should be at least a bit cheaper at C layer if done equivalently, using GetAttrId to avoid temporary str)
>>> %timeit -r5 g['_strptime']
10000000 loops, best of 5: 33.1 ns per loop

# Cached reference (should be *much* cheaper at C layer)
>>> %timeit -r5 _strptime
100000000 loops, best of 5: 19.1 ns per loop

Note: I'm a little unclear on whether a Python function implemented in C has its own globals, or whether it's simulated as part of the C module initialization); if it lacks globals, then the work done for PyImport_Import looks like it roughly doubles (it has to do all sorts of work to simulate globals and the like), so that 351 ns per re-import might actually be costlier in C.

Either way, it's a >10x increase in cost to reimport compared to a dict lookup, and ~18x speedup over using a cached reference (and like I said, I think the real cost of the cheaper options would be much less in C, so the multiplier is higher). Admittedly, in tests, empty string calls to `_strptime._strptime` take around 7.4 microseconds (with realistic calls taking 8.5-13.5 microseconds), so caching is saving maybe a third of a microsecond overhead, maybe 2.5%-4.5% of the work involved in the strptime call.
msg276604 - (view) Author: Josh Rosenberg (josh.r) * Date: 2016-09-15 19:28
Hmm... On checking down some of the code paths and realizing there were some issues in 3.5 (redundant code, and what looked like two memory leaks), I checked tip (to avoid opening bugs on stale code), and discovered that #22557 rewrote the import code, reducing the cost of top level reimport by ~60%, so my microbenchmarks (run on Python 3.5.0) are already out of date for 3.6's faster re-import. Even so, caching wasn't a wholly unreasonable optimization before now, and undoing it now still has a cost, if a smaller one.
msg276622 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2016-09-15 21:34
I am not sure this is possible to fix without refactoring the datetime module according to PEP 3121. See #15390.
msg276626 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-09-15 21:46
PEP 3121 is a big change. Can we use PyModuleDef->m_clear() for a clever hack?
msg276630 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2016-09-15 22:06
Yes, I think something like the attached patch may do the trick.
msg276632 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-09-15 22:26
Wouldn't it clear strptime_module when a subinterpreter shuts down, too? It's not a big deal because it can't cause a crash.
msg277184 - (view) Author: Jim Jewett (Jim.Jewett) Date: 2016-09-21 21:29
Having to (re-)fill the cache once per interpreter seems like a reasonable price to pay.

Why is 3.5 not included?  Did this not cause problems before the import change, or is it just that this bug is small enough that maybe it isn't worth backporting?
msg285560 - (view) Author: Denny Weinberg (Denny Weinberg) Date: 2017-01-16 12:30
Any news here? 3.6.0 is also affected by this bug.
msg286618 - (view) Author: Christopher Schramm (cschramm) Date: 2017-02-01 12:11
This issue should have a much higher priority as it basically breaks Python embedding unless the user either does not re-initialize the interpreter or avoid the use of _datetime.strptime.

We're currently testing with a patch based on Christian and Alexander's idea but using m_free as using m_clear and Py_CLEAR neither makes sense to me nor did it work in conjunction with Py_Finalize when testing it. A version matching the current tip is attached. We did not run into any issues so far (and the only thing I can think of is clearing the static variable too often and thus causing some extra imports).
msg286636 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-02-01 14:19
I've added Steve Dower to the nosy list, as he's done some work recently on making the Windows builds more embedding-friendly, and I believe at least some of that time may have been funded work.

Unfortunately, we don't currently have anyone I'm aware of that's being specifically paid to improve or maintain CPython's embedding support in general, so getting attention for these kinds of interpreter reinitialization bugs can be a bit hit-or-miss.
History
Date User Action Args
2017-03-15 20:50:37Nathan Jensensetnosy: + Nathan Jensen
2017-02-01 14:19:30ncoghlansetnosy: + steve.dower
messages: + msg286636
2017-02-01 12:11:01cschrammsetfiles: + 27400.patch

messages: + msg286618
2017-01-31 16:12:40cschrammsetnosy: + cschramm
2017-01-16 12:30:08Denny Weinbergsetmessages: + msg285560
2016-12-15 10:37:09Chi Hsuan Yensetnosy: + Chi Hsuan Yen
2016-09-21 21:29:42Jim.Jewettsetnosy: + Jim.Jewett
messages: + msg277184
2016-09-15 22:26:24christian.heimessetmessages: + msg276632
2016-09-15 22:06:09belopolskysetfiles: + issue27400.patch
versions: + Python 3.6, Python 3.7, - Python 3.5
messages: + msg276630

assignee: belopolsky
keywords: + patch
2016-09-15 21:46:01christian.heimessetnosy: + christian.heimes
messages: + msg276626
2016-09-15 21:34:48belopolskysetmessages: + msg276622
2016-09-15 19:28:36josh.rsetmessages: + msg276604
2016-09-15 19:16:43josh.rsetnosy: + josh.r
messages: + msg276598
2016-09-15 11:13:59Roman.Evstifeevsetnosy: + Roman.Evstifeev
2016-07-07 02:27:52ncoghlansetmessages: + msg269916
2016-07-07 00:18:32ammar2setnosy: + ammar2
messages: + msg269914
2016-07-03 04:31:30ncoghlansetnosy: + belopolsky, ncoghlan
messages: + msg269751
2016-06-27 14:10:52Denny Weinbergsetmessages: + msg269381
2016-06-27 13:49:18Denny Weinbergcreate