This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PEP 3121, 384 refactoring applied to hashopenssl module
Type: resource usage Stage: resolved
Components: Extension Modules Versions: Python 3.4
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Py_Finalize() doesn't clear all Python objects at exit
View: 1635741
Assigned To: Nosy List: Robin.Schreiber, asvetlov, gregory.p.smith, loewis, pitrou, vstinner
Priority: normal Keywords: pep3121

Created on 2012-08-14 18:42 by Robin.Schreiber, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
_hashopenssl_pep3121-384_v0.patch Robin.Schreiber, 2012-08-14 18:42
Messages (13)
msg168220 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2012-08-14 18:42
Changes proposed in PEP3121 and PEP384 have now been applied to the hashopenssl module!
msg168224 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-14 18:59
+    m = PyState_FindModule(&_hashlibmodule);
+    if(!m){
+        m = PyModule_Create(&_hashlibmodule);
+        if (m == NULL)
+            return NULL;
+    } else {
+        Py_INCREF(m);
+        return m;
+    }

Why is this dance needed?

+    if((void *)type->tp_dealloc == (void *)EVP_dealloc) {
+        Py_DECREF(type);
+    }

Why?
msg168225 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-14 19:01
Something else:

+#define _hashlibstate(o) ((_hashlibstate *)PyModule_GetState(o))

It is really bad style to #define a symbol that shadows another symbol.
Since it's a #define, I would expect to be named something like STATE(o).
msg168234 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2012-08-14 19:43
Regarding the macro definition, I would be fine with changing it to _hashlib_state.

The dance you have found inside the Init, makes shure that the very same module is returned if Init is called twice or multiple times, before the Module is unloaded. A month back, when I created this patch, I had statements such as test.import.import_fresh_module(...) call the Init-method multiple times, before a module was unloaded. This was apparently a bug, as I can no longer reproduce this behavior, but at that time I thought it was the expected behavior :-)

The last code snipped verifies, that we only dereference the type if the dealloc function is not being called from inside the subtype_dealloc function. This is necessary because the subtype_dealloc function itself contains a decref of the respective type object. Without this check, we would then end up decrefing the type too many times.
msg168235 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-14 19:49
> The last code snipped verifies, that we only dereference the type if
> the dealloc function is not being called from inside the
> subtype_dealloc function. This is necessary because the
> subtype_dealloc function itself contains a decref of the respective
> type object. Without this check, we would then end up decrefing the
> type too many times.

I still don't understand why it is required. You shouldn't have to
decref the type at all. Otherwise, it is a bug somewhere in Python
(typeobject.c perhaps).
msg168237 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2012-08-14 19:55
Well, as I have changed the static type to a HeapType (as I am now using the stable ABI from PEP 384 for this type), we have to start perfoming proper reference counting with this object. This includes increfing the type in case a new object of that type is created, and decrefing if such an object is being deallocated.
As of now, I do not know of HeapTypes being excluded from refcounting.
msg168239 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-14 20:04
Le mardi 14 août 2012 à 19:55 +0000, Robin Schreiber a écrit :
> As of now, I do not know of HeapTypes being excluded from refcounting.

No, but see http://bugs.python.org/issue15142
It's not obvious to me that the type should be explicitly decref'ed from the tp_dealloc function. This will make things more complicated for people willing to migrate their extensions to the stable ABI.

Also, all this is not documented at all :-(
msg168240 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2012-08-14 20:16
As subtype_dealloc decref'ed the HeapType I thought the dealloc method was the most appropriate place to decrement the refcount of the type.
However you still agree that these types need to be recounted properly, don't you? In that case, where would you place the decref of the type?
I have talked with Martin v. Löwis about this issue, and I think he was quite comfortable with placing the decref inside the dealloc method.
msg168241 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-14 20:21
Le mardi 14 août 2012 à 20:16 +0000, Robin Schreiber a écrit :
> However you still agree that these types need to be recounted
> properly, don't you?

Yes, of course.

> In that case, where would you place the decref of the type?

That's a good question. Perhaps the DECREF mechanism / typeobject.c
should do it. Or perhaps PyType_FromSpec should place a stub which would
call the actual dealloc.
msg168449 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-08-17 17:30
Antoine: Py_DECREF calls tp_dealloc directly, so the type needs to be DECREFed in the course of tp_dealloc. I don't think there is any alternative to that.

One may wonder why regular extension types don't do that: this is because of a hack that excludes static (non-heap) types from being reference counted in their instances. Heap types do refcount their types, consequently, subtype_dealloc also DECREFs the type.

I certainly agree that this is muddy, in particular when it comes to subtyping where the derived subtype calls the base tp_dealloc. In an ideal world, object_dealloc would decref the type, and subtypes would be required to call the base type's dealloc. However, I feel that this cannot be changed before Python 4.

So I'd propose that it is actually the leaf subtype which decrefs ob_type. The check whether you are the leaf type is then done by checking whether tp_dealloc is the one you are "in" right now.
msg168527 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-18 20:02
> So I'd propose that it is actually the leaf subtype which decrefs
> ob_type. The check whether you are the leaf type is then done by
> checking whether tp_dealloc is the one you are "in" right now.

By leaf, you mean the derived subtype? That sounds a bit quirky (you
need the aforementioned "if"), how about the base heap type instead?

Speaking of which, what does this refactoring bring exactly? The type
declarations are slightly less verbose, but other than that, is there a
point? (using the stable ABI doesn't seem to provide any benefit for
standard extension modules; besides, using PyType_FromSpec is only a
small part of complying with the stable ABI)
msg168538 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-08-18 23:00
> By leaf, you mean the derived subtype? That sounds a bit quirky (you
> need the aforementioned "if"), how about the base heap type instead?

I think this fails (currently) because a subtype defined in Python using
a "class" statement will automatically get subtype_dealloc as its dealloc
function, which will in turn unconditionally decrefs the type (after
calling the tp_dealloc function from its nearest ancestor which doesn't
use subtype_dealloc).

> Speaking of which, what does this refactoring bring exactly? The type
> declarations are slightly less verbose, but other than that, is there a
> point?

It's really about the PEP 3121 changes: modules shouldn't have any
global variables shared across interpreters, so that module cleanup
can work properly. Ultimately, this can lead to gc-based shutdown,
module unloading, and better separation of interpreters.

In addition, it further reduces the differences between "extension
types" and "classes" (which supposedly were removed by dropping
old-style classes, yet some inconsistencies remain where the
interpreter offers some features only to heaptypes).
msg381430 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-19 14:14
commit 46f59ebd01e22cc6a56fd0691217318c1d801a49
Author: Christian Heimes <christian@python.org>
Date:   Wed Nov 18 16:12:13 2020 +0100

    bpo-1635741: Port _hashlib to multiphase initialization (GH-23358)
    
    Signed-off-by: Christian Heimes <christian@python.org>


See bpo-41111 "Convert a few stdlib extensions to the limited C API (PEP 384)".
History
Date User Action Args
2022-04-11 14:57:34adminsetgithub: 59858
2020-11-19 14:14:40vstinnersetstatus: open -> closed

superseder: Py_Finalize() doesn't clear all Python objects at exit

nosy: + vstinner
messages: + msg381430
resolution: duplicate
stage: patch review -> resolved
2012-11-08 13:28:05Robin.Schreibersetkeywords: + pep3121, - patch
2012-08-27 03:41:57belopolskylinkissue15787 dependencies
2012-08-18 23:13:32gsteinsetnosy: - gstein
2012-08-18 23:00:23loewissetmessages: + msg168538
2012-08-18 20:02:49pitrousetmessages: + msg168527
2012-08-17 17:30:29loewissetmessages: + msg168449
2012-08-17 16:37:05asvetlovsetnosy: + asvetlov
2012-08-14 20:21:34pitrousetmessages: + msg168241
2012-08-14 20:16:13Robin.Schreibersetmessages: + msg168240
2012-08-14 20:04:19pitrousetmessages: + msg168239
2012-08-14 19:55:54Robin.Schreibersetmessages: + msg168237
2012-08-14 19:49:27pitrousetmessages: + msg168235
2012-08-14 19:43:58Robin.Schreibersetmessages: + msg168234
2012-08-14 19:01:21pitrousetmessages: + msg168225
2012-08-14 18:59:41pitrousetstage: patch review
2012-08-14 18:59:26pitrousetnosy: + gregory.p.smith, loewis, pitrou
messages: + msg168224
2012-08-14 18:42:13Robin.Schreibercreate