Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 3121, 384 refactoring applied to hashopenssl module #59858

Closed
RobinSchreiber mannequin opened this issue Aug 14, 2012 · 13 comments
Closed

PEP 3121, 384 refactoring applied to hashopenssl module #59858

RobinSchreiber mannequin opened this issue Aug 14, 2012 · 13 comments
Labels
extension-modules C modules in the Modules dir performance Performance or resource usage

Comments

@RobinSchreiber
Copy link
Mannequin

RobinSchreiber mannequin commented Aug 14, 2012

BPO 15653
Nosy @loewis, @gpshead, @pitrou, @vstinner, @asvetlov
Superseder
  • bpo-1635741: Py_Finalize() doesn't clear all Python objects at exit
  • Files
  • _hashopenssl_pep3121-384_v0.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-11-19.14:14:40.589>
    created_at = <Date 2012-08-14.18:42:13.261>
    labels = ['extension-modules', 'performance']
    title = 'PEP 3121, 384 refactoring applied to hashopenssl module'
    updated_at = <Date 2020-11-19.14:14:40.588>
    user = 'https://bugs.python.org/RobinSchreiber'

    bugs.python.org fields:

    activity = <Date 2020-11-19.14:14:40.588>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-11-19.14:14:40.589>
    closer = 'vstinner'
    components = ['Extension Modules']
    creation = <Date 2012-08-14.18:42:13.261>
    creator = 'Robin.Schreiber'
    dependencies = []
    files = ['26807']
    hgrepos = []
    issue_num = 15653
    keywords = ['pep3121']
    message_count = 13.0
    messages = ['168220', '168224', '168225', '168234', '168235', '168237', '168239', '168240', '168241', '168449', '168527', '168538', '381430']
    nosy_count = 6.0
    nosy_names = ['loewis', 'gregory.p.smith', 'pitrou', 'vstinner', 'asvetlov', 'Robin.Schreiber']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'resolved'
    status = 'closed'
    superseder = '1635741'
    type = 'resource usage'
    url = 'https://bugs.python.org/issue15653'
    versions = ['Python 3.4']

    @RobinSchreiber
    Copy link
    Mannequin Author

    RobinSchreiber mannequin commented Aug 14, 2012

    Changes proposed in PEP-3121 and PEP-384 have now been applied to the hashopenssl module!

    @RobinSchreiber RobinSchreiber mannequin added extension-modules C modules in the Modules dir performance Performance or resource usage labels Aug 14, 2012
    @pitrou
    Copy link
    Member

    pitrou commented Aug 14, 2012

    + m = PyState_FindModule(&_hashlibmodule);
    + if(!m){
    + m = PyModule_Create(&_hashlibmodule);
    + if (m == NULL)
    + return NULL;
    + } else {
    + Py_INCREF(m);
    + return m;
    + }

    Why is this dance needed?

    + if((void *)type->tp_dealloc == (void *)EVP_dealloc) {
    + Py_DECREF(type);
    + }

    Why?

    @pitrou
    Copy link
    Member

    pitrou commented Aug 14, 2012

    Something else:

    +#define _hashlibstate(o) ((_hashlibstate *)PyModule_GetState(o))

    It is really bad style to #define a symbol that shadows another symbol.
    Since it's a #define, I would expect to be named something like STATE(o).

    @RobinSchreiber
    Copy link
    Mannequin Author

    RobinSchreiber mannequin commented Aug 14, 2012

    Regarding the macro definition, I would be fine with changing it to _hashlib_state.

    The dance you have found inside the Init, makes shure that the very same module is returned if Init is called twice or multiple times, before the Module is unloaded. A month back, when I created this patch, I had statements such as test.import.import_fresh_module(...) call the Init-method multiple times, before a module was unloaded. This was apparently a bug, as I can no longer reproduce this behavior, but at that time I thought it was the expected behavior :-)

    The last code snipped verifies, that we only dereference the type if the dealloc function is not being called from inside the subtype_dealloc function. This is necessary because the subtype_dealloc function itself contains a decref of the respective type object. Without this check, we would then end up decrefing the type too many times.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 14, 2012

    The last code snipped verifies, that we only dereference the type if
    the dealloc function is not being called from inside the
    subtype_dealloc function. This is necessary because the
    subtype_dealloc function itself contains a decref of the respective
    type object. Without this check, we would then end up decrefing the
    type too many times.

    I still don't understand why it is required. You shouldn't have to
    decref the type at all. Otherwise, it is a bug somewhere in Python
    (typeobject.c perhaps).

    @RobinSchreiber
    Copy link
    Mannequin Author

    RobinSchreiber mannequin commented Aug 14, 2012

    Well, as I have changed the static type to a HeapType (as I am now using the stable ABI from PEP-384 for this type), we have to start perfoming proper reference counting with this object. This includes increfing the type in case a new object of that type is created, and decrefing if such an object is being deallocated.
    As of now, I do not know of HeapTypes being excluded from refcounting.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 14, 2012

    Le mardi 14 août 2012 à 19:55 +0000, Robin Schreiber a écrit :

    As of now, I do not know of HeapTypes being excluded from refcounting.

    No, but see http://bugs.python.org/issue15142
    It's not obvious to me that the type should be explicitly decref'ed from the tp_dealloc function. This will make things more complicated for people willing to migrate their extensions to the stable ABI.

    Also, all this is not documented at all :-(

    @RobinSchreiber
    Copy link
    Mannequin Author

    RobinSchreiber mannequin commented Aug 14, 2012

    As subtype_dealloc decref'ed the HeapType I thought the dealloc method was the most appropriate place to decrement the refcount of the type.
    However you still agree that these types need to be recounted properly, don't you? In that case, where would you place the decref of the type?
    I have talked with Martin v. Löwis about this issue, and I think he was quite comfortable with placing the decref inside the dealloc method.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 14, 2012

    Le mardi 14 août 2012 à 20:16 +0000, Robin Schreiber a écrit :

    However you still agree that these types need to be recounted
    properly, don't you?

    Yes, of course.

    In that case, where would you place the decref of the type?

    That's a good question. Perhaps the DECREF mechanism / typeobject.c
    should do it. Or perhaps PyType_FromSpec should place a stub which would
    call the actual dealloc.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Aug 17, 2012

    Antoine: Py_DECREF calls tp_dealloc directly, so the type needs to be DECREFed in the course of tp_dealloc. I don't think there is any alternative to that.

    One may wonder why regular extension types don't do that: this is because of a hack that excludes static (non-heap) types from being reference counted in their instances. Heap types do refcount their types, consequently, subtype_dealloc also DECREFs the type.

    I certainly agree that this is muddy, in particular when it comes to subtyping where the derived subtype calls the base tp_dealloc. In an ideal world, object_dealloc would decref the type, and subtypes would be required to call the base type's dealloc. However, I feel that this cannot be changed before Python 4.

    So I'd propose that it is actually the leaf subtype which decrefs ob_type. The check whether you are the leaf type is then done by checking whether tp_dealloc is the one you are "in" right now.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 18, 2012

    So I'd propose that it is actually the leaf subtype which decrefs
    ob_type. The check whether you are the leaf type is then done by
    checking whether tp_dealloc is the one you are "in" right now.

    By leaf, you mean the derived subtype? That sounds a bit quirky (you
    need the aforementioned "if"), how about the base heap type instead?

    Speaking of which, what does this refactoring bring exactly? The type
    declarations are slightly less verbose, but other than that, is there a
    point? (using the stable ABI doesn't seem to provide any benefit for
    standard extension modules; besides, using PyType_FromSpec is only a
    small part of complying with the stable ABI)

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Aug 18, 2012

    By leaf, you mean the derived subtype? That sounds a bit quirky (you
    need the aforementioned "if"), how about the base heap type instead?

    I think this fails (currently) because a subtype defined in Python using
    a "class" statement will automatically get subtype_dealloc as its dealloc
    function, which will in turn unconditionally decrefs the type (after
    calling the tp_dealloc function from its nearest ancestor which doesn't
    use subtype_dealloc).

    Speaking of which, what does this refactoring bring exactly? The type
    declarations are slightly less verbose, but other than that, is there a
    point?

    It's really about the PEP-3121 changes: modules shouldn't have any
    global variables shared across interpreters, so that module cleanup
    can work properly. Ultimately, this can lead to gc-based shutdown,
    module unloading, and better separation of interpreters.

    In addition, it further reduces the differences between "extension
    types" and "classes" (which supposedly were removed by dropping
    old-style classes, yet some inconsistencies remain where the
    interpreter offers some features only to heaptypes).

    @vstinner
    Copy link
    Member

    commit 46f59eb
    Author: Christian Heimes <christian@python.org>
    Date: Wed Nov 18 16:12:13 2020 +0100

    bpo-1635741: Port _hashlib to multiphase initialization (GH-23358)
    
    Signed-off-by: Christian Heimes <christian@python.org>
    

    See bpo-41111 "Convert a few stdlib extensions to the limited C API (PEP-384)".

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants