Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace uuid ctypes usage with an extension module. #64718

Closed
gustavo mannequin opened this issue Feb 5, 2014 · 17 comments
Closed

Replace uuid ctypes usage with an extension module. #64718

gustavo mannequin opened this issue Feb 5, 2014 · 17 comments
Labels
3.7 (EOL) end of life topic-ctypes type-feature A feature request or enhancement

Comments

@gustavo
Copy link
Mannequin

gustavo mannequin commented Feb 5, 2014

BPO 20519
Nosy @amauryfa, @abalkin, @pitrou, @vstinner, @ned-deily, @bitdancer, @meadori, @serhiy-storchaka
PRs
  • bpo-11063, bpo-20519: avoid ctypes and improve import time for uuid #3796
  • Superseder
  • bpo-11063: Rework uuid module: lazy initialization and add a new C extension
  • Files
  • gc.py: test program
  • uuid-no-ctypes.diff
  • issue20519_10941v2.diff
  • uuid.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-09-28.13:29:10.292>
    created_at = <Date 2014-02-05.11:52:23.220>
    labels = ['ctypes', 'type-feature', '3.7']
    title = 'Replace uuid ctypes usage with an extension module.'
    updated_at = <Date 2017-09-28.21:03:08.306>
    user = 'https://bugs.python.org/gustavo'

    bugs.python.org fields:

    activity = <Date 2017-09-28.21:03:08.306>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-09-28.13:29:10.292>
    closer = 'vstinner'
    components = ['ctypes']
    creation = <Date 2014-02-05.11:52:23.220>
    creator = 'gustavo'
    dependencies = []
    files = ['33925', '33932', '40865', '40874']
    hgrepos = []
    issue_num = 20519
    keywords = ['patch']
    message_count = 17.0
    messages = ['210306', '210316', '210320', '210322', '210328', '210331', '210335', '210346', '251509', '251510', '253493', '253494', '253511', '253598', '303219', '303233', '303280']
    nosy_count = 9.0
    nosy_names = ['amaury.forgeotdarc', 'belopolsky', 'gustavo', 'pitrou', 'vstinner', 'ned.deily', 'r.david.murray', 'meador.inge', 'serhiy.storchaka']
    pr_nums = ['3796']
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'resolved'
    status = 'closed'
    superseder = '11063'
    type = 'enhancement'
    url = 'https://bugs.python.org/issue20519'
    versions = ['Python 3.7']

    @gustavo
    Copy link
    Mannequin Author

    gustavo mannequin commented Feb 5, 2014

    If you try the attached program, you will find that for every iteration the uuid.uuid4() call generates objects that contain reference cycles and need the help of the garbage collector. This is not nice. If I make the ctypes module not able to import, then no garbage is generated.

    This problem appears in 2.7, 3.3, and 3.4, at least.

    @gustavo gustavo mannequin added stdlib Python modules in the Lib dir performance Performance or resource usage labels Feb 5, 2014
    @bitdancer
    Copy link
    Member

    I'm not sure that this is a real problem, but have you identified what the cycle is?

    @gustavo
    Copy link
    Mannequin Author

    gustavo mannequin commented Feb 5, 2014

    Well, this isn't a big problem, but I have an application that needs to run with the GC disabled, since it causes pauses of a couple of seconds each time a full collection runs (we have a few million objects allocated). I will run the GC only once every 3 hours. So it would be nice if this uuid call didn't generate cycles.

    No, I didn't identify the cycle. All I know is that the garbage below is produced. If the ctypes module is unavailable, uuid still works but doesn't generate the garbage.

    >>> gc.garbage
    [(<type '_ctypes.Array'>,), {'raw': <attribute 'raw' of 'c_char_Array_16' objects>, '__module__': 'ctypes', '__dict__': <attribute '__dict__' of 'c_char_Array_16' objects>, '__weakref__': <attribute '__weakref__' of 'c_char_Array_16' objects>, '_length_': 16, '_type_': <class 'ctypes.c_char'>, '__doc__': None, 'value': <attribute 'value' of 'c_char_Array_16' objects>}, <class 'ctypes.c_char_Array_16'>, <attribute '__dict__' of 'c_char_Array_16' objects>, <attribute '__weakref__' of 'c_char_Array_16' objects>, (<class 'ctypes.c_char_Array_16'>, <type '_ctypes.Array'>, <type '_ctypes._CData'>, <type 'object'>), <attribute 'raw' of 'c_char_Array_16' objects>, <attribute 'value' of 'c_char_Array_16' objects>, (<type '_ctypes.Array'>,), {'raw': <attribute 'raw' of 'c_char_Array_16' objects>, '__module__': 'ctypes', '__dict__': <attribute '__dict__' of 'c_char_Array_16' objects>, '__weakref__': <attribute '__weakref__' of 'c_char_Array_16' objects>, '_length_': 16, '_type_': <class 'ctypes.c_char'>, '__doc__': None, 'value': <attribute 'value' of 'c_char_Array_16' objects>}, <class 'ctypes.c_char_Array_16'>, <attribute '__dict__' of 'c_char_Array_16' objects>, <attribute '__weakref__' of 'c_char_Array_16' objects>, (<class 'ctypes.c_char_Array_16'>, <type '_ctypes.Array'>, <type '_ctypes._CData'>, <type 'object'>), <attribute 'raw' of 'c_char_Array_16' objects>, <attribute 'value' of 'c_char_Array_16' objects>, (<type '_ctypes.Array'>,), {'raw': <attribute 'raw' of 'c_char_Array_16' objects>, '__module__': 'ctypes', '__dict__': <attribute '__dict__' of 'c_char_Array_16' objects>, '__weakref__': <attribute '__weakref__' of 'c_char_Array_16' objects>, '_length_': 16, '_type_': <class 'ctypes.c_char'>, '__doc__': None, 'value': <attribute 'value' of 'c_char_Array_16' objects>}, <class 'ctypes.c_char_Array_16'>, <attribute '__dict__' of 'c_char_Array_16' objects>, <attribute '__weakref__' of 'c_char_Array_16' objects>, (<class 'ctypes.c_char_Array_16'>, <type '_ctypes.Array'>, <type '_ctypes._CData'>, <type 'object'>), <attribute 'raw' of 'c_char_Array_16' objects>, <attribute 'value' of 'c_char_Array_16' objects>, (<type '_ctypes.Array'>,), {'raw': <attribute 'raw' of 'c_char_Array_16' objects>, '__module__': 'ctypes', '__dict__': <attribute '__dict__' of 'c_char_Array_16' objects>, '__weakref__': <attribute '__weakref__' of 'c_char_Array_16' objects>, '_length_': 16, '_type_': <class 'ctypes.c_char'>, '__doc__': None, 'value': <attribute 'value' of 'c_char_Array_16' objects>}, <class 'ctypes.c_char_Array_16'>, <attribute '__dict__' of 'c_char_Array_16' objects>, <attribute '__weakref__' of 'c_char_Array_16' objects>, (<class 'ctypes.c_char_Array_16'>, <type '_ctypes.Array'>, <type '_ctypes._CData'>, <type 'object'>), <attribute 'raw' of 'c_char_Array_16' objects>, <attribute 'value' of 'c_char_Array_16' objects>]

    @gustavo
    Copy link
    Mannequin Author

    gustavo mannequin commented Feb 5, 2014

    I have narrowed it down to one line of code:
    ctypes.create_string_buffer(16)

    That is enough to create 7 objects that have reference cycles.

    [<class 'ctypes.c_char_Array_16'>, {'__module__': 'ctypes', '__doc__': None, '__weakref__': <attribute '__weakref__' of 'c_char_Array_16' objects>, 'raw': <attribute 'raw' of 'c_char_Array_16' objects>, '_length_': 16, '_type_': <class 'ctypes.c_char'>, 'value': <attribute 'value' of 'c_char_Array_16' objects>, '__dict__': <attribute '__dict__' of 'c_char_Array_16' objects>}, (<class 'ctypes.c_char_Array_16'>, <class '_ctypes.Array'>, <class '_ctypes._CData'>, <class 'object'>), <attribute '__weakref__' of 'c_char_Array_16' objects>, <attribute 'raw' of 'c_char_Array_16' objects>, <attribute 'value' of 'c_char_Array_16' objects>, <attribute '__dict__' of 'c_char_Array_16' objects>]

    So maybe the bug is in ctypes itself, not the uuid module.

    @bitdancer
    Copy link
    Member

    Yes, I was pretty sure it was in cytpes, from looking at the UUID code.

    @bitdancer bitdancer added topic-ctypes and removed stdlib Python modules in the Lib dir labels Feb 5, 2014
    @bitdancer bitdancer changed the title uuid.uuid4().hex generates garbage when ctypes available ctypes.create_string_buffer creates reference cycles Feb 5, 2014
    @gustavo
    Copy link
    Mannequin Author

    gustavo mannequin commented Feb 5, 2014

    Regardless, if you don't mind, take this patch for Python 3.5 to avoid ctypes, at least in the Linux case (I don't have Windows to test). Creating a proper extension module is safer and really not that hard...

    @bitdancer
    Copy link
    Member

    Thanks. This looks like a good idea, but I'll leave it to someone with more experience with this module to review it.

    Let's change this issue to an enhancement request for uuid instead. If someone wants to work on the cycle-in-ctypes problem they can open a new issue.

    @bitdancer bitdancer changed the title ctypes.create_string_buffer creates reference cycles Replace uuid ctypes usage with an extension module. Feb 5, 2014
    @bitdancer bitdancer added type-feature A feature request or enhancement and removed performance Performance or resource usage labels Feb 5, 2014
    @serhiy-storchaka
    Copy link
    Member

    See also bpo-5885.

    @vstinner
    Copy link
    Member

    In the Python stdlib, we try to avoid ctypes to use instead a C extension module. So I like the idea of a new optional _uuid module.

    I reviewed the attached patch.

    @vstinner
    Copy link
    Member

    @Gustavo: Can you please take my remarks in account and rebase your change on the default branch?

    @gustavo
    Copy link
    Mannequin Author

    gustavo mannequin commented Oct 26, 2015

    This patch fixes the Mac OS X issue @Haypo pointed out.

    @gustavo
    Copy link
    Mannequin Author

    gustavo mannequin commented Oct 26, 2015

    One issue of note is regarding generate_time(). Originally I found ctypes bindings for this function, so I wrapped it as well in the extension module. However, it doesn't appear to be used...

    @vstinner
    Copy link
    Member

    New review (question for Windows).

    @gustavo
    Copy link
    Mannequin Author

    gustavo mannequin commented Oct 28, 2015

    New patch that:

    1. adds assert(sizeof(uuid_t) == 16); to the extension module;

    2. fixes the code path when ctypes has to be used instead of the extension module (needed a bit more refactoring, apologies if it makes the diff harder to read);

    3. Adjusts the uuid module tests to account for the possibility of ctypes not being imported.

    @pitrou pitrou added the 3.7 (EOL) end of life label Sep 28, 2017
    @pitrou
    Copy link
    Member

    pitrou commented Sep 28, 2017

    #3796 updates the patch for 3.7 and improves on it a bit by making initialization lazy.

    @vstinner
    Copy link
    Member

    There are too many uuid open issues proposing similar changes. I mark this issue as a duplicate of bpo-11063 to avoid splitted discussions. Please continue the discussion there!

    @pitrou
    Copy link
    Member

    pitrou commented Sep 28, 2017

    New changeset a106aec by Antoine Pitrou in branch 'master':
    bpo-11063, bpo-20519: avoid ctypes and improve import time for uuid (bpo-3796)
    a106aec

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life topic-ctypes type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants