Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deques to adopt the standard clearing procedure for mutable objects #69322

Closed
rhettinger opened this issue Sep 15, 2015 · 15 comments
Closed

Deques to adopt the standard clearing procedure for mutable objects #69322

rhettinger opened this issue Sep 15, 2015 · 15 comments
Assignees
Labels
extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@rhettinger
Copy link
Contributor

BPO 25135
Nosy @tim-one, @rhettinger, @vstinner, @serhiy-storchaka
Files
  • deque_nonreentrant_clear.diff: Make the deque empty before clearing blocks.
  • deque_nonreentrant_clear2.diff: Fix final freeblock() call
  • deque_nonreentrant_clear3.diff: Don't create new block and don't use alternate method
  • deque_nonreentrant_clear4.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2015-09-26.07:54:18.746>
    created_at = <Date 2015-09-15.23:12:56.641>
    labels = ['extension-modules', 'type-bug']
    title = 'Deques to adopt the standard clearing procedure for mutable objects'
    updated_at = <Date 2015-09-26.07:54:18.744>
    user = 'https://github.com/rhettinger'

    bugs.python.org fields:

    activity = <Date 2015-09-26.07:54:18.744>
    actor = 'rhettinger'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2015-09-26.07:54:18.746>
    closer = 'rhettinger'
    components = ['Extension Modules']
    creation = <Date 2015-09-15.23:12:56.641>
    creator = 'rhettinger'
    dependencies = []
    files = ['40477', '40486', '40494', '40572']
    hgrepos = []
    issue_num = 25135
    keywords = ['patch']
    message_count = 15.0
    messages = ['250811', '250876', '250882', '250883', '250890', '250913', '250915', '251577', '251578', '251579', '251592', '251595', '251596', '251638', '251640']
    nosy_count = 5.0
    nosy_names = ['tim.peters', 'rhettinger', 'vstinner', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue25135'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6']

    @rhettinger
    Copy link
    Contributor Author

    The clear method for deques is possibly reentrant. Use the safer technique used in listobject.c, setobject.c, and dictobject.c.

    @rhettinger rhettinger self-assigned this Sep 15, 2015
    @rhettinger rhettinger added the extension-modules C modules in the Modules dir label Sep 15, 2015
    @serhiy-storchaka
    Copy link
    Member

    I left comments on Rietveld. Perhaps you missed Rietveld messages in the Spam folder.

    @rhettinger
    Copy link
    Contributor Author

    How does newblock() mutate the deque? Does PyMem_Malloc() do anything other than call malloc()? We hold the GIL so there doesn't seem to be a need to use PyMem_RawMalloc().

    @vstinner
    Copy link
    Member

    "Does PyMem_Malloc() do anything other than call malloc()? We hold the GIL so there doesn't seem to be a need to use PyMem_RawMalloc()."

    Well, the difference between PyMem_Malloc() and PyMem_RawMalloc() is subtle. Right now, it should only impact debug tools hooking on the memory allocators.

    But one day I would like to try to modify PyMem_Malloc() to reuse PyObject_Malloc(). I don't see why PyMem_Malloc() shouldn't be optimized for small applications. But it's hard to benchmark such change, and it depends on the platform. I already had to use different config for memory allocators depending on the platform :-/ The implementation of tracemalloc uses a different overallocation factor on Windows and on other platforms.

    @serhiy-storchaka
    Copy link
    Member

    Here is a patch that doesn't need newblock().

    @rhettinger
    Copy link
    Contributor Author

    Serhiy contends that calling newblock() can mutate the deque (its only external call is PyMem_Malloc()). I'm trying understand how that could be possible if it is just a thin wrapper around malloc.

    Before gumming-up the code in an effort to avoid calling newblock(), I want to understand whether there is an actual issue here and if so whether PyMem_RawMalloc() would solve the problem directly.

    @serhiy-storchaka
    Copy link
    Member

    I perhaps were overcautious. There is a difference between the case of deque and the case of lru_cache and OrderedDict. The latter creates Python object (PyDict_New) that can trigger garbage collecting, and the former calls only PyMem_Malloc. The latter can cause a crash in common use, and the former only with special memory allocation hooks (Victor, perhaps we should disable GC for executing hooks).

    But my version of the patch solves not only this issue. It eliminates the use of possibly re-entrant alternate method.

    @rhettinger
    Copy link
    Contributor Author

    Victor, I was thinking of switching to PyMem_RawMalloc instead of PyMem_Malloc. The advantages are that there are guaranteed to be no side-effects, the GIL doesn't need to be held, and there is less overhead. Are there any disadvantage I should know about?

    Serhiy, I'm still considering your patch. I have a strong aversion to putting a big block on the stack and don't like having to do a full block memcpy every time a deque is cleared on deallocation. If you don't see an actual bug in patch 2, I'm inclined to use that approach (especially since the popping fallback approach will likely never be called in practice).

    @vstinner
    Copy link
    Member

    "Victor, I was thinking of switching to PyMem_RawMalloc instead of PyMem_Malloc. The advantages are that there are guaranteed to be no side-effects, the GIL doesn't need to be held, and there is less overhead. Are there any disadvantage I should know about?"

    For me, PyMem_Malloc() can be faster than PyMem_RawMalloc() because the GIL is held. In practice... both function are very thin wrapper to malloc(), so it's exactly the same :-) In Objects/obmalloc.c, you have:

    #define PYRAW_FUNCS _PyMem_RawMalloc, _PyMem_RawCalloc, _PyMem_RawRealloc, _PyMem_RawFree
    ...
    #define PYMEM_FUNCS PYRAW_FUNCS

    So PyMem_Malloc() and PyMem_RawMalloc() are *currently* exactly the same.

    I'm not sure that you understand what you are trying to do. If you expect better performances, you should try to use PyObject_Malloc() instead, to benefit from the fast Python allocator for small allocations (512 bytes and less).

    @serhiy-storchaka
    Copy link
    Member

    Here is a patch that copies only used part of the block.

    I don't see an actual bug in patch 2 besides that it fallbacks to current popping behavior.

    @rhettinger
    Copy link
    Contributor Author

    I'm not sure that you understand what you are trying to do.

    The goal is to avoid possible re-entrancy both now and in the future. Since PyMem_RawMalloc is thread-safe and doesn't require the GIL to be held, it is guaranteed that there won't be a callback into regular Python that could mutate any of the data structures.

    @tim-one
    Copy link
    Member

    tim-one commented Sep 25, 2015

    The only way to be certain you're never going to face re-entrancy issues in the future is to call malloc() directly - and hope nobody redefines that too with some goofy macro ;-)

    In the meantime, stick to PyMem_Malloc(). That's the intended way for code holding the GIL to do lowest-level memory allocation. Uses of PyMem_RawMalloc() should be extremely rare, typically only in contexts where Python internals _know_ the GIL isn't being held, and can't reasonably try to acquire the GIL. It's already being used in a few contexts where it probably shouldn't be, and each such use needlessly complicates future changes.

    If PyMem_Malloc() does grow re-entrancy issues in the future, deques will be the least of our problems. I'd strongly oppose it before then. It's _intended_ to be as low level as possible (it was created to begin with just to worm around cross-platform insanity when called with a 0 argument).

    @rhettinger
    Copy link
    Contributor Author

    Thanks Timmy!

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 26, 2015

    New changeset 005590a4a005 by Raymond Hettinger in branch '3.5':
    Issue bpo-25135: Avoid possible reentrancy issues in deque_clear.
    https://hg.python.org/cpython/rev/005590a4a005

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 26, 2015

    New changeset fc6d62db8d42 by Raymond Hettinger in branch '2.7':
    Issue bpo-25135: Avoid possible reentrancy issues in deque_clear.
    https://hg.python.org/cpython/rev/fc6d62db8d42

    @rhettinger rhettinger added the type-bug An unexpected behavior, bug, or error label Sep 26, 2015
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants