Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release GIL on mmap dealloc #80320

Closed
sorcio mannequin opened this issue Feb 27, 2019 · 18 comments
Closed

release GIL on mmap dealloc #80320

sorcio mannequin opened this issue Feb 27, 2019 · 18 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@sorcio
Copy link
Mannequin

sorcio mannequin commented Feb 27, 2019

BPO 36139
Nosy @vstinner, @tiran, @benjaminp, @ezio-melotti, @methane, @sorcio, @applio
PRs
  • bpo-36139: release GIL around munmap() #12073
  • Revert "closes bpo-36139: release GIL around munmap(). (GH-12073)" #12198
  • bpo-36139: release GIL around munmap() #12199
  • Files
  • unmap.c
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-03-06.17:09:28.233>
    created_at = <Date 2019-02-27.18:00:41.829>
    labels = ['3.8', '3.7', 'library', 'performance']
    title = 'release GIL on mmap dealloc'
    updated_at = <Date 2019-03-07.11:41:30.871>
    user = 'https://github.com/sorcio'

    bugs.python.org fields:

    activity = <Date 2019-03-07.11:41:30.871>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-03-06.17:09:28.233>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2019-02-27.18:00:41.829>
    creator = 'davide.rizzo'
    dependencies = []
    files = ['48177']
    hgrepos = []
    issue_num = 36139
    keywords = ['patch']
    message_count = 18.0
    messages = ['336775', '336813', '336817', '336819', '336827', '336829', '337317', '337320', '337322', '337323', '337324', '337325', '337326', '337333', '337334', '337355', '337386', '337387']
    nosy_count = 7.0
    nosy_names = ['vstinner', 'christian.heimes', 'benjamin.peterson', 'ezio.melotti', 'methane', 'davide.rizzo', 'davin']
    pr_nums = ['12073', '12198', '12199']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue36139'
    versions = ['Python 3.7', 'Python 3.8']

    @sorcio
    Copy link
    Mannequin Author

    sorcio mannequin commented Feb 27, 2019

    munmap() can take a long time. I think mmap_object_dealloc can trivially release the GIL around this operation. Something similar was already mentioned in https://bugs.python.org/issue1572968 but a general patch was never provided. The dealloc case alone is significant enough to deserve fixing.

    @sorcio sorcio mannequin added 3.8 only security fixes stdlib Python modules in the Lib dir labels Feb 27, 2019
    @methane
    Copy link
    Member

    methane commented Feb 28, 2019

    Your patch is modifying obmalloc.
    It uses MAP_PRIVATE|MAP_ANONYMOUS mmap.
    Does it really take log time? How?

    bpo-1572968 is different. It is mmap module. It can use file-backed mmap. It is I/O.

    @methane
    Copy link
    Member

    methane commented Feb 28, 2019

    Sorry, I messed up. Your PR is changing mmap module, not obmalloc.

    @sorcio
    Copy link
    Mannequin Author

    sorcio mannequin commented Feb 28, 2019

    Yes, this is mmap module. I found this to be slow for posix-shm-backed mmaps. Several milliseconds, like 20ms for a 128 MB object. Maybe the same can happen with private|anon mmaps? I will follow up with more numbers.

    @sorcio
    Copy link
    Mannequin Author

    sorcio mannequin commented Feb 28, 2019

    munmap() of private maps is usually pretty fast but not negligible (2 ms for 1GB). Shared maps are much slower. For some reason, shared maps explicitly backed by POSIX shared memory stand in between but are still pretty slow.

    If someone cares about file-backed mmaps I can test those too. I thought this is already significant to justify releasing the GIL.

    This is on Linux 4.4:

    shared anon mmap 1048576 bytes
    mmap time 6,393 ns
    write time 449,062 ns
    munmap time 100,205 ns

    private anon mmap 1048576 bytes
    mmap time 2,168 ns
    write time 308,966 ns
    munmap time 36,930 ns

    posix shm + mmap 1048576 bytes
    mmap time 13,299 ns
    write time 369,305 ns
    close time 1,545 ns
    munmap time 26,759 ns

    shared anon mmap 134217728 bytes
    mmap time 4,641 ns
    write time 64,508,536 ns
    munmap time 13,592,556 ns

    private anon mmap 134217728 bytes
    mmap time 6,116 ns
    write time 25,402,084 ns
    munmap time 388,976 ns

    posix shm + mmap 134217728 bytes
    mmap time 29,034 ns
    write time 66,826,645 ns
    close time 3,707 ns
    munmap time 3,475,977 ns

    shared anon mmap 1073741824 bytes
    mmap time 11,127 ns
    write time 508,227,373 ns
    munmap time 94,885,306 ns

    private anon mmap 1073741824 bytes
    mmap time 7,133 ns
    write time 199,933,903 ns
    munmap time 2,361,036 ns

    posix shm + mmap 1073741824 bytes
    mmap time 24,868 ns
    write time 527,566,819 ns
    close time 4,015 ns
    munmap time 21,179,674 ns

    @tiran
    Copy link
    Member

    tiran commented Feb 28, 2019

    The change sounds like a good idea and should be backported, too.

    IIRC mmap() performance also depends on MMU and TLB speed. In the past I have seen paravirtualized systems with poor MMU performance that caused fork() to be slow and Redis to hang.

    @tiran tiran added the 3.7 (EOL) end of life label Feb 28, 2019
    @benjaminp
    Copy link
    Contributor

    New changeset bb9593a by Benjamin Peterson (Davide Rizzo) in branch 'master':
    closes bpo-36139: release GIL around munmap(). (GH-12073)
    bb9593a

    @vstinner
    Copy link
    Member

    vstinner commented Mar 6, 2019

    This change broke the Windows 7 buildbot:
    https://buildbot.python.org/all/#/builders/40/builds/1745

    Example:

    0:01:34 [ 66/420/1] test_mmap crashed (Exit code 2147483651) -- running: test_tools (1 min 22 sec)
    Fatal Python error: Python memory allocator called without holding the GIL

    Current thread 0x00000de4 (most recent call first):
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_mmap.py", line 645 in test_crasher_on_windows
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\case.py", line 642 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\case.py", line 702 in __call__
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 122 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 84 in __call__
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 122 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 84 in __call__
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 122 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 84 in __call__
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\runner.py", line 176 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\init.py", line 1968 in _run_suite
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\init.py", line 2064 in run_unittest
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\runtest.py", line 178 in test_runner
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\runtest.py", line 182 in runtest_inner
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\runtest.py", line 127 in runtest
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\runtest_mp.py", line 68 in run_tests_worker
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\main.py", line 600 in _main
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\main.py", line 586 in main
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\main.py", line 640 in main
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\regrtest.py", line 46 in _main
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\regrtest.py", line 50 in <module>
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\runpy.py", line 85 in _run_code
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\runpy.py", line 192 in _run_module_as_main
    Windows fatal exception: code 0x80000003

    Current thread 0x00000de4 (most recent call first):
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_mmap.py", line 645 in test_crasher_on_windows
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\case.py", line 642 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\case.py", line 702 in __call__
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 122 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 84 in __call__
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 122 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 84 in __call__
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 122 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py", line 84 in __call__
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\runner.py", line 176 in run
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\init.py", line 1968 in _run_suite
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\init.py", line 2064 in run_unittest
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\runtest.py", line 178 in test_runner
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\runtest.py", line 182 in runtest_inner
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\runtest.py", line 127 in runtest
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\runtest_mp.py", line 68 in run_tests_worker
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\main.py", line 600 in _main
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\main.py", line 586 in main
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\libregrtest\main.py", line 640 in main
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\regrtest.py", line 46 in _main
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\regrtest.py", line 50 in <module>
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\runpy.py", line 85 in _run_code
    File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\runpy.py", line 192 in _run_module_as_main

    @vstinner vstinner reopened this Mar 6, 2019
    @vstinner
    Copy link
    Member

    vstinner commented Mar 6, 2019

    @vstinner
    Copy link
    Member

    vstinner commented Mar 6, 2019

    The bug:
    https://github.com/python/cpython/pull/12073/files#r263026496
    (correct link, sorry)

    --

    If nobody comes up quickly with a fix, I will revert the change to repair Windows buildbots: PR 12198, as part of the buildbot policy ("revert on failure"). I don't have the bandwidth to dig into this failure.
    https://pythondev.readthedocs.io/ci.html#revert-on-fail

    @vstinner
    Copy link
    Member

    vstinner commented Mar 6, 2019

    """
    r263026496">https://github.com/python/cpython/pull/12073/files#r263026496
    """

    Oh wow, that's really strange. I'm sure that I wrote "https://..." URL but my URL became "r263026496">https://..." !?

    3rd attempt to post the link:
    https://github.com/python/cpython/pull/12073/files#r263026496

    Anyway, my link points to my comment:

    """

    PyMem_Free(m_obj->tagname) is called below without holding the GIL: that's illegal. Other move the call to free when the GIL is hold again, or use PyMem_RawFree().

    "Warning: The GIL must be held when using these functions. "
    https://docs.python.org/dev/c-api/memory.html#memory-interface

    Note: You can use PYTHONMALLOC=debug or -X dev to reproduce the issue on a Python compiled in release mode.
    """

    On the #ifdef MS_WINDOWS path of mmap_object_dealloc().

    @sorcio
    Copy link
    Mannequin Author

    sorcio mannequin commented Mar 6, 2019

    Thanks Victor. I think this fixes it #12199

    @sorcio
    Copy link
    Mannequin Author

    sorcio mannequin commented Mar 6, 2019

    BTW should this be backported to 3.7?

    @vstinner
    Copy link
    Member

    vstinner commented Mar 6, 2019

    New changeset dc07894 by Victor Stinner (Davide Rizzo) in branch 'master':
    bpo-36139: Fix mmap_object_dealloc(): hold the GIL to call PyMem_Free() (GH-12199)
    dc07894

    @vstinner
    Copy link
    Member

    vstinner commented Mar 6, 2019

    The second commit does fix the regression (I tested manually on Windows with Python compiled in debug mode).

    @vstinner vstinner closed this as completed Mar 6, 2019
    @ezio-melotti
    Copy link
    Member

    Oh wow, that's really strange. I'm sure that I wrote "https://..." URL but my URL became "r263026496">https://..." !?

    The links are now fixed (Roundup was getting confused by the rNNNNNN, since it looks like a SVN revision).

    @ezio-melotti ezio-melotti added the performance Performance or resource usage label Mar 7, 2019
    @vstinner
    Copy link
    Member

    vstinner commented Mar 7, 2019

    4th attempt (sorry for the spam, it's just to check the Roundup fix!):
    https://github.com/python/cpython/pull/12073/files#r263026496

    @vstinner
    Copy link
    Member

    vstinner commented Mar 7, 2019

    Yeah! Roundup is fixed, thanks Ezio!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants