Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbounded memory growth resizing split-table dicts #72334

Closed
minrk mannequin opened this issue Sep 14, 2016 · 51 comments
Closed

Unbounded memory growth resizing split-table dicts #72334

minrk mannequin opened this issue Sep 14, 2016 · 51 comments
Labels
3.7 (EOL) end of life deferred-blocker interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@minrk
Copy link
Mannequin

minrk mannequin commented Sep 14, 2016

BPO 28147
Nosy @vstinner, @larryhastings, @ned-deily, @methane, @berkerpeksag, @serhiy-storchaka, @minrk, @jamadden, @zhangyangyu
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Files
  • test-dict-pop.py: test illustrating dict resize memory leak
  • 0001-Avoid-unbounded-growth-in-dict_resize.patch: patch to fix memory growth with regression test
  • fix-28147.patch
  • fix-28147.patch: Add missing file
  • fix-28147-py35.patch
  • fix28147-py36.patch
  • fix28147-py36-2.patch
  • fix28147-py36-3.patch
  • fix28147-comment-update.patch
  • fix-28147-py35-2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-12-20.01:05:08.308>
    created_at = <Date 2016-09-14.11:44:13.094>
    labels = ['interpreter-core', 'deferred-blocker', '3.7', 'type-crash']
    title = 'Unbounded memory growth resizing split-table dicts'
    updated_at = <Date 2017-03-31.16:36:26.473>
    user = 'https://github.com/minrk'

    bugs.python.org fields:

    activity = <Date 2017-03-31.16:36:26.473>
    actor = 'dstufft'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-12-20.01:05:08.308>
    closer = 'methane'
    components = ['Interpreter Core']
    creation = <Date 2016-09-14.11:44:13.094>
    creator = 'minrk'
    dependencies = []
    files = ['44658', '44659', '44665', '44666', '44696', '45796', '45797', '45910', '45911', '45968']
    hgrepos = []
    issue_num = 28147
    keywords = ['patch']
    message_count = 51.0
    messages = ['276418', '276419', '276420', '276425', '276428', '276434', '276437', '276450', '276451', '276452', '276456', '276477', '276481', '276482', '276483', '276502', '276726', '276895', '276901', '276947', '277614', '282635', '282642', '282643', '282648', '282650', '282672', '282682', '282683', '282699', '282702', '282743', '283246', '283259', '283264', '283267', '283268', '283270', '283283', '283334', '283335', '283337', '283339', '283374', '283386', '283392', '283395', '283468', '283478', '283644', '283659']
    nosy_count = 11.0
    nosy_names = ['vstinner', 'larry', 'ned.deily', 'methane', 'python-dev', 'berker.peksag', 'serhiy.storchaka', 'minrk', 'jmadden', 'xiang.zhang', 'wenzel']
    pr_nums = ['552']
    priority = 'deferred blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue28147'
    versions = ['Python 3.5', 'Python 3.6', 'Python 3.7']

    @minrk
    Copy link
    Mannequin Author

    minrk mannequin commented Sep 14, 2016

    There is a memory leak in the new dictionary resizing in 3.6, which can cause memory exhaustion in just a few iterations.

    I don't fully understand the details of the bug, but it happens when resizing a dict with a split table several times. The only way that I have found to trigger this is by popping items off of an object's __dict__ repeatedly.

    I've attached a script to illustrate the issue. Be careful with it, because it will eat up all your memory if you don't interrupt it.

    @minrk minrk mannequin added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Sep 14, 2016
    @minrk
    Copy link
    Mannequin Author

    minrk mannequin commented Sep 14, 2016

    This patch fixes the memory leak in split-dict resizing.

    Each time dict_resize is called, it gets a new, larger size > minused. If this is triggered many times, it will keep growing in size by a factor of two each time, as the previous size is passed as minused for the next call.

    Set the lower bound at minused (inclusive), rather than exclusive, so that the size does not continue to increase for repeated calls.

    A test is added to test_dict.py based on the earlier test script, but if someone has a simpler way to trigger the split-dict resize events, I'd be happy to see it.

    @minrk minrk mannequin changed the title Memory leak in dictionary resize Memory leak in new 3.6 dictionary resize Sep 14, 2016
    @berkerpeksag
    Copy link
    Member

    Can you wrap the test with @support.cpython_only decorator? The patch fixes the memory leak demonstrated in test-dict-pop.py.

    @minrk
    Copy link
    Mannequin Author

    minrk mannequin commented Sep 14, 2016

    I can add the cpython_only decorator, but I'm not sure it is the right thing to do. I would expect the code in the test to pass on any Python implementation, which would suggest that it should not be cpython_only, right? If you still think so, I'll add it.

    @methane
    Copy link
    Member

    methane commented Sep 14, 2016

    I don't understand the leak yet.

    Each time dict_resize is called, it gets a new, larger size > minused. If this is triggered many times, it will keep growing in size by a factor of two each time, as the previous size is passed as minused for the next call.

    dictresize() is called for converting split table to combined table.
    How is it triggered many times?

    In your test code, which loop cause leak? new instance loop or re-use instance loop?

    @vstinner
    Copy link
    Member

    The CPython test suite uses a counter on memory allocations. Please add an unit test which triggers the memory leak, but you don't need many iterations. One iteartion should be enough to be catched by the unit test. Try: ./python -m test -R 3:3 test_dict.

    The memory allocation counter:
    https://docs.python.org/dev/library/sys.html#sys.getallocatedblocks

    Note: The tracemalloc module can also be used to track memory leak, but it's harder to use it to write unit tests because Python uses a lot of internal caches and tracemalloc really tracks every single bytes.

    @methane
    Copy link
    Member

    methane commented Sep 14, 2016

    Ah, is the leak happen in 3.6b1?
    dict.pop() in 3.6b1 doesn't combine split table. It was a bug and
    fixed in master branch already.

    Regardless the leak, I agree that grow dict when popping is bad idea.
    I'll improve your patch.

    @minrk
    Copy link
    Mannequin Author

    minrk mannequin commented Sep 14, 2016

    dictresize() is called for converting split table to combined table.
    How is it triggered many times?

    every self.__dict__.pop triggers a resize. According to https://www.python.org/dev/peps/pep-0412/#split-table-dictionaries obj.__dict__ is always a split-table dict. I do not understand the dict implementation enough to say precisely why, but pop forces a recombine via resize because split-table dicts don't support deletion. In dict_resize, due to a <=minused condition, the size is guaranteed to at least double every time dict_resize is called. It would appear that after this, __dict__ is again forced to be a split-table dict, though I'm not sure how or where this happens, but good old-fashioned printf debugging shows that dict_resize is called for every __dict__.pop because _PyDict_HasSplitTable is true every time pop is called.

    In your test code, which loop cause leak? new instance loop or re-use instance loop?

    Both loops cause the leak. If the pop_attr() is not in __init__, then only the re-used instance has the leak. if pop_attr is in __init__, then it happens across instances as well. I will try to add more comments in the code to make this clearer.

    Does anyone have a handy way to create a split-table dict other than on obj.__dict__?

    Please add an unit test which triggers the memory leak

    I should not have used the term memory leak, and have updated the title to be more precise. It is not memory allocated without a corresponding free, instead it is unbounded growth of the memory owned by a split-table dict. Cleaning up the object does indeed clean up the memory associated with it. The included test exercises the bug with several iterations. Running the test several times with only one iteration would not exercise the bug.

    @vstinner
    Copy link
    Member

    According to https://www.python.org/dev/peps/pep-0412/#split-table-dictionaries obj.__dict__ is always a split-table dict.

    Hum, this PEP is now probably outdated since Python 3.6 beta 1 :-)

    @minrk
    Copy link
    Mannequin Author

    minrk mannequin commented Sep 14, 2016

    Ah, is the leak happen in 3.6b1?

    The leak happens in 3.6b1 and master as of an hour ago (git: 3c06edf, hg:r103797)

    @minrk minrk mannequin changed the title Memory leak in new 3.6 dictionary resize Unbounded memory growth resizing split-table dicts Sep 14, 2016
    @minrk
    Copy link
    Mannequin Author

    minrk mannequin commented Sep 14, 2016

    I pulled just now and saw changes in dictobject.c, and just wanted to confirm the memory growth bug is still in changeset 56294e03ad89 (I think I used the right hash, this time).

    @methane
    Copy link
    Member

    methane commented Sep 14, 2016

    I confirmed and investigated it. Thanks!
    I'll post patch including more test in 24 hours.

    @zhangyangyu
    Copy link
    Member

    I confirmed and investigated it. Thanks!
    I'll post patch including more test in 24 hours.

    Please wait a second.

    The cause of this problem is that attribute setting causes a combined table to be splitted (the code path is [0]->[1]->[2]->[3]). So every time you set an attribute and then pop, it will do dictresize(right now it will increase memory usage). So even if you make pop() not increase memory usage, the example Min gives will still do much work not wanted (dictresize and make_keys_shared).

    Actually I think memory usage increasing is not a problem if other things are right. popitem() before does the same thing and should never cause a problem.

    [0] https://hg.python.org/cpython/file/tip/Python/ceval.c#l2248
    [1] https://hg.python.org/cpython/file/tip/Objects/object.c#l1119
    [2] https://hg.python.org/cpython/file/tip/Objects/object.c#l1125
    [3] https://hg.python.org/cpython/file/tip/Objects/object.c#l1172

    @zhangyangyu
    Copy link
    Member

    popitem() before does the same thing and should never cause a problem.

    Ahh, this seems not true. Test it in py3.5 also crash.

    @methane
    Copy link
    Member

    methane commented Sep 14, 2016

    This issue is caused by dictresize() and _PyObjectDict_SetItem()

    1. a.__dict__.pop('a') convert the dict to combined table which has double keysize.
    2. a.a = 1 converts the dict to split table again if there are no instances sharing key with class.

    As I wrote before, pop, popitem, and del should not increase key size.

    And _PyObjectDict_SetItem shouldn't convert the dict to split-table when the dict doesn't share keys
    with the class before calling PyDict_SetItem.

    @methane
    Copy link
    Member

    methane commented Sep 15, 2016

    xiang is right. Python 3.5 has same issue when using popitem().
    I'll make patch for 3.5. But it will be bit differ from patch for 3.6 and they will conflict.

    @methane
    Copy link
    Member

    methane commented Sep 16, 2016

    This is patch for Python 3.5.
    The patch uses more conservative approach.

    @zhangyangyu
    Copy link
    Member

    LGTM. But why this change?

    -#define ESTIMATE_SIZE(n) (((n)*3) >> 1)
    +#define ESTIMATE_SIZE(n) (((n)*3+1) >> 1)

    @methane
    Copy link
    Member

    methane commented Sep 18, 2016

    xiang:
    dictresize(d, n) may choose keysize==n (when n == 2**m) with this patch.

    If there are any integer a such as ESTIMATE_SIZE(a) == n and n == 2**m and USABLE_FRACTION(n) == a - 1,
    a items cannot be inserted into dict after dictresize(d, ESTIMATE_SIZE(a))

    This is why ESTIMATE_SIZE should round up fraction.

    @zhangyangyu
    Copy link
    Member

    Ahh, I see.

    If there are any integer a such as ESTIMATE_SIZE(a) == n and n == 2**m and USABLE_FRACTION(n) == a - 1.

    There are, such as 11, 43...

    a items cannot be inserted into dict after dictresize(d, ESTIMATE_SIZE(a))

    It can but needs another resize in insertdict which breaks the intention of ESTIMATE_SIZE.

    Then everything looks fine to me. :)

    @methane
    Copy link
    Member

    methane commented Sep 28, 2016

    haypo, Could you review fix-28147-py35.patch and fix-28147.patch ?

    Draft NEWS entry:

    • Issue bpo-28147: Fixed split table dict may consume unlimited memory.

    @ned-deily
    Copy link
    Member

    This issue seems to have slipped through. Should it be a release blocker for 3.6.0 final or can it wait for 3.6.1?

    @serhiy-storchaka
    Copy link
    Member

    Calling pop() for instance's __dict__ is not common operation. I expected that this bug does not affect any real code. But the case in bpo-28894 looks as a real case. This might be a release blocker.

    @methane
    Copy link
    Member

    methane commented Dec 7, 2016

    Ned Deily added the comment:

    This issue seems to have slipped through. Should it be a release blocker for 3.6.0 final or can it wait for 3.6.1?

    On Python 3.5, instance.__dict__.popitem() cause this issue.
    On Python 3.6, instance.__dict__.popitem() and instance.__dict__.pop()
    cause this issue.

    This is not new issue, but there is small regression.

    I don't know this should be fixed in 3.6.0.
    I'll make patch smaller anyway.

    @vstinner
    Copy link
    Member

    vstinner commented Dec 7, 2016

    In this case, I suggest to wait for 3.6.1 to fix it.

    @ned-deily
    Copy link
    Member

    Just follow the normal 3.6 branch which will ensure it gets into 3.6.1. If we end up deciding to also add it to 3.6.0 final, I will handle the cherrypicking.

    @methane
    Copy link
    Member

    methane commented Dec 8, 2016

    Here is patch for Python 3.6.

    @methane
    Copy link
    Member

    methane commented Dec 8, 2016

    http://bugs.python.org/issue28894 was duplicate of this issue.

    Since real world program suffered by this, I'm +1 to fix this by 3.6.0

    @ned-deily
    Copy link
    Member

    I think these proposed patches need careful review but, even so, given the extent of the fix28147-py36-2.patch changes, I don't think they should go into 3.6.0 at the last minute. Unless somebody can convince me otherwise, I'm going to let this wait for 3.6.1.

    @wenzel
    Copy link
    Mannequin

    wenzel mannequin commented Dec 15, 2016

    Hi,

    pybind11 (https://github.com/pybind/pybind11) dev here.

    We're seeing massive memory increases due to this bug, which completely break our test suite (and likely any use of this library, which is commonly used to bind C++ code to Python).

    Please look at the following issue ticket:

    pybind/pybind11#558

    where RSS goes to 43MB to 4.3GB for the basic set of tests. The fancy fancy test suite which also covers NumPy/SciPy can't even be run anymore. All issues disappear when the dict patch listed here is applied.

    We're really concerned that this is not slated to be fixed for 3.6.0. Pybind11 is not doing anything particularly special with dicts -- if this is hitting us, others will likely have issues as well.

    -Wenzel

    @ned-deily
    Copy link
    Member

    It appears this issue has stalled again. Is the most recent patch fix28147-py36-2.patch) ready for review? If so, can someone please review it? Then it needs to be pushed to the 3.6 branch for 3.6.1 and exposed to the buildbots. Only then could we consider cherrypicking for 3.6.0 and a change of the magnitude in the patch would require at least another preview release and delay 3.6.0 final. 3.6.0 is scheduled to be released tomorrow. I am very reluctant to delay now for an issue that has been open now for 3 months throughout the beta phase of the release cycle. If enough people feel we should delay 3.6.0 to address this, we can. But without a fix reviewed and committed and without more feedback from other core developers, 3.6.0 is going out without it. @inada.naoki ? @Haypo ? @serhiy.storchaka ? Others?

    @vstinner
    Copy link
    Member

    fix28147-py36-2.patch: test_dict pass with DEBUG_PYDICT defined.

    @vstinner
    Copy link
    Member

    I reviewed fix28147-py36-2.patch, I have some requests.

    @vstinner
    Copy link
    Member

    Ned: "If so, can someone please review it?"

    Done.

    Ned: "I am very reluctant to delay now for an issue that has been open now for 3 months throughout the beta phase of the release cycle. If enough people feel we should delay 3.6.0 to address this, we can."

    The bug was seen as minor and I didn't expect that anyone would notice it.

    Wenzel Jakob wrote "RSS goes to 43MB to 4.3GB": for me, it's a major regression, and it's not possible to work around it. With such bug, Python 3.6 is unusable on such application (pybind11). I easily imagine that such memory leak is a blocker issue for a server running for days.

    I now consider that Python 3.6.0 must not be released with such blocker issue.

    @vstinner
    Copy link
    Member

    fix28147-py36-3.patch LGTM: Naoki, please push it right now.

    But please see also my comments on fix28147-py36-2.patch: I would still apprecicate that we enhance the comment in dictobject.c. The comment is not a blocker issue for 3.6.0, it can be enhanced later :-D

    @methane
    Copy link
    Member

    methane commented Dec 15, 2016

    This patch updates the comment.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 15, 2016

    New changeset 85be9dcc16a8 by Victor Stinner in branch '3.6':
    Fix a memory leak in split-table dictionaries
    https://hg.python.org/cpython/rev/85be9dcc16a8

    @vstinner
    Copy link
    Member

    I dislike pushing a change written by another core dev, I prefer that he/she directly push it, but we are out of time. Ned asked to first push to Python 3.6 to get a confirmation of buildbots, before being able to ask for a cherry-pick in 3.6.0 final.

    Anyway, thanks for the fix Naoki!

    @serhiy-storchaka
    Copy link
    Member

    I think this can be tested without _testcapi. See an example in bpo-28894.

    @serhiy-storchaka
    Copy link
    Member

    I asked a question about the change in _PyObjectDict_SetItem. It is not clear to me.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 16, 2016

    New changeset b70b2d3f3167 by Victor Stinner in branch '3.6':
    Fix a memory leak in split-table dictionaries
    https://hg.python.org/cpython/rev/b70b2d3f3167

    @ned-deily
    Copy link
    Member

    [cherrypicked for 3.6.0rc2]

    @methane
    Copy link
    Member

    methane commented Dec 16, 2016

    Oh, I'm sorry. It seems I had failed to push the commit yesterday.

    @vstinner
    Copy link
    Member

    Oh, I'm sorry. It seems I had failed to push the commit yesterday.

    No problem. Thanks for your fix! I prefer to not see the bug in Python
    3.6.0 final!

    @ned-deily
    Copy link
    Member

    So now that a fix has been released with 3.6.0rc2, what else needs to be done to close this issue? Why is 3.5 selected in the applicable versions?

    @methane
    Copy link
    Member

    methane commented Dec 17, 2016

    Before Python 3.6, instance.__dict__.pop(key) didn't trigger this.
    But there are many way to trigger this. This bug is live from 3.3.
    This script reproduce this issue on Python 3.5.

    class C: pass
    a = C()
    a.a = 1
    while True:
        a.__dict__.popitem()  # convert split table into combined table.
        a.a = 1   # convert combined table into split table again.

    @vstinner
    Copy link
    Member

    The 3.5 patch LGTM.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 20, 2016

    New changeset cc40470c10f8 by INADA Naoki in branch '3.5':
    Issue bpo-28147: Fix a memory leak in split-table dictionaries
    https://hg.python.org/cpython/rev/cc40470c10f8

    @methane methane closed this as completed Dec 20, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life deferred-blocker interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants