classification
Title: C Unpickler memory leak via memo
Type: resource usage Stage: patch review
Components: Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8, Python 3.7, Python 3.6, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: kale-smoothie, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2020-07-28 18:26 by kale-smoothie, last changed 2020-07-28 19:23 by serhiy.storchaka.

Files
File name Uploaded Description Edit
leak_pickler.py kale-smoothie, 2020-07-28 18:26
Pull Requests
URL Status Linked Edit
PR 21664 open python-dev, 2020-07-28 18:49
Messages (2)
msg374518 - (view) Author: kale-smoothie (kale-smoothie) * Date: 2020-07-28 18:26
I'm not familiar with the workings of GC/pickle, but it looks like the traverse code in the C Unpickler omits a visit to the memo, potentially causing a memory leak?
msg374521 - (view) Author: kale-smoothie (kale-smoothie) * Date: 2020-07-28 19:07
The leak demonstrated in the attachment is, to my understanding, caused by memoizing the closure returned from the `find_class` method that's used to intercept global references. The cycle is then: Unpickler, memo table, closure, Unpickler (via cell reference to `self`).

My proposed patch visits every entry in the memo table.

Pre-patch run of valgrind on leak_pickler.py:

==20339== HEAP SUMMARY:
==20339==     in use at exit: 190,189,238 bytes in 2,406,919 blocks
==20339==   total heap usage: 3,150,288 allocs, 743,369 frees, 233,766,596 bytes allocated
==20339== 
==20339== LEAK SUMMARY:
==20339==    definitely lost: 0 bytes in 0 blocks
==20339==    indirectly lost: 0 bytes in 0 blocks
==20339==      possibly lost: 190,176,150 bytes in 2,406,835 blocks
==20339==    still reachable: 13,088 bytes in 84 blocks
==20339==         suppressed: 0 bytes in 0 blocks
==20339== Rerun with --leak-check=full to see details of leaked memory

Post-patch run of valgrind on leak_pickler.py:

==20880== HEAP SUMMARY:
==20880==     in use at exit: 667,277 bytes in 6,725 blocks
==20880==   total heap usage: 2,853,739 allocs, 2,847,014 frees, 216,473,216 bytes allocated
==20880== 
==20880== LEAK SUMMARY:
==20880==    definitely lost: 0 bytes in 0 blocks
==20880==    indirectly lost: 0 bytes in 0 blocks
==20880==      possibly lost: 654,624 bytes in 6,646 blocks
==20880==    still reachable: 12,653 bytes in 79 blocks
==20880==         suppressed: 0 bytes in 0 blocks
==20880== Rerun with --leak-check=full to see details of leaked memory
History
Date User Action Args
2020-07-28 19:23:14serhiy.storchakasetnosy: + serhiy.storchaka
2020-07-28 19:07:30kale-smoothiesetmessages: + msg374521
2020-07-28 18:49:37python-devsetkeywords: + patch
nosy: + python-dev

pull_requests: + pull_request20809
stage: patch review
2020-07-28 18:26:29kale-smoothiecreate