This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Memore leak in pickle and cPickle
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: barry, gvanrossum, naris, vlk
Priority: normal Keywords:

Created on 2001-01-23 15:28 by vlk, last changed 2022-04-10 16:03 by admin. This issue is now closed.

Messages (6)
msg3038 - (view) Author: Vladimir Kralik (vlk) Date: 2001-01-23 15:28
# When Pickler.object is used for dump typles into file and Unpickler for 
# load from files. A loaded object are not garbage collected.
# When function dump(object,file) is used Unpickler works fine.
# Problem is in pickle and cPickle module
# tested on Python 2.0 Red Hat Linux 6.2

import cPickle			
#import pickle			 
import gc

f=open("xxx","w")
pic=cPickle.Pickler(f,1)	# ERROR
#pic=pickle.Pickler(f,1)	# ERROR
for i in range(100):
	#cPickle.dump(([long(i),long(i),long(i),long(i)],i),f)	
		# this is OK
	#pickle.dump(([long(i),long(i),long(i),long(i)],i),f)	
		# this is OK
	pic.dump(([long(i),long(i),long(i),long(i)],i))		
		# Memory leak

f.close()
gc.set_debug(gc.DEBUG_STATS)
f=open("xxx","r")
u=cPickle.Unpickler(f)
try:
	while 1:
		gc.collect()
		print u.load()
except EOFError:
	pass
f.close()
msg3039 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2001-01-24 03:28
Barry, can you look into this?  I would first see if this is really reproducable without using Insure++; somehow it looks a bit fishy.  Could also be fixed in 2.1 because now modules participate in gc.  Or could have to do with a __del__?  Also, I doubt the claim that this is a leak with both pickle and cPickle.
msg3040 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2001-02-23 18:58
When cranking up the number of objects placed in the pickle to 10000, I do see some memory growth when unpickling as clocked by top.  The cPickle growth is much smaller than the pickle growth, which already appears fairly minimal.

I will investigate further with Insure++.

I don't see how the problem could be related to __del__ since the only thing we're dumping and loading are builtin objects (tuples, lists, longs, and ints).
msg3041 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2001-02-23 18:59
# When Pickler.object is used for dump typles into file and Unpickler for load
# from files. A loaded object are not garbage collected.  When function
# dump(object,file) is used Unpickler works fine.  Problem is in pickle and
# cPickle module tested on Python 2.0 Red Hat Linux 6.2

import gc 

def main():
    fp = open('/tmp/xxx', 'w') 
    pic = pickle.Pickler(fp, 1)                   # ERROR 

    for i in range(10000):
        pickle.dump(([long(i), long(i), long(i), long(i)], i), fp) 
        # this is OK 
        pic.dump(([long(i), long(i), long(i), long(i)], i))
        # Memory leak 

    fp.close() 
    gc.set_debug(gc.DEBUG_STATS) 

    fp = open('/tmp/xxx')

    upic = pickle.Unpickler(fp)
    try: 
        while 1: 
            gc.collect() 
            print upic.load() 
    except EOFError: 
        pass

    fp.close()


if __name__ == '__main__':
    import sys
    if len(sys.argv) > 1:
        import cPickle
        pickle = cPickle
    else:
        import pickle

    main()
msg3042 - (view) Author: Naris Siamwalla (naris) Date: 2001-04-24 23:47
Logged In: YES 
user_id=67995

barry's python snippet memory leaks on python 2.1 final
on rh6.2.
msg3043 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2001-04-25 17:27
Logged In: YES 
user_id=6380

I think I know why this is.  It is a false alarm, or at
least something that we cannot fix -- it falls in the
category "don't do this".

The Pickler instance keeps a memory of each of the lists
dumped alive, so that if you later pickle a reference to the
same list (or other mutable object) again, it can pickle a
reference rather than a copy of the value. This is a
feature.

By using the same Pickler instance to dump 10,000 unrelated
lists, you simply grow the memo data structure beyond
reason. So just don't do this!

Closing this now.
History
Date User Action Args
2022-04-10 16:03:39adminsetgithub: 33782
2001-01-23 15:28:11vlkcreate