classification
Title: overzealous garbage collector (dict)
Type: behavior Stage:
Components: Interpreter Core Versions: Python 2.5
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: GreaseMonkey, georg.brandl, paulmelis, pitrou
Priority: normal Keywords:

Created on 2008-06-13 09:23 by GreaseMonkey, last changed 2008-07-20 11:15 by georg.brandl. This issue is now closed.

Files
File name Uploaded Description Edit
markov.py GreaseMonkey, 2008-06-14 01:31
Messages (8)
msg68142 - (view) Author: (GreaseMonkey) Date: 2008-06-13 09:23
When filled with a massive database (>16MB, i'm not sure how large it's
meant to be), the dict object appears to mysteriously drop objects off
the face of the earth (in this case list objects). Wouldn't it be more
appropriate to splurt out a memory error rather than fail silently only
to screw up in another way?
msg68145 - (view) Author: Paul Melis (paulmelis) Date: 2008-06-13 11:10
What do you mean with ">16MB"? Is that the total size of all data held
by the dictionary (and if so, how did you measure this)? How many keys
are in the dictionary? And what indication do you have that elements are
being dropped?
msg68148 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-13 12:30
Are you sure the keys for those list objects aren't just equal to others
you insert in the dict?

Witness:

>>> d = {}
>>> d[1] = 'a'
>>> d
{1: 'a'}
>>> d[1.0] = 'b'
>>> d
{1: 'b'}

I'm not sure what the memory limit is for dict objects, but 16MB sounds
quite low.
msg68154 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-06-13 13:20
You'll have to produce a test case for this "dropping" -- otherwise I
don't believe that 16 MB cause *any* memory problems at all.
msg68197 - (view) Author: (GreaseMonkey) Date: 2008-06-14 01:31
I mean that it actually *drops* values, not *overwrites* them.

I have attached the script which demonstrates this quirk in the garbage
collector (it also doubles as a library).

The original text file was an IRC log. Shoving Charles Dickens' "Great
Expectations" 17 times in a text file and then parsing it doesn't show
this problem for some weird reason.

I have python 2.5.1.
msg68201 - (view) Author: Paul Melis (paulmelis) Date: 2008-06-14 07:07
The script is still not a test case, as it doesn't *demonstrate* the
problem when run. You need to provide more information for this to be
reproducable by others. 

- what exact input did you use? (e.g. include the IRC log file on which
you claim a bug is exposed)
- what output/behaviour did you expect for the given input?
- how was the actual output/behaviour different from what was expected?
msg68207 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-14 14:37
> The original text file was an IRC log. Shoving Charles Dickens' "Great
> Expectations" 17 times in a text file and then parsing it doesn't show
> this problem for some weird reason.

I'd say the "weird reason" is probably a bug in your script. For example
the following appears very dubious:

		for o in self.wlist:
			if len(o) > 0xFF:
				o = o[:0xFF]
			fp.write(chr(len(o)))
			fp.write(o)
			for s in self.wlist[o]:

In any case, the idea that one of Python's built-in containers would
silently *drop* values (rather than segfault or produce a MemoryError)
is in itself quite unbelievable, due to the way those containers funciton.
msg70071 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-07-20 11:15
No complaints were voiced, so I'm closing this.
History
Date User Action Args
2008-07-20 11:15:20georg.brandlsetstatus: pending -> closed
messages: + msg70071
2008-06-14 14:37:21pitrousetmessages: + msg68207
2008-06-14 07:07:59paulmelissetmessages: + msg68201
2008-06-14 01:31:18GreaseMonkeysetfiles: + markov.py
messages: + msg68197
2008-06-13 13:20:46georg.brandlsetstatus: open -> pending
nosy: + georg.brandl
messages: + msg68154
2008-06-13 12:30:47pitrousetnosy: + pitrou
messages: + msg68148
2008-06-13 11:10:46paulmelissetnosy: + paulmelis
messages: + msg68145
2008-06-13 09:23:38GreaseMonkeycreate