classification
Title: PEP MemoryError with a lot of available memory gc not called
Type: resource usage Stage: needs patch
Components: Interpreter Core Versions: Python 3.1, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: illume, jimjjewett, loewis, markmat, pitrou (5)
Priority: low Keywords

Created on 2006-07-19 02:46 by markmat, last changed 2009-08-05 11:05 by pitrou.

Messages (11)
msg29202 - (view) Author: Mark Matusevich (markmat) Date: 2006-07-19 02:46
Also the gc behavior is consistent with the
documentation, I beleave it is wrong. I think, that Gc
should be called automatically before any memory
allocation is raised.

Example 1:
for i in range(700): 
   a = [range(5000000)]
   a.append(a)
   print i

This example will crash on any any PC with less then
20Gb RAM. On my PC (Windows 2000, 256Mb) it crashes at
i==7.
Also, this example can be fixed by addition of a call
to gc.collect() in the loop, in real cases it may be
unreasonable. 
msg29203 - (view) Author: Rene Dudfield (illume) Date: 2006-07-19 23:20
Logged In: YES 
user_id=2042

Perhaps better than checking before every memory allocation,
would be to check once a memory error happens in an allocation.

That way there is only the gc hit once there is low memory.

So...

res = malloc(...);
if(!res) {
    gc.collect();
}

res = malloc(...);
if(!res) {
    raise memory error.
}



msg29204 - (view) Author: Martin v. Löwis (loewis) Date: 2006-07-23 20:00
Logged In: YES 
user_id=21627

This is very difficult to implement. The best way might be
to introduce yet another allocation function, one that
invokes gc before failing, and call that function in all
interesting places (of which there are many).

Contributions are welcome and should probably start with a
PEP first.
msg29205 - (view) Author: Mark Matusevich (markmat) Date: 2006-07-23 20:11
Logged In: YES 
user_id=1337765

This is exectly what I meant. 
For my recollection, this is the policy in Java GC. I never
had to handle MemoryError in Java, because I knew, that I
really do not have any more memory.
msg29206 - (view) Author: Mark Matusevich (markmat) Date: 2006-07-23 20:19
Logged In: YES 
user_id=1337765

Sorry, my last comment was to illume (I am slow typer :( )
msg29207 - (view) Author: Jim Jewett (jimjjewett) Date: 2006-08-02 21:52
Logged In: YES 
user_id=764593

Doing it everywhere would be a lot of painful changes.

Adding the "oops, failed, call gc and try again" to to 
PyMem_* (currently PyMem_Malloc, PyMem_Realloc, PyMem_New, 
and PyMem_Resize, but Brett may be changing that) is far 
more reasonable.

Whether it is safe to call gc from there is a different 
question.
msg29208 - (view) Author: Mark Matusevich (markmat) Date: 2006-08-03 10:02
Logged In: YES 
user_id=1337765

Another problem related to the above example: there is a
time waste due to a memory swap before the MemoryError. 
Possible solution is to use a dynamic memory limit: GC is
called when the limit is reached, then the limit is adjusted
according to the memory left.
msg29209 - (view) Author: Martin v. Löwis (loewis) Date: 2006-08-03 16:43
Logged In: YES 
user_id=21627

The example is highly constructed, and it is pointless to
optimize for a boundary case. In the average application,
garbage collection is invoked often enough to reclaim memory
before swapping occurs.
msg86769 - (view) Author: Antoine Pitrou (pitrou) Date: 2009-04-28 21:57
Lowering priority since, as Martin said, it shouldn't be needed in
real-life situations.
msg90581 - (view) Author: Mark Matusevich (markmat) Date: 2009-07-16 20:25
It looks like the severity of this problem is underestimated here.

A programmer working with a significant amount of data (e.g SciPy user)
and uses OOP will face this problem. Most OOP designs result in
existence of some loops (e.g. two way connections). Some object in those
loops will include huge amount of data which were allocated by a single
operation if the program deals with some kind of algorithms (signal
processing, image processing or even 3D games).

I apologize that my example is artificial. I had a real-life program of
8000 lines which was going into swap for no apparent reason and then
crashing. But instead of posting those 8000 lines, I posted a simple
example illustrating the problem.
msg91312 - (view) Author: Antoine Pitrou (pitrou) Date: 2009-08-05 11:05
I'm not sure what we should do anyway. Your program will first swap out
and thrash before the MemoryError is raised. Invoking the GC when memory
allocation fails would avoid the MemoryError, but not the massive
slowdown due to swapping.
History
Date User Action Args
2009-08-05 11:05:50pitrousetmessages: + msg91312
2009-07-16 20:25:05markmatsetmessages: + msg90581
2009-04-28 21:57:57pitrousetpriority: high -> low

nosy: + pitrou
messages: + msg86769

type: feature request -> resource usage
stage: needs patch
2009-04-27 23:48:03ajaksu2setversions: + Python 3.1, Python 2.7, - Python 2.6
2008-01-05 19:47:02christian.heimessetpriority: normal -> high
versions: + Python 2.6
2006-07-19 02:46:19markmatcreate