Issue 1524938: PEP MemoryError with a lot of available memory gc not called

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/43690

classification

Title:	PEP MemoryError with a lot of available memory gc not called
Type:	resource usage	Stage:	needs patch
Components:	Interpreter Core	Versions:

process

Status:	closed	Resolution:	wont fix
Dependencies:		Superseder:
Assigned To:		Nosy List:	Itai.i, brian.curtin, illume, jimjjewett, loewis, markmat, pitrou, swapnil, ysj.ray
Priority:	low	Keywords:

Created on 2006-07-19 02:46 by markmat, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
unnamed	Itai.i, 2010-08-20 00:58

Messages (20)
msg29202 - (view)	Author: Mark Matusevich (markmat)	Date: 2006-07-19 02:46
Also the gc behavior is consistent with the documentation, I beleave it is wrong. I think, that Gc should be called automatically before any memory allocation is raised. Example 1: for i in range(700): a = [range(5000000)] a.append(a) print i This example will crash on any any PC with less then 20Gb RAM. On my PC (Windows 2000, 256Mb) it crashes at i==7. Also, this example can be fixed by addition of a call to gc.collect() in the loop, in real cases it may be unreasonable.
msg29203 - (view)	Author: Rene Dudfield (illume)	Date: 2006-07-19 23:20
Logged In: YES user_id=2042 Perhaps better than checking before every memory allocation, would be to check once a memory error happens in an allocation. That way there is only the gc hit once there is low memory. So... res = malloc(...); if(!res) { gc.collect(); } res = malloc(...); if(!res) { raise memory error. }
msg29204 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2006-07-23 20:00
Logged In: YES user_id=21627 This is very difficult to implement. The best way might be to introduce yet another allocation function, one that invokes gc before failing, and call that function in all interesting places (of which there are many). Contributions are welcome and should probably start with a PEP first.
msg29205 - (view)	Author: Mark Matusevich (markmat)	Date: 2006-07-23 20:11
Logged In: YES user_id=1337765 This is exectly what I meant. For my recollection, this is the policy in Java GC. I never had to handle MemoryError in Java, because I knew, that I really do not have any more memory.
msg29206 - (view)	Author: Mark Matusevich (markmat)	Date: 2006-07-23 20:19
Logged In: YES user_id=1337765 Sorry, my last comment was to illume (I am slow typer :( )
msg29207 - (view)	Author: Jim Jewett (jimjjewett)	Date: 2006-08-02 21:52
Logged In: YES user_id=764593 Doing it everywhere would be a lot of painful changes. Adding the "oops, failed, call gc and try again" to to PyMem_* (currently PyMem_Malloc, PyMem_Realloc, PyMem_New, and PyMem_Resize, but Brett may be changing that) is far more reasonable. Whether it is safe to call gc from there is a different question.
msg29208 - (view)	Author: Mark Matusevich (markmat)	Date: 2006-08-03 10:02
Logged In: YES user_id=1337765 Another problem related to the above example: there is a time waste due to a memory swap before the MemoryError. Possible solution is to use a dynamic memory limit: GC is called when the limit is reached, then the limit is adjusted according to the memory left.
msg29209 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2006-08-03 16:43
Logged In: YES user_id=21627 The example is highly constructed, and it is pointless to optimize for a boundary case. In the average application, garbage collection is invoked often enough to reclaim memory before swapping occurs.
msg86769 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2009-04-28 21:57
Lowering priority since, as Martin said, it shouldn't be needed in real-life situations.
msg90581 - (view)	Author: Mark Matusevich (markmat)	Date: 2009-07-16 20:25
It looks like the severity of this problem is underestimated here. A programmer working with a significant amount of data (e.g SciPy user) and uses OOP will face this problem. Most OOP designs result in existence of some loops (e.g. two way connections). Some object in those loops will include huge amount of data which were allocated by a single operation if the program deals with some kind of algorithms (signal processing, image processing or even 3D games). I apologize that my example is artificial. I had a real-life program of 8000 lines which was going into swap for no apparent reason and then crashing. But instead of posting those 8000 lines, I posted a simple example illustrating the problem.
msg91312 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2009-08-05 11:05
I'm not sure what we should do anyway. Your program will first swap out and thrash before the MemoryError is raised. Invoking the GC when memory allocation fails would avoid the MemoryError, but not the massive slowdown due to swapping.
msg114225 - (view)	Author: Itai (Itai.i)	Date: 2010-08-18 14:25
Hi all, I'm joining Mark's assertion - this is a real issue for me too. I've stumbled into this problem too. I have a numpy/scipy kind of application (about 6000+ lines so far) which needs to allocate alot of memory for statistics derived from "real life data" which is then transformed a few times by different algorithms (which means allocating more memory, but dumping the previous objects). Currently I'm getting MemoryError when I try to use the entire dataset, both on linux and on windows, python 2.5 on 64bit 4gb mem machine. (The windows python is a 32bit version though cause it needs to be compatible with some dlls. This is the same reason I use python 2.5)
msg114230 - (view)	Author: ysj.ray (ysj.ray)	Date: 2010-08-18 15:13
How about calling gc.collect() explicitly in the loop?
msg114237 - (view)	Author: Itai (Itai.i)	Date: 2010-08-18 16:08
Sure, that's what i'll do for now. Its an ok workaround for me, I was just posting to support the notion that its a bug (lets call it usability bug) and something that people out there do run into. There's also a scenerio where you couldn't use this workaround - for example use a library precompiled in a pyd.. On Wed, Aug 18, 2010 at 6:13 PM, Ray.Allen <report@bugs.python.org> wrote: > > Ray.Allen <ysj.ray@gmail.com> added the comment: > > How about calling gc.collect() explicitly in the loop? > > ---------- > nosy: +ysj.ray > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue1524938> > _______________________________________ >
msg114262 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2010-08-18 18:27
Anybody really interested in this issue: somebody will need to write a PEP, get it accepted, and provide an implementations. Open source is about scratching your own itches: the ones affected by a problems are the ones which are also expected to provide solutions.
msg114416 - (view)	Author: Itai (Itai.i)	Date: 2010-08-20 00:58
You are right, ofcourse... I haven't got the time for doing the right thing, But I've found another workaround that helped me though and might be helpful to others. (not sure its for this thread though but...) Windows on default limits the amount of memory for 32 bit processes to 2GB. There's a bit in the PE image which tells 64 bit windows to give it 4GB (on 32 bit windows PAE needs to be enabled too) which is called IMAGE_FILE_LARGE_ADDRESS_AWARE. There's a post-build way to enable it with the editbin.exe utility which comes with visual studio like this: editbin.exe /LARGEADDRESSAWARE python.exe It works for me since it gives me x2 memory on my 64 bit os. I have to say it could be dangerous since it essentially says no where in python code pointers are treated as negative numbers. I figured this should be right since there's a 64 bit version of python... On Wed, Aug 18, 2010 at 9:27 PM, Martin v. Löwis <report@bugs.python.org>wrote: > > Martin v. Löwis <martin@v.loewis.de> added the comment: > > Anybody really interested in this issue: somebody will need to write a > PEP, get it accepted, and provide an implementations. Open source is about > scratching your own itches: the ones affected by a problems are the ones > which are also expected to provide solutions. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue1524938> > _______________________________________ >
msg114424 - (view)	Author: Swapnil Talekar (swapnil)	Date: 2010-08-20 11:11
Mark, are you sure that the above program is sure to cause a crash. I had absolutely no problem running it with Python 3.1.2. With Python 2.6.5, PC went terribly slow but the program managed to run till i==14 without crashing. I did not wait to see if it reaches 700. I'm running it on XP.
msg114425 - (view)	Author: Brian Curtin (brian.curtin) *	Date: 2010-08-20 13:32
> (not sure its for this thread though but...) Windows on default limits > the amount of memory for 32 bit processes to 2GB. There's a bit in > the PE image which tells 64 bit windows to give it 4GB (on 32 bit > windows PAE needs to be enabled too) which is called > IMAGE_FILE_LARGE_ADDRESS_AWARE. There's a post-build way to enable > it with the editbin.exe utility which comes with visual studio like > this: editbin.exe /LARGEADDRESSAWARE python.exe See #1449496 if you are interested in that.
msg114987 - (view)	Author: Mark Matusevich (markmat)	Date: 2010-08-26 15:24
This is what I got on computer with 512 MB RAM: Mandriva Linux 2009.1 ============================= Python 2.6.1 (r261:67515, Jul 14 2010, 09:23:11) [GCC 4.3.2] -----> Python process killed by operating system after 14 Microsoft Windows XP Professional Version 5.1.2600 Service Pack 2 Build 2600 ============================================= Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] -----> MemoryError after 10 Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)] -----> MemoryError after 10 Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] -----> MemoryError after 10 Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit (Intel)] -----> Sucessfull finnish in no time!!! Unfortunately I cannot test the original program I had the problem with, because since the original post (2006) I changed the employer. Now I use Matlab :(
msg115026 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2010-08-26 21:21
Ok, I'm closing this as "won't fix". The OP doesn't have the issue anymore; anybody else having some issue please report that separately (taking into account that you are likely asked to provide a patch as well).

History
Date	User	Action	Args
2022-04-11 14:56:18	admin	set	github: 43690
2010-08-26 21:21:29	loewis	set	status: open -> closed resolution: wont fix messages: + msg115026
2010-08-26 15:24:58	markmat	set	messages: + msg114987
2010-08-20 13:32:51	brian.curtin	set	nosy: + brian.curtin messages: + msg114425
2010-08-20 11:11:27	swapnil	set	nosy: + swapnil messages: + msg114424
2010-08-20 00:58:16	Itai.i	set	files: + unnamed messages: + msg114416
2010-08-18 18:41:04	belopolsky	set	files: - unnamed
2010-08-18 18:27:06	loewis	set	messages: + msg114262
2010-08-18 16:08:25	Itai.i	set	files: + unnamed messages: + msg114237
2010-08-18 15:13:13	ysj.ray	set	nosy: + ysj.ray messages: + msg114230
2010-08-18 14:25:10	Itai.i	set	nosy: + Itai.i messages: + msg114225 versions: - Python 3.1, Python 2.7
2009-08-05 11:05:50	pitrou	set	messages: + msg91312
2009-07-16 20:25:05	markmat	set	messages: + msg90581
2009-04-28 21:57:57	pitrou	set	priority: high -> low nosy: + pitrou messages: + msg86769 type: enhancement -> resource usage stage: needs patch
2009-04-27 23:48:03	ajaksu2	set	versions: + Python 3.1, Python 2.7, - Python 2.6
2008-01-05 19:47:02	christian.heimes	set	priority: normal -> high versions: + Python 2.6
2006-07-19 02:46:19	markmat	create