classification
Title: Memory leak in socket.py on Mac OS X
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: a_lauer, akuchling, bacchusrx, bob.ippolito, christian.heimes, gregory.p.smith, martey, mhammond, schmir, vila
Priority: normal Keywords:

Created on 2004-12-29 02:09 by bacchusrx, last changed 2008-04-30 05:59 by mhammond. This issue is now closed.

Files
File name Uploaded Description Edit
example.py bacchusrx, 2004-12-29 02:09 example.py
example2.py bacchusrx, 2005-01-01 23:02 example client
server.pl bacchusrx, 2005-01-01 23:03 example server
Messages (17)
msg23834 - (view) Author: bacchusrx (bacchusrx) Date: 2004-12-29 02:09
Some part of socket.py leaks memory on Mac OS X 10.3 (both with 
the python 2.3 that ships with the OS and with python 2.4).

I encountered the problem in John Goerzen's offlineimap. 
Transfers of messages over a certain size would cause the program 
to bail with malloc errors, eg

*** malloc: vm_allocate(size=5459968) failed (error code=3)
*** malloc[13730]: error: Can't allocate region

Inspecting the process as it runs shows that python's total memory
size grows wildly during such transfers.

The bug manifests in _fileobject.read() in socket.py. You can 
replicate the problem easily using the attached example with "nc -l 
-p 9330 < /dev/zero" running on some some remote host.

The way _fileobject.read() is written, socket.recv is called with the 
larger of the minimum rbuf size or whatever's left to be read. 
Whatever is received is then appended to a buffer which is joined 
and returned at the end of function.

It looks like each time through the loop, space for recv_size is 
allocated but not freed, so if the loop runs for enough iterations, 
python exhausts the memory available to it.

You can sidestep the condition if recv_size is small (like 
_fileobject.default_bufsize small).

I can't replicate this problem with python 2.3 on FreeBSD 4.9 or  
FreeBSD 5.2, nor on Mac OS X 10.3 if the logic from 
_fileobject.read() is re-written in Perl (for example).
msg23835 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005-01-01 06:18
Logged In: YES 
user_id=139309

I can't reproduce this on either version of Python a 10.3.7 machine w/ 
1gb ram.  Python's total memory usage seems stable to me even if the 
read is in a while loop.

I can't see anything in sock_recv or _fileobject.read that will in any way 
leak memory.

With a really large buffer size (always >17mb, but it does vary with each 
run) it will get a memory error but the Python process doesn't grow 
beyond 50mb at the samples I looked at.  That's pretty much the amount 
of RAM I'd expect it to use.  

It is kind of surprising it doesn't want to allocate a buffer of that size, 
because I have the RAM for it.. but I don't think this is a bug.
msg23836 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005-01-01 06:27
Logged In: YES 
user_id=139309

I just played with a bit more.  If I catch the MemoryError and try again, 
most of the time it will work (sometimes on the second try).  These 
malloc faults seem to be some kind of temporary condition.
msg23837 - (view) Author: bacchusrx (bacchusrx) Date: 2005-01-01 23:01
Logged In: YES 
user_id=646321

I've been able to replicate the problem reliably on both 10.3.5 and 
10.3.7. I've attached two more examples to demonstrate:

Try this: Do, "dd if=/dev/zero of=./data bs=1024 count=10240" and save 
server.pl wherever you put "data". Have three terminals open. In one, 
run "perl server.pl -s0.25". In another, run "top -ovsize" and in the third 
run "python example2.py". 

After about 100 iterations, python's vsize is +1GB (just about the value 
of cumulative_req in example2.py) and if left running will cause a 
malloc error at around 360 iterations with a vsize over 3.6GB (again, just 
about what cumulative_req reports). Mind you, we've only received 
~512kbytes.

server.pl differs from the netcat method in that it (defaults) to sending 
only 1492 bytes at a time (configurable with the -b switch) and sleeps for 
however many seconds specified with the -s switch. This guarantees 
enough iterations to raise the error each time around. When omittting 
the -s switch to server.pl, I don't get the error, but throughput is good 
enough that the loop in readFromSockUntil() only runs a few times.
msg23838 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005-01-02 02:22
Logged In: YES 
user_id=139309

Ok.  I've tracked it down.  realloc(...) on Darwin doesn't actually resize 
memory unless it *has* to.  For shrinking an allocation, it does not have 
to, therefore realloc(...) with a smaller size is a no-op.

It seems that this may be a misunderstanding by Python.  The man page 
for realloc(...) does not say that it will EVER free memory, EXCEPT in the 
case where it has to allocate a larger region.

I'll attach an example that demonstrates this outside of Python.
msg23839 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005-01-02 02:23
Logged In: YES 
user_id=139309

#include <unistd.h>

#define NUM_ALLOCATIONS 100000
#define ALLOC_SIZE 10485760
#define ALLOC_RESIZE 1492

int main(int argc, char **argv) {
    /* exiting will free all this leaked memory */
    for (i = 0; i < NUM_ALLOCATIONS; i++) {
        void *orig_ptr, *new_ptr;
        size_t new_size, orig_size;
        orig_ptr = malloc(ALLOC_SIZE);
        orig_size = malloc_size(orig_ptr);

        if (orig_ptr == NULL) {
            printf("failure to malloc %d\n", i);
            abort();
        }
        new_ptr = realloc(orig_ptr, ALLOC_RESIZE);
        new_size = malloc_size(new_ptr);
        printf("resized %d[%p] -> %d[%p]\n",
            orig_size, orig_ptr, new_size, new_ptr);
        if (new_ptr == NULL) {
            printf("failure to realloc %d\n", i);
            abort();
        }
    }
    return 0;
}
msg23840 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005-01-02 02:25
Logged In: YES 
user_id=139309

that code paste is missing an "int i" at the beginning of main..
msg23841 - (view) Author: Andreas Lauer (a_lauer) Date: 2005-11-10 07:42
Logged In: YES 
user_id=1376343

The problem also occurs in rare cases under Windows XP with
Python 2.3.4.  I Suspect the code line

recv_size = max(self._rbufsize, left)

in socket.py to be a part of the problem.
 
In the case that I investigated, this caused >600 allocations
of up to 5 MBytes (which came in 8 KB packets). 

Sure, the memory allocator should be able to handle this in
_socket.recv (first it allocates the X MBytes buffer, which
is later
resized with _PyString_Resize), but it I think the correct
line in socket.py
is 

recv_size = min(self._rbufsize, left).

At least, after this my problem was gone.

msg59313 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-05 19:39
Probably outdated

I haven't heard or seen any such problems in the past two years.
msg60130 - (view) Author: Martey Dodoo (martey) Date: 2008-01-19 01:24
I am not sure that this is outdated. After running into memory errors
trying to download messages with attachments using imaplib, I ran the
example client and server. After iteration 357, there was a malloc
error, just like etrepum suggested.

I am using Mac OS X 10.5, with Python 2.5 (not the Apple-supplied Python).
msg61341 - (view) Author: Martey Dodoo (martey) Date: 2008-01-20 19:24
Just wanted to note that the good people of comp.lang.python helped me
figure out that the issue is actually
http://bugs.python.org/issue1389051, in case anyone in similar straits
ended up here.
msg62797 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2008-02-23 19:31
Andreas Lauer's suggested fix is correct.  Applied to 2.6 trunk in rev.
61008 and to 2.5-maint in rev. 61009.
msg65467 - (view) Author: Ralf Schmitt (schmir) Date: 2008-04-14 16:33
I think this should be fixed somewhere in the c code. people calling
sock.recv which a large recv size will also trigger this error.
this fix is wrong. the fixed code reads one byte at a time.
see:
http://mail.python.org/pipermail/python-dev/2008-April/078613.html
msg65468 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2008-04-14 17:30
Note that _rbufsize is only set to 1 if the _fileobject's bufsize is set
to 0.  So perhaps the bug is that some library is turning off buffering
when it shouldn't.

I don't see how you would fix this in the C code, other than manually
doing two separate mallocs and copying the data, which would unfairly
penalize platforms with smarter malloc() implementations.  What sort of
fix would you suggest?
msg65481 - (view) Author: Ralf Schmitt (schmir) Date: 2008-04-14 21:07
Well, I think the right thing to do is limit the maximal size to be read
inside the c function (just to make it impossible to pass around large
values). This is basically the same fix just at another place in the code.


http://twistedmatrix.com/trac/ticket/1079 describes the same problem
(but with 64 k read requests: it can even leak with small requests).
The fix there was to really copy those strings around into a StringIO
object.

Note that the code does not read byte by byte when passing in no size
argument. Instead it read recv_size bytes:

            if self._rbufsize <= 1:
                recv_size = self.default_bufsize
            else:
                recv_size = self._rbufsize

This seems clearly wrong to me.
msg65482 - (view) Author: Ralf Schmitt (schmir) Date: 2008-04-14 21:10
that is it seems wrong that it uses 1 byte when a size is given, and
recv_size when size is not given.

By the way I think if you ask for 4096 bytes and the buffering is set to
2048 bytes it should still try to read the full 4096 bytes.
The number of bytes it tries to read however. should be limited by
whatever the system maximally returns.
msg65991 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008-04-30 05:59
FYI, #2632 is tracking a regression caused by this change.
History
Date User Action Args
2008-04-30 05:59:20mhammondsetnosy: + mhammond
messages: + msg65991
2008-04-15 15:24:03gregory.p.smithsetnosy: + gregory.p.smith
2008-04-14 21:10:17schmirsetmessages: + msg65482
2008-04-14 21:07:38schmirsetmessages: + msg65481
2008-04-14 17:30:42akuchlingsetmessages: + msg65468
2008-04-14 16:33:46schmirsetnosy: + schmir
messages: + msg65467
2008-03-07 10:24:42vilasetnosy: + vila
2008-02-23 19:31:25akuchlingsetnosy: + akuchling
resolution: out of date -> fixed
messages: + msg62797
2008-01-20 19:24:30marteysetmessages: + msg61341
2008-01-19 01:24:23marteysetnosy: + martey
messages: + msg60130
title: Memory leak in socket.py on Mac OS X 10.3 -> Memory leak in socket.py on Mac OS X
2008-01-05 19:39:04christian.heimessetstatus: open -> closed
nosy: + christian.heimes
resolution: out of date
messages: + msg59313
2004-12-29 02:09:35bacchusrxcreate