Issue11849
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011-04-15 09:08 by kaifeng, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
test.py | kaifeng, 2011-04-15 09:08 | |||
issue11849_test.py | flox, 2011-04-15 11:39 | raw benchmark test | ||
issue11849_test2.py | kaifeng, 2011-04-18 00:37 | |||
valgrind.log | kaifeng, 2011-04-25 08:01 | |||
pymalloc_threshold.diff | neologix, 2011-05-02 16:57 | patch increasing pymalloc threshold | review | |
pymalloc_frag.diff | neologix, 2011-05-02 21:59 | final patch with pymalloc threshold | review | |
arenas_mmap.diff | neologix, 2011-11-25 22:45 | review |
Messages (44) | |||
---|---|---|---|
msg133797 - (view) | Author: kaifeng (kaifeng) | Date: 2011-04-15 09:08 | |
I'm using xml.etree.ElementTree to parse large XML file, while the memory keep increasing consistently. You can run attached test script to reproduce it. From 'top' in Linux or 'Task Manager' in Windows, the memory usage of python is not decreased as expected when 'Done' is printed. Tested with Python 2.5/3.1 in Windows 7, and Python 2.5 in CentOS 5.3. |
|||
msg133799 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2011-04-15 09:33 | |
Do you experience same issue with current versions of Python? (3.2 or 2.7) The package was upgraded in latest versions. |
|||
msg133800 - (view) | Author: kaifeng (kaifeng) | Date: 2011-04-15 09:52 | |
Yes. Just tested with Python 2.7 and 3.2 in Windows 7, the memory usage is still unexpected high after 'Done' is printed. |
|||
msg133808 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2011-04-15 11:39 | |
I've tested a small variant of your script, on OSX. It seems to behave correctly (with 2.5, 2.6, 2.7 and 3.1). You can force Python to release memory immediately by calling "gc.collect()". |
|||
msg133809 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2011-04-15 11:41 | |
this is the output for 2.7.1: $ python2.7 issue11849_test.py *** Python 2.7.1 final --- PID STAT TIME SL RE PAGEIN VSZ RSS LIM TSIZ %CPU %MEM COMMAND 0 2754 S+ 0:00.07 0 0 0 2441472 5372 - 0 11,7 0,1 python2.7 issue11849_test.py 1 2754 S+ 0:02.36 0 0 0 2520740 83720 - 0 100,0 2,0 python2.7 issue11849_test.py 2 2754 S+ 0:04.89 0 0 0 2596784 158888 - 0 100,0 3,8 python2.7 issue11849_test.py 3 2754 S+ 0:07.28 0 0 0 2668740 230972 - 0 100,0 5,5 python2.7 issue11849_test.py 4 2754 S+ 0:10.11 0 0 0 2740932 303200 - 0 100,0 7,2 python2.7 issue11849_test.py 5 2754 S+ 0:12.85 0 0 0 2812876 375276 - 0 98,4 8,9 python2.7 issue11849_test.py 6 2754 R+ 0:14.95 0 0 0 2885868 447740 - 0 98,9 10,7 python2.7 issue11849_test.py 7 2754 S+ 0:17.91 0 0 0 2962156 522560 - 0 99,1 12,5 python2.7 issue11849_test.py 8 2754 S+ 0:21.08 0 0 0 3034092 594620 - 0 98,3 14,2 python2.7 issue11849_test.py 9 2754 S+ 0:23.20 0 0 0 3106028 667004 - 0 100,0 15,9 python2.7 issue11849_test.py END 2754 S+ 0:27.50 0 0 0 2551160 114480 - 0 96,3 2,7 python2.7 issue11849_test.py GC 2754 S+ 0:27.75 0 0 0 2454904 18992 - 0 97,2 0,5 python2.7 issue11849_test.py *** 2754 S+ 0:27.75 0 0 0 2454904 18992 - 0 3,0 0,5 python2.7 issue11849_test.py |
|||
msg133813 - (view) | Author: kaifeng (kaifeng) | Date: 2011-04-15 12:32 | |
Python 3.2 On Linux (CentOS 5.3) *** Python 3.2.0 final --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 15116 pts/0 S+ 0:00 1 1316 11055 6452 0.6 python3.2 issue11849_test.py 1 15116 pts/0 S+ 0:02 1 1316 53155 47340 4.5 python3.2 issue11849_test.py 2 15116 pts/0 S+ 0:05 1 1316 91051 86364 8.3 python3.2 issue11849_test.py 3 15116 pts/0 S+ 0:08 1 1316 129067 124232 12.0 python3.2 issue11849_test.py 4 15116 pts/0 S+ 0:10 1 1316 166587 162096 15.6 python3.2 issue11849_test.py 5 15116 pts/0 S+ 0:13 1 1316 204483 198824 19.2 python3.2 issue11849_test.py 6 15116 pts/0 S+ 0:17 1 1316 242375 236692 22.8 python3.2 issue11849_test.py 7 15116 pts/0 S+ 0:19 1 1316 284383 277528 26.8 python3.2 issue11849_test.py 8 15116 pts/0 S+ 0:23 1 1316 318371 312452 30.1 python3.2 issue11849_test.py 9 15116 pts/0 S+ 0:25 1 1316 360235 353288 34.1 python3.2 issue11849_test.py END 15116 pts/0 S+ 0:30 1 1316 393975 388176 37.4 python3.2 issue11849_test.py GC 15116 pts/0 S+ 0:30 1 1316 352035 347656 33.5 python3.2 issue11849_test.py *** 15116 pts/0 S+ 0:30 1 1316 352035 347656 33.5 python3.2 issue11849_test.py |
|||
msg133929 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-04-17 14:39 | |
The "problem" is not with Python, but with your libc. When a program - such as Python - returns memory, it uses the free(3) library call. But the libc is free to either return the memory immediately to the kernel using the relevant syscall (brk, munmap), or keep it around just in case (to simplify). It seems that RHEL5 and onwards tend to keep a lot of memory around, at least in this case (probably because of the allocation pattern). To sum up, python is returning memory, but your libc is not. You can force it using malloc_trim, see the attached patch (I'm not at all suggesting its inclusion, it's just an illustration). Results with current code: *** Python 3.3.0 alpha --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 29823 pts/0 S+ 0:00 1 1607 168176 8596 0.2 ./python /tmp/issue11849_test.py 1 29823 pts/0 S+ 0:01 1 1607 249400 87088 2.2 ./python /tmp/issue11849_test.py 2 29823 pts/0 S+ 0:03 1 1607 324080 161704 4.1 ./python /tmp/issue11849_test.py 3 29823 pts/0 S+ 0:04 1 1607 398960 235036 5.9 ./python /tmp/issue11849_test.py 4 29823 pts/0 S+ 0:06 1 1607 473356 309464 7.8 ./python /tmp/issue11849_test.py 5 29823 pts/0 S+ 0:07 1 1607 548120 384624 9.8 ./python /tmp/issue11849_test.py 6 29823 pts/0 S+ 0:09 1 1607 622884 458332 11.6 ./python /tmp/issue11849_test.py 7 29823 pts/0 S+ 0:10 1 1607 701864 535736 13.6 ./python /tmp/issue11849_test.py 8 29823 pts/0 S+ 0:12 1 1607 772440 607988 15.5 ./python /tmp/issue11849_test.py 9 29823 pts/0 S+ 0:13 1 1607 851156 685384 17.4 ./python /tmp/issue11849_test.py END 29823 pts/0 S+ 0:16 1 1607 761712 599400 15.2 ./python /tmp/issue11849_test.py GC 29823 pts/0 S+ 0:16 1 1607 680900 519280 13.2 ./python /tmp/issue11849_test.py *** 29823 pts/0 S+ 0:16 1 1607 680900 519288 13.2 ./python /tmp/issue11849_test.py Results with the malloc_trim: *** Python 3.3.0 alpha --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 30020 pts/0 S+ 0:00 1 1607 168180 8596 0.2 ./python /tmp/issue11849_test.py 1 30020 pts/0 S+ 0:01 1 1607 249404 86160 2.1 ./python /tmp/issue11849_test.py 2 30020 pts/0 S+ 0:03 1 1607 324084 160596 4.0 ./python /tmp/issue11849_test.py 3 30020 pts/0 S+ 0:04 1 1607 398964 235036 5.9 ./python /tmp/issue11849_test.py 4 30020 pts/0 S+ 0:06 1 1607 473360 309808 7.9 ./python /tmp/issue11849_test.py 5 30020 pts/0 S+ 0:07 1 1607 548124 383896 9.7 ./python /tmp/issue11849_test.py 6 30020 pts/0 S+ 0:09 1 1607 622888 458716 11.7 ./python /tmp/issue11849_test.py 7 30020 pts/0 S+ 0:10 1 1607 701868 536124 13.6 ./python /tmp/issue11849_test.py 8 30020 pts/0 S+ 0:12 1 1607 772444 607212 15.4 ./python /tmp/issue11849_test.py 9 30020 pts/0 S+ 0:14 1 1607 851160 684608 17.4 ./python /tmp/issue11849_test.py END 30020 pts/0 S+ 0:16 1 1607 761716 599524 15.3 ./python /tmp/issue11849_test.py GC 30020 pts/0 S+ 0:16 1 1607 680776 10744 0.2 ./python /tmp/issue11849_test.py *** 30020 pts/0 S+ 0:16 1 1607 680776 10752 0.2 ./python /tmp/issue11849_test.py |
|||
msg133940 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-04-17 22:27 | |
> To sum up, python is returning memory, but your libc is not. > You can force it using malloc_trim, see the attached patch (I'm not at > all suggesting its inclusion, it's just an illustration). That's an interesting thing, perhaps you want to open a feature request as a separate issue? |
|||
msg133946 - (view) | Author: kaifeng (kaifeng) | Date: 2011-04-18 00:37 | |
I added 'malloc_trim' to the test code and rerun the test with Python 2.5 / 3.2 on CentOS 5.3. The problem still exists. *** Python 2.5.5 final --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 2567 pts/0 S+ 0:00 0 1 8206 4864 0.4 /home/zkf/.programs/python/bin/python issue11849_test.py 1 2567 pts/0 S+ 0:03 0 1 44558 41140 3.9 /home/zkf/.programs/python/bin/python issue11849_test.py 2 2567 pts/0 S+ 0:07 0 1 81166 77728 7.5 /home/zkf/.programs/python/bin/python issue11849_test.py 3 2567 pts/0 S+ 0:12 0 1 117798 114316 11.0 /home/zkf/.programs/python/bin/python issue11849_test.py 4 2567 pts/0 S+ 0:17 0 1 154402 150912 14.5 /home/zkf/.programs/python/bin/python issue11849_test.py 5 2567 pts/0 S+ 0:23 0 1 191018 187500 18.1 /home/zkf/.programs/python/bin/python issue11849_test.py 6 2567 pts/0 S+ 0:29 0 1 227630 224084 21.6 /home/zkf/.programs/python/bin/python issue11849_test.py 7 2567 pts/0 S+ 0:36 0 1 264242 260668 25.1 /home/zkf/.programs/python/bin/python issue11849_test.py 8 2567 pts/0 S+ 0:44 0 1 300882 297288 28.7 /home/zkf/.programs/python/bin/python issue11849_test.py 9 2567 pts/0 S+ 0:53 0 1 337230 333860 32.2 /home/zkf/.programs/python/bin/python issue11849_test.py END 2567 pts/0 S+ 1:02 0 1 373842 370444 35.7 /home/zkf/.programs/python/bin/python issue11849_test.py GC 2567 pts/0 S+ 1:02 0 1 373842 370444 35.7 /home/zkf/.programs/python/bin/python issue11849_test.py *** 2567 pts/0 S+ 1:02 0 1 373714 370436 35.7 /home/zkf/.programs/python/bin/python issue11849_test.py *** Python 3.2.0 final --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 2633 pts/0 S+ 0:00 1 1316 11051 6448 0.6 python3.2 issue11849_test.py 1 2633 pts/0 S+ 0:02 1 1316 53151 47340 4.5 python3.2 issue11849_test.py 2 2633 pts/0 S+ 0:05 1 1316 91051 85216 8.2 python3.2 issue11849_test.py 3 2633 pts/0 S+ 0:08 1 1316 128943 124228 12.0 python3.2 issue11849_test.py 4 2633 pts/0 S+ 0:11 1 1316 166803 162296 15.6 python3.2 issue11849_test.py 5 2633 pts/0 S+ 0:14 1 1316 204475 199972 19.3 python3.2 issue11849_test.py 6 2633 pts/0 S+ 0:17 1 1316 243831 238180 23.0 python3.2 issue11849_test.py 7 2633 pts/0 S+ 0:20 1 1316 284371 277532 26.8 python3.2 issue11849_test.py 8 2633 pts/0 S+ 0:23 1 1316 318187 312456 30.1 python3.2 issue11849_test.py 9 2633 pts/0 S+ 0:26 1 1316 360231 353296 34.1 python3.2 issue11849_test.py END 2633 pts/0 S+ 0:30 1 1316 393971 388184 37.4 python3.2 issue11849_test.py GC 2633 pts/0 S+ 0:30 1 1316 352031 347652 33.5 python3.2 issue11849_test.py *** 2633 pts/0 S+ 0:31 1 1316 351903 347524 33.5 python3.2 issue11849_test.py |
|||
msg133956 - (view) | Author: kaifeng (kaifeng) | Date: 2011-04-18 10:01 | |
Found a minor defect of Python 3.2 / 3.3: line 1676 of xml/etree/ElementTree.py was: del self.target, self._parser # get rid of circular references should be: del self.target, self._target, self.parser, self._parser # get rid of circular references While it doesn't help this issue... |
|||
msg133980 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-04-18 16:41 | |
> kaifeng <cafeeee@gmail.com> added the comment: > > I added 'malloc_trim' to the test code and rerun the test with Python 2.5 / 3.2 on CentOS 5.3. The problem still exists. > Well, malloc_trim can fail, but how did you "add" it ? Did you use patch to apply the diff ? Also, could you post the output of a ltrace -e malloc_trim python <test script> For info, the sample outputs I posted above come from a RHEL6 box. Anyway, I'm 99% sure this isn't a leak but a malloc issue (valgrind --tool=memcheck could confirm this if you want to try, I could be wrong, it wouldn't be the first time ;-) ). By the way, look at what I just found: http://mail.gnome.org/archives/xml/2008-February/msg00003.html > Antoine Pitrou <pitrou@free.fr> added the comment: > That's an interesting thing, perhaps you want to open a feature request as a separate issue? Dunno. Memory management is a domain which belongs to the operating system/libc, and I think applications should mess with it (apart from specific cases) . I don't have time to look at this precise problem in greater detail right now, but AFAICT, this looks either like a glibc bug, or at least a corner case with default malloc parameters (M_TRIM_THRESHOLD and friends), affecting only RHEL and derived distributions. malloc_trim should be called automatically by free if the amount of memory that could be release is above M_TRIM_THRESHOLD. Calling it systematically can have a non-negligible performance impact. |
|||
msg134008 - (view) | Author: kaifeng (kaifeng) | Date: 2011-04-19 02:41 | |
I applied your patch to Python 3.2, also I added a function call to 'malloc_trim' via ctypes, as you can see in issue11849_test2.py. In fact I have a daemon written in Python 2.5, parsing an XML of size 10+ MB every 5 minutes, after 16+ hours running, the program finally exhausted 4 GB memory and died. I simplified the logic of the daemon and found ElementTree eats too much memory. There comes the attached test script. BTW, after utilize lxml instead of ElementTree, such phenomenon of increasing memory usage disappeared. $ ltrace -e malloc_trim python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- --- SIGCHLD (Child exited) --- *** Python 3.2.0 final --- SIGCHLD (Child exited) --- --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND --- SIGCHLD (Child exited) --- 0 13708 pts/1 S+ 0:00 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:00 1 1316 11055 6440 0.6 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- 1 13708 pts/1 S+ 0:00 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:03 1 1316 53155 47332 4.5 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- 2 13708 pts/1 S+ 0:00 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:06 1 1316 91055 85204 8.2 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- 3 13708 pts/1 S+ 0:01 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:10 1 1316 128947 124212 11.9 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- 4 13708 pts/1 S+ 0:01 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:13 1 1316 166807 162280 15.6 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- 5 13708 pts/1 S+ 0:01 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:16 1 1316 204483 198808 19.2 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- 6 13708 pts/1 S+ 0:02 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:20 1 1316 242379 236672 22.8 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- 7 13708 pts/1 S+ 0:02 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:23 1 1316 284383 277508 26.8 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- 8 13708 pts/1 S+ 0:03 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:27 1 1316 318191 312436 30.1 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- 9 13708 pts/1 S+ 0:03 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:29 1 1316 360199 353272 34.1 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- END 13708 pts/1 S+ 0:03 1 65 1742 636 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:34 1 1316 393975 388164 37.4 python3.2 Issue11849_test2.py malloc_trim(0, 0, 0x818480a, 0x81a0114, 0xbfb6c940) = 1 --- SIGCHLD (Child exited) --- GC 13708 pts/1 S+ 0:03 1 65 1742 648 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:35 1 1316 351871 347480 33.5 python3.2 Issue11849_test2.py --- SIGCHLD (Child exited) --- *** 13708 pts/1 S+ 0:03 1 65 1742 648 0.0 ltrace -e malloc_trim python3.2 Issue11849_test2.py 13709 pts/1 S+ 0:35 1 1316 351871 347480 33.5 python3.2 Issue11849_test2.py +++ exited (status 0) +++ |
|||
msg134083 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-04-19 17:26 | |
> BTW, after utilize lxml instead of ElementTree, such phenomenon of increasing memory usage disappeared. If you looked at the link I posted, you'll see that lxml had some similar issues and solved it by calling malloc_trim systematically when freeing memory. It could also be heap fragmentation, though. To go further, it'd be nice if you could provide the output of valgrind --tool=memcheck --leak-check=full --suppressions=Misc/valgrind-python.supp python <test script> after uncommenting relevant lines in Misc/valgrind-python.supp (see http://svn.python.org/projects/python/trunk/Misc/README.valgrind ). It will either confirm a memory leak or malloc issue (I still favour the later). By the way, does while True: XML(gen_xml()) lead to a constant memory usage increase ? |
|||
msg134358 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-04-24 22:58 | |
This is definitely a malloc bug. Test with default malloc on a Debian box: cf@neobox:~/cpython$ ./python ../issue11849_test.py *** Python 3.3.0 alpha --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 3778 pts/2 S+ 0:00 1 1790 8245 7024 0.5 ./python ../issue11849_test.py 1 3778 pts/2 S+ 0:17 1 1790 61937 60404 4.6 ./python ../issue11849_test.py 2 3778 pts/2 S+ 0:35 1 1790 110841 108300 8.3 ./python ../issue11849_test.py 3 3778 pts/2 S+ 0:53 1 1790 159885 158540 12.2 ./python ../issue11849_test.py 4 3778 pts/2 S+ 1:10 1 1790 209369 206724 15.9 ./python ../issue11849_test.py 5 3778 pts/2 S+ 1:28 1 1790 258505 255956 19.7 ./python ../issue11849_test.py 6 3778 pts/2 S+ 1:46 1 1790 307669 304964 23.5 ./python ../issue11849_test.py 7 3778 pts/2 S+ 2:02 1 1790 360705 356952 27.5 ./python ../issue11849_test.py 8 3778 pts/2 S+ 2:21 1 1790 405529 404172 31.2 ./python ../issue11849_test.py 9 3778 pts/2 S+ 2:37 1 1790 458789 456128 35.2 ./python ../issue11849_test.py END 3778 pts/2 S+ 3:00 1 1790 504189 501624 38.7 ./python ../issue11849_test.py GC 3778 pts/2 S+ 3:01 1 1790 454689 453476 35.0 ./python ../issue11849_test.py *** 3778 pts/2 S+ 3:01 1 1790 454689 453480 35.0 ./python ../issue11849_test.py [56426 refs] The heap is not trimmed, even after GC collection. Now, using a smaller mmap threshold so that malloc uses mmap instead of brk: cf@neobox:~/cpython$ MALLOC_MMAP_THRESHOLD_=1024 ./python ../issue11849_test.py *** Python 3.3.0 alpha --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 3843 pts/2 S+ 0:00 1 1790 8353 7036 0.5 ./python ../issue11849_test.py 1 3843 pts/2 S+ 0:17 1 1790 62593 59240 4.5 ./python ../issue11849_test.py 2 3843 pts/2 S+ 0:35 1 1790 112321 108304 8.3 ./python ../issue11849_test.py 3 3843 pts/2 S+ 0:53 1 1790 162313 157372 12.1 ./python ../issue11849_test.py 4 3843 pts/2 S+ 1:11 1 1790 212057 206456 15.9 ./python ../issue11849_test.py 5 3843 pts/2 S+ 1:29 1 1790 261749 255484 19.7 ./python ../issue11849_test.py 6 3843 pts/2 S+ 1:47 1 1790 311669 304484 23.5 ./python ../issue11849_test.py 7 3843 pts/2 S+ 2:03 1 1790 365485 356488 27.5 ./python ../issue11849_test.py 8 3843 pts/2 S+ 2:22 1 1790 411341 402568 31.1 ./python ../issue11849_test.py 9 3843 pts/2 S+ 2:38 1 1790 465141 454552 35.1 ./python ../issue11849_test.py END 3843 pts/2 S+ 3:02 1 1790 67173 63892 4.9 ./python ../issue11849_test.py GC 3843 pts/2 S+ 3:03 1 1790 9925 8664 0.6 ./python ../issue11849_test.py *** 3843 pts/2 S+ 3:03 1 1790 9925 8668 0.6 ./python ../issue11849_test.py [56428 refs] Just to be sure, with ptmalloc3 malloc implementation: cf@neobox:~/cpython$ LD_PRELOAD=../ptmalloc3/libptmalloc3.so ./python ../issue11849_test.py *** Python 3.3.0 alpha --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 3898 pts/2 S+ 0:00 1 1790 8369 7136 0.5 ./python ../issue11849_test.py 1 3898 pts/2 S+ 0:17 1 1790 62825 60264 4.6 ./python ../issue11849_test.py 2 3898 pts/2 S+ 0:34 1 1790 112641 110176 8.5 ./python ../issue11849_test.py 3 3898 pts/2 S+ 0:52 1 1790 162689 160048 12.3 ./python ../issue11849_test.py 4 3898 pts/2 S+ 1:09 1 1790 212285 209732 16.2 ./python ../issue11849_test.py 5 3898 pts/2 S+ 1:27 1 1790 261881 259460 20.0 ./python ../issue11849_test.py 6 3898 pts/2 S+ 1:45 1 1790 311929 309332 23.9 ./python ../issue11849_test.py 7 3898 pts/2 S+ 2:01 1 1790 365625 362004 27.9 ./python ../issue11849_test.py 8 3898 pts/2 S+ 2:19 1 1790 411445 408812 31.5 ./python ../issue11849_test.py 9 3898 pts/2 S+ 2:35 1 1790 465205 461536 35.6 ./python ../issue11849_test.py END 3898 pts/2 S+ 2:58 1 1790 72141 69688 5.3 ./python ../issue11849_test.py GC 3898 pts/2 S+ 2:59 1 1790 15001 13748 1.0 ./python ../issue11849_test.py *** 3898 pts/2 S+ 2:59 1 1790 15001 13752 1.0 ./python ../issue11849_test.py [56428 refs] So the problem is really that glibc/eglibc malloc implementations don't automatically trim memory upon free (this happens if you're only allocating/deallocating small chunks < 64B that come from fastbins, but that's not the case here). By the way, I noticed that dictionnaries are never allocated through pymalloc, since a new dictionnary takes more than 256B... |
|||
msg134359 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-04-24 23:20 | |
The MALLOC_MMAP_THRESHOLD improvement is less visible here: $ MALLOC_MMAP_THRESHOLD_=1024 ../opt/python issue11849_test.py *** Python 3.3.0 alpha --- USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 0 antoine 7703 0.0 0.1 57756 8560 pts/2 S+ 01:16 0:00 ../opt/python issue11849_test.py 1 antoine 7703 62.0 1.0 138892 86100 pts/2 S+ 01:16 0:01 ../opt/python issue11849_test.py 2 antoine 7703 84.6 2.0 213580 160552 pts/2 S+ 01:16 0:02 ../opt/python issue11849_test.py 3 antoine 7703 97.0 2.9 288080 234972 pts/2 S+ 01:16 0:03 ../opt/python issue11849_test.py 4 antoine 7703 85.6 3.9 362852 309408 pts/2 S+ 01:16 0:05 ../opt/python issue11849_test.py 5 antoine 7703 93.4 4.8 437616 383844 pts/2 S+ 01:16 0:06 ../opt/python issue11849_test.py 6 antoine 7703 99.0 5.7 512380 458276 pts/2 S+ 01:16 0:07 ../opt/python issue11849_test.py 7 antoine 7703 89.6 6.7 591360 535672 pts/2 S+ 01:16 0:08 ../opt/python issue11849_test.py 8 antoine 7703 94.9 7.6 661676 607156 pts/2 S+ 01:16 0:10 ../opt/python issue11849_test.py 9 antoine 7703 95.5 8.6 740652 684556 pts/2 S+ 01:16 0:11 ../opt/python issue11849_test.py END antoine 7703 96.1 7.5 650432 597736 pts/2 S+ 01:16 0:13 ../opt/python issue11849_test.py GC antoine 7703 97.2 6.5 570316 519228 pts/2 S+ 01:16 0:13 ../opt/python issue11849_test.py *** antoine 7703 90.8 6.5 569876 518792 pts/2 S+ 01:16 0:13 ../opt/python issue11849_test.py By the way, an easy fix is to use cElementTree instead of ElementTree. It still won't release all memory but it will eat a lot less of it, and be much faster as well. |
|||
msg134360 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-04-24 23:22 | |
> By the way, I noticed that dictionnaries are never allocated through > pymalloc, since a new dictionnary takes more than 256B... On 64-bit builds indeed. pymalloc could be improved to handle allocations up to 512B. Want to try and write a patch? |
|||
msg134375 - (view) | Author: kaifeng (kaifeng) | Date: 2011-04-25 08:01 | |
Sorry for the later update. Valgrind shows there is no memory leak (see attached valgrind.log). The following code, while True: XML(gen_xml()) has an increasing memory usage in the first 5~8 iterations, and waves around a constant level afterwards. So I guess there's a component, maybe libc, Python interpreter, ElementTree/pyexpat module or someone else, hold some memory until process ends. |
|||
msg134380 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-04-25 12:36 | |
> The MALLOC_MMAP_THRESHOLD improvement is less visible here: > Are you running on 64-bit ? If yes, it could be that you're exhausting M_MMAP_MAX (malloc falls back to brk when there are too many mmap mappings). You could try with MALLOC_MMAP_THRESHOLD_=1024 MALLOC_MMAP_MAX_=16777216 ../opt/python issue11849_test.py By the way, never do that in real life, it's a CPU and memory hog ;-) I think the root cause is that glibc's malloc coalescing of free chunks is called far less often than in the original ptmalloc version, but I still have to dig some more. >> By the way, I noticed that dictionnaries are never allocated through >> pymalloc, since a new dictionnary takes more than 256B... > > On 64-bit builds indeed. pymalloc could be improved to handle allocations up > to 512B. Want to try and write a patch? Sure. I'll open another issue. |
|||
msg134388 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-04-25 14:55 | |
> > The MALLOC_MMAP_THRESHOLD improvement is less visible here: > > > > Are you running on 64-bit ? Yes. > If yes, it could be that you're exhausting M_MMAP_MAX (malloc falls > back to brk when there are too many mmap mappings). > You could try with > MALLOC_MMAP_THRESHOLD_=1024 MALLOC_MMAP_MAX_=16777216 ../opt/python > issue11849_test.py It isn't better. |
|||
msg134392 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-04-25 15:57 | |
> It isn't better. Requests above 256B are directly handled by malloc, so MALLOC_MMAP_THRESHOLD_ should in fact be set to 256 (with 1024 I guess that on 64-bit every mid-sized dictionnary gets allocated with brk). |
|||
msg134992 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-05-02 16:57 | |
I've had some time to look at this, and I've written a quick demo patch that should - hopefully - fix this, and reduce memory fragmentation. A little bit of background first: - a couple years ago (probably true when pymalloc was designed and merged), glibc's malloc used brk for small and medium allocations, and mmap for large allocations, to reduce memory fragmentation (also, because of the processes' VM layout in older Linux 32-bit kernels, you couldn't have a heap bigger than 1GB). The threshold for routing requests to mmap was fixed, and had a default of 256KB (exactly the size of an pymalloc arena). Thus, all arenas were allocated with mmap - in 2006, a patch was merged to make this mmap threshold dynamic, see http://sources.redhat.com/ml/libc-alpha/2006-03/msg00033.html for more details - as a consequence, with modern glibc/elibc versions, the first arenas will be allocated through mmap, but as soon as one of them is freed, subsequent arenas allocation will be allocated from the heap through brk, and not mmap - imagine the following happens : 1) program creates many objects 2) to store those objects, many arenas are allocated from the heap through brk 3) program destroys all the objects created, except 1 which is in the last allocated arena 4) since the arena has at least one object in it, it's not deallocated, and thus the heap doesn't shrink, and the memory usage remains high (with a huge hole between the base of the heap and its top) Note that 3) can be a single leaked reference, or just a variable that doesn't get deallocated immediately. As an example, here's a demo program that should exhibit this behaviour: """ import sys import gc # allocate/de-allocate/re-allocate the array to make sure that arenas are # allocated through brk tab = [] for i in range(1000000): tab.append(i) tab = [] for i in range(1000000): tab.append(i) print('after allocation') sys.stdin.read(1) # allocate a dict at the top of the heap (actually it works even without) this a = {} # deallocate the big array del tab print('after deallocation') sys.stdin.read(1) # collect gc.collect() print('after collection') sys.stdin.read(1) """ You should see that even after the big array has been deallocated and collected, the memory usage doesn't decrease. Also, there's another factor coming into play, the linked list of arenas ("arenas" variable in Object/obmalloc.c), which is expanded when there are not enough arenas allocated: if this variable is realloc()ed while the heap is really large and whithout hole in it, it will be allocated from the top of the heap, and since it's not resized when the number of used arenas goes down, it will remain at the top of the heap and will also prevent the heap from shrinking. My demo patch (pymem.diff) thus does two things: 1) use mallopt to fix the mmap threshold so that arenas are allocated through mmap 2) increase the maximum size of requests handled by pymalloc from 256B to 512B (as discussed above with Antoine). The reason is that if a PyObject_Malloc request is not handled by pymalloc from an arena (i.e. greater than 256B) and is less than the mmap threshold, then we can't do anything if it's not freed and remains in the middle of the heap. That's exactly what's happening in the OP case, some dictionnaries aren't deallocated even after the collection (I couldn't quite identify them, but there seems to be some UTF-8 codecs and other stuff) To sum up, this patch increases greatly the likelihood of Python's objects being allocated from arenas which should reduce fragmentation (and seems to speed up certain operations quite a bit), and ensures that arenas are allocated from mmap so that a single dangling object doesn't prevent the heap from being trimmed. I've tested it on RHEL6 64-bit and Debian 32-bit, but it'd be great if someone else could try it - and of course comment on the above explanation/proposed solution. Here's the result on Debian 32-bit: Without patch: *** Python 3.3.0 alpha --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 1843 pts/1 S+ 0:00 1 1795 9892 7528 0.5 ./python /home/cf/issue11849_test.py 1 1843 pts/1 S+ 0:16 1 1795 63584 60928 4.7 ./python /home/cf/issue11849_test.py 2 1843 pts/1 S+ 0:33 1 1795 112772 109064 8.4 ./python /home/cf/issue11849_test.py 3 1843 pts/1 S+ 0:50 1 1795 162140 159424 12.3 ./python /home/cf/issue11849_test.py 4 1843 pts/1 S+ 1:06 1 1795 211376 207608 16.0 ./python /home/cf/issue11849_test.py END 1843 pts/1 S+ 1:25 1 1795 260560 256888 19.8 ./python /home/cf/issue11849_test.py GC 1843 pts/1 S+ 1:26 1 1795 207276 204932 15.8 ./python /home/cf/issue11849_test.py With patch: *** Python 3.3.0 alpha --- PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 0 1996 pts/1 S+ 0:00 1 1795 10160 7616 0.5 ./python /home/cf/issue11849_test.py 1 1996 pts/1 S+ 0:16 1 1795 64168 59836 4.6 ./python /home/cf/issue11849_test.py 2 1996 pts/1 S+ 0:33 1 1795 114160 108908 8.4 ./python /home/cf/issue11849_test.py 3 1996 pts/1 S+ 0:50 1 1795 163864 157944 12.2 ./python /home/cf/issue11849_test.py 4 1996 pts/1 S+ 1:07 1 1795 213848 207008 15.9 ./python /home/cf/issue11849_test.py END 1996 pts/1 S+ 1:26 1 1795 68280 63776 4.9 ./python /home/cf/issue11849_test.py GC 1996 pts/1 S+ 1:26 1 1795 12112 9708 0.7 ./python /home/cf/issue11849_test.py Antoine: since the increasing of the pymalloc threshold is part of the solution to this problem, I'm attaching a standalone patch here (pymalloc_threshold.diff). It's included in pymem.diff. I'll try post some pybench results tomorrow. |
|||
msg134995 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-05-02 17:50 | |
This is a very interesting patch, thank you. I've tested it on Mandriva 64-bit and it indeed fixes the free() issue on the XML workload. I see no regression on pybench, stringbench or json/pickle benchmarks. I guess the final patch will have to guard the mallopt() call with some #ifdef? (also, I suppose a portable solution would have to call mmap() ourselves for allocation of arenas, but that would probably be a bit more involved) |
|||
msg135010 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-05-02 21:59 | |
> I guess the final patch will have to guard the mallopt() call with some #ifdef? Yes. See attached patch pymalloc_frag.diff It's the first time I'm playing with autotools, so please review this part really carefully ;-) > (also, I suppose a portable solution would have to call mmap() ourselves > for allocation of arenas, but that would probably be a bit more involved) Yes. But since it probably only affects glibc/eglibc malloc versions, I guess that target implementations are likely to provide mallopt(M_MMAP_THRESHOLD). Also, performing an anonymous mappings varies even among Unices (the mmapmodule code is scary). I'm not talking about Windows, which I don't know at all. |
|||
msg135023 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-05-03 10:00 | |
Patch looks fine to me, thank you. |
|||
msg135049 - (view) | Author: Roundup Robot (python-dev) ![]() |
Date: 2011-05-03 16:19 | |
New changeset f8a697bc3ca8 by Antoine Pitrou in branch 'default': Issue #11849: Make it more likely for the system allocator to release http://hg.python.org/cpython/rev/f8a697bc3ca8 |
|||
msg148293 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-11-25 00:30 | |
For the record, this seems to make large allocations slower: -> with patch: $ ./python -m timeit "b'x'*200000" 10000 loops, best of 3: 27.2 usec per loop -> without patch: $ ./python -m timeit "b'x'*200000" 100000 loops, best of 3: 7.4 usec per loop Not sure we should care, though. It's still very fast. (noticed in http://mail.python.org/pipermail/python-dev/2011-November/114610.html ) |
|||
msg148297 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-11-25 00:52 | |
More surprising is that, even ignoring the allocation cost, other operations on the memory area seem more expensive: $ ./python -m timeit -s "b=bytearray(500000)" "b[:] = b" -> python 3.3: 1000 loops, best of 3: 367 usec per loop -> python 3.2: 10000 loops, best of 3: 185 usec per loop (note how this is just a dump memcpy) |
|||
msg148308 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-11-25 08:17 | |
> For the record, this seems to make large allocations slower: > > -> with patch: > $ ./python -m timeit "b'x'*200000" > 10000 loops, best of 3: 27.2 usec per loop > > -> without patch: > $ ./python -m timeit "b'x'*200000" > 100000 loops, best of 3: 7.4 usec per loop > Yes, IIRC, I warned it could be a possible side effect: since we're now using mmap() instead of brk() for large allocations (between 256B and 32/64MB), it can be slower (that's the reason adaptive mmap threadshold was introduced in the first place). > More surprising is that, even ignoring the allocation cost, other operations on the memory area seem more expensive: Hum, this it strange. I see you're comparing 3.2 and default: could you run the same benchmark on default with and without the patch ? |
|||
msg148313 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-11-25 12:18 | |
> I see you're comparing 3.2 and default: could you run the same > benchmark on default with and without the patch ? Same results: -> default branch: 1000 loops, best of 3: 364 usec per loop -> default branch with patch reverted: 10000 loops, best of 3: 185 usec per loop (with kernel 2.6.38.8-desktop-8.mga and glibc-2.12.1-11.2.mga1) And I can reproduce on another machine: -> default branch: 1000 loops, best of 3: 224 usec per loop -> default branch with patch reverted: 10000 loops, best of 3: 88 usec per loop (Debian stable with kernel 2.6.32-5-686 and glibc 2.11.2-10) |
|||
msg148314 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-11-25 12:52 | |
Ah, sorry, false alarm. "b[:] = b" actually makes a temporary copy of the bytearray when assigning to itself (!). However, there's still another strange regression: $ ./python -m timeit \ -s "n=300000; f=open('10MB.bin', 'rb', buffering=0); b=bytearray(n)" \ "f.seek(0);f.readinto(b)" -> default branch: 10000 loops, best of 3: 43 usec per loop -> default branch with patch reverted: 10000 loops, best of 3: 27.5 usec per loop FileIO.readinto executes a single read() into the passed buffer. |
|||
msg148363 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-11-25 21:51 | |
> However, there's still another strange regression: > > $ ./python -m timeit \ > -s "n=300000; f=open('10MB.bin', 'rb', buffering=0); b=bytearray(n)" \ > "f.seek(0);f.readinto(b)" > > -> default branch: > 10000 loops, best of 3: 43 usec per loop > -> default branch with patch reverted: > 10000 loops, best of 3: 27.5 usec per loop > > FileIO.readinto executes a single read() into the passed buffer. On my box: default: $ ./python -m timeit -s "n=300000; f=open('/tmp/10MB.bin', 'rb'); b=bytearray(n)" "f.seek(0);f.readinto(b)" 1000 loops, best of 3: 640 usec per loop default without patch ("$ hg revert -r 68258 Objects/obmalloc.c && make"): $ ./python -m timeit -s "n=300000; f=open('/tmp/10MB.bin', 'rb'); b=bytearray(n)" "f.seek(0);f.readinto(b)" 1000 loops, best of 3: 663 usec per loop I'm just observing a random variance (but my computer is maybe too slow to notice). However, I really don't see how the patch could play a role here. Concerning the slight performance regression, if it's a problem, I see two options: - revert the patch - replace calls to malloc()/free() by mmap()/munmap() to allocate/free arenas (but I'm not sure anonymous mappings are supported by every OS out there, so this might lead to some ugly #ifdef's...) |
|||
msg148364 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-11-25 22:00 | |
> On my box: > default: > $ ./python -m timeit -s "n=300000; f=open('/tmp/10MB.bin', 'rb'); > b=bytearray(n)" "f.seek(0);f.readinto(b)" > 1000 loops, best of 3: 640 usec per loop > > default without patch ("$ hg revert -r 68258 Objects/obmalloc.c && make"): > $ ./python -m timeit -s "n=300000; f=open('/tmp/10MB.bin', 'rb'); > b=bytearray(n)" "f.seek(0);f.readinto(b)" > 1000 loops, best of 3: 663 usec per loop > > I'm just observing a random variance (but my computer is maybe too > slow to notice). Hmm, quite slow indeed, are you sure you're not running in debug mode? > However, I really don't see how the patch could play a role here. > > Concerning the slight performance regression, if it's a problem, I see > two options: > - revert the patch > - replace calls to malloc()/free() by mmap()/munmap() to allocate/free > arenas (but I'm not sure anonymous mappings are supported by every OS > out there, so this might lead to some ugly #ifdef's...) If the performance regression is limited to read(), I don't think it's really an issue, but using mmap/munmap explicitly would probably benicer anyway (1° because it lets the glibc choose whatever heuristic is best, 2° because it would help release memory on more systems than just glibc systems). I think limiting ourselves to systems which have MMAP_ANONYMOUS is good enough. Here is what the glibc malloc does btw: /* Nearly all versions of mmap support MAP_ANONYMOUS, so the following is unlikely to be needed, but is supplied just in case. */ #ifndef MAP_ANONYMOUS static int dev_zero_fd = -1; /* Cached file descriptor for /dev/zero. */ #define MMAP(addr, size, prot, flags) ((dev_zero_fd < 0) ? \ (dev_zero_fd = open("/dev/zero", O_RDWR), \ mmap((addr), (size), (prot), (flags), dev_zero_fd, 0)) : \ mmap((addr), (size), (prot), (flags), dev_zero_fd, 0)) #else #define MMAP(addr, size, prot, flags) \ (mmap((addr), (size), (prot), (flags)|MAP_ANONYMOUS, -1, 0)) #endif |
|||
msg148366 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-11-25 22:45 | |
> Hmm, quite slow indeed, are you sure you're not running in debug mode? > Well, yes, but it's no faster with a non-debug build: my laptop is really crawling :-) > If the performance regression is limited to read(), I don't think it's > really an issue, but using mmap/munmap explicitly would probably benicer > anyway (1° because it lets the glibc choose whatever heuristic is best, > 2° because it would help release memory on more systems than just glibc > systems). I think limiting ourselves to systems which have > MMAP_ANONYMOUS is good enough. > Agreed. Here's a patch. |
|||
msg148374 - (view) | Author: Roundup Robot (python-dev) ![]() |
Date: 2011-11-26 00:23 | |
New changeset e7aa72e6aad4 by Antoine Pitrou in branch 'default': Better resolution for issue #11849: Ensure that free()d memory arenas are really released http://hg.python.org/cpython/rev/e7aa72e6aad4 |
|||
msg202458 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2013-11-09 04:05 | |
I just found this issue from this article: http://python.dzone.com/articles/diagnosing-memory-leaks-python Great job! Using mmap() for arenas is the best solution for this issue. I did something similar on a completly different project (also using its own dedicated memory allocator) for workaround the fragmentation of the heap memory. |
|||
msg202459 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2013-11-09 04:54 | |
[@haypo] > http://python.dzone.com/articles/diagnosing-memory-leaks-python > Great job! Using mmap() for arenas is the best solution for this issue. ? I read the article, and they stopped when they found "there seemed to be a ton of tiny little objects around, like integers.". Ints aren't allocated from arenas to begin wtih - they have their own (immortal & unbounded) free list in Python2. No change to pymalloc could make any difference to that. |
|||
msg202478 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2013-11-09 09:28 | |
Extract of the "workaround" section: "You could also run your Python jobs using Jython, which uses the Java JVM and does not exhibit this behavior. Likewise, you could upgrade to Python 3.3 <http://bugs.python.org/issue11849>," Which contains a link to this issue. |
|||
msg310052 - (view) | Author: Bob Kline (bkline) * | Date: 2018-01-16 08:58 | |
Would it be inappropriate for this fix to be applied to 2.7? |
|||
msg310053 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-01-16 08:59 | |
It's not really a fix, it's an improvement, and as such doesn't belong in 2.7. Using malloc() and free() is not a bug in itself. |
|||
msg310055 - (view) | Author: Bob Kline (bkline) * | Date: 2018-01-16 09:08 | |
Sorry, I should have used the language of the patch author ("the resolution"). Without the resolution, Python 2.7 eventually runs out of memory and crashes for some correctly written user code. |
|||
msg310058 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-01-16 09:11 | |
Well, memory fragmentation can happen with any allocation scheme, and it's possible even Python 3 isn't immune to this. Backporting performance improvements is a strain on our resources and also constitutes a maintenance threat (what if the bug hides in the new code?). And Python 2.7 is really nearing its end-of-life more and more everyday. So IMHO it's a no-no. |
|||
msg310065 - (view) | Author: Bob Kline (bkline) * | Date: 2018-01-16 09:43 | |
Thanks for your responses to my comments. I'm working as hard as I can to get my customer's systems migrated into the Python 3 world, and I appreciate the efforts of the community to provide incentives (such as the resolution for this failure) for developers to upgrade. However, it's a delicate balancing act sometimes, given that we have critical places in our system for which the same code runs more than twice as slowly on Python 3.6 as on Python 2.7. |
|||
msg310068 - (view) | Author: Inada Naoki (methane) * ![]() |
Date: 2018-01-16 10:00 | |
FYI, jemalloc can reduce memory usage, especially when application is multithreaded. https://www.speedshop.co/2017/12/04/malloc-doubles-ruby-memory.html https://zapier.com/engineering/celery-python-jemalloc/ |
|||
msg310086 - (view) | Author: Bob Kline (bkline) * | Date: 2018-01-16 12:51 | |
> ... jemalloc can reduce memory usage ... Thanks for the tip. I downloaded the source and successfully built the DLL, then went looking for a way to get it loaded. Unfortunately, DLL injection, which is needed to use this allocator in Python, seems to be much better supported on Linux than on Windows. Basically, Microsoft's documentation [1] for AppInit_DLL, the shim for DLL injection on Windows, says (in effect) "here's how to use this technique, but we don't recommend using it, so here's a link [2] for what we recommend you do instead. That link takes you to "Try searching for what you need. This page doesn’t exist." [1] https://support.microsoft.com/en-us/help/197571/working-with-the-appinit-dlls-registry-value [2] https://support.microsoft.com/en-us/help/134655 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:16 | admin | set | github: 56058 |
2018-01-16 12:51:42 | bkline | set | messages: + msg310086 |
2018-01-16 10:00:01 | methane | set | nosy:
+ methane messages: + msg310068 |
2018-01-16 09:43:00 | bkline | set | messages: + msg310065 |
2018-01-16 09:11:38 | pitrou | set | messages: + msg310058 |
2018-01-16 09:08:40 | bkline | set | messages: + msg310055 |
2018-01-16 08:59:08 | pitrou | set | messages: + msg310053 |
2018-01-16 08:58:08 | bkline | set | nosy:
+ bkline messages: + msg310052 |
2013-11-09 09:28:25 | vstinner | set | messages: + msg202478 |
2013-11-09 04:54:21 | tim.peters | set | nosy:
+ tim.peters messages: + msg202459 |
2013-11-09 04:05:24 | vstinner | set | nosy:
+ vstinner messages: + msg202458 |
2011-11-26 00:23:42 | python-dev | set | messages: + msg148374 |
2011-11-25 22:45:18 | neologix | set | files:
+ arenas_mmap.diff messages: + msg148366 |
2011-11-25 22:00:00 | pitrou | set | messages: + msg148364 |
2011-11-25 21:51:17 | neologix | set | messages: + msg148363 |
2011-11-25 12:52:55 | pitrou | set | messages: + msg148314 |
2011-11-25 12:18:12 | pitrou | set | messages: + msg148313 |
2011-11-25 08:17:07 | neologix | set | messages: + msg148308 |
2011-11-25 00:52:32 | pitrou | set | messages: + msg148297 |
2011-11-25 00:30:00 | pitrou | set | nosy:
+ eli.bendersky messages: + msg148293 |
2011-05-03 16:20:57 | pitrou | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2011-05-03 16:19:20 | python-dev | set | nosy:
+ python-dev messages: + msg135049 |
2011-05-03 10:00:59 | pitrou | set | stage: patch review messages: + msg135023 versions: - Python 3.1, Python 2.7, Python 3.2 |
2011-05-02 21:59:48 | neologix | set | files: - pymem.diff |
2011-05-02 21:59:34 | neologix | set | files: - gc_trim.diff |
2011-05-02 21:59:21 | neologix | set | files:
+ pymalloc_frag.diff messages: + msg135010 |
2011-05-02 17:50:30 | pitrou | set | messages: + msg134995 |
2011-05-02 16:57:55 | neologix | set | files:
+ pymem.diff, pymalloc_threshold.diff messages: + msg134992 |
2011-04-25 19:02:09 | dmalcolm | set | nosy:
+ dmalcolm |
2011-04-25 15:57:06 | neologix | set | messages: + msg134392 |
2011-04-25 14:55:11 | pitrou | set | messages: + msg134388 |
2011-04-25 12:36:05 | neologix | set | messages: + msg134380 |
2011-04-25 08:01:32 | kaifeng | set | files:
+ valgrind.log messages: + msg134375 |
2011-04-24 23:22:16 | pitrou | set | messages: + msg134360 |
2011-04-24 23:20:06 | pitrou | set | title: ElementTree memory leak -> glibc allocator doesn't release all free()ed memory messages: + msg134359 versions: + Python 3.3, - Python 2.5 |
2011-04-24 22:58:44 | neologix | set | messages: + msg134358 |
2011-04-19 17:26:47 | neologix | set | messages: + msg134083 |
2011-04-19 02:41:37 | kaifeng | set | messages: + msg134008 |
2011-04-18 16:41:02 | neologix | set | messages: + msg133980 |
2011-04-18 10:01:24 | kaifeng | set | messages: + msg133956 |
2011-04-18 00:37:29 | kaifeng | set | files:
+ issue11849_test2.py messages: + msg133946 versions: + Python 2.7, Python 3.2 |
2011-04-17 22:27:53 | pitrou | set | nosy:
+ pitrou messages: + msg133940 |
2011-04-17 14:41:41 | neologix | set | files:
+ gc_trim.diff keywords: + patch |
2011-04-17 14:39:39 | neologix | set | nosy:
+ neologix messages: + msg133929 |
2011-04-15 12:32:47 | kaifeng | set | messages: + msg133813 |
2011-04-15 11:41:26 | flox | set | messages: + msg133809 |
2011-04-15 11:39:27 | flox | set | files:
+ issue11849_test.py messages: + msg133808 |
2011-04-15 09:52:26 | kaifeng | set | messages: + msg133800 |
2011-04-15 09:33:32 | flox | set | nosy:
+ flox messages: + msg133799 |
2011-04-15 09:08:38 | kaifeng | create |