Issue4177
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008-10-22 18:40 by surkamp, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
test_MIMEText.tar.bz2 | surkamp, 2008-10-22 18:40 | Test case |
Messages (13) | |||
---|---|---|---|
msg75097 - (view) | Author: Sérgio Surkamp (surkamp) | Date: 2008-10-22 18:40 | |
If you try to create a MIMEText object from a very large string (test case include a 40Mbytes string), the program just eat all the CPU and with high memory usage or raise a MemoryError. Sometimes it just deadlocks when using _charset = "iso-8859-1". Use the submited file and the script to test the case. ** On Linux its very slow, but work's ** - the problem occour on a FreeBSD installation. |
|||
msg75112 - (view) | Author: Roumen Petrov (rpetrov) * | Date: 2008-10-22 21:46 | |
I don't think that test work on linux without MemoryError. What about if you set user limits on linux ? If you enable core file on linux did the test really crash and dump core or just raise exception and exit without coredump ? |
|||
msg75142 - (view) | Author: Sérgio Surkamp (surkamp) | Date: 2008-10-23 12:45 | |
Testing on Linux: $ ulimit -m 128000 $ ulimit -v 196000 $ python test_MIMEText.py [...] Traceback (most recent call last): File "test_MIMEText.py", line 23, in <module> txt = MIMEText(buffer, _subtype="plain", _charset="iso-8859-1") File "/usr/lib/python2.5/email/mime/text.py", line 30, in __init__ self.set_payload(_text, _charset) File "/usr/lib/python2.5/email/message.py", line 220, in set_payload self.set_charset(charset) File "/usr/lib/python2.5/email/message.py", line 262, in set_charset self._payload = charset.body_encode(self._payload) File "/usr/lib/python2.5/email/charset.py", line 386, in body_encode return email.quoprimime.body_encode(s) File "/usr/lib/python2.5/email/quoprimime.py", line 198, in encode body = fix_eols(body) File "/usr/lib/python2.5/email/utils.py", line 77, in fix_eols s = re.sub(r'(?<!\r)\n', CRLF, s) File "/usr/lib/python2.5/re.py", line 150, in sub return _compile(pattern, 0).sub(repl, string, count) MemoryError Ok. Setting a "low" ulimit for memory and vmemory, raise a MemoryError on Linux too. Chacking the same limits on FreeBSD, they are set to unlimited, so the problem should not occour there. |
|||
msg75158 - (view) | Author: Roumen Petrov (rpetrov) * | Date: 2008-10-24 10:21 | |
what about data segment and stack size limits ? |
|||
msg75160 - (view) | Author: STINNER Victor (vstinner) * | Date: 2008-10-24 11:27 | |
Your example work here on: - Linux, i386, 2 Go of memory, Python 2.5 - FreeBSD in Qemu, i386, 512 MB of memory, Python 2.5 > The program just eat all the CPU and with high memory usage or raise a MemoryError Yes, it takes one minute or more to finish. If there is not enough memory, Python raises a MemoryError. The behaviour is correct: Python doesn't crash, it's just slow. Your text file is ~40 MB. Python may allocate mutiple objects bigger than 40 MB to create the email content. The algorithm should be changed to work on a stream (process small chunks, eg. 4 KB) instead of manipule the full text in memory (+40,000 KB). Why do you try to send 40 MB by email? Use FTP or another protocol :-p Or use another encoding (base64) to attach the text to the email. |
|||
msg75164 - (view) | Author: Sérgio Surkamp (surkamp) | Date: 2008-10-24 12:48 | |
> Your text file is ~40 MB. Python may allocate mutiple objects bigger than 40 MB to create the email content. The algorithm should be changed to work on a stream (process small chunks, eg. 4 KB) instead of manipule the full text in memory (+40,000 KB). The original text block is about 5 to 9 Mbytes - its a server generated report by pflogsum. When it came to our mailing list processing program (wrote by someone else in Python), it freezes building the MIMEText object. Actually no MemoryError isn't raised, just a sudden freeze of the running thread. Unfortunately the test script submited does not do the same behavior, maybe some other things are freezing the software instead of raise the MemoryError. I have checked for blocks of try: ... except ...: pass that could hide the problem, but found nothing. I have already limited the size on Postfix, but the strange thing is why this happens on FreeBSD and don't on Linux. |
|||
msg75165 - (view) | Author: STINNER Victor (vstinner) * | Date: 2008-10-24 13:03 | |
> The original text block is about 5 to 9 Mbytes (...), it freezes > building the MIMEText object. Actually no MemoryError isn't raised, > just a sudden freeze of the running thread. Can you give more details about the freeze? - FreeBSD version? - CPU, memory? - Full Python version? On "freeze", the process uses 0% or 100% of the CPU time? You can use the strace program to trace Python activity during the freeze. Your might try my clone of strace, strace.py, which works on FreeBSD without the Linux emulation (but on FreeBSD, only i386 is supported): http://python-ptrace.hachoir.org/trac > Unfortunately the test script submited does not do the same behavior, > maybe some other things are freezing the software instead of raise the > MemoryError. You can try the isolate the bug? Remove some code, disable functions, etc. |
|||
msg75166 - (view) | Author: Sérgio Surkamp (surkamp) | Date: 2008-10-24 13:17 | |
- FreeBSD version? FreeBSD 7.0-RELEASE - CPU, memory? CPU: 2 x Pentium III 1.133 GHz Memory: 512 Mbytes - Full Python version? Python 2.5.2 (r252:60911, Oct 2 2008, 10:03:50) [GCC 4.2.1 20070719 [FreeBSD]] on freebsd7 > On "freeze", the process uses 0% or 100% of the CPU time? You can use the strace program to trace Python activity during the freeze. Usually 100%. But saw it with more (using both CPU's), I think that mean more then one thread "freezed". I will download your trace program and do some tests with it. Ill try to collect some informations using GDB too. |
|||
msg75167 - (view) | Author: STINNER Victor (vstinner) * | Date: 2008-10-24 13:24 | |
> Usually 100%. But saw it with more (using both CPU's), I think that mean > more then one thread "freezed". Does the program finish its job after 10 minutes or 1 hour? Using all the CPU doesn't mean that Python is frozen, it's the opposite: Python is working hard to compute the result :-) |
|||
msg75168 - (view) | Author: Sérgio Surkamp (surkamp) | Date: 2008-10-24 13:39 | |
When I first saw the problem, the email system queue was stopped about 2 days (weekend) :-( The email system control the number of open threads, so I wasn't opening new threads too and issuing many warnings about it on logs Anyway, already installed the ptrace tool and Ill start debuging when I came back from launch |
|||
msg75297 - (view) | Author: Sérgio Surkamp (surkamp) | Date: 2008-10-28 17:44 | |
Ok. Something is very wrong with our code too. I have dumped the text that's cousing the "freeze" and run it using the test case scripts. It worked slow, but worked. It seems that our application is eating too many memory from server (about 60Mbytes for a 2.4Mbytes message), so its obviously a application bug/leak. Unfortunately I cant submit the files for performance test, becose they may contain confidential information. As long as I can see on GDB, the python process is in a loop inside this functions: #0 0x2825798e in memcpy () from /lib/libc.so.7 #1 0x080a4607 in PyUnicodeUCS4_Concat () #2 0x080aec8d in PyEval_EvalFrameEx () #3 0x080b2c49 in PyEval_EvalCodeEx () #4 0x080b111a in PyEval_EvalFrameEx () #5 0x080b2c49 in PyEval_EvalCodeEx () #6 0x080b111a in PyEval_EvalFrameEx () #7 0x080b1f65 in PyEval_EvalFrameEx () #8 0x080b2c49 in PyEval_EvalCodeEx () #9 0x080b111a in PyEval_EvalFrameEx () #10 0x080b2c49 in PyEval_EvalCodeEx () #11 0x080eebd6 in PyClassMethod_New () #12 0x08059ef7 in PyObject_Call () #13 0x0805f341 in PyClass_IsSubclass () #14 0x08059ef7 in PyObject_Call () #15 0x080ac86c in PyEval_CallObjectWithKeywords () #16 0x080629d6 in PyInstance_New () #17 0x08059ef7 in PyObject_Call () #18 0x080af2bb in PyEval_EvalFrameEx () #19 0x080b2c49 in PyEval_EvalCodeEx () #20 0x080b111a in PyEval_EvalFrameEx () #21 0x080b1f65 in PyEval_EvalFrameEx () #22 0x080b1f65 in PyEval_EvalFrameEx () #23 0x080b1f65 in PyEval_EvalFrameEx () #24 0x080b2c49 in PyEval_EvalCodeEx () #25 0x080eec4e in PyClassMethod_New () #26 0x08059ef7 in PyObject_Call () #27 0x0805f341 in PyClass_IsSubclass () #28 0x08059ef7 in PyObject_Call () #29 0x080ac86c in PyEval_CallObjectWithKeywords () #30 0x080d4b58 in initthread () #31 0x28175acf in pthread_getprio () from /lib/libthr.so.3 #32 0x00000000 in ?? () Every memcpy call take a lot to complete, but it seems a problem with GDB debugging as it eats 80% to 95% of the CPU and python just 1% or 2%. How python charset conversion works from inside? It duplicates the original string every character substitution? If this is the case, shouldn't be better to count the substituitions, calculate the amount of needed memory and make just one allocation for the new string? Then copy the unmodified characters from the original to the new string and change other chars as needed? |
|||
msg77496 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2008-12-10 08:32 | |
IIUC, no patch has been proposed. So retargetting it to later versions. |
|||
msg127182 - (view) | Author: STINNER Victor (vstinner) * | Date: 2011-01-27 12:34 | |
> Something is very wrong with our code too. I have dumped the text > that's cousing the "freeze" and run it using the test case scripts. > It worked slow, but worked. I retried test_MIMEText.tar.bz2 on FreeBSD 8.0 with 640 MB of memory: the program takes ~5 minutes, but it doesn't fail (no memory error or crash). I suppose that the crash cannot be reproduced by test_MIMEText.tar.bz2 example, only with the full program. Because I don't have access to the full program, I am unable to reproduce the bug, and because there is no activity on this issue since 2 years: I close this issue. If you have more information (especially a short script to reproduce the crash), reopen the issue or create a new issue (maybe more specific? eg. patch MIMEText to use less memory). |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:40 | admin | set | github: 48427 |
2011-01-27 12:34:20 | vstinner | set | status: open -> closed nosy: + vstinner messages: + msg127182 resolution: not a bug |
2009-01-17 01:40:39 | vstinner | set | nosy: - vstinner |
2008-12-10 08:32:42 | loewis | set | nosy:
+ loewis messages: + msg77496 versions: + Python 2.7, - Python 2.5, Python 2.5.3 |
2008-10-28 17:44:44 | surkamp | set | messages: + msg75297 |
2008-10-24 13:39:47 | surkamp | set | messages: + msg75168 |
2008-10-24 13:24:27 | vstinner | set | messages: + msg75167 |
2008-10-24 13:17:03 | surkamp | set | messages: + msg75166 |
2008-10-24 13:03:23 | vstinner | set | messages: + msg75165 |
2008-10-24 12:48:15 | surkamp | set | messages: + msg75164 |
2008-10-24 11:27:20 | vstinner | set | nosy:
+ vstinner messages: + msg75160 |
2008-10-24 10:21:04 | rpetrov | set | messages: + msg75158 |
2008-10-23 12:45:06 | surkamp | set | messages: + msg75142 |
2008-10-22 21:46:14 | rpetrov | set | nosy:
+ rpetrov messages: + msg75112 |
2008-10-22 18:40:38 | surkamp | create |