msg289954 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-03-22 01:10 |
When upgrading to 3.5.3 we noticed that the requests module was leaking memory rather quickly. This led to me logging the issue: https://github.com/kennethreitz/requests/issues/3933. After more investigation I've found that the leak is caused by the raw python SSL sockets. I've created a test file here: https://gist.github.com/thehesiod/ef79dd77e2df7a0a7893dfea6325d30a which allows you to reproduce the leak with raw python ssl socket (CLIENT_TYPE = ClientType.RAW), aiohttp or requests. They all leak in a similar way due to their use of the python SSL socket objects. I tried tracing the memory usage with tracemalloc but nothing interesting popped up so I believe this is a leak in the native code.
A docker cloud image is available here: amohr/testing:stretch_request_leak based on:
```
FROM debian:stretch
COPY request_https_leak.py /tmp/request_https_leak.py
RUN apt-get update && \
apt-get install -y python3.5 python3-pip git
RUN python3 -m pip install requests git+git://github.com/thehesiod/pyca.git@fix-py3#egg=calib setproctitle requests psutil
```
I believe this issue was introduced in python 3.5.3 as we're not seeing the leak with 3.5.2. Also I haven't verified yet if this happens on non-debian systems. I'll update if I have any more info. I believe 3.6 is similarly impacted but am not 100% certain yet.
|
msg289956 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-03-22 01:40 |
validated 3.6 in fedora is affected as well, see github bug for charts. So it seems all 3.5.3+ versions are affected. I'm guessing it was introduced in one of the SSL changes in 3.5.3: https://docs.python.org/3.5/whatsnew/changelog.html#python-3-5-3
|
msg289960 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-03-22 02:38 |
adding valgrind log of 3.5.3 on debian: jessie
|
msg289961 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-03-22 02:41 |
interestingly the valgrind run doesn't show a leak in the profile
|
msg291427 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-04-10 12:21 |
Can you record sys.getallocatedblocks() to see whether it grows continuously?
|
msg291441 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-04-10 18:38 |
@pitrou: sys.getallocatedblocks does not seem to increase
|
msg291444 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-04-10 18:49 |
I see. This may mean the leak is in memory that's not managed directly by Python (e.g. some OpenSSL structure).
Is there a way to reproduce without third-party libraries such as requests?
|
msg291447 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-04-10 19:03 |
yes, in the gist I created you can switch between the various clients, by default right now it uses raw sockets.
|
msg291497 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-04-11 14:14 |
After adapting your test script to run against a local openssl server (`openssl s_server -www`), I can't see a single leak: the process is peaking at 20860 KB RSS. This is with Python 3.5 tip.
Does it need a specific server to test against to showcase the leak?
|
msg291500 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-04-11 16:04 |
the interesting part is it doesn't leak with a local https server, it appears to need to be a remove server.
|
msg291501 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-04-11 16:14 |
Is there a fast enough remote server that shows the leak? I've tested with my own remote server (https://pitrou.net/), but it doesn't leak.
|
msg291502 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-04-11 16:15 |
ya, my sample script hits google.com <http://google.com/>, it's pretty fast. It just does a "HEAD".
> On Apr 11, 2017, at 9:14 AM, Antoine Pitrou <report@bugs.python.org> wrote:
>
>
> Antoine Pitrou added the comment:
>
> Is there a fast enough remote server that shows the leak? I've tested with my own remote server (https://pitrou.net/), but it doesn't leak.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue29870>
> _______________________________________
|
msg291503 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-04-11 16:19 |
Google is not very fast here (a couple of requests / sec at most). How many requests does it take to see a clear tendency?
|
msg291504 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-04-11 16:21 |
see graphs here: https://github.com/kennethreitz/requests/issues/3933, x-axis is number of requests not what it says (seconds).
|
msg291505 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-04-11 17:19 |
Ok, thank you. I've tweaked the script to remove most threads and use maps.google.com (which is faster here), and I managed to bisect the leak to deduce that the offending changeset is 598894ff48e9c1171cb2ec1c798235826a75c7e0.
|
msg291508 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-04-11 17:57 |
The following addition fixes the leak:
diff --git a/Modules/_ssl.c b/Modules/_ssl.c
index bb40051..8f5facd 100644
--- a/Modules/_ssl.c
+++ b/Modules/_ssl.c
@@ -1203,6 +1203,8 @@ _get_crl_dp(X509 *certificate) {
Py_XDECREF(lst);
#if OPENSSL_VERSION_NUMBER < 0x10001000L
sk_DIST_POINT_free(dps);
+#else
+ CRL_DIST_POINTS_free(dps);
#endif
return res;
}
Christian, what do you think?
|
msg291509 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2017-04-11 18:14 |
CRL_DIST_POINTS_free() should be available in all supported OpenSSL versions. The function is defined by DECLARE_ASN1_FUNCTIONS(CRL_DIST_POINTS).
|
msg291510 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-04-11 18:16 |
So we should use it instead of sk_DIST_POINT_free()? I'd like to minimize potential breakage here.
|
msg291511 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2017-04-11 19:11 |
awesome! Thanks for finding a proposing fix pitrou! btw I found an example of freeing this structure here: http://www.zedwood.com/article/c-openssl-parse-x509-certificate-pem
|
msg291512 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2017-04-11 19:15 |
Yes, I'm currently testing the change with a bunch of OpenSSL and LibreSSL versions.
By the way the memory issue can be reproduced with any certificate that contains a CRL distribution point. Letsencrypt certs don't have a CRL DP. I guess Alexander's test cert doesn't have a CRL DP either. The Nokia test cert in our test suite contains one.
---
import _ssl
import sys
PEM = 'Lib/test/nokia.pem'
def mem():
with open('/proc/self/status') as f:
for line in f:
if line.startswith('RssAnon'):
print(line, end='')
for i in range(10000):
if i % 1000 == 0:
mem()
d = _ssl._test_decode_cert(PEM)
assert d['crlDistributionPoints']
mem()
---
Without fix:
$ ./python t.py
RssAnon: 4376 kB
RssAnon: 4840 kB
RssAnon: 5224 kB
RssAnon: 5608 kB
RssAnon: 6120 kB
RssAnon: 6504 kB
RssAnon: 6888 kB
RssAnon: 7272 kB
RssAnon: 7656 kB
RssAnon: 8040 kB
RssAnon: 8424 kB
With fix:
$ ./python t.py
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
|
msg291513 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2017-04-11 20:00 |
Antoine, you might find multissl.py helpful. I wrote a script to automate testing with multiple versions of OpenSSL and libressl. The first time it takes about half an hour to download, compile and install all versions locally. https://github.com/tiran/multissl/blob/master/multissl.py
|
msg291518 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-04-11 22:07 |
> Antoine, you might find multissl.py helpful.
Very nice, thank you!
|
msg291692 - (view) |
Author: Mariatta (Mariatta) *  |
Date: 2017-04-15 01:44 |
I believe this was fixed in https://bugs.python.org/issue29738, so I'm closing this.
Thanks :)
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:44 | admin | set | github: 74056 |
2017-06-09 22:42:09 | yselivanov | set | pull_requests:
+ pull_request2111 |
2017-06-09 22:41:45 | yselivanov | set | pull_requests:
+ pull_request2110 |
2017-06-09 22:32:24 | yselivanov | set | pull_requests:
+ pull_request2109 |
2017-04-15 01:44:09 | Mariatta | set | status: open -> closed
nosy:
+ Mariatta messages:
+ msg291692
resolution: fixed stage: resolved |
2017-04-11 22:07:02 | pitrou | set | messages:
+ msg291518 |
2017-04-11 20:00:14 | christian.heimes | set | messages:
+ msg291513 |
2017-04-11 19:33:53 | christian.heimes | set | pull_requests:
+ pull_request1229 |
2017-04-11 19:15:21 | christian.heimes | set | messages:
+ msg291512 |
2017-04-11 19:11:46 | thehesiod | set | messages:
+ msg291511 |
2017-04-11 18:16:48 | pitrou | set | messages:
+ msg291510 |
2017-04-11 18:14:26 | christian.heimes | set | messages:
+ msg291509 |
2017-04-11 17:57:35 | pitrou | set | priority: normal -> high
messages:
+ msg291508 versions:
+ Python 2.7, Python 3.7 |
2017-04-11 17:19:02 | pitrou | set | messages:
+ msg291505 |
2017-04-11 16:21:23 | thehesiod | set | messages:
+ msg291504 |
2017-04-11 16:19:51 | pitrou | set | messages:
+ msg291503 |
2017-04-11 16:15:25 | thehesiod | set | messages:
+ msg291502 |
2017-04-11 16:14:05 | pitrou | set | messages:
+ msg291501 |
2017-04-11 16:04:35 | thehesiod | set | messages:
+ msg291500 |
2017-04-11 14:14:49 | pitrou | set | messages:
+ msg291497 |
2017-04-10 19:03:34 | thehesiod | set | messages:
+ msg291447 |
2017-04-10 18:49:56 | pitrou | set | messages:
+ msg291444 |
2017-04-10 18:38:20 | thehesiod | set | messages:
+ msg291441 |
2017-04-10 12:21:35 | pitrou | set | nosy:
+ pitrou messages:
+ msg291427
|
2017-04-10 09:03:26 | misg | set | pull_requests:
+ pull_request1218 |
2017-03-22 08:17:19 | Lukasa | set | nosy:
+ Lukasa
|
2017-03-22 02:41:43 | thehesiod | set | messages:
+ msg289961 |
2017-03-22 02:38:45 | thehesiod | set | files:
+ valgrind.log.gz
messages:
+ msg289960 |
2017-03-22 01:40:59 | thehesiod | set | messages:
+ msg289956 versions:
+ Python 3.6 |
2017-03-22 01:10:31 | thehesiod | create | |