classification
Title: ssl socket leak
Type: resource usage Stage: resolved
Components: SSL Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: christian.heimes Nosy List: Lukasa, Mariatta, christian.heimes, pitrou, thehesiod
Priority: high Keywords:

Created on 2017-03-22 01:10 by thehesiod, last changed 2017-06-09 22:42 by yselivanov. This issue is now closed.

Files
File name Uploaded Description Edit
valgrind.log.gz thehesiod, 2017-03-22 02:38
Pull Requests
URL Status Linked Edit
PR 981 merged misg, 2017-04-10 09:03
PR 1085 closed christian.heimes, 2017-04-11 19:33
PR 2047 merged yselivanov, 2017-06-09 22:32
PR 2048 merged yselivanov, 2017-06-09 22:41
PR 2049 merged yselivanov, 2017-06-09 22:42
Messages (23)
msg289954 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-03-22 01:10
When upgrading to 3.5.3 we noticed that the requests module was leaking memory rather quickly.  This led to me logging the issue: https://github.com/kennethreitz/requests/issues/3933.  After more investigation I've found that the leak is caused by the raw python SSL sockets.  I've created a test file here: https://gist.github.com/thehesiod/ef79dd77e2df7a0a7893dfea6325d30a which allows you to reproduce the leak with raw python ssl socket (CLIENT_TYPE = ClientType.RAW), aiohttp or requests.  They all leak in a similar way due to their use of the python SSL socket objects.  I tried tracing the memory usage with tracemalloc but nothing interesting popped up so I believe this is a leak in the native code.

A docker cloud image is available here: amohr/testing:stretch_request_leak based on:
```
FROM debian:stretch

COPY request_https_leak.py /tmp/request_https_leak.py

RUN apt-get update && \
    apt-get install -y python3.5 python3-pip git

RUN python3 -m pip install requests git+git://github.com/thehesiod/pyca.git@fix-py3#egg=calib setproctitle requests psutil
```

I believe this issue was introduced in python 3.5.3 as we're not seeing the leak with 3.5.2.  Also I haven't verified yet if this happens on non-debian systems.  I'll update if I have any more info.  I believe 3.6 is similarly impacted but am not 100% certain yet.
msg289956 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-03-22 01:40
validated 3.6 in fedora is affected as well, see github bug for charts.  So it seems all 3.5.3+ versions are affected.  I'm guessing it was introduced in one of the SSL changes in 3.5.3: https://docs.python.org/3.5/whatsnew/changelog.html#python-3-5-3
msg289960 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-03-22 02:38
adding valgrind log of 3.5.3 on debian: jessie
msg289961 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-03-22 02:41
interestingly the valgrind run doesn't show a leak in the profile
msg291427 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-04-10 12:21
Can you record sys.getallocatedblocks() to see whether it grows continuously?
msg291441 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-04-10 18:38
@pitrou: sys.getallocatedblocks does not seem to increase
msg291444 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-04-10 18:49
I see.  This may mean the leak is in memory that's not managed directly by Python (e.g. some OpenSSL structure).

Is there a way to reproduce without third-party libraries such as requests?
msg291447 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-04-10 19:03
yes, in the gist I created you can switch between the various clients, by default right now it uses raw sockets.
msg291497 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-04-11 14:14
After adapting your test script to run against a local openssl server (`openssl s_server -www`), I can't see a single leak: the process is peaking at 20860 KB RSS.  This is with Python 3.5 tip.

Does it need a specific server to test against to showcase the leak?
msg291500 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-04-11 16:04
the interesting part is it doesn't leak with a local https server, it appears to need to be a remove server.
msg291501 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-04-11 16:14
Is there a fast enough remote server that shows the leak? I've tested with my own remote server (https://pitrou.net/), but it doesn't leak.
msg291502 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-04-11 16:15
ya, my sample script hits google.com <http://google.com/>, it's pretty fast.  It just does a "HEAD".

> On Apr 11, 2017, at 9:14 AM, Antoine Pitrou <report@bugs.python.org> wrote:
> 
> 
> Antoine Pitrou added the comment:
> 
> Is there a fast enough remote server that shows the leak? I've tested with my own remote server (https://pitrou.net/), but it doesn't leak.
> 
> ----------
> 
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue29870>
> _______________________________________
msg291503 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-04-11 16:19
Google is not very fast here (a couple of requests / sec at most).  How many requests does it take to see a clear tendency?
msg291504 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-04-11 16:21
see graphs here: https://github.com/kennethreitz/requests/issues/3933, x-axis is number of requests not what it says (seconds).
msg291505 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-04-11 17:19
Ok, thank you. I've tweaked the script to remove most threads and use maps.google.com (which is faster here), and I managed to bisect the leak to deduce that the offending changeset is 598894ff48e9c1171cb2ec1c798235826a75c7e0.
msg291508 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-04-11 17:57
The following addition fixes the leak:

diff --git a/Modules/_ssl.c b/Modules/_ssl.c
index bb40051..8f5facd 100644
--- a/Modules/_ssl.c
+++ b/Modules/_ssl.c
@@ -1203,6 +1203,8 @@ _get_crl_dp(X509 *certificate) {
     Py_XDECREF(lst);
 #if OPENSSL_VERSION_NUMBER < 0x10001000L
     sk_DIST_POINT_free(dps);
+#else
+    CRL_DIST_POINTS_free(dps);
 #endif
     return res;
 }


Christian, what do you think?
msg291509 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017-04-11 18:14
CRL_DIST_POINTS_free() should be available in all supported OpenSSL versions. The function is defined by DECLARE_ASN1_FUNCTIONS(CRL_DIST_POINTS).
msg291510 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-04-11 18:16
So we should use it instead of sk_DIST_POINT_free()? I'd like to minimize potential breakage here.
msg291511 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-04-11 19:11
awesome! Thanks for finding a proposing fix pitrou!  btw I found an example of freeing this structure here: http://www.zedwood.com/article/c-openssl-parse-x509-certificate-pem
msg291512 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017-04-11 19:15
Yes, I'm currently testing the change with a bunch of OpenSSL and LibreSSL versions.

By the way the memory issue can be reproduced with any certificate that contains a CRL distribution point. Letsencrypt certs don't have a CRL DP. I guess Alexander's test cert doesn't have a CRL DP either. The Nokia test cert in our test suite contains one.

---
import _ssl
import sys

PEM = 'Lib/test/nokia.pem'

def mem():
    with open('/proc/self/status') as f:
        for line in f:
            if line.startswith('RssAnon'):
                print(line, end='')

for i in range(10000):
    if i % 1000 == 0:
        mem()
    d = _ssl._test_decode_cert(PEM)
    assert d['crlDistributionPoints']

mem()
---

Without fix:

$ ./python t.py 
RssAnon:            4376 kB
RssAnon:            4840 kB
RssAnon:            5224 kB
RssAnon:            5608 kB
RssAnon:            6120 kB
RssAnon:            6504 kB
RssAnon:            6888 kB
RssAnon:            7272 kB
RssAnon:            7656 kB
RssAnon:            8040 kB
RssAnon:            8424 kB


With fix:

$ ./python t.py 
RssAnon:            4376 kB
RssAnon:            4376 kB
RssAnon:            4376 kB
RssAnon:            4376 kB
RssAnon:            4376 kB
RssAnon:            4376 kB
RssAnon:            4376 kB
RssAnon:            4376 kB
RssAnon:            4376 kB
RssAnon:            4376 kB
RssAnon:            4376 kB
msg291513 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017-04-11 20:00
Antoine, you might find multissl.py helpful. I wrote a script to automate testing with multiple versions of OpenSSL and libressl. The first time it takes about half an hour to download, compile and install all versions locally. https://github.com/tiran/multissl/blob/master/multissl.py
msg291518 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-04-11 22:07
> Antoine, you might find multissl.py helpful.

Very nice, thank you!
msg291692 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-04-15 01:44
I believe this was fixed in https://bugs.python.org/issue29738, so I'm closing this.
Thanks :)
History
Date User Action Args
2017-06-09 22:42:09yselivanovsetpull_requests: + pull_request2111
2017-06-09 22:41:45yselivanovsetpull_requests: + pull_request2110
2017-06-09 22:32:24yselivanovsetpull_requests: + pull_request2109
2017-04-15 01:44:09Mariattasetstatus: open -> closed

nosy: + Mariatta
messages: + msg291692

resolution: fixed
stage: resolved
2017-04-11 22:07:02pitrousetmessages: + msg291518
2017-04-11 20:00:14christian.heimessetmessages: + msg291513
2017-04-11 19:33:53christian.heimessetpull_requests: + pull_request1229
2017-04-11 19:15:21christian.heimessetmessages: + msg291512
2017-04-11 19:11:46thehesiodsetmessages: + msg291511
2017-04-11 18:16:48pitrousetmessages: + msg291510
2017-04-11 18:14:26christian.heimessetmessages: + msg291509
2017-04-11 17:57:35pitrousetpriority: normal -> high

messages: + msg291508
versions: + Python 2.7, Python 3.7
2017-04-11 17:19:02pitrousetmessages: + msg291505
2017-04-11 16:21:23thehesiodsetmessages: + msg291504
2017-04-11 16:19:51pitrousetmessages: + msg291503
2017-04-11 16:15:25thehesiodsetmessages: + msg291502
2017-04-11 16:14:05pitrousetmessages: + msg291501
2017-04-11 16:04:35thehesiodsetmessages: + msg291500
2017-04-11 14:14:49pitrousetmessages: + msg291497
2017-04-10 19:03:34thehesiodsetmessages: + msg291447
2017-04-10 18:49:56pitrousetmessages: + msg291444
2017-04-10 18:38:20thehesiodsetmessages: + msg291441
2017-04-10 12:21:35pitrousetnosy: + pitrou
messages: + msg291427
2017-04-10 09:03:26misgsetpull_requests: + pull_request1218
2017-03-22 08:17:19Lukasasetnosy: + Lukasa
2017-03-22 02:41:43thehesiodsetmessages: + msg289961
2017-03-22 02:38:45thehesiodsetfiles: + valgrind.log.gz

messages: + msg289960
2017-03-22 01:40:59thehesiodsetmessages: + msg289956
versions: + Python 3.6
2017-03-22 01:10:31thehesiodcreate