This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: HTTPS reads can block when content length not available and timeout set.
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Lukasa, Zachary Salzbank, demian.brecht, piotr.dobrogost, pitrou, python-dev
Priority: normal Keywords: patch

Created on 2015-03-03 16:56 by Lukasa, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
ssl_read_fixup.patch pitrou, 2015-03-04 00:13 review
Messages (6)
msg237151 - (view) Author: Cory Benfield (Lukasa) * Date: 2015-03-03 16:56
Initially reported on the requests bug list at https://github.com/kennethreitz/requests/issues/2467

In cases when a remote web server sends a non-chunked response that does not have a content length, it is possible to get http.client to hang on a read. To hit it requires a specific set of circumstances:

- Python 3 (this bug is not seen on Python 2 in my testing)
- HTTPS (using plaintext HTTP resolves the problem)
- A socket timeout must be set (leaving it unset, i.e. leaving the socket in blocking mode, resolves this problem)
- The server must not send a content-length or set Transfer-Coding: chunked.
- The reads must not be an even divisor of the content's length.
- The timeout must be longer than the time the server is willing to keep the connection open (otherwise an exception is thrown).

The following code can be used as a sample to demonstrate the bug:

import http.client
import time

def test(connection_class=http.client.HTTPSConnection, timeout=10, read_size=7):
    start = time.time()
    c = connection_class('sleepy-reaches-6892.herokuapp.com')
    c.connect()
    if timeout is not None:
        c.sock.settimeout(timeout)
    c.request('GET', '/test')
    r = c.getresponse()
    while True:
        data = r.read(read_size)
        if not data:
            break
    print('Finished in {}'.format(time.time() - start))


Below are the results from several different runs:

test(): Finished in 4.8294830322265625
test(connection_class=http.client.HTTPConnection): Finished in 0.3060309886932373
test(timeout=None): Finished in 0.6070599555969238
test(read_size=2): Finished in 0.6600658893585205

As you can see, only this particular set of features causes the bug. Note that if you change the URL to one that does return a Content-Length (e.g. http2bin.org/get), this bug also goes away.

This is a really weird edge case bug. There's some extensive investigation over at the requests bug report, but I've not been able to convincingly point at the problem yet.
msg237169 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-03-03 23:53
Reproducing seems a bit irregular. Note that the last bytestring (the empty bytestring) is what takes time to read.

Also note that HTTPResponse is a buffered I/O object, so normally you don't need to read up to the empty string. You can stop when you got less bytes than what you asked for.
msg237170 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-03-04 00:13
Varying reproduceability may have to do with sleepy-reaches-6892.herokuapp.com resolving to different endpoints (that domain name has a stupidly small TTL, by the way).

Anyway, for an unknown reason the following patch seems to fix the issue.
msg237205 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-03-04 20:00
This is now fixed in all branches. Thanks for the report!
msg237206 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-03-04 20:02
New changeset 371cf371a6a1 by Antoine Pitrou in branch '2.7':
Issue #23576: Avoid stalling in SSL reads when EOF has been reached in the SSL layer but the underlying connection hasn't been closed.
https://hg.python.org/cpython/rev/371cf371a6a1
msg237208 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-03-04 20:19
New changeset 01cf9ce75eda by Antoine Pitrou in branch '3.4':
Issue #23576: Avoid stalling in SSL reads when EOF has been reached in the SSL layer but the underlying connection hasn't been closed.
https://hg.python.org/cpython/rev/01cf9ce75eda

New changeset fc0201ccbcd4 by Antoine Pitrou in branch 'default':
Issue #23576: Avoid stalling in SSL reads when EOF has been reached in the SSL layer but the underlying connection hasn't been closed.
https://hg.python.org/cpython/rev/fc0201ccbcd4
History
Date User Action Args
2022-04-11 14:58:13adminsetgithub: 67764
2015-03-04 20:19:03python-devsetmessages: + msg237208
2015-03-04 20:02:49python-devsetnosy: + python-dev
messages: + msg237206
2015-03-04 20:00:18pitrousetstatus: open -> closed
versions: + Python 2.7
messages: + msg237205

resolution: fixed
stage: resolved
2015-03-04 00:13:21pitrousetfiles: + ssl_read_fixup.patch
keywords: + patch
messages: + msg237170
2015-03-03 23:53:45pitrousetnosy: + pitrou

messages: + msg237169
versions: + Python 3.5
2015-03-03 21:54:27piotr.dobrogostsetnosy: + piotr.dobrogost
2015-03-03 18:49:51Zachary Salzbanksetnosy: + Zachary Salzbank
2015-03-03 18:17:09demian.brechtsetnosy: + demian.brecht
2015-03-03 16:56:13Lukasacreate