Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPS reads can block when content length not available and timeout set. #67764

Closed
Lukasa mannequin opened this issue Mar 3, 2015 · 6 comments
Closed

HTTPS reads can block when content length not available and timeout set. #67764

Lukasa mannequin opened this issue Mar 3, 2015 · 6 comments
Labels
stdlib Python modules in the Lib dir

Comments

@Lukasa
Copy link
Mannequin

Lukasa mannequin commented Mar 3, 2015

BPO 23576
Nosy @pitrou, @demianbrecht, @Lukasa
Files
  • ssl_read_fixup.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-03-04.20:00:18.325>
    created_at = <Date 2015-03-03.16:56:13.386>
    labels = ['library']
    title = 'HTTPS reads can block when content length not available and timeout set.'
    updated_at = <Date 2015-03-04.20:19:03.067>
    user = 'https://github.com/Lukasa'

    bugs.python.org fields:

    activity = <Date 2015-03-04.20:19:03.067>
    actor = 'python-dev'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-03-04.20:00:18.325>
    closer = 'pitrou'
    components = ['Library (Lib)']
    creation = <Date 2015-03-03.16:56:13.386>
    creator = 'Lukasa'
    dependencies = []
    files = ['38326']
    hgrepos = []
    issue_num = 23576
    keywords = ['patch']
    message_count = 6.0
    messages = ['237151', '237169', '237170', '237205', '237206', '237208']
    nosy_count = 6.0
    nosy_names = ['pitrou', 'python-dev', 'piotr.dobrogost', 'demian.brecht', 'Lukasa', 'Zachary Salzbank']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue23576'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @Lukasa
    Copy link
    Mannequin Author

    Lukasa mannequin commented Mar 3, 2015

    Initially reported on the requests bug list at https://github.com/kennethreitz/requests/issues/2467

    In cases when a remote web server sends a non-chunked response that does not have a content length, it is possible to get http.client to hang on a read. To hit it requires a specific set of circumstances:

    • Python 3 (this bug is not seen on Python 2 in my testing)
    • HTTPS (using plaintext HTTP resolves the problem)
    • A socket timeout must be set (leaving it unset, i.e. leaving the socket in blocking mode, resolves this problem)
    • The server must not send a content-length or set Transfer-Coding: chunked.
    • The reads must not be an even divisor of the content's length.
    • The timeout must be longer than the time the server is willing to keep the connection open (otherwise an exception is thrown).

    The following code can be used as a sample to demonstrate the bug:

    import http.client
    import time
    
    def test(connection_class=http.client.HTTPSConnection, timeout=10, read_size=7):
        start = time.time()
        c = connection_class('sleepy-reaches-6892.herokuapp.com')
        c.connect()
        if timeout is not None:
            c.sock.settimeout(timeout)
        c.request('GET', '/test')
        r = c.getresponse()
        while True:
            data = r.read(read_size)
            if not data:
                break
        print('Finished in {}'.format(time.time() - start))

    Below are the results from several different runs:

    test(): Finished in 4.8294830322265625
    test(connection_class=http.client.HTTPConnection): Finished in 0.3060309886932373
    test(timeout=None): Finished in 0.6070599555969238
    test(read_size=2): Finished in 0.6600658893585205

    As you can see, only this particular set of features causes the bug. Note that if you change the URL to one that does return a Content-Length (e.g. http2bin.org/get), this bug also goes away.

    This is a really weird edge case bug. There's some extensive investigation over at the requests bug report, but I've not been able to convincingly point at the problem yet.

    @Lukasa Lukasa mannequin added the stdlib Python modules in the Lib dir label Mar 3, 2015
    @pitrou
    Copy link
    Member

    pitrou commented Mar 3, 2015

    Reproducing seems a bit irregular. Note that the last bytestring (the empty bytestring) is what takes time to read.

    Also note that HTTPResponse is a buffered I/O object, so normally you don't need to read up to the empty string. You can stop when you got less bytes than what you asked for.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 4, 2015

    Varying reproduceability may have to do with sleepy-reaches-6892.herokuapp.com resolving to different endpoints (that domain name has a stupidly small TTL, by the way).

    Anyway, for an unknown reason the following patch seems to fix the issue.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 4, 2015

    This is now fixed in all branches. Thanks for the report!

    @pitrou pitrou closed this as completed Mar 4, 2015
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 4, 2015

    New changeset 371cf371a6a1 by Antoine Pitrou in branch '2.7':
    Issue bpo-23576: Avoid stalling in SSL reads when EOF has been reached in the SSL layer but the underlying connection hasn't been closed.
    https://hg.python.org/cpython/rev/371cf371a6a1

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 4, 2015

    New changeset 01cf9ce75eda by Antoine Pitrou in branch '3.4':
    Issue bpo-23576: Avoid stalling in SSL reads when EOF has been reached in the SSL layer but the underlying connection hasn't been closed.
    https://hg.python.org/cpython/rev/01cf9ce75eda

    New changeset fc0201ccbcd4 by Antoine Pitrou in branch 'default':
    Issue bpo-23576: Avoid stalling in SSL reads when EOF has been reached in the SSL layer but the underlying connection hasn't been closed.
    https://hg.python.org/cpython/rev/fc0201ccbcd4

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant