classification
Title: REMOTE_USER and Remote-User collision in wsgiref
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Alex.Raitz, pje
Priority: normal Keywords:

Created on 2010-12-21 22:46 by Alex.Raitz, last changed 2011-01-05 00:24 by Alex.Raitz. This issue is now closed.

Messages (7)
msg124466 - (view) Author: Alex Raitz (Alex.Raitz) Date: 2010-12-21 22:46
Clients can overwrite 'REMOTE_USER' header variable value with an arbitrary 'Remote-User' value by specifying the later after the former.

This has tricky implications when a proxy server is being used, namely that if the proxy passes a re-written REMOTE_USER but also the user-supplied 'Remote-User', Python WSGI will actually store HTTP_REMOTE_USER as the value of the user-supplied 'Remote-User' header based on the order that the headers are processed. 

./python2.6/wsgiref/headers.py:

184         for k, v in _params.items():
185             if v is None:
186                 parts.append(k.replace('_', '-'))
187             else:
188                 parts.append(_formatparam(k.replace('_', '-'), v))
msg125082 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2011-01-02 19:52
I don't understand.  HTTP_REMOTE_USER is not the name of a standard CGI variable - it's REMOTE_USER.

It would help if you could show code for what client/proxy/server combination has this problem, what happens when that code runs, and what you want to happen instead.
msg125344 - (view) Author: Alex Raitz (Alex.Raitz) Date: 2011-01-04 17:55
Yes, I was referring to REMOTE_USER, apologies for the conflation with HTTP_REMOTE_USER, which was one of the HTTP headers that a proxy which we were testing was setting.

The customer that reported this issue to us was using FireFox with Tamper Data to set REMOTE-USER, AdNovum Nevis as the proxy, and Splunk as the server.  

For example, the following is received by the proxy in question:

Host: foobar:42000
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Cookie: session_id_4200=69e6b6e33510fa64d8b18c34aa73b4b50eff37dc
remote-user: USER-SUPPLIED
Cache-Control: max-age=0 
Connection: Keep-Alive

The proxy sends the following to the server:

Host: localhost:4200
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
REMOTE_USER: normal_user
X-Forwarded-For: 10.3.1.53
X-Forwarded-Host: foobar:42000
X-Forwarded-Server: foobar <http://foobar>
Cookie: session_id_4200=69e6b6e33510fa64d8b18c34aa73b4b50eff37dc
Authorization: Basic Z2FyZXRoOjUzMjc5 
Cache-Control: max-age=0
remote-user: USER-SUPPLIED
Connection: Keep-Alive

In this case, replacing '-' with '_' in wsgiref would overload 'remote_user=normal_user' with 'remote_user=user-supplied'.

When testing with Apache, we found that all user-supplied variables were placed above the proxy-added variables, so that overloading was not an issue.  This seems like the appropriate and expected behavior.

However, given that the customer's chosen proxy did not exhibit this behavior, and searching for a specification for proxy behavior in this situation was inconclusive, our team deemed it advisable to file this issue.

Ideally, Python wsgiref should ensure that the proxy-supplied REMOTE_USER cannot be overloaded by a user-supplied REMOTE-USER that is passed to the server after the proxy-supplied REMOTE_USER.

Please note that Splunk uses wsgiref from the CherryPy framework, but when we investigated the issue we noticed that the replacement of '-' with '_' is the same in both Python and CherryPy wsgiref.  A bug has also been filed against CherryPy.
msg125374 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2011-01-04 22:31
I'm still baffled.  How does this matter to anything?

The HTTP headers you describe would end up in an HTTP_REMOTE_USER environment variable, with no impact on REMOTE_USER.  REMOTE_USER could only be set by an actual web server, not via an HTTP header.

So I don't get how this is a security issue, or even a bug at all.
msg125378 - (view) Author: Alex Raitz (Alex.Raitz) Date: 2011-01-04 22:40
Per the first line of my previous comment, please ignore HTTP_REMOTE_USER.

The risk is that if the proxy does not place the user-supplied 'remote-user=VALUE1' before the proxy-supplied 'REMOTE_USER=VALUE2', wsgiref will overload REMOTE_USER with the value of REMOTE-USER.

1) Client supplies 'REMOTE-USER=admin'
2) Proxy adds 'REMOTE_USER=normal_user' and appends 'REMOTE-USER=admin'
3) Server using wsgiref processes header key/value 'REMOTE_USER=normal_user' and performs lowercase/replace, resulting in 'remote_user=normal_user'
4) Server using wsgiref continues to process the header, performs lowercase/replace on 'REMOTE-USER=admin', resulting in 'remote_user=admin', which overloads the proxy-supplied value for 'remote_user' and allows for arbitrary privilege escalation.
msg125380 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2011-01-04 22:53
You say it "would" do this.  Have you actually *tested* it?

Looking at the code in wsgiref again, I don't think it does what you think it does.  The '_' substitution is done to keyword arguments for header *parameters* only; it's not done to header *names*.

Please write a test case for wsgiref.headers.Headers that demonstrates the behavior you think it would be doing.  AFAICT, you will not even be able to get the replace() calls to execute without writing explicit add_header() calls, and even then, you *still* won't get the results you're describing.
msg125386 - (view) Author: Alex Raitz (Alex.Raitz) Date: 2011-01-05 00:24
I had previously tested it against simple_server.  However, in reviewing my test, I realized that you are correct that wsgiref headers is not misbehaving.  

It appears that in simple_server, the values of remote-user and remote_user both end up in HTTP_REMOTE_USER because of the replacement behavior in simple_server (not in headers).

I am withdrawing this bug and will submit a subsequent ticket with the required details.  Thank you for your patience.
History
Date User Action Args
2011-01-05 00:24:46Alex.Raitzsetstatus: open -> closed

messages: + msg125386
resolution: not a bug
nosy: pje, Alex.Raitz
2011-01-04 22:53:09pjesetnosy: pje, Alex.Raitz
messages: + msg125380
2011-01-04 22:40:29Alex.Raitzsetnosy: pje, Alex.Raitz
messages: + msg125378
2011-01-04 22:31:31pjesetnosy: pje, Alex.Raitz
messages: + msg125374
2011-01-04 17:55:27Alex.Raitzsetnosy: pje, Alex.Raitz
messages: + msg125344
2011-01-02 19:52:18pjesetnosy: pje, Alex.Raitz
messages: + msg125082
2011-01-02 18:21:47eric.araujosettitle: WSGIREF - REMOTE_USER and REMOTE-USER collision -> REMOTE_USER and Remote-User collision in wsgiref
nosy: pje, Alex.Raitz
versions: + Python 3.1, Python 3.2, - Python 2.6
components: + Library (Lib), - Extension Modules
type: security -> behavior
stage: needs patch
2010-12-21 22:54:49r.david.murraysetnosy: + pje
2010-12-21 22:46:10Alex.Raitzcreate