classification
Title: CGIHTTPRequestHandler.run_cgi() HTTP_ACCEPT improperly parsed
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: catalin.iacob, demian.brecht, martin.panter, mwatkins, orsenthil, petri.lehtinen, terry.reedy
Priority: normal Keywords: patch

Created on 2009-01-25 15:30 by mwatkins, last changed 2015-05-31 11:10 by martin.panter.

Files
File name Uploaded Description Edit
cgi-accept.patch martin.panter, 2015-02-15 04:26 review
Messages (4)
msg80510 - (view) Author: Mike Watkins (mwatkins) Date: 2009-01-25 15:30
There appears to have been a bug in how HTTP_ACCEPT is parsed living in 
run_cgi() for eons, perhaps from the time it was written. Perhaps not 
many are using this code (I'm not either) but recent (post 3.0 Release) 
Python 3.x appear to have broken something in getallmatchingheaders() 
(which originates in Message) and I happened to stumble upon this 
condition while searching through the stdlib code.

From Line 980 of http.server

        accept = []
        for line in self.headers.getallmatchingheaders('accept'):
            if line[:1] in "\t\n\r ":
                accept.append(line.strip())
            else:
                accept = accept + line[7:].split(',')
        env['HTTP_ACCEPT'] = ','.join(accept)


line[:1] in '\t\n\r' clearly was meant to to be line[-1].

However that doesn't fix completely this chunk of code as it makes some 
assumptions about what getallmatchingheaders() delivers which aren't 
accurate. The following behaves as expected and feels safer:

        accept = []
        for line in self.headers.getallmatchingheaders('accept'):
            if line.lower().startswith("accept:"):
                line = line[7:]
            for part in line.split(','):
                part = part.strip()
                if part:
                    accept.append(part)
        env['HTTP_ACCEPT'] = ','.join(accept)


Note that post Python 3.0 release, 
http.client.HTTPMessage.getallmatchingheaders() was broken. I've 
reported this just now and proposed a fix in #5053.
msg108657 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-06-26 00:10
I hope that someone who knows more than me on this subject takes a look at this.
msg236019 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-15 04:26
Posting a patch for this so that we can get rid of the broken HTTPMessage.getallmatchingheaders() method in Issue 5053.
msg236020 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-15 04:34
BTW in the original code, I think line[:1] in "\t\n\r " might have been correct. It looks like the getallmatchinheaders() method was actually meant to return continued lines separately, prefixed with whitespace. My patch is probably only appropriate for Python 3; maybe Mike’s code will work for Python 2.
History
Date User Action Args
2015-05-31 11:13:17martin.panterlinkissue5053 dependencies
2015-05-31 11:10:59martin.pantersetstage: patch review
2015-02-16 19:01:36demian.brechtsetnosy: + demian.brecht
2015-02-15 04:34:53martin.pantersetmessages: + msg236020
2015-02-15 04:26:24martin.pantersetfiles: + cgi-accept.patch

nosy: + martin.panter
messages: + msg236019

keywords: + patch
2012-11-26 08:25:36catalin.iacobsetnosy: + catalin.iacob
2011-11-23 19:26:41petri.lehtinensetnosy: + petri.lehtinen
2010-06-26 02:21:49r.david.murraysetnosy: + orsenthil
2010-06-26 00:10:37terry.reedysetnosy: + terry.reedy

messages: + msg108657
versions: + Python 3.2, - Python 2.5, Python 2.4, Python 3.0
2009-01-25 15:30:02mwatkinscreate