Author mwatkins
Recipients mwatkins
Date 2009-01-25.15:30:01
SpamBayes Score 1.68691e-05
Marked as misclassified No
Message-id <1232897403.01.0.0399148938595.issue5054@psf.upfronthosting.co.za>
In-reply-to
Content
There appears to have been a bug in how HTTP_ACCEPT is parsed living in 
run_cgi() for eons, perhaps from the time it was written. Perhaps not 
many are using this code (I'm not either) but recent (post 3.0 Release) 
Python 3.x appear to have broken something in getallmatchingheaders() 
(which originates in Message) and I happened to stumble upon this 
condition while searching through the stdlib code.

From Line 980 of http.server

        accept = []
        for line in self.headers.getallmatchingheaders('accept'):
            if line[:1] in "\t\n\r ":
                accept.append(line.strip())
            else:
                accept = accept + line[7:].split(',')
        env['HTTP_ACCEPT'] = ','.join(accept)


line[:1] in '\t\n\r' clearly was meant to to be line[-1].

However that doesn't fix completely this chunk of code as it makes some 
assumptions about what getallmatchingheaders() delivers which aren't 
accurate. The following behaves as expected and feels safer:

        accept = []
        for line in self.headers.getallmatchingheaders('accept'):
            if line.lower().startswith("accept:"):
                line = line[7:]
            for part in line.split(','):
                part = part.strip()
                if part:
                    accept.append(part)
        env['HTTP_ACCEPT'] = ','.join(accept)


Note that post Python 3.0 release, 
http.client.HTTPMessage.getallmatchingheaders() was broken. I've 
reported this just now and proposed a fix in #5053.
History
Date User Action Args
2009-01-25 15:30:03mwatkinssetrecipients: + mwatkins
2009-01-25 15:30:03mwatkinssetmessageid: <1232897403.01.0.0399148938595.issue5054@psf.upfronthosting.co.za>
2009-01-25 15:30:02mwatkinslinkissue5054 messages
2009-01-25 15:30:01mwatkinscreate