Message 80510 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mwatkins
Recipients	mwatkins
Date	2009-01-25.15:30:01
SpamBayes Score	1.6869142e-05
Marked as misclassified	No
Message-id	<1232897403.01.0.0399148938595.issue5054@psf.upfronthosting.co.za>
In-reply-to

Content
There appears to have been a bug in how HTTP_ACCEPT is parsed living in run_cgi() for eons, perhaps from the time it was written. Perhaps not many are using this code (I'm not either) but recent (post 3.0 Release) Python 3.x appear to have broken something in getallmatchingheaders() (which originates in Message) and I happened to stumble upon this condition while searching through the stdlib code. From Line 980 of http.server accept = [] for line in self.headers.getallmatchingheaders('accept'): if line[:1] in "\t\n\r ": accept.append(line.strip()) else: accept = accept + line[7:].split(',') env['HTTP_ACCEPT'] = ','.join(accept) line[:1] in '\t\n\r' clearly was meant to to be line[-1]. However that doesn't fix completely this chunk of code as it makes some assumptions about what getallmatchingheaders() delivers which aren't accurate. The following behaves as expected and feels safer: accept = [] for line in self.headers.getallmatchingheaders('accept'): if line.lower().startswith("accept:"): line = line[7:] for part in line.split(','): part = part.strip() if part: accept.append(part) env['HTTP_ACCEPT'] = ','.join(accept) Note that post Python 3.0 release, http.client.HTTPMessage.getallmatchingheaders() was broken. I've reported this just now and proposed a fix in #5053.

There appears to have been a bug in how HTTP_ACCEPT is parsed living in 
run_cgi() for eons, perhaps from the time it was written. Perhaps not 
many are using this code (I'm not either) but recent (post 3.0 Release) 
Python 3.x appear to have broken something in getallmatchingheaders() 
(which originates in Message) and I happened to stumble upon this 
condition while searching through the stdlib code.

From Line 980 of http.server

        accept = []
        for line in self.headers.getallmatchingheaders('accept'):
            if line[:1] in "\t\n\r ":
                accept.append(line.strip())
            else:
                accept = accept + line[7:].split(',')
        env['HTTP_ACCEPT'] = ','.join(accept)


line[:1] in '\t\n\r' clearly was meant to to be line[-1].

However that doesn't fix completely this chunk of code as it makes some 
assumptions about what getallmatchingheaders() delivers which aren't 
accurate. The following behaves as expected and feels safer:

        accept = []
        for line in self.headers.getallmatchingheaders('accept'):
            if line.lower().startswith("accept:"):
                line = line[7:]
            for part in line.split(','):
                part = part.strip()
                if part:
                    accept.append(part)
        env['HTTP_ACCEPT'] = ','.join(accept)


Note that post Python 3.0 release, 
http.client.HTTPMessage.getallmatchingheaders() was broken. I've 
reported this just now and proposed a fix in #5053.

History
Date	User	Action	Args
2009-01-25 15:30:03	mwatkins	set	recipients: + mwatkins
2009-01-25 15:30:03	mwatkins	set	messageid: <1232897403.01.0.0399148938595.issue5054@psf.upfronthosting.co.za>
2009-01-25 15:30:02	mwatkins	link	issue5054 messages
2009-01-25 15:30:01	mwatkins	create