classification
Title: CGI library - Using unicode in header fields
Type: behavior Stage:
Components: Unicode Versions: Python 3.5
process
Status: pending Resolution: duplicate
Dependencies: Superseder: support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage
View: 23434
Assigned To: Nosy List: Olivier.Le.Moign, ezio.melotti, martin.panter, vstinner
Priority: normal Keywords:

Created on 2016-03-10 11:04 by Olivier.Le.Moign, last changed 2020-03-22 06:01 by martin.panter.

Messages (3)
msg261491 - (view) Author: Olivier Le Moign (Olivier.Le.Moign) Date: 2016-03-10 11:04
According to RFC5987 (http://tools.ietf.org/html/rfc5987), it's possible to use other encoding than ASCII in header fields. 
Specifically in the CGI library, posting files with non-ASCII characters will lead the header to be (for example) filename*=utf-8"xxxxx" which is not recognised:

l 513

if 'filename' in pdict:
    self.filename = pdict['filename']
self._binary_file = self.filename is not None 

The file will thus be treated as a string.
The correction isn't too big but being a total newbie, I'm a bit scared to suggest a patch.
msg261492 - (view) Author: Olivier Le Moign (Olivier.Le.Moign) Date: 2016-03-10 11:12
I guess this is fixed by https://pypi.python.org/pypi/rfc6266. Could have looked better, sorry.
msg364787 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2020-03-22 06:01
I’m not an expert on the topic, but it sounds like this might be a duplicate of Issue 23434, which has more discussion.
History
Date User Action Args
2020-03-22 06:01:40martin.pantersetstatus: open -> pending

nosy: + martin.panter
messages: + msg364787

superseder: support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage
resolution: duplicate
2016-03-10 11:12:02Olivier.Le.Moignsetmessages: + msg261492
2016-03-10 11:04:31Olivier.Le.Moigncreate