This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: cgi.parse() fatally attempts str.decode when handling multipart/form-data
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: James Edington, ethan.furman, kxrob
Priority: normal Keywords:

Created on 2020-02-23 05:34 by James Edington, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
curlLogs.txt James Edington, 2020-02-23 05:34 terminal transcript demonstrating bug
demo.py James Edington, 2020-02-23 05:36 demonstration file of the issue
Messages (3)
msg362490 - (view) Author: James Edington (James Edington) Date: 2020-02-23 05:34
It appears that cgi.parse() in Python 3.7.6 [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] fatally chokes on POST requests with multipart/form-data due to some internal processing still relying on assumptions from when str and bytes were the same object.

I'll attach as the first comment the "try-it-at-home" file to demonstrate this error.
msg362491 - (view) Author: James Edington (James Edington) Date: 2020-02-23 05:36
Here is a file to try it out in an instant.

(lines 11–28 are not necessary; they are just "luxuries" allowing easier testing of the issue in a web browser)
msg376299 - (view) Author: Robert (kxrob) * Date: 2020-09-03 13:42
Would this patch already solve? :

https://github.com/python/cpython/pull/19130

There seems to be another bug: The strange 'latin-1' default encoding of cgi.parse(), which only has effect in non-mulitpart:

    if hasattr(fp,'encoding'):
        encoding = fp.encoding
    else:
        encoding = 'latin-1'


( cgi.FieldStorage and the other functions in cgi and urllib.parse use a 'utf-8' default correctly - and do not try fp.encoding, which is usually not present and not reasonable in form handling WSGI. And 'application/x-www-form-urlencoded' implies  utf-8. )

=> that default should possibly become utf-8. Optionally cgi.parse() could take an extra parameter encoding=None  .
History
Date User Action Args
2022-04-11 14:59:27adminsetgithub: 83908
2020-09-03 13:42:42kxrobsetnosy: + kxrob
messages: + msg376299
2020-07-20 20:50:53Rhodri Jamessetnosy: - Rhodri James
2020-02-29 00:03:28terry.reedysetnosy: + ethan.furman, Rhodri James
2020-02-23 05:36:26James Edingtonsetfiles: + demo.py

messages: + msg362491
2020-02-23 05:34:16James Edingtoncreate