Title: cgi.parse() fatally attempts str.decode when handling multipart/form-data
Created on 2020-02-23 05:34 by James Edington, last changed 2020-09-03 13:42 by kxrob.

curlLogs.txt James Edington, 2020-02-23 05:34 terminal transcript demonstrating bug
James Edington, 2020-02-23 05:36 demonstration file of the issue
Author: James Edington (James Edington) Date: 2020-02-23 05:34
It appears that cgi.parse() in Python 3.7.6 [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] fatally chokes on POST requests with multipart/form-data due to some internal processing still relying on assumptions from when str and bytes were the same object.

I'll attach as the first comment the "try-it-at-home" file to demonstrate this error.
Author: James Edington (James Edington) Date: 2020-02-23 05:36
Here is a file to try it out in an instant.

(lines 11–28 are not necessary; they are just "luxuries" allowing easier testing of the issue in a web browser)
Author: Robert (kxrob) Date: 2020-09-03 13:42
Would this patch already solve? :

There seems to be another bug: The strange 'latin-1' default encoding of cgi.parse(), which only has effect in non-mulitpart:

    if hasattr(fp,'encoding'):
        encoding = fp.encoding
        encoding = 'latin-1'

( cgi.FieldStorage and the other functions in cgi and urllib.parse use a 'utf-8' default correctly - and do not try fp.encoding, which is usually not present and not reasonable in form handling WSGI. And 'application/x-www-form-urlencoded' implies  utf-8. )

=> that default should possibly become utf-8. Optionally cgi.parse() could take an extra parameter encoding=None  .
