classification
Title: cgi.FieldStorage can't parse simple body with Content-Length and no Content-Disposition
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Decorater, X-Istence, berker.peksag, ned.deily, rr-, srittau, watusimoto
Priority: normal Keywords: patch

Created on 2016-08-16 13:55 by rr-, last changed 2018-06-19 19:07 by watusimoto.

Files
File name Uploaded Description Edit
test rr-, 2016-08-16 13:55
cgi.py Decorater, 2016-08-16 14:16
bug27777.patch X-Istence, 2016-11-18 05:19 patchset for bug27777 review
Pull Requests
URL Status Linked Edit
PR 7804 open watusimoto, 2018-06-19 18:43
Messages (16)
msg272856 - (view) Author: (rr-) Date: 2016-08-16 13:55
Sending requests with Content-Length but without Content-Disposition headers causes following error:

Traceback (most recent call last):
  File "./test", line 19, in <module>
    form = cgi.FieldStorage(fp=env['wsgi.input'], environ=env)
  File "/usr/lib/python3.5/cgi.py", line 561, in __init__
    self.read_single()
  File "/usr/lib/python3.5/cgi.py", line 740, in read_single
    self.read_binary()
  File "/usr/lib/python3.5/cgi.py", line 762, in read_binary
    self.file.write(data)
TypeError: write() argument must be str, not bytes

I've attached a test file that reproduces the issue.

The issue is because read_single decides whether to read the content as binary or text depending on content-length - if it's > 0, it uses read_binary which assumes binary input, and rewrites this input to self.file, assuming self.file is opened in binary mode.

At the same, self.file is opened in text mode, because self._binary_file is set to False, which in turn is because there's no Content-Disposition header.

At very least, the decision whether to use binary or text should be consistent in both places (self.length >= 0 vs self._binary_file).

Related: https://bugs.python.org/issue27308
Note that unlike https://bugs.python.org/issue24764 this issue does NOT concern multipart requests.
msg272857 - (view) Author: Decorater (Decorater) * Date: 2016-08-16 14:12
hmm into looking it should check if it is in actuality a binary file the length of the file data does not really determine anything on encoding really.

if self._binary_file:

would suffice on determining binary mode or not.
msg272858 - (view) Author: Decorater (Decorater) * Date: 2016-08-16 14:16
Here is a patch to review (note I only had disc space to clone 3.6 so I had to manually download this version of the file).
msg277482 - (view) Author: Bert JW Regeer (X-Istence) * Date: 2016-09-27 03:40
On line #890 in self.make_file() the check for _binary_file should be changed to also check for self.length >= 0.

https://github.com/python/cpython/blob/3.4/Lib/cgi.py#L890

becomes:

if self._binary_file or self.length >= 0:

_binary_file is only ever set if there is a content disposition, which there is not in the test case provided. In the case of no content disposition we can use the content-length as a hint that we have a file that has been uploaded. All files uploaded should be treated as binary if they are not a text type.

This is a duplicate of #27308, however the patch in that report is incorrect IMHO.
msg277483 - (view) Author: Bert JW Regeer (X-Istence) * Date: 2016-09-27 03:44
Updated versions this applies to.
msg277536 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-09-27 18:27
Thanks for triaging this, Bert. Would you like to propose a patch with a test case?

Note that we can't fix this in 3.3 and 3.4 because they are in security-fix-only mode. See https://docs.python.org/devguide/index.html#status-of-python-branches for details.
msg281075 - (view) Author: Bert JW Regeer (X-Istence) * Date: 2016-11-18 05:19
@berker.peksag:

Attached is a patch with a test case that exercises this issue.

Code path is that read_single() checks if the length is greater than 0, and then it reads binary, otherwise it reads it as a single line. This fixes make_file so that if self.length is greater than or equal to 0, it opens the file in binary mode, this matches the checks that write stuff as binary.

This also undoes the change that was made in https://bugs.python.org/issue24764. Fixing this issue fixed that one as well, and arguably throwing data away doesn't seem like a good idea.
msg281793 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2016-11-26 19:30
Berker asks in IRC whether this change should go into 3.6.0 (at rc1).  While it is affecting a relatively self-contained part of the standard library (cgi), the issue doesn't seem to be "release critical".  Further, it is changing behavior that was changed barely a year ago for Issue24764.  My preference would be to try to have this change reviewed and/or tested by at least some of the people involved with the earlier issue and, if there is a consensus for it, target the change for 3.6.1.
msg281794 - (view) Author: Bert JW Regeer (X-Istence) * Date: 2016-11-26 19:44
Unfortunately I need to spin another patch, the one I created didn't solve the issue for one of WebOb's users:

https://github.com/Pylons/webob/pull/300 (Thanks Julien Meyer!)

I have his permission to grab his test/patch and update this patch, I will get this done later today.

That being said, this is a real issue, and WebOb will be shipping a fix for Python less than 3.6 anyway, so whether this gets released in 3.6 or not doesn't matter to me. I'd prefer this to be fixed in the standard library for all users, rather than just for WebOb users.

Even if this were released for 3.6.1, WebOb will have to carry the fix for the foreseeable future.
msg319179 - (view) Author: Chris Eykamp (watusimoto) * Date: 2018-06-09 20:25
I've been experiencing the same issue, which is triggered in the exception handling of web.py.

Bert's proposed fix, adding the zero byte check (if self._binary_file or self.length >= 0:) addresses the issue I'm seeing (tested on 3.5, it's what's available where I can reproduce the error).

This issue seems to be languishing.  Is there any way we could push this forward, even if it doesn't address every problem with the lib?
msg319438 - (view) Author: Chris Eykamp (watusimoto) * Date: 2018-06-13 09:06
This also manifests itself when using web.py: if the underlying code throws an exception, this is emitted:

  File "/usr/local/lib/python3.5/dist-packages/web/webapi.py", line 364, in input
    out = rawinput(_method)
  File "/usr/local/lib/python3.5/dist-packages/web/webapi.py", line 341, in rawinput
    a = cgi.FieldStorage(fp=fp, environ=e, keep_blank_values=1)
  File "/usr/lib/python3.5/cgi.py", line 561, in __init__
    self.read_single()
  File "/usr/lib/python3.5/cgi.py", line 740, in read_single
    self.read_binary()
  File "/usr/lib/python3.5/cgi.py", line 762, in read_binary
    self.file.write(data)
TypeError: write() argument must be str, not bytes
msg319443 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-06-13 09:57
Thank you for the ping, Chris. I will try to combine Bert's and Julien's patches and prepare a PR this weekend.
msg319451 - (view) Author: Chris Eykamp (watusimoto) * Date: 2018-06-13 10:45
I've already got a PR based on the patch listed under the Files section (it's prepared, not yet submitted), but if you want to do something more, I'll step back and let you do it.
msg319480 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-06-13 17:54
That's even better! :) Please submit your work as a pull request.

Did you take a look at https://github.com/Pylons/webob/pull/300 as well? Can we use the test in the PR? Is it possible to adapt it solve both this and WebOb issues?
msg319534 - (view) Author: Chris Eykamp (watusimoto) * Date: 2018-06-14 17:42
I'll get a PR submitted this weekend, and post back here.  It will not explicitly address that other case, as I don't have the capacity or wherewithal for that.  Alas.
msg319993 - (view) Author: Chris Eykamp (watusimoto) * Date: 2018-06-19 19:07
Packaged patch offered below into PR 7804

https://github.com/python/cpython/pull/7804
History
Date User Action Args
2018-06-19 19:07:24watusimotosetmessages: + msg319993
versions: + Python 3.5
2018-06-19 18:43:15watusimotosetpull_requests: + pull_request7409
2018-06-14 17:42:08watusimotosetmessages: + msg319534
2018-06-13 17:54:38berker.peksagsetmessages: + msg319480
2018-06-13 10:45:36watusimotosetmessages: + msg319451
2018-06-13 09:57:36berker.peksagsetmessages: + msg319443
versions: + Python 3.8, - Python 3.5
2018-06-13 09:37:56vstinnersetnosy: - vstinner
2018-06-13 09:06:36watusimotosetmessages: + msg319438
2018-06-09 20:25:17watusimotosetnosy: + watusimoto
messages: + msg319179
2017-11-15 01:55:21srittausetnosy: + srittau
2017-11-15 01:53:55berker.peksaglinkissue32029 superseder
2016-11-26 19:44:39X-Istencesetmessages: + msg281794
2016-11-26 19:30:54ned.deilysetnosy: + ned.deily
messages: + msg281793
2016-11-26 12:10:07berker.peksagsetstage: needs patch -> patch review
2016-11-18 05:19:52X-Istencesetfiles: + bug27777.patch
keywords: + patch
messages: + msg281075
2016-09-27 18:27:03berker.peksagsettype: behavior
stage: needs patch
messages: + msg277536
versions: - Python 3.3, Python 3.4
2016-09-27 03:44:39X-Istencesetmessages: + msg277483
versions: + Python 3.3, Python 3.4, Python 3.6, Python 3.7
2016-09-27 03:40:39X-Istencesetnosy: + berker.peksag
2016-09-27 03:40:02X-Istencesetnosy: + vstinner, X-Istence
messages: + msg277482
2016-08-16 14:16:13Decoratersetfiles: + cgi.py

messages: + msg272858
2016-08-16 14:12:17Decoratersetnosy: + Decorater
messages: + msg272857
2016-08-16 13:55:45rr-create