classification
Title: Content-length not set for HTTP methods expecting body when body is None
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.5, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, demian.brecht, jimr, martin.panter, orsenthil, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2015-02-27 17:30 by demian.brecht, last changed 2015-03-22 19:39 by jimr. This issue is now closed.

Files
File name Uploaded Description Edit
issue23539-py27.patch jimr, 2015-02-27 20:28 review
issue23539-py27.patch jimr, 2015-02-27 22:09 review
issue23539-py27.patch jimr, 2015-02-28 16:13 review
issue23539-py3.patch jimr, 2015-03-19 22:32 review
issue23539-py27.patch jimr, 2015-03-19 22:40 review
Messages (36)
msg236803 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-02-27 17:30
#14721 solved setting the Content-Length header for 0-length bodies. However, it doesn't account for cases where body is None (reported by James Rutherford here: http://bugs.python.org/issue14721#msg236600).

One method of solving this might be something like this:

_METHODS_EXPECTING_BODIES = {'OPTIONS', 'POST', 'PUT', 'PATCH'}
if method.upper() in _METHODS_EXPECTING_BODIES and \
        'content-length' not in header_names:
    self._set_content_length(body)

(_set_content_length would have to be updated in order to allow for None)

This ensures that Content-Length will not be set for methods not expecting a body.

RFC 7230, Section 3.3.2:

   A user agent SHOULD NOT send a Content-Length header field when the
   request message does not contain a payload body and the method 
   semantics do not anticipate such a body.
msg236812 - (view) Author: James Rutherford (jimr) * Date: 2015-02-27 18:35
Thanks for setting up the new issue, I'll cook up a patch. I'm assuming this affects all Python 3.X versions but I've specifically encountered it on Python 2.7.
msg236815 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-02-27 18:52
> I'm assuming this affects all Python 3.X versions but I've specifically encountered it on Python 2.7.

Unless there are any core dev objections, I think it's applicable to 2.7, 3.4 and 3.5 as other minor 3.x versions are in security mode (https://docs.python.org/devguide/devcycle.html#security-branches).
msg236817 - (view) Author: James Rutherford (jimr) * Date: 2015-02-27 18:58
OK, thanks.
msg236822 - (view) Author: James Rutherford (jimr) * Date: 2015-02-27 19:24
OK, I've got a patch but it's failing on 'test_send_file'[1], which is sending a body on a GET request. According to the IETF memo[2]:

   Bodies on GET requests have no defined semantics.  Note that sending
   a body on a GET request might cause some existing implementations to
   reject the request.

So I don't know whether to remove the method constraint so this test passes, or to "fix" this test to use a different method. My feeling is that stripping out Content-Length for GET requests could cause problems for some code that for whatever reason uses this, so we should remove that constraint (it was only "SHOULD NOT" after all) and go back to the original proposal of just setting the content length for bodies that are None as well as empty strings.

[1] https://hg.python.org/cpython/file/325aec842e3e/Lib/test/test_httplib.py#l297
[2] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-14#section-7.3
msg236831 - (view) Author: James Rutherford (jimr) * Date: 2015-02-27 20:28
Patch attached for the 2.7 branch, including updated tests. All tests pass. Let me know if this looks like a sensible approach and I'll produce something comparable for 3.X. 

The logic now is as it was before, except that we set a content length of zero if the body is None and the method is one of OPTIONS, PATCH, PUT, or POST.
msg236832 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-02-27 20:30
My concern about this is around the combination of the following two passages:

draft-ietf-httpbis-p2-semantics-14, Section 7.3:

>   Bodies on GET requests have no defined semantics.  Note that sending
>   a body on a GET request might cause some existing implementations to
>   reject the request.

RFC 7230, Section 3.3.2:

> The presence of a message body in a request is signaled by a Content-Length or Transfer-Encoding header field.

My concern around the above is the possibility that servers may have implementations that reject requests with methods that are not expecting bodies solely on the existence of the Content-Length header rather than the value (which may be 0). Without a good amount of research around whether or not this would be an actual problem in practice, I wouldn’t be comfortable adding Content-Length header to every request.

> My feeling is that stripping out Content-Length for GET requests could cause problems for some code that for whatever reason uses this

That’s definitely a valid concern. What if the logic was tweaked a little:

expecting_len = method.upper() in _METHODS_EXPECTING_BODIES or \
    body is not None

if expecting_len and 'content-length' not in header_names:
    self._set_content_length(body)

That way, if someone is intentionally sending a body in a GET request, the value of Content-Length would still be automatically added, but would still be set to 0 for all other cases where bodies are expected.
msg236840 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-02-27 21:11
> The logic now is as it was before, except that we set a content length of zero if the body is None and the method is one of OPTIONS, PATCH, PUT, or POST.

I see we definitely have similar thinking on the modifications required for this, but I don’t think I’m in favour of the approach in the patch. Consider the case where you have a GET request without a body. Now, instead of simply circumventing the _set_content_length method altogether as would have been done previously, the code path will now go into _set_content_length, have a TypeError exception raised on (str(len([None])) and then an AttributeError exception raised on os.fstat([None].fileno()). That seems to be quite a bit of additional overhead when it could be avoided altogether by an earlier check as I have in my example.

What do you think?
msg236847 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-27 21:47
I like Demian’s “expecting_len” technique slightly better because it would avoid calling None.fileno() and len(None), as Demian just said.

Also, I think OPTIONS should be removed from the list of methods that enforce a Content-Length. I wouldn’t normally expect any payload for OPTIONS, since RFC 7231 explicitly says it does not define a use for a payload, but requires a Content-Type if a payload is sent.
msg236850 - (view) Author: James Rutherford (jimr) * Date: 2015-02-27 22:09
The first patch should actually be modified so the condition reads (update attached):

    if body is None and method_expects_body:
        thelen = 0
    elif body is not None:
        ...

Demian, I believe this is equivalent to your 'expecting_len' proposal, I've just put the logic inside _set_content_length. My approach was to leave as much existing logic as possible in place to avoid the introduction of new bugs (first patch wasn't entirely successful in this!).

As for OPTION, 

   If the OPTIONS request includes a message-body (as indicated by the
   presence of Content-Length or Transfer-Encoding), then the media type
   MUST be indicated by a Content-Type field.  Although this
   specification does not define any use for such a body, future
   extensions to HTTP might use the OPTIONS body to make more detailed
   queries on the server.

From http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-14#section-7.2
msg236863 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-02-28 01:24
> Also, I think OPTIONS should be removed from the list of methods that enforce a Content-Length. I wouldn’t normally expect any payload for OPTIONS, since RFC 7231 explicitly says it does not define a use for a payload, but requires a Content-Type if a payload is sent.

I think Martin's right about this and it's consistent with my concern about servers treating the existence of a Content-Length header as an indicator of a request body.

In any event, with the new logic, a Content-Length header will be added if the body is not None.

> I've just put the logic inside _set_content_length

Fair enough. It /does/ encapsulate all logic around setting the content length within the method.


I've left a couple minor comments in Rietveld. In addition to what's there, docs should also be updated to reflect the new behaviour.

Thanks for the work on this!
msg236893 - (view) Author: James Rutherford (jimr) * Date: 2015-02-28 16:13
Happy to remove OPTIONS from the list of methods that gets a content-length where body is None, but do we also want to consider behaviour if it's the empty string? My feeling is that '' implies "present but empty" (so should have a content-length set to zero), whereas None implies "missing" (so should only have a content-length header set to zero if the method is expecting a body.

I've updated the patch with this logic, including ensuring proper explicit test coverage for the cases described. Docs updated too.
msg236912 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-28 21:36
Yes I agree with the behaviour that None means no body (for requests such as GET), and an empty string means an empty but present body. Correct me if I’m wrong, but I think the fix for Issue 14721 already does that.

Perhaps the new behaviour needs a “Changed in version X” tag? I’m not sure, but I tend to think of this as a new feature, rather than a bug fix.
msg236917 - (view) Author: James Rutherford (jimr) * Date: 2015-02-28 22:07
I actually consider this a fix for the fix in 14721, rather than a new feature. The only new behaviour here is setting content length to be zero if body is None on PATCH, POST, or PUT. Happy to change the labeling if that's the consensus but IMO it's a bugfix.
msg236929 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-03-01 01:26
> My feeling is that '' implies "present but empty" (so should have a content-length set to zero), whereas None implies "missing" (so should only have a content-length header set to zero if the method is expecting a body.

Although I understand your thinking here, I think that bodies that are either None or empty string should be treated identically. Whether or not the Content-Length header is set should be dependent on the method used.

From a client standpoint, there’s no difference between sending a body with a value of None and an empty string (the latter will get further down the call stack, but no data is actually sent over the wire).

Likewise from the server standpoint, there is no functional difference between a missing or empty body. However, there’s still the possibility of unexpected behaviour due to the inclusion of a Content-Length header for a method not expecting it.

In light of that, I think that HTTPConnection(‘example.com’).request(‘GET’, ‘/‘, ‘’) and HTTPConnection(‘example.com’).request(‘GET’, ‘/‘) should result in identical headers with no Content-Length set.

If someone /really/ wants to set the Content-Length header for a method that doesn’t expect it, they still have the ability to do so: HTTPConnection(‘example.com’).request(‘GET’, ‘/‘, ‘’, {‘Content-Length’: 0}). However, if the default behaviour is to include the Content-Length header for an empty string, the user then has to change what they’re sending to be None instead of empty string even though they should be functionally equivalent.
msg236930 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-03-01 01:30
> I actually consider this a fix for the fix in 14721, rather than a new feature.

+1
msg236967 - (view) Author: James Rutherford (jimr) * Date: 2015-03-01 16:07
>> My feeling is that '' implies "present but empty" (so should have a content-length set to zero), whereas None implies "missing" (so should only have a content-length header set to zero if the method is expecting a body.
> ...
> In light of that, I think that HTTPConnection(‘example.com’).request(‘GET’, ‘/‘, ‘’) and HTTPConnection(‘example.com’).request(‘GET’, ‘/‘) should result in identical headers with no Content-Length set.

I get your reasoning here, but I wonder if that goes beyond the scope of this issue? My initial thinking was the same, but then realised that was inconsistent with the current behaviour. For example, a GET with '' in the body already sets content-length: 0, but a GET with None for the body doesn't set a content length at all. I haven't changed this behaviour in my patch, except for PUT, POST, and PATCH where None and '' are now equivalent for those methods.
msg236970 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-03-01 17:30
> but I wonder if that goes beyond the scope of this issue?

I think it’s worthwhile to fix it while you’re already working on the logic there. I’d consider Content-Type being set for methods not expecting it as a (very) minor bug and it’s better to clear up the inconsistent behaviour. That said, it /is/ breaking backwards compatibility (however slightly) and would likely need to get sign-off from a couple core devs.
msg236971 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-03-01 18:27
+David, Senthil: Thoughts on this?
msg237049 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-03-02 15:19
Don't break backward compatibility.  It's not like this was reported as a bug that caused a problem for real world code, it is about theoretical consistency.  The risk of breaking someone code is much higher than the benefit of any such consistency, when talking about a bug fix.

Aside from that, however, I see request.('GET', '/') and request.('GET', '/', '') as clearly *different* from an API call standpoint, so I would in any case preserve the existing behavior.
msg237065 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-03-02 17:19
> Aside from that, however, I see request.('GET', '/') and request.('GET', '/', '') as clearly *different* from an API call standpoint, so I would in any case preserve the existing behavior.

I do understand the case that the the two examples look different from a caller’s standpoint. However as a user, I’d expect the library to do the Right Thing due to a deeper understanding of the theory behind the API that I’m using. Perhaps it would have made more sense to default body to empty string rather than None in which case there wouldn’t be a need for this distinction.

That said, I do understand the reasoning behind not breaking backwards compatibility and it’s not a problem until reported and am not looking to get into an API design argument (especially for something as low impact as what I was proposing). So @James, I stand corrected :)
msg237066 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-03-02 17:41
Not to get into an API argument, but rather to convey some Python philosophy as I understand it: there *is* often a distinction in Python between the value 'None' and a boolean-False-value like ''.  It is often the case in Python APIs that None means something different from other arguably-semantically-equivalent values.  In this case I believe it is entirely logical that 'None' (the default value) for a method that does not normally take a body and makes a distinction based on Content-Length (or some other header) makes sense, and that explicitly saying "I want an empty body, including the standard headers" is actually a sensible Python API.  If we were designing it from scratch with these issues in mind, we might well come up with exactly what we've got...or we might not, because wanting to specify an empty body on such a command is a really low probability event :)

Regardless, we're constrained by backward compatibility here, so the question is moot.
msg237067 - (view) Author: James Rutherford (jimr) * Date: 2015-03-02 17:51
OK, sounds like we're approaching consensus? And I believe that the patch as-is captures that consensus, so should I proceed and make another for 3.X for review?
msg237071 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2015-03-02 18:24
One patch should be enough, @James. The committer will take care of
porting. Thanks!  ( I am yet to completely read the discussion/ review the
patch and I will do that shortly.)
msg237075 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-03-02 18:55
> OK, sounds like we're approaching consensus? And I believe that the patch as-is captures that consensus

Yes, I believe that we’ve reached a consensus and that your patch captures it. Thanks for the work!
msg237171 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-03-04 02:38
Single patch, yes, but FYI we actually prefer a python3 patch against default as the single patch.  (Unless there are major differences, which there aren't in this case.)

I've made some minor review comments.  You can either ack my changes (or disagree :) and I'll make them when I commit, or update the patch yourself if you are so moved.
msg237177 - (view) Author: James Rutherford (jimr) * Date: 2015-03-04 09:29
Single patch against default makes sense and I'll do that in future.

As for the review comments, I'm happy to go with all of your suggestions but have offered a tweak to the docstring that you can take or leave at your discretion :)
msg238017 - (view) Author: James Rutherford (jimr) * Date: 2015-03-13 09:18
Hi all, apologies for the spam, but I just wanted to confirm that no-one is waiting on anything from me... I'm happy to consolidate the final minor points & make the patch against python3 if that would simplify things.
msg238025 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-03-13 12:48
It would...I've started twice to do the commit and gotten interrupted both times.  Having a patch I can just apply would help.  If you would be willing to also generate a patch for python3, that would be a help as well.
msg238040 - (view) Author: James Rutherford (jimr) * Date: 2015-03-13 17:02
Ok I'll have a go at a consolidated python3 patch tomorrow.
msg238570 - (view) Author: James Rutherford (jimr) * Date: 2015-03-19 22:32
Python 3 patch attached. The documentation has changed structure a little so I've adapted (simplified) this from the original. Otherwise, it's pretty much the same, except with python3 fixes, and incorporated feedback. I'll upload an updated 2.7 patch as well in a few minutes.
msg238573 - (view) Author: James Rutherford (jimr) * Date: 2015-03-19 22:40
Updated 2.7 patch attached.
msg238604 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-03-20 05:52
One super minor comment left in Rietveld (py3 patch). Otherwise LGTM.
msg238926 - (view) Author: Roundup Robot (python-dev) Date: 2015-03-22 19:19
New changeset 5c3dc817ffd7 by R David Murray in branch '2.7':
#23539: Set Content-Length to 0 for PUT, POST, and PATCH if body is None.
https://hg.python.org/cpython/rev/5c3dc817ffd7

New changeset f6f72422df96 by R David Murray in branch '3.4':
#23539: Set Content-Length to 0 for PUT, POST, and PATCH if body is None.
https://hg.python.org/cpython/rev/f6f72422df96

New changeset 643471ed8415 by R David Murray in branch 'default':
Merge: #23539: Set Content-Length to 0 for PUT, POST, and PATCH if body is None.
https://hg.python.org/cpython/rev/643471ed8415
msg238927 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-03-22 19:21
Thanks James and Demien.  I decided to rewrite the 'request' docs for Python3 to make them easier to follow (and more accurate).  Hopefully I didn't make any mistakes, but you might want to review it just in case.
msg238930 - (view) Author: James Rutherford (jimr) * Date: 2015-03-22 19:39
Updated docs look good to me, thanks!
History
Date User Action Args
2015-03-22 19:39:09jimrsetmessages: + msg238930
2015-03-22 19:21:10r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg238927

stage: commit review -> resolved
2015-03-22 19:19:32python-devsetnosy: + python-dev
messages: + msg238926
2015-03-20 05:52:01demian.brechtsetmessages: + msg238604
2015-03-20 00:58:48berker.peksagsetnosy: + berker.peksag
2015-03-19 22:40:13jimrsetfiles: + issue23539-py27.patch

messages: + msg238573
2015-03-19 22:32:13jimrsetfiles: + issue23539-py3.patch

messages: + msg238570
2015-03-13 17:02:36jimrsetmessages: + msg238040
2015-03-13 12:48:15r.david.murraysetmessages: + msg238025
2015-03-13 09:18:28jimrsetmessages: + msg238017
2015-03-04 09:29:12jimrsetmessages: + msg237177
2015-03-04 02:38:09r.david.murraysetmessages: + msg237171
2015-03-02 18:56:05demian.brechtsetstage: patch review -> commit review
2015-03-02 18:55:44demian.brechtsetmessages: + msg237075
2015-03-02 18:24:55orsenthilsetmessages: + msg237071
2015-03-02 17:51:39jimrsetmessages: + msg237067
2015-03-02 17:41:15r.david.murraysetmessages: + msg237066
2015-03-02 17:19:31demian.brechtsetmessages: + msg237065
2015-03-02 15:19:07r.david.murraysetmessages: + msg237049
2015-03-01 18:27:37demian.brechtsetnosy: + orsenthil, r.david.murray
messages: + msg236971
2015-03-01 17:30:10demian.brechtsetmessages: + msg236970
2015-03-01 16:07:09jimrsetmessages: + msg236967
2015-03-01 01:30:06demian.brechtsetmessages: + msg236930
2015-03-01 01:26:36demian.brechtsetmessages: + msg236929
2015-02-28 22:07:39jimrsetmessages: + msg236917
2015-02-28 21:36:26martin.pantersetmessages: + msg236912
2015-02-28 16:13:14jimrsetfiles: + issue23539-py27.patch

messages: + msg236893
2015-02-28 01:24:44demian.brechtsetmessages: + msg236863
stage: needs patch -> patch review
2015-02-27 22:09:59jimrsetfiles: + issue23539-py27.patch

messages: + msg236850
2015-02-27 21:47:40martin.pantersetnosy: + martin.panter
messages: + msg236847
2015-02-27 21:11:37demian.brechtsetmessages: + msg236840
2015-02-27 20:30:41demian.brechtsetmessages: + msg236832
2015-02-27 20:28:56jimrsetfiles: + issue23539-py27.patch
keywords: + patch
messages: + msg236831
2015-02-27 19:24:27jimrsetmessages: + msg236822
2015-02-27 18:58:38jimrsetmessages: + msg236817
versions: - Python 3.2, Python 3.3
2015-02-27 18:52:50demian.brechtsetmessages: + msg236815
2015-02-27 18:35:22jimrsetnosy: + jimr

messages: + msg236812
versions: + Python 2.7, Python 3.2, Python 3.3, Python 3.4
2015-02-27 17:30:43demian.brechtcreate