New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Content-length not set for HTTP methods expecting body when body is None #67727
Comments
bpo-14721 solved setting the Content-Length header for 0-length bodies. However, it doesn't account for cases where body is None (reported by James Rutherford here: http://bugs.python.org/issue14721#msg236600). One method of solving this might be something like this: _METHODS_EXPECTING_BODIES = {'OPTIONS', 'POST', 'PUT', 'PATCH'}
if method.upper() in _METHODS_EXPECTING_BODIES and \
'content-length' not in header_names:
self._set_content_length(body) (_set_content_length would have to be updated in order to allow for None) This ensures that Content-Length will not be set for methods not expecting a body. RFC 7230, Section 3.3.2: A user agent SHOULD NOT send a Content-Length header field when the |
Thanks for setting up the new issue, I'll cook up a patch. I'm assuming this affects all Python 3.X versions but I've specifically encountered it on Python 2.7. |
Unless there are any core dev objections, I think it's applicable to 2.7, 3.4 and 3.5 as other minor 3.x versions are in security mode (https://docs.python.org/devguide/devcycle.html#security-branches). |
OK, thanks. |
OK, I've got a patch but it's failing on 'test_send_file'[1], which is sending a body on a GET request. According to the IETF memo[2]: Bodies on GET requests have no defined semantics. Note that sending So I don't know whether to remove the method constraint so this test passes, or to "fix" this test to use a different method. My feeling is that stripping out Content-Length for GET requests could cause problems for some code that for whatever reason uses this, so we should remove that constraint (it was only "SHOULD NOT" after all) and go back to the original proposal of just setting the content length for bodies that are None as well as empty strings. [1] https://hg.python.org/cpython/file/325aec842e3e/Lib/test/test_httplib.py#l297 |
Patch attached for the 2.7 branch, including updated tests. All tests pass. Let me know if this looks like a sensible approach and I'll produce something comparable for 3.X. The logic now is as it was before, except that we set a content length of zero if the body is None and the method is one of OPTIONS, PATCH, PUT, or POST. |
My concern about this is around the combination of the following two passages: draft-ietf-httpbis-p2-semantics-14, Section 7.3:
RFC 7230, Section 3.3.2:
My concern around the above is the possibility that servers may have implementations that reject requests with methods that are not expecting bodies solely on the existence of the Content-Length header rather than the value (which may be 0). Without a good amount of research around whether or not this would be an actual problem in practice, I wouldn’t be comfortable adding Content-Length header to every request.
That’s definitely a valid concern. What if the logic was tweaked a little: expecting_len = method.upper() in _METHODS_EXPECTING_BODIES or \
body is not None
if expecting_len and 'content-length' not in header_names:
self._set_content_length(body) That way, if someone is intentionally sending a body in a GET request, the value of Content-Length would still be automatically added, but would still be set to 0 for all other cases where bodies are expected. |
I see we definitely have similar thinking on the modifications required for this, but I don’t think I’m in favour of the approach in the patch. Consider the case where you have a GET request without a body. Now, instead of simply circumventing the _set_content_length method altogether as would have been done previously, the code path will now go into _set_content_length, have a TypeError exception raised on (str(len([None])) and then an AttributeError exception raised on os.fstat([None].fileno()). That seems to be quite a bit of additional overhead when it could be avoided altogether by an earlier check as I have in my example. What do you think? |
I like Demian’s “expecting_len” technique slightly better because it would avoid calling None.fileno() and len(None), as Demian just said. Also, I think OPTIONS should be removed from the list of methods that enforce a Content-Length. I wouldn’t normally expect any payload for OPTIONS, since RFC 7231 explicitly says it does not define a use for a payload, but requires a Content-Type if a payload is sent. |
The first patch should actually be modified so the condition reads (update attached): if body is None and method_expects_body:
thelen = 0
elif body is not None:
... Demian, I believe this is equivalent to your 'expecting_len' proposal, I've just put the logic inside _set_content_length. My approach was to leave as much existing logic as possible in place to avoid the introduction of new bugs (first patch wasn't entirely successful in this!). As for OPTION, If the OPTIONS request includes a message-body (as indicated by the From http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-14#section-7.2 |
I think Martin's right about this and it's consistent with my concern about servers treating the existence of a Content-Length header as an indicator of a request body. In any event, with the new logic, a Content-Length header will be added if the body is not None.
Fair enough. It /does/ encapsulate all logic around setting the content length within the method. I've left a couple minor comments in Rietveld. In addition to what's there, docs should also be updated to reflect the new behaviour. Thanks for the work on this! |
Happy to remove OPTIONS from the list of methods that gets a content-length where body is None, but do we also want to consider behaviour if it's the empty string? My feeling is that '' implies "present but empty" (so should have a content-length set to zero), whereas None implies "missing" (so should only have a content-length header set to zero if the method is expecting a body. I've updated the patch with this logic, including ensuring proper explicit test coverage for the cases described. Docs updated too. |
Yes I agree with the behaviour that None means no body (for requests such as GET), and an empty string means an empty but present body. Correct me if I’m wrong, but I think the fix for bpo-14721 already does that. Perhaps the new behaviour needs a “Changed in version X” tag? I’m not sure, but I tend to think of this as a new feature, rather than a bug fix. |
I actually consider this a fix for the fix in 14721, rather than a new feature. The only new behaviour here is setting content length to be zero if body is None on PATCH, POST, or PUT. Happy to change the labeling if that's the consensus but IMO it's a bugfix. |
Although I understand your thinking here, I think that bodies that are either None or empty string should be treated identically. Whether or not the Content-Length header is set should be dependent on the method used. From a client standpoint, there’s no difference between sending a body with a value of None and an empty string (the latter will get further down the call stack, but no data is actually sent over the wire). Likewise from the server standpoint, there is no functional difference between a missing or empty body. However, there’s still the possibility of unexpected behaviour due to the inclusion of a Content-Length header for a method not expecting it. In light of that, I think that HTTPConnection(‘example.com’).request(‘GET’, ‘/‘, ‘’) and HTTPConnection(‘example.com’).request(‘GET’, ‘/‘) should result in identical headers with no Content-Length set. If someone /really/ wants to set the Content-Length header for a method that doesn’t expect it, they still have the ability to do so: HTTPConnection(‘example.com’).request(‘GET’, ‘/‘, ‘’, {‘Content-Length’: 0}). However, if the default behaviour is to include the Content-Length header for an empty string, the user then has to change what they’re sending to be None instead of empty string even though they should be functionally equivalent. |
+1 |
I get your reasoning here, but I wonder if that goes beyond the scope of this issue? My initial thinking was the same, but then realised that was inconsistent with the current behaviour. For example, a GET with '' in the body already sets content-length: 0, but a GET with None for the body doesn't set a content length at all. I haven't changed this behaviour in my patch, except for PUT, POST, and PATCH where None and '' are now equivalent for those methods. |
I think it’s worthwhile to fix it while you’re already working on the logic there. I’d consider Content-Type being set for methods not expecting it as a (very) minor bug and it’s better to clear up the inconsistent behaviour. That said, it /is/ breaking backwards compatibility (however slightly) and would likely need to get sign-off from a couple core devs. |
+David, Senthil: Thoughts on this? |
Don't break backward compatibility. It's not like this was reported as a bug that caused a problem for real world code, it is about theoretical consistency. The risk of breaking someone code is much higher than the benefit of any such consistency, when talking about a bug fix. Aside from that, however, I see request.('GET', '/') and request.('GET', '/', '') as clearly *different* from an API call standpoint, so I would in any case preserve the existing behavior. |
I do understand the case that the the two examples look different from a caller’s standpoint. However as a user, I’d expect the library to do the Right Thing due to a deeper understanding of the theory behind the API that I’m using. Perhaps it would have made more sense to default body to empty string rather than None in which case there wouldn’t be a need for this distinction. That said, I do understand the reasoning behind not breaking backwards compatibility and it’s not a problem until reported and am not looking to get into an API design argument (especially for something as low impact as what I was proposing). So @james, I stand corrected :) |
Not to get into an API argument, but rather to convey some Python philosophy as I understand it: there *is* often a distinction in Python between the value 'None' and a boolean-False-value like ''. It is often the case in Python APIs that None means something different from other arguably-semantically-equivalent values. In this case I believe it is entirely logical that 'None' (the default value) for a method that does not normally take a body and makes a distinction based on Content-Length (or some other header) makes sense, and that explicitly saying "I want an empty body, including the standard headers" is actually a sensible Python API. If we were designing it from scratch with these issues in mind, we might well come up with exactly what we've got...or we might not, because wanting to specify an empty body on such a command is a really low probability event :) Regardless, we're constrained by backward compatibility here, so the question is moot. |
OK, sounds like we're approaching consensus? And I believe that the patch as-is captures that consensus, so should I proceed and make another for 3.X for review? |
One patch should be enough, @james. The committer will take care of |
Yes, I believe that we’ve reached a consensus and that your patch captures it. Thanks for the work! |
Single patch, yes, but FYI we actually prefer a python3 patch against default as the single patch. (Unless there are major differences, which there aren't in this case.) I've made some minor review comments. You can either ack my changes (or disagree :) and I'll make them when I commit, or update the patch yourself if you are so moved. |
Single patch against default makes sense and I'll do that in future. As for the review comments, I'm happy to go with all of your suggestions but have offered a tweak to the docstring that you can take or leave at your discretion :) |
Hi all, apologies for the spam, but I just wanted to confirm that no-one is waiting on anything from me... I'm happy to consolidate the final minor points & make the patch against python3 if that would simplify things. |
It would...I've started twice to do the commit and gotten interrupted both times. Having a patch I can just apply would help. If you would be willing to also generate a patch for python3, that would be a help as well. |
Ok I'll have a go at a consolidated python3 patch tomorrow. |
Python 3 patch attached. The documentation has changed structure a little so I've adapted (simplified) this from the original. Otherwise, it's pretty much the same, except with python3 fixes, and incorporated feedback. I'll upload an updated 2.7 patch as well in a few minutes. |
Updated 2.7 patch attached. |
One super minor comment left in Rietveld (py3 patch). Otherwise LGTM. |
New changeset 5c3dc817ffd7 by R David Murray in branch '2.7': New changeset f6f72422df96 by R David Murray in branch '3.4': New changeset 643471ed8415 by R David Murray in branch 'default': |
Thanks James and Demien. I decided to rewrite the 'request' docs for Python3 to make them easier to follow (and more accurate). Hopefully I didn't make any mistakes, but you might want to review it just in case. |
Updated docs look good to me, thanks! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: