Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content-length not set for HTTP methods expecting body when body is None #67727

Closed
demianbrecht mannequin opened this issue Feb 27, 2015 · 36 comments
Closed

Content-length not set for HTTP methods expecting body when body is None #67727

demianbrecht mannequin opened this issue Feb 27, 2015 · 36 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@demianbrecht
Copy link
Mannequin

demianbrecht mannequin commented Feb 27, 2015

BPO 23539
Nosy @orsenthil, @bitdancer, @berkerpeksag, @vadmium, @demianbrecht
Files
  • issue23539-py27.patch
  • issue23539-py27.patch
  • issue23539-py27.patch
  • issue23539-py3.patch
  • issue23539-py27.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-03-22.19:21:10.807>
    created_at = <Date 2015-02-27.17:30:43.379>
    labels = ['type-bug', 'library']
    title = 'Content-length not set for HTTP methods expecting body when body is None'
    updated_at = <Date 2015-03-22.19:39:09.799>
    user = 'https://github.com/demianbrecht'

    bugs.python.org fields:

    activity = <Date 2015-03-22.19:39:09.799>
    actor = 'jimr'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-03-22.19:21:10.807>
    closer = 'r.david.murray'
    components = ['Library (Lib)']
    creation = <Date 2015-02-27.17:30:43.379>
    creator = 'demian.brecht'
    dependencies = []
    files = ['38267', '38270', '38276', '38569', '38570']
    hgrepos = []
    issue_num = 23539
    keywords = ['patch']
    message_count = 36.0
    messages = ['236803', '236812', '236815', '236817', '236822', '236831', '236832', '236840', '236847', '236850', '236863', '236893', '236912', '236917', '236929', '236930', '236967', '236970', '236971', '237049', '237065', '237066', '237067', '237071', '237075', '237171', '237177', '238017', '238025', '238040', '238570', '238573', '238604', '238926', '238927', '238930']
    nosy_count = 7.0
    nosy_names = ['orsenthil', 'r.david.murray', 'python-dev', 'berker.peksag', 'martin.panter', 'demian.brecht', 'jimr']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue23539'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Feb 27, 2015

    bpo-14721 solved setting the Content-Length header for 0-length bodies. However, it doesn't account for cases where body is None (reported by James Rutherford here: http://bugs.python.org/issue14721#msg236600).

    One method of solving this might be something like this:

    _METHODS_EXPECTING_BODIES = {'OPTIONS', 'POST', 'PUT', 'PATCH'}
    if method.upper() in _METHODS_EXPECTING_BODIES and \
            'content-length' not in header_names:
        self._set_content_length(body)

    (_set_content_length would have to be updated in order to allow for None)

    This ensures that Content-Length will not be set for methods not expecting a body.

    RFC 7230, Section 3.3.2:

    A user agent SHOULD NOT send a Content-Length header field when the
    request message does not contain a payload body and the method
    semantics do not anticipate such a body.

    @demianbrecht demianbrecht mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Feb 27, 2015
    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Feb 27, 2015

    Thanks for setting up the new issue, I'll cook up a patch. I'm assuming this affects all Python 3.X versions but I've specifically encountered it on Python 2.7.

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Feb 27, 2015

    I'm assuming this affects all Python 3.X versions but I've specifically encountered it on Python 2.7.

    Unless there are any core dev objections, I think it's applicable to 2.7, 3.4 and 3.5 as other minor 3.x versions are in security mode (https://docs.python.org/devguide/devcycle.html#security-branches).

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Feb 27, 2015

    OK, thanks.

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Feb 27, 2015

    OK, I've got a patch but it's failing on 'test_send_file'[1], which is sending a body on a GET request. According to the IETF memo[2]:

    Bodies on GET requests have no defined semantics. Note that sending
    a body on a GET request might cause some existing implementations to
    reject the request.

    So I don't know whether to remove the method constraint so this test passes, or to "fix" this test to use a different method. My feeling is that stripping out Content-Length for GET requests could cause problems for some code that for whatever reason uses this, so we should remove that constraint (it was only "SHOULD NOT" after all) and go back to the original proposal of just setting the content length for bodies that are None as well as empty strings.

    [1] https://hg.python.org/cpython/file/325aec842e3e/Lib/test/test_httplib.py#l297
    [2] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-14#section-7.3

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Feb 27, 2015

    Patch attached for the 2.7 branch, including updated tests. All tests pass. Let me know if this looks like a sensible approach and I'll produce something comparable for 3.X.

    The logic now is as it was before, except that we set a content length of zero if the body is None and the method is one of OPTIONS, PATCH, PUT, or POST.

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Feb 27, 2015

    My concern about this is around the combination of the following two passages:

    draft-ietf-httpbis-p2-semantics-14, Section 7.3:

    Bodies on GET requests have no defined semantics. Note that sending
    a body on a GET request might cause some existing implementations to
    reject the request.

    RFC 7230, Section 3.3.2:

    The presence of a message body in a request is signaled by a Content-Length or Transfer-Encoding header field.

    My concern around the above is the possibility that servers may have implementations that reject requests with methods that are not expecting bodies solely on the existence of the Content-Length header rather than the value (which may be 0). Without a good amount of research around whether or not this would be an actual problem in practice, I wouldn’t be comfortable adding Content-Length header to every request.

    My feeling is that stripping out Content-Length for GET requests could cause problems for some code that for whatever reason uses this

    That’s definitely a valid concern. What if the logic was tweaked a little:

    expecting_len = method.upper() in _METHODS_EXPECTING_BODIES or \
        body is not None
    
    if expecting_len and 'content-length' not in header_names:
        self._set_content_length(body)

    That way, if someone is intentionally sending a body in a GET request, the value of Content-Length would still be automatically added, but would still be set to 0 for all other cases where bodies are expected.

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Feb 27, 2015

    The logic now is as it was before, except that we set a content length of zero if the body is None and the method is one of OPTIONS, PATCH, PUT, or POST.

    I see we definitely have similar thinking on the modifications required for this, but I don’t think I’m in favour of the approach in the patch. Consider the case where you have a GET request without a body. Now, instead of simply circumventing the _set_content_length method altogether as would have been done previously, the code path will now go into _set_content_length, have a TypeError exception raised on (str(len([None])) and then an AttributeError exception raised on os.fstat([None].fileno()). That seems to be quite a bit of additional overhead when it could be avoided altogether by an earlier check as I have in my example.

    What do you think?

    @vadmium
    Copy link
    Member

    vadmium commented Feb 27, 2015

    I like Demian’s “expecting_len” technique slightly better because it would avoid calling None.fileno() and len(None), as Demian just said.

    Also, I think OPTIONS should be removed from the list of methods that enforce a Content-Length. I wouldn’t normally expect any payload for OPTIONS, since RFC 7231 explicitly says it does not define a use for a payload, but requires a Content-Type if a payload is sent.

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Feb 27, 2015

    The first patch should actually be modified so the condition reads (update attached):

        if body is None and method_expects_body:
            thelen = 0
        elif body is not None:
            ...

    Demian, I believe this is equivalent to your 'expecting_len' proposal, I've just put the logic inside _set_content_length. My approach was to leave as much existing logic as possible in place to avoid the introduction of new bugs (first patch wasn't entirely successful in this!).

    As for OPTION,

    If the OPTIONS request includes a message-body (as indicated by the
    presence of Content-Length or Transfer-Encoding), then the media type
    MUST be indicated by a Content-Type field. Although this
    specification does not define any use for such a body, future
    extensions to HTTP might use the OPTIONS body to make more detailed
    queries on the server.

    From http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-14#section-7.2

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Feb 28, 2015

    Also, I think OPTIONS should be removed from the list of methods that enforce a Content-Length. I wouldn’t normally expect any payload for OPTIONS, since RFC 7231 explicitly says it does not define a use for a payload, but requires a Content-Type if a payload is sent.

    I think Martin's right about this and it's consistent with my concern about servers treating the existence of a Content-Length header as an indicator of a request body.

    In any event, with the new logic, a Content-Length header will be added if the body is not None.

    I've just put the logic inside _set_content_length

    Fair enough. It /does/ encapsulate all logic around setting the content length within the method.

    I've left a couple minor comments in Rietveld. In addition to what's there, docs should also be updated to reflect the new behaviour.

    Thanks for the work on this!

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Feb 28, 2015

    Happy to remove OPTIONS from the list of methods that gets a content-length where body is None, but do we also want to consider behaviour if it's the empty string? My feeling is that '' implies "present but empty" (so should have a content-length set to zero), whereas None implies "missing" (so should only have a content-length header set to zero if the method is expecting a body.

    I've updated the patch with this logic, including ensuring proper explicit test coverage for the cases described. Docs updated too.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 28, 2015

    Yes I agree with the behaviour that None means no body (for requests such as GET), and an empty string means an empty but present body. Correct me if I’m wrong, but I think the fix for bpo-14721 already does that.

    Perhaps the new behaviour needs a “Changed in version X” tag? I’m not sure, but I tend to think of this as a new feature, rather than a bug fix.

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Feb 28, 2015

    I actually consider this a fix for the fix in 14721, rather than a new feature. The only new behaviour here is setting content length to be zero if body is None on PATCH, POST, or PUT. Happy to change the labeling if that's the consensus but IMO it's a bugfix.

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Mar 1, 2015

    My feeling is that '' implies "present but empty" (so should have a content-length set to zero), whereas None implies "missing" (so should only have a content-length header set to zero if the method is expecting a body.

    Although I understand your thinking here, I think that bodies that are either None or empty string should be treated identically. Whether or not the Content-Length header is set should be dependent on the method used.

    From a client standpoint, there’s no difference between sending a body with a value of None and an empty string (the latter will get further down the call stack, but no data is actually sent over the wire).

    Likewise from the server standpoint, there is no functional difference between a missing or empty body. However, there’s still the possibility of unexpected behaviour due to the inclusion of a Content-Length header for a method not expecting it.

    In light of that, I think that HTTPConnection(‘example.com’).request(‘GET’, ‘/‘, ‘’) and HTTPConnection(‘example.com’).request(‘GET’, ‘/‘) should result in identical headers with no Content-Length set.

    If someone /really/ wants to set the Content-Length header for a method that doesn’t expect it, they still have the ability to do so: HTTPConnection(‘example.com’).request(‘GET’, ‘/‘, ‘’, {‘Content-Length’: 0}). However, if the default behaviour is to include the Content-Length header for an empty string, the user then has to change what they’re sending to be None instead of empty string even though they should be functionally equivalent.

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Mar 1, 2015

    I actually consider this a fix for the fix in 14721, rather than a new feature.

    +1

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Mar 1, 2015

    > My feeling is that '' implies "present but empty" (so should have a content-length set to zero), whereas None implies "missing" (so should only have a content-length header set to zero if the method is expecting a body.
    ...
    In light of that, I think that HTTPConnection(‘example.com’).request(‘GET’, ‘/‘, ‘’) and HTTPConnection(‘example.com’).request(‘GET’, ‘/‘) should result in identical headers with no Content-Length set.

    I get your reasoning here, but I wonder if that goes beyond the scope of this issue? My initial thinking was the same, but then realised that was inconsistent with the current behaviour. For example, a GET with '' in the body already sets content-length: 0, but a GET with None for the body doesn't set a content length at all. I haven't changed this behaviour in my patch, except for PUT, POST, and PATCH where None and '' are now equivalent for those methods.

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Mar 1, 2015

    but I wonder if that goes beyond the scope of this issue?

    I think it’s worthwhile to fix it while you’re already working on the logic there. I’d consider Content-Type being set for methods not expecting it as a (very) minor bug and it’s better to clear up the inconsistent behaviour. That said, it /is/ breaking backwards compatibility (however slightly) and would likely need to get sign-off from a couple core devs.

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Mar 1, 2015

    +David, Senthil: Thoughts on this?

    @bitdancer
    Copy link
    Member

    Don't break backward compatibility. It's not like this was reported as a bug that caused a problem for real world code, it is about theoretical consistency. The risk of breaking someone code is much higher than the benefit of any such consistency, when talking about a bug fix.

    Aside from that, however, I see request.('GET', '/') and request.('GET', '/', '') as clearly *different* from an API call standpoint, so I would in any case preserve the existing behavior.

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Mar 2, 2015

    Aside from that, however, I see request.('GET', '/') and request.('GET', '/', '') as clearly *different* from an API call standpoint, so I would in any case preserve the existing behavior.

    I do understand the case that the the two examples look different from a caller’s standpoint. However as a user, I’d expect the library to do the Right Thing due to a deeper understanding of the theory behind the API that I’m using. Perhaps it would have made more sense to default body to empty string rather than None in which case there wouldn’t be a need for this distinction.

    That said, I do understand the reasoning behind not breaking backwards compatibility and it’s not a problem until reported and am not looking to get into an API design argument (especially for something as low impact as what I was proposing). So @james, I stand corrected :)

    @bitdancer
    Copy link
    Member

    Not to get into an API argument, but rather to convey some Python philosophy as I understand it: there *is* often a distinction in Python between the value 'None' and a boolean-False-value like ''. It is often the case in Python APIs that None means something different from other arguably-semantically-equivalent values. In this case I believe it is entirely logical that 'None' (the default value) for a method that does not normally take a body and makes a distinction based on Content-Length (or some other header) makes sense, and that explicitly saying "I want an empty body, including the standard headers" is actually a sensible Python API. If we were designing it from scratch with these issues in mind, we might well come up with exactly what we've got...or we might not, because wanting to specify an empty body on such a command is a really low probability event :)

    Regardless, we're constrained by backward compatibility here, so the question is moot.

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Mar 2, 2015

    OK, sounds like we're approaching consensus? And I believe that the patch as-is captures that consensus, so should I proceed and make another for 3.X for review?

    @orsenthil
    Copy link
    Member

    One patch should be enough, @james. The committer will take care of
    porting. Thanks! ( I am yet to completely read the discussion/ review the
    patch and I will do that shortly.)

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Mar 2, 2015

    OK, sounds like we're approaching consensus? And I believe that the patch as-is captures that consensus

    Yes, I believe that we’ve reached a consensus and that your patch captures it. Thanks for the work!

    @bitdancer
    Copy link
    Member

    Single patch, yes, but FYI we actually prefer a python3 patch against default as the single patch. (Unless there are major differences, which there aren't in this case.)

    I've made some minor review comments. You can either ack my changes (or disagree :) and I'll make them when I commit, or update the patch yourself if you are so moved.

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Mar 4, 2015

    Single patch against default makes sense and I'll do that in future.

    As for the review comments, I'm happy to go with all of your suggestions but have offered a tweak to the docstring that you can take or leave at your discretion :)

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Mar 13, 2015

    Hi all, apologies for the spam, but I just wanted to confirm that no-one is waiting on anything from me... I'm happy to consolidate the final minor points & make the patch against python3 if that would simplify things.

    @bitdancer
    Copy link
    Member

    It would...I've started twice to do the commit and gotten interrupted both times. Having a patch I can just apply would help. If you would be willing to also generate a patch for python3, that would be a help as well.

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Mar 13, 2015

    Ok I'll have a go at a consolidated python3 patch tomorrow.

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Mar 19, 2015

    Python 3 patch attached. The documentation has changed structure a little so I've adapted (simplified) this from the original. Otherwise, it's pretty much the same, except with python3 fixes, and incorporated feedback. I'll upload an updated 2.7 patch as well in a few minutes.

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Mar 19, 2015

    Updated 2.7 patch attached.

    @demianbrecht
    Copy link
    Mannequin Author

    demianbrecht mannequin commented Mar 20, 2015

    One super minor comment left in Rietveld (py3 patch). Otherwise LGTM.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 22, 2015

    New changeset 5c3dc817ffd7 by R David Murray in branch '2.7':
    bpo-23539: Set Content-Length to 0 for PUT, POST, and PATCH if body is None.
    https://hg.python.org/cpython/rev/5c3dc817ffd7

    New changeset f6f72422df96 by R David Murray in branch '3.4':
    bpo-23539: Set Content-Length to 0 for PUT, POST, and PATCH if body is None.
    https://hg.python.org/cpython/rev/f6f72422df96

    New changeset 643471ed8415 by R David Murray in branch 'default':
    Merge: bpo-23539: Set Content-Length to 0 for PUT, POST, and PATCH if body is None.
    https://hg.python.org/cpython/rev/643471ed8415

    @bitdancer
    Copy link
    Member

    Thanks James and Demien. I decided to rewrite the 'request' docs for Python3 to make them easier to follow (and more accurate). Hopefully I didn't make any mistakes, but you might want to review it just in case.

    @jimr
    Copy link
    Mannequin

    jimr mannequin commented Mar 22, 2015

    Updated docs look good to me, thanks!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants