Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distutils upload/register should use CRLF in HTTP requests #54719

Closed
BrianJones mannequin opened this issue Nov 23, 2010 · 25 comments
Closed

distutils upload/register should use CRLF in HTTP requests #54719

BrianJones mannequin opened this issue Nov 23, 2010 · 25 comments
Assignees
Labels
easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@BrianJones
Copy link
Mannequin

BrianJones mannequin commented Nov 23, 2010

BPO 10510
Nosy @orsenthil, @tarekziade, @merwok, @bitdancer, @sigmavirus24
Files
  • compliant_distutils.patch: Issue patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/merwok'
    closed_at = <Date 2014-09-27.20:59:57.489>
    created_at = <Date 2010-11-23.05:11:52.451>
    labels = ['easy', 'type-bug', 'library']
    title = 'distutils upload/register should use CRLF in HTTP requests'
    updated_at = <Date 2014-11-21.23:35:14.210>
    user = 'https://bugs.python.org/BrianJones'

    bugs.python.org fields:

    activity = <Date 2014-11-21.23:35:14.210>
    actor = 'eric.araujo'
    assignee = 'eric.araujo'
    closed = True
    closed_date = <Date 2014-09-27.20:59:57.489>
    closer = 'r.david.murray'
    components = ['Distutils']
    creation = <Date 2010-11-23.05:11:52.451>
    creator = 'Brian.Jones'
    dependencies = []
    files = ['35067']
    hgrepos = []
    issue_num = 10510
    keywords = ['patch', 'easy']
    message_count = 25.0
    messages = ['122192', '122210', '122220', '122223', '122440', '122455', '122457', '122458', '122459', '122460', '124093', '139069', '139106', '139109', '139139', '146215', '211826', '211828', '217336', '218822', '219157', '221100', '227728', '227760', '231502']
    nosy_count = 10.0
    nosy_names = ['orsenthil', 'jerub', 'tarek', 'eric.araujo', 'r.david.murray', 'mitchellh', 'Brian.Jones', 'python-dev', 'icordasc', 'Matthew.Iversen']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue10510'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @BrianJones
    Copy link
    Mannequin Author

    BrianJones mannequin commented Nov 23, 2010

    In trying to write a PyPI service, I of course need to support the registration and upload features of setup.py, which uses distutils for those actions. One thing making this a bit more difficult than need be is the fact that distutils.command.register and distutils.command.upload both use bare line feeds as a line terminator in the HTTP message headers. While there are probably tolerant servers out there, I think this behavior should be considered a bug in light of the HTTP spec, which reads:

    "The line terminator for message-header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers, recognize a single LF as a line terminator and ignore the leading CR."

    The second sentence can be interpreted in multiple ways that either agree with or contradict the first sentence, but the first sentence is at least perfectly clear, and I'd like to see "\r\n" added to the beginning of (specifically in this case) the Content-Disposition headers assembled by distutils.command.upload and distutils.command.register.

    The change involves taking identical lines in each file that currently look like this:

           title = '\nContent-Disposition: form-data; name="%s"' % key
    

    ...and making them look like this:

           title = '\r\nContent-Disposition: form-data; name="%s"' % key
    

    The result is that servers which do not support a lax adherence to the protocol will still be able to support users of distutils in Python 2.6, 2.7, and 3.1 (and maybe others).

    @BrianJones BrianJones mannequin added the type-bug An unexpected behavior, bug, or error label Nov 23, 2010
    @merwok
    Copy link
    Member

    merwok commented Nov 23, 2010

    Thank you for the report. This is how I understand the part of the spec you quoted: Strictly valid HTTP uses CRLF, but servers and clients should be liberal in what they accept and deal with LF too. Senthil, do you agree with that?

    There are a small number of PyPI-like servers out there which seem to deal fine with what distutils produces. We have strong backward compatibility constraints in distutils, so we try to make the less changes possible to fix bugs. Because we’ve haven’t had this reported by those other servers before, I’m tempted to reject this report. The good part is that since you’re still writing the server, you can change its code to follow Postel’s Law and be liberal in what you accept. Does that sound acceptable to you?

    Best regards

    @merwok merwok added the stdlib Python modules in the Lib dir label Nov 23, 2010
    @merwok merwok self-assigned this Nov 23, 2010
    @BrianJones
    Copy link
    Mannequin Author

    BrianJones mannequin commented Nov 23, 2010

    In truth, I don't personally know if the other PyPI server implementations also have to work around this issue. Other comments on that are welcome.

    As for my own implementation, I've implemented a workaround to this, but I'm working around and reimplementing what is otherwise a feature built into the server (and every other one).

    I believe the suggested change makes the world a better place, and having administered several different web servers, I can't imagine ill effects from making the change. Web servers support *at least* the standard. Any ill effects we might imagine are, at least for my part, pure speculation. I, on the other hand, have found a real-life problem now :)

    I guess in the end I think that servers are more likely to err on the side of strict HTTP than make allowances for every "SHOULD" in the protocol spec, and leaving things as-is relies on a "SHOULD", which I submit is incorrect behavior, and therefore a bug.

    It's not like a fix is going to magically improve my life, as I'll have to support older buggy versions anyway, but making the change now can help grease the wheels for future implementations who won't have to hit this hurdle.

    My $.02.

    @orsenthil
    Copy link
    Member

    I think, it is better that distutils.command.register and
    distutils.command.upload use CRLF as the line terminator for header
    values.
    It just helps in many cases, we can safely side by this case, but not
    relying LR or CR only may not be always possible.

    @merwok
    Copy link
    Member

    merwok commented Nov 26, 2010

    I am convinced. Do you want to work on a patch? Guidelines are found at http://www.python.org/dev/patches/

    @merwok merwok changed the title distutils.command.upload/register HTTP message headers: bad line termination distutils upload/register should use CRLF in HTTP requests Nov 26, 2010
    @BrianJones
    Copy link
    Mannequin Author

    BrianJones mannequin commented Nov 26, 2010

    Sure. I'll create a patch in the next few days and submit it. Thanks for the link to the guidelines. :)

    @BrianJones
    Copy link
    Mannequin Author

    BrianJones mannequin commented Nov 26, 2010

    So... have I missed a memo, or is it currently impossible to test the current svn version of distutils in the current svn version of Python?

    The tests for (at least) register and upload are written using Python 2.x syntax and modules. How are new features and fixes for distutils being tested? Is it valid to use a 2.x version of Python to test the trunk version of distutils?

    Sorry if these questions have really obvious answers. First time patcher :?

    @tarekziade
    Copy link
    Mannequin

    tarekziade mannequin commented Nov 26, 2010

    If you can make your server work with the current implementation, I'd rather not change this in distutils but in distutils2.

    Distutils is frozen and we make only bug fixes. By bug fix I mean anything that is a bug. A non-strict implementation of the EOLs is not a bug, and I suggest that you write your patch against distutils2.

    Eric, thanks for all the hard work in triaging the bugs ! Let's make sure we do the bare minimal maintenance in distutils, and suggest to people to contribute in distutils2.

    Brian, for distutils1, the register and upload command would be considered buggy if they don't work for PyPI, which is not the case.

    Thanks!

    @BrianJones
    Copy link
    Mannequin Author

    BrianJones mannequin commented Nov 26, 2010

    If it's not a bug in distutils1, I imagine it will not be a bug in distutils2, since that will also presumably work with PyPI, and PyPI will be the single solitary supported implementation of the service?

    I also don't see distutils2 in this list http://svn.python.org/projects/ which means either distutils2 isn't part of the standard library, or it doesn't follow the documented patch submission process, or perhaps both? Insight welcome here.

    @tarekziade
    Copy link
    Mannequin

    tarekziade mannequin commented Nov 26, 2010

    If it's not a bug in distutils1, I imagine it will not be a bug in distutils2

    Yes that would be a feature request. we would be happy to add. e.g. make the commands works with server X ou server Y.

    I also don't see distutils2 in this list http://svn.python.org/projects/ which means either distutils2 isn't part of the standard library, or it doesn't follow the documented patch submission process, or perhaps both? Insight welcome here.

    Distutils2 is developed at http://hg.python.org/distutils2 and will be included when ready in the stdlib (circa 3.3), as a distutils replacer. It's in fact the distutils trunk, with lots of changes (some are backward incompatible). It will be installable from 2.4 onward.

    It does not follow the same process. Which I have described in python-dev last week: http://mail.python.org/pipermail/python-dev/2010-November/105772.html

    @merwok
    Copy link
    Member

    merwok commented Dec 15, 2010

    Updated guidelines: http://wiki.python.org/moin/Distutils/FixingBugs

    is it currently impossible to test the current svn version of
    distutils in the current svn version of Python?
    It is. Branches that get bugfixes are 3.2, 3.1 and 2.7; each one has its interpreter code and its distutils lib and tests. I’m not sure what prompted your question: was it something on the patches guideline page?

    @merwok merwok removed the stdlib Python modules in the Lib dir label Dec 15, 2010
    @merwok merwok added the easy label Jun 9, 2011
    @jerub
    Copy link
    Mannequin

    jerub mannequin commented Jun 25, 2011

    I'm having a look at this ticket now. It looks like this can be rewritten to use common code, and it would probably be good to use the 'email' module for creating the MIME segements properly.

    @jerub
    Copy link
    Mannequin

    jerub mannequin commented Jun 25, 2011

    Okay, I looked at this, then I ran into str/byte type problems with the email module. Will wait until 'email' is sorted out before I consider a ticket like this one again.

    @merwok
    Copy link
    Member

    merwok commented Jun 25, 2011

    I'm having a look at this ticket now. It looks like this can be
    rewritten to use common code

    Indeed! As such, I’m folding this feature request into bpo-12169, which has a patch under review.

    @merwok merwok changed the title distutils upload/register should use CRLF in HTTP requests packaging upload/register should use CRLF in HTTP requests Jun 25, 2011
    @merwok merwok closed this as completed Jun 25, 2011
    @bitdancer
    Copy link
    Member

    I know this is closed so the comment may not be relevant, but the byte/string issues with email should be sorted out in the code that is in 3.2/3.3 at this point. That is, it is possible once again to work with binary data, using the correct (new) classes. If you have issues where things still don't work, please report them. Note also that email now allows control of the generated linesep characters. (That, unfortunately, won't help the python2 backport.)

    @merwok
    Copy link
    Member

    merwok commented Oct 23, 2011

    Mitchell Hashimoto provided this link on a duplicate report:

    RFC2616 page 31 (http://tools.ietf.org/html/rfc2616#page-31) states that headers must be separated
    by CRLF. Specifically, the above "\n\n" for the header separator is causing issues with some
    minimal RFC-compliant web servers.

    So I checked the RFC carefully again and found this:

    http://tools.ietf.org/html/rfc2616#section-3.7.1

    When in canonical form, media subtypes of the "text" type use CRLF as
    the text line break. HTTP relaxes this requirement and allows the
    transport of text media with plain CR or LF alone representing a line
    break when it is done consistently for an entire entity-body. HTTP
    applications MUST accept CRLF, bare CR, and bare LF as being
    representative of a line break in text media received via HTTP

    The requests we send are multipart/form-data, so the RFC exception for text/* would not apply. On one hand, I don’t think we can say that sending LF is not a bug, on the other hand, I believe nearly all HTTP servers just accept all newlines anyway.

    @merwok merwok changed the title packaging upload/register should use CRLF in HTTP requests distutils upload/register should use CRLF in HTTP requests Oct 23, 2011
    @MatthewIversen
    Copy link
    Mannequin

    MatthewIversen mannequin commented Feb 21, 2014

    Sorry, I referenced http://bugs.python.org/issue12169 before.

    distutils multipart/form-data encoding still breaks the spec for MIME, which demands CRLF line endings.

    Especially since it is now sending HTTP 1.1 requests which should conform.

    The patch / resulting branch from the above issue should fix this, I believe it was meant for a 'distutils2/packaging' effort before? But I can't see why it wouldn't still be applicable to current distutils to fix this issue.

    Revelant parts of RFCs are https://tools.ietf.org/html/rfc2045#section-2.10 and https://tools.ietf.org/html/rfc822#section-3.2

    @merwok
    Copy link
    Member

    merwok commented Feb 21, 2014

    Some clarifications:

    • distutils2/packaging is now abandoned.
    • distutils can get bug fixes, unless they break e.g. work-arounds that have been working in setuptools or setup.py scripts for years.
    • distutils can get new features in 3.5.

    For this issue, I think distutils in all active branches can get fixed to use CRLF for multipart POSTs.

    @merwok merwok reopened this Feb 21, 2014
    @merwok merwok added the stdlib Python modules in the Lib dir label Mar 13, 2014
    @sigmavirus24
    Copy link
    Mannequin

    sigmavirus24 mannequin commented Apr 28, 2014

    I've attached a patch that should fix this issue. Please review and let me know if changes are necessary.

    @merwok
    Copy link
    Member

    merwok commented May 19, 2014

    Patch looks good, will apply. Thanks.

    @sigmavirus24
    Copy link
    Mannequin

    sigmavirus24 mannequin commented May 26, 2014

    Per discussion on twitter (https://twitter.com/merwok_/status/468518605135835136) I'm bumping this to make sure it's merged.

    @sigmavirus24
    Copy link
    Mannequin

    sigmavirus24 mannequin commented Jun 20, 2014

    Bumping this once more.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 27, 2014

    New changeset 5e3f8bd33cf2 by R David Murray in branch '3.4':
    bpo-10510: make distuitls upload/register use HTML standards compliant CRLF.
    https://hg.python.org/cpython/rev/5e3f8bd33cf2

    New changeset ea665bae2ea0 by R David Murray in branch 'default':
    Merge: bpo-10510: make distuitls upload/register use HTML standards compliant CRLF.
    https://hg.python.org/cpython/rev/ea665bae2ea0

    New changeset 9ad78b4b169c by R David Murray in branch '2.7':
    bpo-10510: make distuitls upload/register use HTML standards compliant CRLF.
    https://hg.python.org/cpython/rev/9ad78b4b169c

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 28, 2014

    New changeset 6375bf34fff6 by R David Murray in branch '3.4':
    bpo-10510: Fix bug in forward port of 2.7 distutils patch.
    https://hg.python.org/cpython/rev/6375bf34fff6

    New changeset 90b07d422bd9 by R David Murray in branch 'default':
    bpo-10510: Fix bug in forward port of 2.7 distutils patch.
    https://hg.python.org/cpython/rev/90b07d422bd9

    @merwok
    Copy link
    Member

    merwok commented Nov 21, 2014

    Thanks bitdancer for applying this.

    FTR I just stumbled on code extracted from chishop that shows Django had issues with the non-CRLF requests sent by distutils: https://github.com/disqus/djangopypi/blob/master/djangopypi/http.py#L19

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants