New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distutils upload/register should use CRLF in HTTP requests #54719
Comments
In trying to write a PyPI service, I of course need to support the registration and upload features of setup.py, which uses distutils for those actions. One thing making this a bit more difficult than need be is the fact that distutils.command.register and distutils.command.upload both use bare line feeds as a line terminator in the HTTP message headers. While there are probably tolerant servers out there, I think this behavior should be considered a bug in light of the HTTP spec, which reads: "The line terminator for message-header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers, recognize a single LF as a line terminator and ignore the leading CR." The second sentence can be interpreted in multiple ways that either agree with or contradict the first sentence, but the first sentence is at least perfectly clear, and I'd like to see "\r\n" added to the beginning of (specifically in this case) the Content-Disposition headers assembled by distutils.command.upload and distutils.command.register. The change involves taking identical lines in each file that currently look like this:
...and making them look like this:
The result is that servers which do not support a lax adherence to the protocol will still be able to support users of distutils in Python 2.6, 2.7, and 3.1 (and maybe others). |
Thank you for the report. This is how I understand the part of the spec you quoted: Strictly valid HTTP uses CRLF, but servers and clients should be liberal in what they accept and deal with LF too. Senthil, do you agree with that? There are a small number of PyPI-like servers out there which seem to deal fine with what distutils produces. We have strong backward compatibility constraints in distutils, so we try to make the less changes possible to fix bugs. Because we’ve haven’t had this reported by those other servers before, I’m tempted to reject this report. The good part is that since you’re still writing the server, you can change its code to follow Postel’s Law and be liberal in what you accept. Does that sound acceptable to you? Best regards |
In truth, I don't personally know if the other PyPI server implementations also have to work around this issue. Other comments on that are welcome. As for my own implementation, I've implemented a workaround to this, but I'm working around and reimplementing what is otherwise a feature built into the server (and every other one). I believe the suggested change makes the world a better place, and having administered several different web servers, I can't imagine ill effects from making the change. Web servers support *at least* the standard. Any ill effects we might imagine are, at least for my part, pure speculation. I, on the other hand, have found a real-life problem now :) I guess in the end I think that servers are more likely to err on the side of strict HTTP than make allowances for every "SHOULD" in the protocol spec, and leaving things as-is relies on a "SHOULD", which I submit is incorrect behavior, and therefore a bug. It's not like a fix is going to magically improve my life, as I'll have to support older buggy versions anyway, but making the change now can help grease the wheels for future implementations who won't have to hit this hurdle. My $.02. |
I think, it is better that distutils.command.register and |
I am convinced. Do you want to work on a patch? Guidelines are found at http://www.python.org/dev/patches/ |
Sure. I'll create a patch in the next few days and submit it. Thanks for the link to the guidelines. :) |
So... have I missed a memo, or is it currently impossible to test the current svn version of distutils in the current svn version of Python? The tests for (at least) register and upload are written using Python 2.x syntax and modules. How are new features and fixes for distutils being tested? Is it valid to use a 2.x version of Python to test the trunk version of distutils? Sorry if these questions have really obvious answers. First time patcher :? |
If you can make your server work with the current implementation, I'd rather not change this in distutils but in distutils2. Distutils is frozen and we make only bug fixes. By bug fix I mean anything that is a bug. A non-strict implementation of the EOLs is not a bug, and I suggest that you write your patch against distutils2. Eric, thanks for all the hard work in triaging the bugs ! Let's make sure we do the bare minimal maintenance in distutils, and suggest to people to contribute in distutils2. Brian, for distutils1, the register and upload command would be considered buggy if they don't work for PyPI, which is not the case. Thanks! |
If it's not a bug in distutils1, I imagine it will not be a bug in distutils2, since that will also presumably work with PyPI, and PyPI will be the single solitary supported implementation of the service? I also don't see distutils2 in this list http://svn.python.org/projects/ which means either distutils2 isn't part of the standard library, or it doesn't follow the documented patch submission process, or perhaps both? Insight welcome here. |
Yes that would be a feature request. we would be happy to add. e.g. make the commands works with server X ou server Y.
Distutils2 is developed at http://hg.python.org/distutils2 and will be included when ready in the stdlib (circa 3.3), as a distutils replacer. It's in fact the distutils trunk, with lots of changes (some are backward incompatible). It will be installable from 2.4 onward. It does not follow the same process. Which I have described in python-dev last week: http://mail.python.org/pipermail/python-dev/2010-November/105772.html |
Updated guidelines: http://wiki.python.org/moin/Distutils/FixingBugs
|
I'm having a look at this ticket now. It looks like this can be rewritten to use common code, and it would probably be good to use the 'email' module for creating the MIME segements properly. |
Okay, I looked at this, then I ran into str/byte type problems with the email module. Will wait until 'email' is sorted out before I consider a ticket like this one again. |
Indeed! As such, I’m folding this feature request into bpo-12169, which has a patch under review. |
I know this is closed so the comment may not be relevant, but the byte/string issues with email should be sorted out in the code that is in 3.2/3.3 at this point. That is, it is possible once again to work with binary data, using the correct (new) classes. If you have issues where things still don't work, please report them. Note also that email now allows control of the generated linesep characters. (That, unfortunately, won't help the python2 backport.) |
Mitchell Hashimoto provided this link on a duplicate report:
So I checked the RFC carefully again and found this:
The requests we send are multipart/form-data, so the RFC exception for text/* would not apply. On one hand, I don’t think we can say that sending LF is not a bug, on the other hand, I believe nearly all HTTP servers just accept all newlines anyway. |
Sorry, I referenced http://bugs.python.org/issue12169 before. distutils multipart/form-data encoding still breaks the spec for MIME, which demands CRLF line endings. Especially since it is now sending HTTP 1.1 requests which should conform. The patch / resulting branch from the above issue should fix this, I believe it was meant for a 'distutils2/packaging' effort before? But I can't see why it wouldn't still be applicable to current distutils to fix this issue. Revelant parts of RFCs are https://tools.ietf.org/html/rfc2045#section-2.10 and https://tools.ietf.org/html/rfc822#section-3.2 |
Some clarifications:
For this issue, I think distutils in all active branches can get fixed to use CRLF for multipart POSTs. |
I've attached a patch that should fix this issue. Please review and let me know if changes are necessary. |
Patch looks good, will apply. Thanks. |
Per discussion on twitter (https://twitter.com/merwok_/status/468518605135835136) I'm bumping this to make sure it's merged. |
Bumping this once more. |
New changeset 5e3f8bd33cf2 by R David Murray in branch '3.4': New changeset ea665bae2ea0 by R David Murray in branch 'default': New changeset 9ad78b4b169c by R David Murray in branch '2.7': |
New changeset 6375bf34fff6 by R David Murray in branch '3.4': New changeset 90b07d422bd9 by R David Murray in branch 'default': |
Thanks bitdancer for applying this. FTR I just stumbled on code extracted from chishop that shows Django had issues with the non-CRLF requests sent by distutils: https://github.com/disqus/djangopypi/blob/master/djangopypi/http.py#L19 |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: