classification
Title: distutils upload/register should use CRLF in HTTP requests
Type: behavior Stage: resolved
Components: Distutils Versions: Python 3.5, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: eric.araujo Nosy List: Brian.Jones, Matthew.Iversen, eric.araujo, icordasc, jerub, mitchellh, orsenthil, python-dev, r.david.murray, tarek
Priority: normal Keywords: easy, patch

Created on 2010-11-23 05:11 by Brian.Jones, last changed 2014-11-21 23:35 by eric.araujo. This issue is now closed.

Files
File name Uploaded Description Edit
compliant_distutils.patch icordasc, 2014-04-28 02:03 Issue patch review
Messages (25)
msg122192 - (view) Author: Brian Jones (Brian.Jones) * Date: 2010-11-23 05:11
In trying to write a PyPI service, I of course need to support the registration and upload features of setup.py, which uses distutils for those actions. One thing making this a bit more difficult than need be is the fact that distutils.command.register and distutils.command.upload both use bare line feeds as a line terminator in the HTTP message headers. While there are probably tolerant servers out there, I think this behavior should be considered a bug in light of the HTTP spec, which reads: 

"The line terminator for message-header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers, recognize a single LF as a line terminator and ignore the leading CR."

The second sentence can be interpreted in multiple ways that either agree with or contradict the first sentence, but the first sentence is at least perfectly clear, and I'd like to see "\r\n" added to the beginning of (specifically in this case) the Content-Disposition headers assembled by distutils.command.upload and distutils.command.register.  

The change involves taking identical lines in each file that currently look like this: 


           title = '\nContent-Disposition: form-data; name="%s"' % key

...and making them look like this: 


           title = '\r\nContent-Disposition: form-data; name="%s"' % key

The result is that servers which do not support a lax adherence to the protocol will still be able to support users of distutils in Python 2.6, 2.7, and 3.1 (and maybe others).
msg122210 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-23 11:50
Thank you for the report.  This is how I understand the part of the spec you quoted: Strictly valid HTTP uses CRLF, but servers and clients should be liberal in what they accept and deal with LF too.  Senthil, do you agree with that?

There are a small number of PyPI-like servers out there which seem to deal fine with what distutils produces.  We have strong backward compatibility constraints in distutils, so we try to make the less changes possible to fix bugs.  Because we’ve haven’t had this reported by those other servers before, I’m tempted to reject this report.  The good part is that since you’re still writing the server, you can change its code to follow Postel’s Law and be liberal in what you accept.  Does that sound acceptable to you?

Best regards
msg122220 - (view) Author: Brian Jones (Brian.Jones) * Date: 2010-11-23 15:51
In truth, I don't personally know if the other PyPI server implementations also have to work around this issue. Other comments on that are welcome. 

As for my own implementation, I've implemented a workaround to this, but I'm working around and reimplementing what is otherwise a feature built into the server (and every other one). 

I believe the suggested change makes the world a better place, and having administered several different web servers, I can't imagine ill effects from making the change. Web servers support *at least* the standard. Any ill effects we might imagine are, at least for my part, pure speculation. I, on the other hand, have found a real-life problem now :) 

I guess in the end I think that servers are more likely to err on the side of strict HTTP than make allowances for every "SHOULD" in the protocol spec, and leaving things as-is relies on a "SHOULD", which I submit is incorrect behavior, and therefore a bug. 

It's not like a fix is going to magically improve my life, as I'll have to support older buggy versions anyway, but making the change now can help grease the wheels for future implementations who won't have to hit this hurdle. 

My $.02.
msg122223 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-11-23 16:12
I think, it is better that  distutils.command.register and
distutils.command.upload  use CRLF as the line terminator for header
values.
It just helps in many cases, we can safely side by this case, but not
relying LR or CR only may not be always possible.
msg122440 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-26 05:33
I am convinced.  Do you want to work on a patch?  Guidelines are found at http://www.python.org/dev/patches/
msg122455 - (view) Author: Brian Jones (Brian.Jones) * Date: 2010-11-26 13:25
Sure. I'll create a patch in the next few days and submit it. Thanks for the link to the guidelines. :)
msg122457 - (view) Author: Brian Jones (Brian.Jones) * Date: 2010-11-26 14:12
So... have I missed a memo, or is it currently impossible to test the current svn version of distutils in the current svn version of Python?

The tests for (at least) register and upload are written using Python 2.x syntax and modules. How are new features and fixes for distutils being tested? Is it valid to use a 2.x version of Python to test the trunk version of distutils? 

Sorry if these questions have really obvious answers. First time patcher :?
msg122458 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2010-11-26 14:18
If you can make your server work with the current implementation, I'd rather not change this in distutils but in distutils2.

Distutils is frozen and we make only bug fixes. By bug fix I mean anything that is a bug. A non-strict implementation of the EOLs is not a bug, and I suggest that you write your patch against distutils2.

Eric, thanks for all the hard work in triaging the bugs ! Let's make sure we do the bare minimal maintenance in distutils, and suggest to people to contribute in distutils2. 

Brian, for distutils1, the register and upload command would be  considered buggy if they don't work for PyPI, which is not the case.

Thanks!
msg122459 - (view) Author: Brian Jones (Brian.Jones) * Date: 2010-11-26 14:44
If it's not a bug in distutils1, I imagine it will not be a bug in distutils2, since that will also presumably work with PyPI, and PyPI will be the single solitary supported implementation of the service? 

I also don't see distutils2 in this list http://svn.python.org/projects/  which means either distutils2 isn't part of the standard library, or it doesn't follow the documented patch submission process, or perhaps both? Insight welcome here.
msg122460 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2010-11-26 15:01
> If it's not a bug in distutils1, I imagine it will not be a bug in distutils2

Yes that would be a feature request. we would be happy to add. e.g. make the commands works with server X ou server Y.


> I also don't see distutils2 in this list http://svn.python.org/projects/  which means either distutils2 isn't part of the standard library, or it doesn't follow the documented patch submission process, or perhaps both? Insight welcome here.

Distutils2 is developed at http://hg.python.org/distutils2 and will be included when ready in the stdlib (circa 3.3), as a distutils replacer. It's in fact the distutils trunk, with lots of changes (some are backward incompatible). It will be installable from 2.4 onward. 

It does not follow the same process. Which I have described in python-dev last week: http://mail.python.org/pipermail/python-dev/2010-November/105772.html
msg124093 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-12-15 23:48
Updated guidelines: http://wiki.python.org/moin/Distutils/FixingBugs

> is it currently impossible to test the current svn version of
> distutils in the current svn version of Python?
It is.  Branches that get bugfixes are 3.2, 3.1 and 2.7; each one has its interpreter code and its distutils lib and tests.  I’m not sure what prompted your question: was it something on the patches guideline page?
msg139069 - (view) Author: Stephen Thorne (jerub) * Date: 2011-06-25 14:18
I'm having a look at this ticket now. It looks like this can be rewritten to use common code, and it would probably be good to use the 'email' module for creating the MIME segements properly.
msg139106 - (view) Author: Stephen Thorne (jerub) * Date: 2011-06-25 17:39
Okay, I looked at this, then I ran into str/byte type problems with the email module. Will wait until 'email' is sorted out before I consider a ticket like this one again.
msg139109 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-06-25 19:49
> I'm having a look at this ticket now. It looks like this can be
> rewritten to use common code

Indeed!  As such, I’m folding this feature request into #12169, which has a patch under review.
msg139139 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-06-26 03:19
I know this is closed so the comment may not be relevant, but the byte/string issues with email should be sorted out in the code that is in 3.2/3.3 at this point.  That is, it is possible once again to work with binary data, using the correct (new) classes.  If you have issues where things still don't work, please report them.  Note also that email now allows control of the generated linesep characters.  (That, unfortunately, won't help the python2 backport.)
msg146215 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-10-23 02:13
Mitchell Hashimoto provided this link on a duplicate report:

> RFC2616 page 31 (http://tools.ietf.org/html/rfc2616#page-31) states that headers must be separated
> by CRLF. Specifically, the above "\n\n" for the header separator is causing issues with some
> minimal RFC-compliant web servers.

So I checked the RFC carefully again and found this:

> http://tools.ietf.org/html/rfc2616#section-3.7.1
>
> When in canonical form, media subtypes of the "text" type use CRLF as
> the text line break. HTTP relaxes this requirement and allows the
> transport of text media with plain CR or LF alone representing a line
> break when it is done consistently for an entire entity-body. HTTP
> applications MUST accept CRLF, bare CR, and bare LF as being
> representative of a line break in text media received via HTTP

The requests we send are multipart/form-data, so the RFC exception for text/* would not apply.  On one hand, I don’t think we can say that sending LF is not a bug, on the other hand, I believe nearly all HTTP servers just accept all newlines anyway.
msg211826 - (view) Author: Matthew Iversen (Matthew.Iversen) Date: 2014-02-21 08:33
Sorry, I referenced http://bugs.python.org/issue12169 before.

distutils multipart/form-data encoding still breaks the spec for MIME, which demands CRLF line endings.

Especially since it is now sending HTTP 1.1 requests which should conform.

The patch / resulting branch from the above issue should fix this, I believe it was meant for a 'distutils2/packaging' effort before? But I can't see why it wouldn't still be applicable to current distutils to fix this issue.

Revelant parts of RFCs are https://tools.ietf.org/html/rfc2045#section-2.10 and https://tools.ietf.org/html/rfc822#section-3.2
msg211828 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2014-02-21 08:38
Some clarifications:

* distutils2/packaging is now abandoned.
* distutils can get bug fixes, unless they break e.g. work-arounds that have been working in setuptools or setup.py scripts for years.
* distutils can get new features in 3.5.

For this issue, I think distutils in all active branches can get fixed to use CRLF for multipart POSTs.
msg217336 - (view) Author: Ian Cordasco (icordasc) * Date: 2014-04-28 02:03
I've attached a patch that should fix this issue. Please review and let me know if changes are necessary.
msg218822 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2014-05-19 22:27
Patch looks good, will apply.  Thanks.
msg219157 - (view) Author: Ian Cordasco (icordasc) * Date: 2014-05-26 14:37
Per discussion on twitter (https://twitter.com/merwok_/status/468518605135835136) I'm bumping this to make sure it's merged.
msg221100 - (view) Author: Ian Cordasco (icordasc) * Date: 2014-06-20 16:24
Bumping this once more.
msg227728 - (view) Author: Roundup Robot (python-dev) Date: 2014-09-27 20:59
New changeset 5e3f8bd33cf2 by R David Murray in branch '3.4':
#10510: make distuitls upload/register use HTML standards compliant CRLF.
https://hg.python.org/cpython/rev/5e3f8bd33cf2

New changeset ea665bae2ea0 by R David Murray in branch 'default':
Merge: #10510: make distuitls upload/register use HTML standards compliant CRLF.
https://hg.python.org/cpython/rev/ea665bae2ea0

New changeset 9ad78b4b169c by R David Murray in branch '2.7':
#10510: make distuitls upload/register use HTML standards compliant CRLF.
https://hg.python.org/cpython/rev/9ad78b4b169c
msg227760 - (view) Author: Roundup Robot (python-dev) Date: 2014-09-28 15:02
New changeset 6375bf34fff6 by R David Murray in branch '3.4':
#10510: Fix bug in forward port of 2.7 distutils patch.
https://hg.python.org/cpython/rev/6375bf34fff6

New changeset 90b07d422bd9 by R David Murray in branch 'default':
#10510: Fix bug in forward port of 2.7 distutils patch.
https://hg.python.org/cpython/rev/90b07d422bd9
msg231502 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2014-11-21 23:35
Thanks bitdancer for applying this.

FTR I just stumbled on code extracted from chishop that shows Django had issues with the non-CRLF requests sent by distutils: https://github.com/disqus/djangopypi/blob/master/djangopypi/http.py#L19
History
Date User Action Args
2014-11-21 23:35:14eric.araujosetmessages: + msg231502
2014-09-28 15:02:21python-devsetmessages: + msg227760
2014-09-27 20:59:57r.david.murraysetstatus: open -> closed
resolution: fixed
stage: commit review -> resolved
2014-09-27 20:59:21python-devsetnosy: + python-dev
messages: + msg227728
2014-06-20 16:24:22icordascsetmessages: + msg221100
2014-05-26 14:37:19icordascsetmessages: + msg219157
2014-05-19 22:27:31eric.araujosetstage: needs patch -> commit review
messages: + msg218822
versions: + Python 3.5, - Python 3.3
2014-04-28 02:03:47icordascsetfiles: + compliant_distutils.patch
keywords: + patch
messages: + msg217336
2014-04-23 00:41:52icordascsetnosy: + icordasc
2014-03-13 20:19:42eric.araujosetcomponents: + Distutils, - Distutils2
2014-02-21 08:38:54eric.araujosetstatus: closed -> open

stage: needs patch
messages: + msg211828
versions: + Python 2.7, Python 3.4
2014-02-21 08:33:22Matthew.Iversensetnosy: + Matthew.Iversen
messages: + msg211826
2011-10-23 02:14:03eric.araujosetresolution: duplicate -> (no value)
title: packaging upload/register should use CRLF in HTTP requests -> distutils upload/register should use CRLF in HTTP requests
superseder: Factor out common code for d2 commands register, upload and upload_docs ->
stage: resolved -> (no value)
2011-10-23 02:13:08eric.araujosetnosy: + mitchellh
messages: + msg146215
2011-10-23 02:08:03eric.araujolinkissue13132 superseder
2011-06-26 03:19:50r.david.murraysetnosy: + r.david.murray
messages: + msg139139
2011-06-25 19:49:47eric.araujosetstatus: open -> closed
title: distutils upload/register should use CRLF in HTTP requests -> packaging upload/register should use CRLF in HTTP requests
superseder: Factor out common code for d2 commands register, upload and upload_docs
messages: + msg139109

resolution: duplicate
stage: needs patch -> resolved
2011-06-25 17:39:55jerubsetmessages: + msg139106
2011-06-25 14:18:29jerubsetnosy: + jerub
messages: + msg139069
2011-06-09 15:14:42eric.araujosetkeywords: + easy
versions: + Python 3.3, - 3rd party
2010-12-15 23:48:02eric.araujosetnosy: orsenthil, tarek, eric.araujo, Brian.Jones
messages: + msg124093
components: - Distutils
versions: - Python 3.1, Python 2.7, Python 3.2
2010-11-26 15:01:59tareksetmessages: + msg122460
2010-11-26 14:44:24Brian.Jonessetmessages: + msg122459
2010-11-26 14:18:06tareksetmessages: + msg122458
2010-11-26 14:12:00Brian.Jonessetmessages: + msg122457
2010-11-26 13:25:42Brian.Jonessetmessages: + msg122455
2010-11-26 05:33:11eric.araujosettitle: distutils.command.upload/register HTTP message headers: bad line termination -> distutils upload/register should use CRLF in HTTP requests
messages: + msg122440
stage: needs patch
2010-11-23 16:12:59orsenthilsetmessages: + msg122223
2010-11-23 15:51:27Brian.Jonessetmessages: + msg122220
2010-11-23 11:50:49eric.araujosetversions: + 3rd party, Python 3.2, - Python 2.6
nosy: + eric.araujo, tarek, orsenthil

messages: + msg122210

assignee: eric.araujo
components: + Distutils, Distutils2
2010-11-23 05:11:52Brian.Jonescreate