Message150246
The concern here is if the request line had something like this.
Method SP Request-URI SP HTTP-Version <ANY_\r_\n_\r\n_Combination>\r\n
The previous behavior would have resulted in
Method SP Request-URI SP HTTP-Version <ANY_\r_\n_\r\n_Combination>
That is removing only the final \r\n, whereas the current change would make it
Method SP Request-URI SP HTTP-Version
That is removes all the trailing \r\n combination.
BTW, thing to note this, this is only for request line and not the header
lines. And for request-line, both HTTP 1.0 and HTTP 1.1 spec has this in section
5.1
5.1 Request-Line
The Request-Line begins with a method token, followed by the
Request-URI and the protocol version, and ending with CRLF. The
elements are separated by SP characters. No CR or LF are allowed
except in the final CRLF sequence.
Request-Line = Method SP Request-URI SP HTTP-Version CRLF
Which leads me to believe that, removing all the trailing \r\n is a fine thing
to do and should not be harmful.
Just to augment this with few other things I found while (re-)reading the spec.
This advise is different from Header's trailing whitespace, which is called
Linear White space (LWS). If the Host Header looks like, e.g. "Host:
www.foo.com \r\n" (notice the trailing white space),
According to RFC 2616 (HTTP 1.1), section 4.2 Message Headers:
The field-content does not include any leading or trailing LWS:
linear white space occurring before the first non-whitespace
character of the field-value or after the last non-whitespace
character of the field-value. Such leading or trailing LWS MAY be
removed without changing the semantics of the field value.
RFC 1945 (HTTP 1.0), section 4.2 Message Headers does not make such an explicit
statement.
My guess on the former behavior in http/server.py is that it was thought that
Request-Line was following something like section 4.2 on HTTP 1.0 spec and only
the last two characters were removed. But actually, the request-line as per
spec should have only one CRLF as end char. In the Docstring of the
BaseHTTPServer class, there is a mention about preserving the trailing
white-space, but it does not point to any authoritative reference, so I am sure
taking docstring as reference to preserve the behavior is a good idea.
Before dwelling to find the reason, I was thinking if reverting the patch in
2.7 and 3.1 would be a good idea. But give that change has support from older
specs to new ones, I am inclined to think that leave the change as such
(without reverting) should be fine as well.
Only if we find a stronger backwards compatibility argument for leaving
trailing \r\n in request-line, then we should remove it in 2.7 and 3.2,
otherwise we can leave it as such. |
|
Date |
User |
Action |
Args |
2011-12-25 06:16:55 | orsenthil | set | recipients:
+ orsenthil, ezio.melotti, eric.araujo, karlcow, maker, python-dev |
2011-12-25 06:16:54 | orsenthil | link | issue13294 messages |
2011-12-25 06:16:53 | orsenthil | create | |
|