classification
Title: SimpleHTTPServer reports wrong content-length for text files
Type: Stage:
Components: Library (Lib) Versions: Python 2.6, Python 2.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jlgijsbers Nosy List: amaury.forgeotdarc, bluebird, georg.brandl, ggenellina, irmen, jlgijsbers
Priority: normal Keywords: easy, patch

Created on 2003-11-10 20:42 by irmen, last changed 2008-07-20 11:26 by georg.brandl. This issue is now closed.

Files
File name Uploaded Description Edit
diff.txt irmen, 2003-11-10 20:42 SimpleHTTPServer patch
httptest.zip irmen, 2004-06-06 11:16 test case that shows the problem on Windows.
Messages (11)
msg44870 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2003-11-10 20:42
(Python 2.3.2 on Windows)

SimpleHTTPServer reports the size of the file on disk
as Content-Length. This works except for text files.
If the content type starts with "text/" it is opening the
file in 'text' mode rather than 'binary' mode. At least on
Windows this causes newline translations, thereby making
the  actual size of the content transmitted *less* than
the content-length!

I don't know why SimpleHTTPServer is reading text files
with text mode. The included patch removes this distinction
so all files are opened in binary mode (and, also on
windows,
the actual size transmitted is the same as the reported
content-length).

--Irmen de Jong
msg44871 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2004-05-13 11:21
Logged In: YES 
user_id=129426

This bug is also still present in the SimpleHTTPServer.py
from Python 2.3.3 (and in the current CVS version, too).

Is there a reason why it treats text files differently? If
so, then at least the reported content-length must be fixed.
msg44872 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2004-05-31 16:51
Logged In: YES 
user_id=129426

The attached trivial patch removes the special case for text
files.
msg44873 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2004-06-06 11:18
Logged In: YES 
user_id=129426

The attached httptest.zip contains a test scenario. When run
on windows, it will show the problem.
First start 'startserver.py' and then from the same
directory run test.py.
I get this:

[E:\temp\httptest]python test.py
The reported content-length is: 1047 bytes
The real filesize is: 1047 bytes
The data I actually received from the httpserver is: 1028 bytes
msg44874 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2004-07-19 20:59
Logged In: YES 
user_id=129426

Hm, perhaps the easy way out (see my patch) is not the best
solution:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1

It seems it's best to convert text responses to CR LF
format? If we should do this, we must somehow 're-calculate'
the content-length after the CR LF conversion.
msg44875 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2005-01-10 00:44
Logged In: YES 
user_id=129426

Upon re-reading the w3 spec, it seems that we're safe. As
long as the use of CR or LF or CR+LF is consistent in the
whole text file.
The spec says: "HTTP relaxes this requirement [=the
requirement of being in canonical form] and allows the
transport of text media with plain CR or LF alone
representing a line break when it is done consistently for
an entire entity-body. HTTP applications MUST accept CRLF,
bare CR, and bare LF as being representative of a line break
in text media received via HTTP."

So my patch is safe, I think.
msg44876 - (view) Author: Johannes Gijsbers (jlgijsbers) * (Python triager) Date: 2005-01-10 09:28
Logged In: YES 
user_id=469548

Okay, I've checked in the fix on maint24 and HEAD.
msg68140 - (view) Author: Gabriel Genellina (ggenellina) Date: 2008-06-13 09:19
As noted by Leo Jay in this message 
<http://mail.python.org/pipermail/python-list/2008-June/495718.html>
this bug was supposedly fixed but it is still present.

Looks like the patch was only applied to release24-maint, not to the 
trunk. Both the 2.5 and the 2.6 sources don't have the patch applied. 
3.0 doesn't have this problem but probably it was fixed independently.
msg69332 - (view) Author: Bluebird (bluebird) * Date: 2008-07-06 14:27
I confirm that the problem is present on python2.5 on windows, and that
the attached patch fixes it.
msg69355 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-07-06 21:36
Committed as r64762.
Maybe a 2.5 backport candidate, but I fear that it may break existing code.
msg70081 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-07-20 11:26
Better not to backport it then.
History
Date User Action Args
2008-07-20 11:26:37georg.brandlsetstatus: pending -> closed
nosy: + georg.brandl
messages: + msg70081
2008-07-06 21:36:19amaury.forgeotdarcsetstatus: open -> pending
nosy: + amaury.forgeotdarc
resolution: fixed
messages: + msg69355
2008-07-06 14:27:59bluebirdsetnosy: + bluebird
messages: + msg69332
2008-06-19 17:27:40akuchlingsetkeywords: + easy
2008-06-13 22:02:01terry.reedysetstatus: closed -> open
resolution: accepted -> (no value)
2008-06-13 09:19:52ggenellinasetnosy: + ggenellina
messages: + msg68140
versions: + Python 2.6, Python 2.5
2003-11-10 20:42:43irmencreate