This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: SimpleHTTPServer sends wrong Content-Length header
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jlgijsbers Nosy List: irmen, jlgijsbers, josiahcarlson, razmatazz
Priority: normal Keywords:

Created on 2005-01-07 02:34 by razmatazz, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (5)
msg23879 - (view) Author: David Schachter (razmatazz) Date: 2005-01-07 02:34
On Microsoft Windows, text files use \r\n for newline.
The SimpleHTTPServer class's "send_head()" method opens
files with "r" or "rb" mode depending on the MIME type.
Files opened in "r" mode will have \r\n -> \n
translation performed automatically, so the stream of
bytes sent to the client will be smaller than the size
of the file on disk.

Unfortunately, the send_head() method sets the
Content-Length header using the file size on disk,
without compensating for the \r\n -> \n translation.

I remedied this on my copy thusly:

      if mode == "r":
        content = f.read()
        contentLength = str(len(content))
        f.seek(0)
      else:
        contentLength = str(os.fstat(f.fileno())[6])

      self.send_header("Content-Length", contentLength)
 
This may not be as inefficient as it seems: the entire
file was going to be read in anyway for the newline
translation.

Hmmm. The code could be slightly simpler:

     if mode == "r":
        contentLength = len(f.read())
        f.seek(0)
      else:
        contentLength = os.fstat(f.fileno())[6]

      self.send_header("Content-Length",
str(contentLength))


The documentation for SimpleHTTPServer in Python 2.3.4
for Windows says:

   A 'Content-type:' with the guessed content type is
   output, and then a blank line, signifying end of 
   headers, and then the contents of the file. The file
   is always opened in binary mode.

Actually, after Content-type, the Content-Length header
is sent.

It would probably be nice if "Content-Length" was
"Content-length" or if "Content-type" was
"Content-Type", for consistency. The latter is probably
best, per RFC 2016.


By the way, clients weren't caching the files I sent. I
added another line after the Content-Length handling:

      self.send_header("Expires", "Fri, 31 Dec 2100
12:00:00 GMT")

This is egregiously wrong in the general case and just
fine in my case.
msg23880 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2005-01-09 09:26
Logged In: YES 
user_id=341410

Would it be wrong to open all files with a mode of 'rb',
regardless of file type?

While I don't know MIME embeddings all that well, I do have
experience with email and that most codecs that use MIME
embeddings (like base 64, 85, 95, etc.) are \r, \n and \r\n
agnostic..
msg23881 - (view) Author: Johannes Gijsbers (jlgijsbers) * (Python triager) Date: 2005-01-09 17:18
Logged In: YES 
user_id=469548

http://python.org/sf/839496 has a patch for this, but the
submitter isn't sure whetther it's correct. Maybe one of you
could take a look at it?
msg23882 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2005-01-10 00:49
Logged In: YES 
user_id=129426

http://sourceforge.net/support/tracker.php?aid=839496 is a
better url because it has been moved.
I've added a comment to my patch, because I'm now quite sure
it is good after all.
msg23883 - (view) Author: Johannes Gijsbers (jlgijsbers) * (Python triager) Date: 2005-01-10 09:29
Logged In: YES 
user_id=469548

Okay, fixed on maint24 and HEAD by applying patch 839496.
History
Date User Action Args
2022-04-11 14:56:09adminsetgithub: 41404
2005-01-07 02:34:42razmatazzcreate