classification
Title: BaseHTTPServer reinventing rfc822 date formatting
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: rfc2822 formatdate functionality duplication
View: 747320
Assigned To: Nosy List: ajaksu2, berker.peksag, eric.araujo, karlcow, l0nwlf, r.david.murray, schmiddy, techtonik
Priority: normal Keywords: patch

Created on 2009-11-20 16:22 by schmiddy, last changed 2016-03-07 10:32 by berker.peksag. This issue is now closed.

Files
File name Uploaded Description Edit
BaseHTTPServer.patch schmiddy, 2009-11-20 16:22 patch of BaseHTTPServer.py and rfc822.py typofix
BaseHTTPServer.patch l0nwlf, 2010-06-06 18:20 patch of BaseHTTPServer.py making it RFC822 compliant
test_httpserver.patch l0nwlf, 2010-06-13 17:17 unit test for BaseHTTPServer.patch
Messages (19)
msg95554 - (view) Author: Josh Kupershmidt (schmiddy) Date: 2009-11-20 16:22
While digging through Lib/BaseHTTPServer.py, I noticed that the
date_time_string() function duplicates rfc822.formatdate(). Attached is
a patch to eliminate this duplication of code.
msg102664 - (view) Author: Daniel Diniz (ajaksu2) (Python triager) Date: 2010-04-09 01:22
Thanks for the patch. Per issue 2849, use of rfc822 should be gone from the stdlib. Please re-open if you disagree.
msg102665 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-04-09 01:28
Following the last link on #2859, I’ve found that “rfc822.formatdate(time)” can be changed to “email.utils.formatdate(time, usegmt=True)”.

I’ll make a real diff in a few days if noone beats me to it.

Regards
msg102666 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-09 01:49
The issue is not invalid. The code duplication should be removed, but using the email module as Éric suggests.  Reopening.
msg102914 - (view) Author: Shashwat Anand (l0nwlf) Date: 2010-04-12 02:01
Instead of “email.utils.formatdate(time, usegmt=True)” we can simply use time.strftime() and clean the code in a better way. The duplication is there in date_time_string() as well as log_date_time_string(). Submitting the patch for review.
msg102931 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-04-12 09:25
Nice catch. The patch looks good to me and applies correctly on my trunk copy. There seems to be no test about this in the test suite; do you have a little test script to compare old and new code?

On a sidenote, I find all this business with time.time, time.gmtime, time.localtime and time.strftime always confusing. We have datetime objects now, would it be ok to use them in this module?

Regards
msg102946 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-12 13:31
There are a couple problems with this patch.  The first is that fixing date_time_string by using strftime rather than email.utils.formatdate is suboptimal from a code reuse standpoint.  The reason is that the date in HTTP message headers is required to conform to the same RFC standard as that generated by formatdate, so it is better to use the same routine to generate it.  That way any bug fixes needed to handle RFC compliance are centralized in once place.

The second problem with the patch is that strftime generates locale-aware week and month names, but per RFC the header timestamps must use English names (see for example msg53731 in issue 665194; the comment about locale applies to both strftime and strptime).

If issue 665194 were implemented formatdate could use it, and then BaseHTTPServer could also use it directly.  But absent that it should use email.util.formatdate.  (That issue should also answer Éric's question about whether we can use DateTime here: not yet.)

Now, the logging routine is a different story.  That timestamp isn't required to follow the RFC, and one could argue that it makes sense for its timestamp to use the locale.  (One could also ask whether BaseHTTPServer should use the logging module, but that is a whole separate issue.)

We definitely should have a unit test before applying this patch, that makes sure the timestamp gets generated without error.  Checking the detailed format of the timestamp can be assumed to be covered by the unit tests for formatdate.  (I don't think those tests are completely adequate; for example they don't test that the date remains in English if the locale is different, but again that is a different issue.)
msg102947 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-04-12 13:33
“One could also ask whether BaseHTTPServer should use the logging 
module, but that is a whole separate issue.”
Indeed: http://www.python.org/dev/peps/pep-0337/

Cheers
msg102967 - (view) Author: Shashwat Anand (l0nwlf) Date: 2010-04-12 19:06
Quoting from the docstring of trunk/Lib/email/utils.py -> formatdate()

"We cannot use strftime() because that honors the locale and RFC 2822 requires that day and month names be the English abbreviations."

So yes, I do agree that email.utils.formatdate() should be used instead of time.strftime() to remove duplicate codes and be compliant with RFC.
Removing previous patch and submitting another.
msg107183 - (view) Author: Shashwat Anand (l0nwlf) Date: 2010-06-06 04:31
Seems that earlier patch was incorrect. Rectifying and submitting the correct patch.
msg107200 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-06-06 18:03
You could get a minor speedup by doing “from email.utils import formatdate”.

Do we have tests know to check that the patch does not break anything?

Can this still go into 2.7?
msg107202 - (view) Author: Shashwat Anand (l0nwlf) Date: 2010-06-06 18:20
> You could get a minor speedup by doing “from email.utils import formatdate”.

I guess I shall do that. 
Applied the patch and tested it, it does not break anything IMO and should go in python2.7
msg107205 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-06-06 18:40
Opinions are nice, tests are better! :)
msg107740 - (view) Author: Shashwat Anand (l0nwlf) Date: 2010-06-13 17:17
Agreed.
Attaching the unit test.
msg107741 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-06-13 17:45
The skeleton is good but you have to change one thing. Your test should
exercise a function or method of BaseHTTPServer, not the underlying
implementation detail. I failed to explain that earlier:

1a) Write a test that checks that the current code produces right values
for a handful of cases (perhaps the RFC has them, else pick at random);
1b) Check that the test pass with the current code;
2a) Change BaseHTTPServer to use emails.utils.formatdate;
2b) Check that the test still pass.

End of HOWTO write a regression test :)
msg107909 - (view) Author: anatoly techtonik (techtonik) Date: 2010-06-16 09:13
I do not like the idea that BaseHTTPServer depends on email package, which in turn may depend on another package etc. Having date formatting function inside of email package breaks "single responsibility" principle that would be nice to have in stdlib.
msg108017 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-06-17 13:58
The HTTP RFCs reference the email RFCs for the date format, so the email package is the logical place for this function: email is the correct responsible party.  In any case, the function resides in email.utils, which has no dependencies on anything else in the email package.  Of course, it does have dependencies on other parts of the python standard library, but I hardly think you'd want every module re-implementing every stdlib function.
msg182981 - (view) Author: karl (karlcow) * Date: 2013-02-25 20:14
I think it is now fixed by my patch in http://bugs.python.org/issue747320
msg261290 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-03-07 10:32
There is an up-to-date patch for Python 3 in issue 747320. Closing this as a duplicate.
History
Date User Action Args
2016-03-07 10:32:41berker.peksagsetstatus: open -> closed

superseder: rfc2822 formatdate functionality duplication
nosy: + berker.peksag

messages: + msg261290
type: behavior -> enhancement
resolution: duplicate
stage: test needed -> resolved
2013-02-25 20:14:36karlcowsetnosy: + karlcow
messages: + msg182981
2010-06-17 13:58:59r.david.murraysetmessages: + msg108017
2010-06-16 13:19:39brian.curtinsettitle: patch: BaseHTTPServer reinventing rfc822 date formatting -> BaseHTTPServer reinventing rfc822 date formatting
2010-06-16 09:13:28techtoniksetnosy: + techtonik
messages: + msg107909
2010-06-13 17:45:05eric.araujosetmessages: + msg107741
2010-06-13 17:17:26l0nwlfsetfiles: + test_httpserver.patch

messages: + msg107740
2010-06-06 18:40:04eric.araujosetmessages: + msg107205
2010-06-06 18:20:34l0nwlfsetfiles: + BaseHTTPServer.patch

messages: + msg107202
2010-06-06 18:15:20l0nwlfsetfiles: - BaseHTTPServer.patch
2010-06-06 18:03:32eric.araujosetmessages: + msg107200
2010-06-06 04:31:16l0nwlfsetfiles: + BaseHTTPServer.patch

messages: + msg107183
2010-06-06 04:29:09l0nwlfsetfiles: - BaseHTTPServer.patch
2010-04-12 21:08:14l0nwlfsetfiles: + BaseHTTPServer.patch
2010-04-12 19:07:12l0nwlfsetfiles: - BaseHTTPServer.patch
2010-04-12 19:06:51l0nwlfsetmessages: + msg102967
2010-04-12 13:33:49eric.araujosetmessages: + msg102947
2010-04-12 13:31:09r.david.murraysetmessages: + msg102946
2010-04-12 09:25:35eric.araujosetmessages: + msg102931
2010-04-12 02:01:46l0nwlfsetfiles: + BaseHTTPServer.patch
nosy: + l0nwlf
messages: + msg102914

2010-04-09 01:49:58r.david.murraysettitle: patch: BaseHTTPServer reinventing rfc822 -> patch: BaseHTTPServer reinventing rfc822 date formatting
2010-04-09 01:49:23r.david.murraysetversions: + Python 3.2
2010-04-09 01:49:03r.david.murraysetstatus: closed -> open

nosy: + r.david.murray
messages: + msg102666

resolution: not a bug -> (no value)
stage: resolved -> test needed
2010-04-09 01:28:04eric.araujosetnosy: + eric.araujo
messages: + msg102665
2010-04-09 01:22:52ajaksu2setstatus: open -> closed
priority: normal


nosy: + ajaksu2
messages: + msg102664
resolution: not a bug
stage: resolved
2009-11-20 16:22:43schmiddycreate