classification
Title: Logging in BaseHTTPServer.BaseHTTPRequestHandler causes lag
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: aerodonkey, eric.araujo, knicker.kicker, neologix, orsenthil, pitrou, python-dev, santoso.wijaya, schmir, ubershmekel, wmgaca
Priority: normal Keywords: needs review, patch

Created on 2009-05-22 10:12 by aerodonkey, last changed 2012-04-29 05:45 by orsenthil. This issue is now closed.

Files
File name Uploaded Description Edit
server_resolve.diff neologix, 2012-02-25 11:41 review
Messages (19)
msg88188 - (view) Author: Wentao Han (aerodonkey) Date: 2009-05-22 10:12
The invocation of socket.getfqdn to get FQDN of the host may delay for 
several seconds on some platforms when no appropriate host name is found. 
This causes lag when using SimpleXMLRPCServer and other class derived from 
BaseHTTPServer for every request.
msg103928 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2010-04-22 00:03
There is a simple workaround to bypass the hostname resolution mechanism, so that the server can respond faster.

# This is a hack to patch slow socket.getfqdn calls that
# BaseHTTPServer (and its subclasses) make.
# See: http://bugs.python.org/issue6085
# See: http://www.answermysearches.com/xmlrpc-server-slow-in-python-how-to-fix/2140/

import BaseHTTPServer

def _bare_address_string(self):
    host, port = self.client_address[:2]
    return '%s' % host

BaseHTTPServer.BaseHTTPRequestHandler.address_string = \
        _bare_address_string

# End hack.
msg103940 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-04-22 07:32
Yes, performing a DNS resolution every time we log something is definitely sub-optimal. We should perform it only once, and cache the result.
msg104154 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-04-25 18:36
The attached patch caches the result of FQDN lookup.
msg104159 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2010-04-25 20:12
Doesn't that only cache the first remote client it encounters, though? Maybe a dictionary of caches?
msg104164 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-04-25 21:10
> Doesn't that only cache the first remote client it encounters, though? Maybe a dictionary of caches?

A BaseHTTPRequestHandler is instantiated every time a client connects, so there should be only one client per handler, no (the cache is an attribute of the handler, not the server) ?
msg104207 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-26 10:00
Actually, DNS resolution should be disabled by default, as in most HTTP servers probably.
msg104209 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-04-26 10:37
And for this specific request, it fdqn is looked up only for "logging" to sys.stderr. Either removing the fqdn call or just caching per connection it as the patch does is both fine. I doubt if someone is relying this logging anywhere in the code. This is useful only when BaseHTTPServer is started in the command like as a standalone HTTP server.

What shall we do for this? remove the fqdn or cache it once and use it. Either is fine. And there is not testing code present (or required too - as this is a cli invocation scenario).
msg104213 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-26 10:45
> And for this specific request, it fdqn is looked up only for "logging"
> to sys.stderr. Either removing the fqdn call or just caching per
> connection it as the patch does is both fine. I doubt if someone is
> relying this logging anywhere in the code. This is useful only when
> BaseHTTPServer is started in the command like as a standalone HTTP
> server.

I think we should totally remove the call to fqdn. The common practice
with HTTP servers is to log numeric IPs and, if desired, let a batch log
analysis process (such as awstats) deal with DNS resolution and caching.

> And there is not testing code present (or required too - as this is a
> cli invocation scenario).

What do you mean? BaseHTTPRequestHandler is tested in test_httpservers.
msg104216 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-04-26 10:57
On Mon, Apr 26, 2010 at 10:45:56AM +0000, Antoine Pitrou wrote:

> What do you mean? BaseHTTPRequestHandler is tested in test_httpservers.

I meant specifically for that function which is logging to sys.stderr.
No return values, state changes and no tests present in test_httpservers.
Seems to me the cli invokation scenario of the BaseHTTPServer module.

Only other place where fqdn look up happens is HTTPServer.server_bind
and in the setting of server_name, if any code is relying on it could
break.
msg104267 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2010-04-26 19:14
Please correct my understanding if I am mistaken, but from skimming through the source code, it seems that a new BaseHTTPRequestHandler is instantiated per request. If a caching mechanism is to be adopted as per the patch, wouldn't it be forgotten when a new request from the same client arrives in the future?
msg104278 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-04-26 21:28
@santa4nt: You're correct, the cache is retained only from within the same handler: the other solution would be to keep the cache at the server level, but then you'd have to deal with the cache size (you don't want it to grow forever), and it seems a bit overkill. I just kept the FQDN resolution because the code seemed to do it on purpose, but I agree that if nothing relies on this behaviour, it might be as simple to remove the resolution altogether.
msg154195 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-25 07:17
+1 to Antoine’s proposal of removal.
msg154244 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-02-25 11:41
> +1 to Antoine’s proposal of removal.

Agreed.

Here's a patch.

Do note, however, that it's a behavior change: the address_string()
method is documented to return a resolved hostname (when possible).
msg158740 - (view) Author: Yuval Greenfield (ubershmekel) * Date: 2012-04-19 19:06
I agree with Antoine on this. Though the suggested patch is wrong. I believe we should leave address_string alone. Simply stop the log_message method from using it.

Either way we'd be changing the log format but if we don't have to then we shouldn't completely change the meaning of a method while leaving its name intact.
msg159573 - (view) Author: Roundup Robot (python-dev) Date: 2012-04-29 04:52
New changeset 64a6824d129d by Senthil Kumaran in branch 'default':
Fix Issue6085 - SimpleHTTPServer address_string to return client ip instead of client hostname
http://hg.python.org/cpython/rev/64a6824d129d
msg159578 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-04-29 05:26
The original change was introduced in this issue401197 which seems to use  fqdn at *all appropriate places*. In this case, after about a decade, it was realized that using fdqn for client connection may not be appropriate when hostname is not set.

In 3.3, this is fixed by making address_string return client ip instead of client hostname.

In 2.7 and 3.2, I would like to see it fixed wherein, inside the log_message call, the address_string is not called instead direct client address is used. In this manner, we are not changing any return values of public api (address_string), but we are effectively eliminating the delay cased by fqdn looking while logging to sys.stderr.
msg159579 - (view) Author: Roundup Robot (python-dev) Date: 2012-04-29 05:44
New changeset fb1e71c7619a by Senthil Kumaran in branch '2.7':
Fix issue6085 - Remove the delay caused by fqdn lookup while logging in BaseHTTPRequestHandler
http://hg.python.org/cpython/rev/fb1e71c7619a

New changeset 6adad27fcc8c by Senthil Kumaran in branch '3.2':
Fix issue6085 - Remove the delay caused by fqdn lookup while logging in BaseHTTPRequestHandler
http://hg.python.org/cpython/rev/6adad27fcc8c

New changeset f9f11998a20d by Senthil Kumaran in branch 'default':
issue6085 - update docs in default branch
http://hg.python.org/cpython/rev/f9f11998a20d
msg159580 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-04-29 05:45
Fixed in all codelines. Closing this.
History
Date User Action Args
2012-05-30 07:03:14neologixlinkissue14960 superseder
2012-04-29 05:45:26orsenthilsetstatus: open -> closed
resolution: fixed
messages: + msg159580

stage: patch review -> resolved
2012-04-29 05:44:35python-devsetmessages: + msg159579
2012-04-29 05:26:27orsenthilsetmessages: + msg159578
2012-04-29 04:52:17python-devsetnosy: + python-dev
messages: + msg159573
2012-04-19 19:06:32ubershmekelsetnosy: + ubershmekel
messages: + msg158740
2012-04-19 18:54:21pitroulinkissue14622 superseder
2012-02-25 11:41:26neologixsetkeywords: + needs review
files: + server_resolve.diff
messages: + msg154244

stage: test needed -> patch review
2012-02-25 07:17:47eric.araujosetnosy: + eric.araujo

messages: + msg154195
versions: + Python 3.3, - Python 3.2
2012-02-24 13:06:01wmgacasetnosy: + wmgaca
2012-02-13 20:38:29schmirsetnosy: + schmir
2011-03-28 14:26:08neologixsetfiles: - base_http_server_fqdn_lag.diff
2011-03-28 14:22:11knicker.kickersetnosy: + knicker.kicker
2010-04-26 21:28:50neologixsetmessages: + msg104278
2010-04-26 19:14:04santoso.wijayasetmessages: + msg104267
2010-04-26 10:57:41orsenthilsetmessages: + msg104216
2010-04-26 10:45:54pitrousetmessages: + msg104213
2010-04-26 10:37:43orsenthilsetassignee: orsenthil
messages: + msg104209
2010-04-26 10:00:40pitrousetnosy: + pitrou

messages: + msg104207
stage: test needed
2010-04-25 21:10:31neologixsetfiles: + base_http_server_fqdn_lag.diff

messages: + msg104164
2010-04-25 20:47:01neologixsetfiles: - base_http_server_fqdn_lag.diff
2010-04-25 20:12:51santoso.wijayasetmessages: + msg104159
2010-04-25 18:36:34neologixsetfiles: + base_http_server_fqdn_lag.diff
keywords: + patch
messages: + msg104154
2010-04-22 18:52:37pitrousetpriority: normal
nosy: + orsenthil

versions: + Python 3.2, - Python 2.6
2010-04-22 07:32:20neologixsetnosy: + neologix
messages: + msg103940
2010-04-22 00:03:57santoso.wijayasetnosy: + santoso.wijaya
messages: + msg103928
2009-05-22 10:12:39aerodonkeycreate