classification
Title: Logging in BaseHTTPServer.BaseHTTPRequestHandler causes lag
Type: performance Stage: test needed
Components: Library (Lib) Versions: Python 3.2
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: aerodonkey, knicker.kicker, neologix, orsenthil, pitrou, santa4nt
Priority: normal Keywords: patch

Created on 2009-05-22 10:12 by aerodonkey, last changed 2011-03-28 14:26 by neologix.

Messages (12)
msg88188 - (view) Author: Wentao Han (aerodonkey) Date: 2009-05-22 10:12
The invocation of socket.getfqdn to get FQDN of the host may delay for 
several seconds on some platforms when no appropriate host name is found. 
This causes lag when using SimpleXMLRPCServer and other class derived from 
BaseHTTPServer for every request.
msg103928 - (view) Author: Santoso Wijaya (santa4nt) Date: 2010-04-22 00:03
There is a simple workaround to bypass the hostname resolution mechanism, so that the server can respond faster.

# This is a hack to patch slow socket.getfqdn calls that
# BaseHTTPServer (and its subclasses) make.
# See: http://bugs.python.org/issue6085
# See: http://www.answermysearches.com/xmlrpc-server-slow-in-python-how-to-fix/2140/

import BaseHTTPServer

def _bare_address_string(self):
    host, port = self.client_address[:2]
    return '%s' % host

BaseHTTPServer.BaseHTTPRequestHandler.address_string = \
        _bare_address_string

# End hack.
msg103940 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-04-22 07:32
Yes, performing a DNS resolution every time we log something is definitely sub-optimal. We should perform it only once, and cache the result.
msg104154 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-04-25 18:36
The attached patch caches the result of FQDN lookup.
msg104159 - (view) Author: Santoso Wijaya (santa4nt) Date: 2010-04-25 20:12
Doesn't that only cache the first remote client it encounters, though? Maybe a dictionary of caches?
msg104164 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-04-25 21:10
> Doesn't that only cache the first remote client it encounters, though? Maybe a dictionary of caches?

A BaseHTTPRequestHandler is instantiated every time a client connects, so there should be only one client per handler, no (the cache is an attribute of the handler, not the server) ?
msg104207 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-26 10:00
Actually, DNS resolution should be disabled by default, as in most HTTP servers probably.
msg104209 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-04-26 10:37
And for this specific request, it fdqn is looked up only for "logging" to sys.stderr. Either removing the fqdn call or just caching per connection it as the patch does is both fine. I doubt if someone is relying this logging anywhere in the code. This is useful only when BaseHTTPServer is started in the command like as a standalone HTTP server.

What shall we do for this? remove the fqdn or cache it once and use it. Either is fine. And there is not testing code present (or required too - as this is a cli invocation scenario).
msg104213 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-26 10:45
> And for this specific request, it fdqn is looked up only for "logging"
> to sys.stderr. Either removing the fqdn call or just caching per
> connection it as the patch does is both fine. I doubt if someone is
> relying this logging anywhere in the code. This is useful only when
> BaseHTTPServer is started in the command like as a standalone HTTP
> server.

I think we should totally remove the call to fqdn. The common practice
with HTTP servers is to log numeric IPs and, if desired, let a batch log
analysis process (such as awstats) deal with DNS resolution and caching.

> And there is not testing code present (or required too - as this is a
> cli invocation scenario).

What do you mean? BaseHTTPRequestHandler is tested in test_httpservers.
msg104216 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-04-26 10:57
On Mon, Apr 26, 2010 at 10:45:56AM +0000, Antoine Pitrou wrote:

> What do you mean? BaseHTTPRequestHandler is tested in test_httpservers.

I meant specifically for that function which is logging to sys.stderr.
No return values, state changes and no tests present in test_httpservers.
Seems to me the cli invokation scenario of the BaseHTTPServer module.

Only other place where fqdn look up happens is HTTPServer.server_bind
and in the setting of server_name, if any code is relying on it could
break.
msg104267 - (view) Author: Santoso Wijaya (santa4nt) Date: 2010-04-26 19:14
Please correct my understanding if I am mistaken, but from skimming through the source code, it seems that a new BaseHTTPRequestHandler is instantiated per request. If a caching mechanism is to be adopted as per the patch, wouldn't it be forgotten when a new request from the same client arrives in the future?
msg104278 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-04-26 21:28
@santa4nt: You're correct, the cache is retained only from within the same handler: the other solution would be to keep the cache at the server level, but then you'd have to deal with the cache size (you don't want it to grow forever), and it seems a bit overkill. I just kept the FQDN resolution because the code seemed to do it on purpose, but I agree that if nothing relies on this behaviour, it might be as simple to remove the resolution altogether.
History
Date User Action Args
2011-03-28 14:26:08neologixsetfiles: - base_http_server_fqdn_lag.diff
2011-03-28 14:22:11knicker.kickersetnosy: + knicker.kicker
2010-04-26 21:28:50neologixsetmessages: + msg104278
2010-04-26 19:14:04santa4ntsetmessages: + msg104267
2010-04-26 10:57:41orsenthilsetmessages: + msg104216
2010-04-26 10:45:54pitrousetmessages: + msg104213
2010-04-26 10:37:43orsenthilsetassignee: orsenthil
messages: + msg104209
2010-04-26 10:00:40pitrousetnosy: + pitrou

messages: + msg104207
stage: test needed
2010-04-25 21:10:31neologixsetfiles: + base_http_server_fqdn_lag.diff

messages: + msg104164
2010-04-25 20:47:01neologixsetfiles: - base_http_server_fqdn_lag.diff
2010-04-25 20:12:51santa4ntsetmessages: + msg104159
2010-04-25 18:36:34neologixsetfiles: + base_http_server_fqdn_lag.diff
keywords: + patch
messages: + msg104154
2010-04-22 18:52:37pitrousetpriority: normal
nosy: + orsenthil

versions: + Python 3.2, - Python 2.6
2010-04-22 07:32:20neologixsetnosy: + neologix
messages: + msg103940
2010-04-22 00:03:57santa4ntsetnosy: + santa4nt
messages: + msg103928
2009-05-22 10:12:39aerodonkeycreate