classification
Title: Python 3 logging HTTPHandler sends duplicate Host header
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: lhelwerd
Priority: normal Keywords:

Created on 2017-07-11 14:11 by lhelwerd, last changed 2017-07-11 14:11 by lhelwerd.

Messages (1)
msg298161 - (view) Author: Leon Helwerda (lhelwerd) Date: 2017-07-11 14:11
The logging HTTPHandler sends two Host headers which confuses certain servers.

Tested versions:
Python 3.6.1
lighttpd/1.4.45

Steps to reproduce (MWE):

1) Set up a lighttpd server which is to act as the logging host (we do not actually implement anything that accepts the input here). Optionally enable the debug settings from https://redmine.lighttpd.net/projects/1/wiki/DebugVariables (specifically, add debug.log-condition-handling = "enable" to /etc/lighttpd/lighttpd.conf) to follow what is happening inside the server.
2) In python3:
import logging
import logging.handlers
handler = logging.handlers.HTTPHandler('localhost', '/')
logging.getLogger().setLevel(logging.INFO)
logging.getLogger().addHandler(handler)
logging.info('hello world')
3) Notice that the access logs in /var/log/lighttpd/access.log show a 400 response for the request (in Python, the response is ignored). If the debugging from 1) is enabled, then /var/log/lighttpd/error.log contains a line "duplicate Host-header -> 400".

This is not a bug in lighttpd. The server adheres to RFC7320 (sec. 5.4, p. 44): "A server MUST respond with a 400 (Bad Request) status code [...] to any request message that contains more than one Host header field".

A workaround is to put the full URL in the second argument of HTTPRequest:
handler = logging.handlers.HTTPHandler('localhost', 'http://localhost/')
Then lighttpd follows RFC2616 (sec. 5.2, p. 37): "If Request-URI is an absoluteURI, the host is part of the Request-URI. Any Host header field value in the request MUST be ignored."

The origin of this issue is that the http.client.HTTPConnection.putrequest method (called by HTTPHandler.emit) already adds a Host header unless skip_host=True is given. Thus the manual addition of a Host header is duplicate.

Other versions like Python 2.7 might also be affected but I did not test.
History
Date User Action Args
2017-07-11 14:11:00lhelwerdcreate