classification
Title: SysLogHandler incorrectly appends \000 to messages
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.3, Python 3.2, Python 3.1, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: vinay.sajip Nosy List: Carl.Crowder, eric.araujo, petri.lehtinen, python-dev, r.david.murray, vinay.sajip
Priority: normal Keywords:

Created on 2011-05-24 11:44 by Carl.Crowder, last changed 2011-06-10 17:54 by python-dev. This issue is now closed.

Messages (14)
msg136743 - (view) Author: Carl Crowder (Carl.Crowder) Date: 2011-05-24 11:44
logging.handlers.SysLogHandler contains this class variable and comment:

    # curious: when talking to the unix-domain '/dev/log' socket, a
    # zero-terminator seems to be required.  this string is placed
    # into a class variable so that it can be overridden if
    # necessary.
    log_format_string = '<%d>%s\000'

And separately, in emit:

    msg = self.format(record) + '\000'

The assumption here is that a null character must be appended to delimit the syslog message. In RFC5424, there is no mention of a message delimiter, and in fact the previous syslog RFC, RFC3164, specifically states:

>  The MSG part will fill the remainder of the syslog packet.  This will usually contain some additional information of the process that generated the message, and then the text of the message.  There is no ending delimiter to this part.

I believe this comment and behaviour is due to an older version of syslogd. Checking the manpage for an older version of rsyslog for example includes this piece of information [1]:

> There is probably one important consideration when installing rsyslogd. It is dependent on proper formatting of messages by the syslog function. The functioning of the syslog function in the shared libraries changed somewhere in the region of libc.so.4.[2-4].n.   The specific change was to null-terminate the message before transmitting it to the /dev/log socket. Proper functioning of this  version of rsyslogd is dependent on null-termination of the message.

I'm running Ubuntu 11.04 with rsyslogd 4.6.4 (that is, the standard version). In the manpage for this version of rsyslogd, there is no reference to null-terminators. Removing "+ '\000'" from the SysLogHandler results in messages still being received correctly.

Problem behaviour:
1) When running any RFC compliant syslog receiver that handles syslog messages, such as flume[2], this null character is not stripped as it is not expected to be present. Current versions of syslog cope because previously they assumed it existed.
2) The log_format_string class variable is not actually used anywhere, so it cannot be overridden usefully.

Removing the null terminator will cause older typical versions of syslogd to fail to receive messages, however including it causes any normal receiver that does not implement the non-standard behaviour to receive an additional unwanted null.

Suggestion for a fix is either to properly use the log_format_string class variable, or to allow an optional "append_null" argument to the SysLogHandler constructor. By default, this should be True, as it will continue to work with the main use case, which is unix syslog demons. Having the option will allow other use cases to also use the SysLogHandler.

[1] http://manpages.ubuntu.com/manpages/hardy/man8/rsyslogd.8.html#contenttoc8
[2] http://www.cloudera.com/blog/category/flume/
msg136767 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-05-24 16:37
While I agree that it should ideally be possible to actually control this behavior as indicated by the comment, any syslog handler that does not sanitize the messages it receives is broken.
msg136768 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-05-24 16:38
s/handler/receiver/
msg136771 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2011-05-24 17:06
Even though expecting the NUL-terminator is legacy behaviour, newer versions of the syslog daemons presumably follow Postel's rule of "be conservative in what you do, be liberal in what you accept from others". Can flume not be changed to follow this principle, given that ordinarily you wouldn't expect a NUL terminator byte on syslog messages?

When the implementation of SysLogHandler was changed to comply (more closely) with RFC5424, the log_format_string value was inadvertently left behind in Python 2.x, and it makes sense to remove it since changing it doesn't do anything useful.

Having an append_nul parameter to control NUL-termination is certainly doable, and this approach will keep the 2.x and 3.x implementations more or less the same (as 3.x doesn't have a log_format_string attribute). I suggest "nul" rather than "null" since we're talking about appending a NUL byte to the message.

However, having it as an additional keyword parameter to the constructor could cause dictConfig failures if configurations specifying it are used with an old version of Python (e.g. 3.2.0) which doesn't have the parameter. For this reason, it might be better to have an append_nul class attribute (defaulting to True) which can be overridden with an instance attribute set to False. (This approach can still be used with dictConfig, as you can use a factory function other than the SysLogHandler class to construct a SysLogHandler instance.)
msg136847 - (view) Author: Carl Crowder (Carl.Crowder) Date: 2011-05-25 14:12
Flume certainly could avoid parsing certain values. However, while a syslog application "should avoid octet values below 32", they are still "legal" [1]. I don't think that adjusting flume to reject legal values due to legacy behaviour in some unix syslog daemons is the Right Thing™ here.

[1] http://tools.ietf.org/html/rfc5424#section-6.4
msg136856 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-05-25 15:28
And it goes on to say:

"[...] but a syslog application MAY modify these characters upon reception.  For example, it might change them into an escape sequence (e.g., value 0 may be changed to "\0")"

Flume should not break if it receives NULs.  But there should be a way to prevent generating them, and Vinay has agreed with that.  Now we just need a patch :)  (he's probably working on one...)
msg136857 - (view) Author: Carl Crowder (Carl.Crowder) Date: 2011-05-25 15:31
Oh, I understand. Flume doesn't break, it handles the \0 just fine, the problem is that I ended up with a message with that additional byte on the end. Sorry for the confusion!
msg136858 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2011-05-25 15:35
FWIW, I once stumbled on this problem, and solved it by making my own log handler that uses functions from the syslog module.
msg136859 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2011-05-25 16:02
> Petri Lehtinen <petri@digip.org> added the  comment:

> 
> FWIW, I once stumbled on this problem, and solved it by making  my own log 
>handler that uses functions from the syslog  module.

@Petri: It's great that it worked for you, but IIRC the syslog module has had 
thread-safety issues at least on some platforms, which is why it's not used in 
the SysLogHandler implementation (disclaimer: this was a while ago, I don't know 
if the syslog module is thread-safe now - I believe the thread-safety issues 
were in the underlying C APIs).

I am working on a patch, watch this space :-)
msg136860 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2011-05-25 16:06
Vinay Sajip wrote:
> @Petri: It's great that it worked for you, but IIRC the syslog module has had 
> thread-safety issues at least on some platforms, which is why it's not used in 
> the SysLogHandler implementation (disclaimer: this was a while ago, I don't know 
> if the syslog module is thread-safe now - I believe the thread-safety issues 
> were in the underlying C APIs).

Ah, I was not aware of thread safety issues. A quick search revealed
that there has been a non-standard syslog_r() function that's thread
safe, but nowadays POSIX requires syslog() to be thread safe, too.
msg136875 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2011-05-25 17:16
Please note: this change is too late for inclusion in Python 3.2.1 :-( as it has already rc status. So I will probably push this change once 3.2.1 is out.
msg137996 - (view) Author: Roundup Robot (python-dev) Date: 2011-06-09 15:55
New changeset 260b84851d1f by Vinay Sajip in branch '3.2':
Issue #12168: SysLogHandler now allows NUL termination to be controlled using a new 'append_nul' attribute on the handler.
http://hg.python.org/cpython/rev/260b84851d1f

New changeset ac1217099b3f by Vinay Sajip in branch 'default':
Merged fix for issue #12168 from 3.2.
http://hg.python.org/cpython/rev/ac1217099b3f
msg138112 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-06-10 17:02
Should this new attribute be documented?
msg138127 - (view) Author: Roundup Robot (python-dev) Date: 2011-06-10 17:54
New changeset 3cf5d61fd6d7 by Vinay Sajip in branch '3.2':
Documented change for Issue #12168.
http://hg.python.org/cpython/rev/3cf5d61fd6d7

New changeset 6658b9b9f5f3 by Vinay Sajip in branch 'default':
Merged documentation change for Issue #12168.
http://hg.python.org/cpython/rev/6658b9b9f5f3
History
Date User Action Args
2011-06-10 17:54:27python-devsetmessages: + msg138127
2011-06-10 17:02:46eric.araujosetnosy: + eric.araujo
messages: + msg138112
2011-06-09 17:38:19vinay.sajipsetstatus: open -> closed
resolution: fixed
2011-06-09 15:55:34python-devsetnosy: + python-dev
messages: + msg137996
2011-05-25 17:16:34vinay.sajipsetmessages: + msg136875
2011-05-25 16:06:56petri.lehtinensetmessages: + msg136860
2011-05-25 16:02:11vinay.sajipsetmessages: + msg136859
2011-05-25 15:35:47petri.lehtinensetnosy: + petri.lehtinen
messages: + msg136858
2011-05-25 15:31:21Carl.Crowdersetmessages: + msg136857
2011-05-25 15:28:39r.david.murraysetmessages: + msg136856
2011-05-25 14:12:03Carl.Crowdersetmessages: + msg136847
title: SysLogHandler incorrectly appents \000 to messages -> SysLogHandler incorrectly appends \000 to messages
2011-05-24 17:06:35vinay.sajipsetmessages: + msg136771
2011-05-24 16:38:13r.david.murraysetmessages: + msg136768
2011-05-24 16:37:01r.david.murraysetnosy: + r.david.murray
messages: + msg136767
2011-05-24 16:35:24eric.araujosetassignee: vinay.sajip

nosy: + vinay.sajip
versions: + Python 3.1, Python 3.2, Python 3.3
2011-05-24 11:44:12Carl.Crowdercreate