classification
Title: syslog logging handler fails with address in unix abstract namespace
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: vinay.sajip, vstinner, xdegaye
Priority: normal Keywords: patch

Created on 2016-04-19 07:35 by xdegaye, last changed 2016-04-21 11:42 by xdegaye.

Files
File name Uploaded Description Edit
unix_abstract_namespace.patch xdegaye, 2016-04-19 07:35 review
unix_abstract_namespace_2.patch xdegaye, 2016-04-20 19:00 address in abstract namespace as a string with file system encoding review
unix_abstract_namespace_3.patch xdegaye, 2016-04-20 19:12 review
Messages (14)
msg263715 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2016-04-19 07:35
The traceback:

    Traceback (most recent call last):
      File "Lib/logging/handlers.py", line 917, in emit
        self.socket.sendto(msg, self.address)
    TypeError: getsockaddrarg: AF_INET address must be tuple, not bytes

The attached patch reproduces the issue with a test case and fixes it.
msg263745 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-19 14:58
The following change looks good to me:

-        if isinstance(address, str):
+        if isinstance(address, (str, bytes)):
             self.unixsocket = True
             self._connect_unixsocket(address)

But I don't understand the testcase. Is an address starting with a NULL character a special address? How does it the bytes address case?

+        # override the definition in the base class
+        self.address = '\x00python_logging_test'

"syslog logging handler fails with address in unix abstract namespace"

What is the unix abstract namespace?
msg263802 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2016-04-20 07:06
A unix abstract socket address is defined here:
    http://man7.org/linux/man-pages/man7/unix.7.html
as:

"Traditionally, UNIX domain sockets can be either unnamed, or bound to
a filesystem pathname (marked as being of type socket).  Linux also
supports an abstract namespace which is independent of the
filesystem."

and:

"abstract: an abstract socket address is distinguished (from a
pathname socket) by the fact that sun_path[0] is a null byte ('\0').
..."

It is also documented in the Python socket module documentation at
    https://docs.python.org/3/library/socket.html
and tested in the TestLinuxAbstractNamespace test of
Lib/test/test_socket.py.
msg263810 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-20 07:56
Ok. And what is the link between abstract namespace and the fact that the syslog handler gets bytes? Is it tested by your change?
msg263817 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2016-04-20 10:28
The SysLogHandlerTest class instantiates a 'server_class' that eventually calls the
server_bind() method of the TCPServer class of the socketserver module.

This server_bind() method executes the statements:

    self.socket.bind(self.server_address)
    self.server_address = self.socket.getsockname()

and when self.server_address is a string and contains a null byte, then
getsockname() returns a bytes object.

So finally, self.server in SysLogHandlerTest is a bytes object in this case,
hence the needed to fix Lib/logging/handlers.py.
msg263828 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-20 13:01
Xavier de Gaye added the comment:
> when self.server_address is a string and contains a null byte, then
> getsockname() returns a bytes object.

Ah? It sounds strange to me that the type of getsockname() depends on
the NULL byte. I would also expect a Unicode name for the abstract
namespace. What do you think?
msg263837 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2016-04-20 13:45
makesockaddr() in socketmodule.c calls PyBytes_FromStringAndSize() when the
first byte is a null byte.

My opinion is not worth much in this matter :). The socket documentation does
say that AF_UNIX addresses are "represented as a string, using the file system
encoding", so the implementation looks wrong. However it is possible that
PyUnicode_DecodeFSDefault() fails for some file system encodings when the
encoded address contains null bytes ?
msg263838 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-20 13:49
> However it is possible that
PyUnicode_DecodeFSDefault() fails for some file system encodings when the
encoded address contains null bytes ?

No, it's not possible. Undecode bytes are escaped as surrogate characters using the surrogateescape error handler. See the PEP 393.
msg263839 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2016-04-20 13:56
Then it seems that makesockaddr() should be fixed instead of the logging
handler.
msg263840 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-20 13:57
> Then it seems that makesockaddr() should be fixed instead of the logging handler.

Yes. That's why I didn't understand the issue at the beginning :-) Would you like to work on fixing the _socket module?

It may be worth to keep your syslog unit test. It's up to you.
msg263841 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2016-04-20 14:03
I will give it a try.
msg263857 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2016-04-20 19:00
Here is the patch.

As a  reference, issue #8373 with changeset 1f23bb74f4bc changed the AF_UNIX
socket addresses encoding from UTF-8 to the file system encoding and the
'surrogateescape' error handler (PEP 383).
msg263858 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2016-04-20 19:12
This patch corrects the non-ascii quote character in the Misc/NEWS diff.
msg263913 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2016-04-21 11:42
In makesockaddr(), the current implementation does not do any decoding of AF_UNIX addresses in the abstract namespace in the struct sockaddr_un to bytes direction, i.e. system to python direction, but does encode a string or bytes object to struct sockaddr_un in the reverse direction, in getsockaddrarg(). This is not correct.

So the patch does more than fixing the type of the addresses as it also fixes this in a non backward compatible way. Maybe this should be indicated in Misc/NEWS.
History
Date User Action Args
2016-04-21 11:42:16xdegayesetmessages: + msg263913
2016-04-20 19:12:19xdegayesetfiles: + unix_abstract_namespace_3.patch

messages: + msg263858
2016-04-20 19:00:33xdegayesetfiles: + unix_abstract_namespace_2.patch

messages: + msg263857
2016-04-20 14:03:58xdegayesetmessages: + msg263841
2016-04-20 13:57:23vstinnersetmessages: + msg263840
2016-04-20 13:56:01xdegayesetmessages: + msg263839
2016-04-20 13:49:01vstinnersetmessages: + msg263838
2016-04-20 13:45:49xdegayesetmessages: + msg263837
2016-04-20 13:01:58vstinnersetmessages: + msg263828
2016-04-20 10:28:42xdegayesetmessages: + msg263817
2016-04-20 07:56:53vstinnersetmessages: + msg263810
2016-04-20 07:06:58xdegayesetmessages: + msg263802
2016-04-19 14:58:33vstinnersetnosy: + vstinner
messages: + msg263745
2016-04-19 07:35:16xdegayecreate