This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author baikie
Recipients baikie
Date 2010-04-11.18:45:24
SpamBayes Score 3.330669e-16
Marked as misclassified No
Message-id <>
In 3.x, the socket module assumes that AF_UNIX addresses use
UTF-8 encoding - this means, for example, that accept() will
raise UnicodeDecodeError if the peer socket path is not valid
UTF-8, which could crash an unwary server.

Python 3.1.2 (r312:79147, Mar 23 2010, 19:02:21) 
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more
>>> from socket import *
>>> s = socket(AF_UNIX, SOCK_STREAM)
>>> s.bind(b"\xff")
>>> s.getsockname()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 0: unexpected code byte

I'm attaching a patch to handle socket paths according to PEP
383.  Normally this would use PyUnicode_FSConverter, but there
are a couple of ways in which the address handling currently
differs from normal filename handling.

One is that embedded null bytes are passed through to the system
instead of being rejected, which is needed for the Linux abstract
namespace.  These abstract addresses are returned as bytes
objects, but they can currently be specified as strings with
embedded null characters as well.  The patch preserves this

The current code also accepts read-only buffer objects (it uses
the "s#" format), so in order to accept these as well as
bytearray filenames (which the posix module accepts), the patch
simply accepts any single-segment buffer, read-only or not.

This patch applies on top of the patches I submitted for issue
#8372 (rather than knowingly running past the end of sun_path).
Date User Action Args
2010-04-11 18:45:29baikiesetrecipients: + baikie
2010-04-11 18:45:28baikiesetmessageid: <>
2010-04-11 18:45:26baikielinkissue8373 messages
2010-04-11 18:45:25baikiecreate