New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
socket.recv(size, MSG_TRUNC) returns more than size bytes #69121
Comments
In [1]: import socket
In [2]: sks = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
In [3]: sks[1].send("asdfasdfsadfasdfsdfsadfsdfasdfsdfasdfsadfa")
Out[3]: 42
In [4]: sks[0].recv(1, socket.MSG_PEEK | socket.MSG_TRUNC)
Out[4]: 'a\\x00\\x00\\x00\\xc0\\xbf8\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' recv() returns a buffer. The size of this buffer is equal to the size of transferred data, but only the first symbol was initialized. What is the idea of this behavior. Usually |
sendto(4, "asdfasdfsadfasdfsdfsadfsdfasdfsd"..., 42, 0, NULL, 0) = 42
recvfrom(3, "a\\0n\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\5\\0\\0\\0\\0\\0\\0\\0\\2\\0\\0\\0"..., 1, MSG_TRUNC, NULL, NULL) = 42 I think the exit code is interpreted incorrectly. In this case it isn't equal to the number of bytes received. Then python copies this number of bytes from the buffer with smaller size, so it may access memory which are not allocated or allocated by someone else. valgrind detects this type of errors: sks = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
pid = os.fork()
if pid == 0:
sks[1].send("\0" * 4096)
sys.exit(0)
sk = sks[0]
print sk.recv(1, socket.MSG_TRUNC )
|
Evidently, the recv code doesn't know anything about MSG_TRUNC, which causes it to do incorrect things when the output length is greater than the buffer length. |
There is the same behavior for python 3.4
>>> sks[1].send(b"asdfasdfsadfasdfsdfsadfsdfasdfsdfasdfsadfa")
42
>>> sks[0].recv(1, socket.MSG_PEEK | socket.MSG_TRUNC)
b'a\x00Nx\x94\x7f\x00\x00sadfasdfsdfsadfsdfasdfsdfasdfsadfa'
>>> |
As far as I know, passing MSG_TRUNC into recv() is Linux-specific. I guess the “right” portable way to get a message size is to know it in advance, or guess and expand the buffer if MSG_PEEK cannot return the whole message. Andrey: I don’t think we are accessing _unallocated_ memory (which could crash Python). If you look at _PyBytes_Resize(), I think it correctly allocates the memory, and just leaves it uninitialized. Some options:
|
MSG_TRUNC literally causes a buffer overflow. In the example sock_recv() and friends only allocate a buffer of size 1 on the heap. With MSG_TRUNC recv() ignores the maximum size and writes beyond the buffer. We cannot recover from a buffer overflow because the overflow might have damanged other data structures. Instead Python should detect the problem and forcefully abort() the process with Py_FatalError(). |
Ah, I misunderstood MSG_TRUNC. It's not a buffer overflow. MSG_TRUNC does not write beyond the end of the buffer. In this example the libc function recv() writes two bytes into the buffer but returns a larger value than 2. --- import socket
a, b = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
a.send(b'abcdefgh')
result = b.recv(2, socket.MSG_TRUNC)
print(len(result), result) stdout: 2 b'ab' To fix the wrong result of recv() with MSG_TRUNC, only resize when outlen < recvlen (line 3089). To get the size of the message, you have to use recv_into() with a buffer. --- a, b = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
a.send(b'abcdefgh')
msg = bytearray(2)
result = b.recv_into(msg, flags=socket.MSG_TRUNC)
print(result, msg) --- |
FWIW, PyPy just got the same bug report: https://foss.heptapod.net/pypy/pypy/-/issues/3864 We'll likely fix it in the way that @tiran suggested, "only resize when outlen < recvlen" |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: