Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket.recv(size, MSG_TRUNC) returns more than size bytes #69121

Open
AndreyWagin mannequin opened this issue Aug 25, 2015 · 8 comments
Open

socket.recv(size, MSG_TRUNC) returns more than size bytes #69121

AndreyWagin mannequin opened this issue Aug 25, 2015 · 8 comments
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@AndreyWagin
Copy link
Mannequin

AndreyWagin mannequin commented Aug 25, 2015

BPO 24933
Nosy @tiran, @benjaminp, @berkerpeksag, @vadmium

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2015-08-25.11:28:15.516>
labels = ['extension-modules', 'type-bug', '3.7']
title = 'socket.recv(size, MSG_TRUNC) returns more than size bytes'
updated_at = <Date 2016-09-26.15:31:46.213>
user = 'https://bugs.python.org/AndreyWagin'

bugs.python.org fields:

activity = <Date 2016-09-26.15:31:46.213>
actor = 'christian.heimes'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Extension Modules']
creation = <Date 2015-08-25.11:28:15.516>
creator = 'Andrey Wagin'
dependencies = []
files = []
hgrepos = []
issue_num = 24933
keywords = []
message_count = 7.0
messages = ['249114', '249121', '249127', '249214', '264343', '277427', '277429']
nosy_count = 5.0
nosy_names = ['christian.heimes', 'benjamin.peterson', 'berker.peksag', 'martin.panter', 'Andrey Wagin']
pr_nums = []
priority = 'high'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue24933'
versions = ['Python 2.7', 'Python 3.5', 'Python 3.6', 'Python 3.7']

@AndreyWagin
Copy link
Mannequin Author

AndreyWagin mannequin commented Aug 25, 2015

In [1]: import socket

In [2]: sks = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)

In [3]: sks[1].send("asdfasdfsadfasdfsdfsadfsdfasdfsdfasdfsadfa")
Out[3]: 42

In [4]: sks[0].recv(1, socket.MSG_PEEK | socket.MSG_TRUNC)
Out[4]: 'a\\x00\\x00\\x00\\xc0\\xbf8\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'

recv() returns a buffer. The size of this buffer is equal to the size of transferred data, but only the first symbol was initialized. What is the idea of this behavior.

Usually recv(sk, NULL, 0, socket.MSG_PEEK | socket.MSG_TRUNC) is used to get a message size. What is the right way to get a message size in Python?

@AndreyWagin AndreyWagin mannequin added the stdlib Python modules in the Lib dir label Aug 25, 2015
@AndreyWagin
Copy link
Mannequin Author

AndreyWagin mannequin commented Aug 25, 2015

sendto(4, "asdfasdfsadfasdfsdfsadfsdfasdfsd"..., 42, 0, NULL, 0) = 42
recvfrom(3, "a\\0n\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\5\\0\\0\\0\\0\\0\\0\\0\\2\\0\\0\\0"..., 1, MSG_TRUNC, NULL, NULL) = 42

I think the exit code is interpreted incorrectly. In this case it isn't equal to the number of bytes received. Then python copies this number of bytes from the buffer with smaller size, so it may access memory which are not allocated or allocated by someone else.

valgrind detects this type of errors:
[avagin@localhost ~]$ cat sock.py
import socket, os, sys

sks = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
pid = os.fork()
if pid == 0:
	sks[1].send("\0" * 4096)
	sys.exit(0)
sk = sks[0]
print sk.recv(1, socket.MSG_TRUNC )
[avagin@localhost \~]$ valgrind python sock.py
==25511== Memcheck, a memory error detector
==25511== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==25511== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==25511== Command: python sock.py
==25511== 
==25511== Syscall param write(buf) points to uninitialised byte(s)
==25511==    at 0x320B4F0940: \_\_write_nocancel (in /usr/lib64/libc-2.20.so)
==25511==    by 0x320B478D2C: \_IO_file_write@@GLIBC_2.2.5 (in /usr/lib64/libc-2.20.so)
==25511==    by 0x320B4794EE: \_IO_file_xsputn@@GLIBC_2.2.5 (in /usr/lib64/libc-2.20.so)
==25511==    by 0x320B46EE68: fwrite (in /usr/lib64/libc-2.20.so)
==25511==    by 0x369CC90210: ??? (in /usr/lib64/libpython2.7.so.1.0)
==25511==    by 0x369CC85EAE: ??? (in /usr/lib64/libpython2.7.so.1.0)
==25511==    by 0x369CC681AB: PyFile_WriteObject (in /usr/lib64/libpython2.7.so.1.0)
==25511==    by 0x369CCE08F9: PyEval_EvalFrameEx (in /usr/lib64/libpython2.7.so.1.0)
==25511==    by 0x369CCE340F: PyEval_EvalCodeEx (in /usr/lib64/libpython2.7.so.1.0)
==25511==    by 0x369CCE3508: PyEval_EvalCode (in /usr/lib64/libpython2.7.so.1.0)
==25511==    by 0x369CCFC91E: ??? (in /usr/lib64/libpython2.7.so.1.0)
==25511==    by 0x369CCFDB41: PyRun_FileExFlags (in /usr/lib64/libpython2.7.so.1.0)

@benjaminp
Copy link
Contributor

Evidently, the recv code doesn't know anything about MSG_TRUNC, which causes it to do incorrect things when the output length is greater than the buffer length.

@AndreyWagin
Copy link
Mannequin Author

AndreyWagin mannequin commented Aug 26, 2015

There is the same behavior for python 3.4
>>> sks[1].send(b"asdfasdfsadfasdfsdfsadfsdfasdfsdfasdfsadfa")
42
>>> sks[0].recv(1, socket.MSG_PEEK | socket.MSG_TRUNC)
b'a\x00Nx\x94\x7f\x00\x00sadfasdfsdfsadfsdfasdfsdfasdfsadfa'
>>>

@AndreyWagin AndreyWagin mannequin added the type-security A security issue label Aug 26, 2015
@vadmium
Copy link
Member

vadmium commented Apr 27, 2016

As far as I know, passing MSG_TRUNC into recv() is Linux-specific. I guess the “right” portable way to get a message size is to know it in advance, or guess and expand the buffer if MSG_PEEK cannot return the whole message.

Andrey: I don’t think we are accessing _unallocated_ memory (which could crash Python). If you look at _PyBytes_Resize(), I think it correctly allocates the memory, and just leaves it uninitialized.

Some options:

  • Document that arbitrary flags like Linux’s MSG_TRUNC not supported
  • Limit the returned buffer to the original buffer size
  • Raise an exception or warning if recv() returns more than the original buffer size
  • Reject unsupported flags like MSG_TRUNC
  • Initialize the expanded buffer with zeros

@vadmium vadmium added extension-modules C modules in the Modules dir and removed stdlib Python modules in the Lib dir labels Apr 27, 2016
@tiran
Copy link
Member

tiran commented Sep 26, 2016

MSG_TRUNC literally causes a buffer overflow. In the example sock_recv() and friends only allocate a buffer of size 1 on the heap. With MSG_TRUNC recv() ignores the maximum size and writes beyond the buffer. We cannot recover from a buffer overflow because the overflow might have damanged other data structures. Instead Python should detect the problem and forcefully abort() the process with Py_FatalError().

@tiran tiran added the 3.7 (EOL) end of life label Sep 26, 2016
@tiran
Copy link
Member

tiran commented Sep 26, 2016

Ah, I misunderstood MSG_TRUNC. It's not a buffer overflow. MSG_TRUNC does not write beyond the end of the buffer. In this example the libc function recv() writes two bytes into the buffer but returns a larger value than 2.

---

import socket
a, b = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
a.send(b'abcdefgh')
result = b.recv(2, socket.MSG_TRUNC)
print(len(result), result)

stdout: 2 b'ab'

To fix the wrong result of recv() with MSG_TRUNC, only resize when outlen < recvlen (line 3089).

To get the size of the message, you have to use recv_into() with a buffer.

---

a, b = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
a.send(b'abcdefgh')
msg = bytearray(2)
result = b.recv_into(msg, flags=socket.MSG_TRUNC)
print(result, msg)

---
stdout: 8 bytearray(b'ab')

@tiran tiran added type-bug An unexpected behavior, bug, or error and removed type-security A security issue labels Sep 26, 2016
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@cfbolz
Copy link
Contributor

cfbolz commented Dec 3, 2022

FWIW, PyPy just got the same bug report: https://foss.heptapod.net/pypy/pypy/-/issues/3864

We'll likely fix it in the way that @tiran suggested, "only resize when outlen < recvlen"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants