classification
Title: Valgrind reports Syscall param epoll_ctl(event) points to uninitialised byte(s)
Type: compile error Stage: resolved
Components: Interpreter Core Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ZackerySpytz, benjamin.peterson, miss-islington, nikratio, skrah, vstinner
Priority: normal Keywords: patch

Created on 2018-12-22 15:22 by nikratio, last changed 2020-01-19 22:44 by miss-islington. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 17782 closed ZackerySpytz, 2020-01-01 14:45
PR 18060 merged ZackerySpytz, 2020-01-19 06:54
PR 18070 merged miss-islington, 2020-01-19 22:39
PR 18071 merged miss-islington, 2020-01-19 22:39
Messages (14)
msg332348 - (view) Author: Nikolaus Rath (nikratio) * Date: 2018-12-22 15:22
With current git master, configured --with-valgrind --with-pydebug, I get:

==31074== Command: /home/nikratio/clones/cpython/python /home/nikratio/in-progress/pyfuse3/test/../examples/hello.py /tmp/pytest-of-nikratio/pytest-11/test_hello_hello_py_0
==31074== 
==31074== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==31074==    at 0x584906A: epoll_ctl (syscall-template.S:84)
==31074==    by 0xBDAA493: pyepoll_internal_ctl (selectmodule.c:1392)
==31074==    by 0xBDAA59F: select_epoll_register_impl (selectmodule.c:1438)
==31074==    by 0xBDAA5F8: select_epoll_register (selectmodule.c.h:599)
==31074==    by 0x174E15: _PyMethodDef_RawFastCallKeywords (call.c:658)
==31074==    by 0x300BCA: _PyMethodDescr_FastCallKeywords (descrobject.c:290)
==31074==    by 0x21FC05: call_function (ceval.c:4611)
==31074==    by 0x22B5E7: _PyEval_EvalFrameDefault (ceval.c:3183)
==31074==    by 0x2206FF: PyEval_EvalFrameEx (ceval.c:533)
==31074==    by 0x173B61: function_code_fastcall (call.c:285)
==31074==    by 0x174737: _PyFunction_FastCallKeywords (call.c:410)
==31074==    by 0x21FDF4: call_function (ceval.c:4634)
==31074==  Address 0xffeffeb4c is on thread 1's stack
==31074==  in frame #1, created by pyepoll_internal_ctl (selectmodule.c:1379)

To reproduce:

$ python-dev -m pip install --user pyfuse3  # for dependencies
$ git clone https://github.com/libfuse/pyfuse3.git
$ valgrind --trace-children=yes "--trace-children-skip=*mount*" python-dev -m pytest test/


pyfuse3 provides a C extension module, but I believe the problem is in the interpreter core as the stacktrace does not include anything from the extension.
msg332349 - (view) Author: Nikolaus Rath (nikratio) * Date: 2018-12-22 15:32
Same error with 3.7.1 and 3.6.7 (though line numbers differ slightly).
msg332351 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2018-12-22 15:44
"--with-valgrind --with-pydebug" looks suspicious, it essentially
mixes two different memory checkers.

1) CFLAGS="-O0 -g" --without-pymalloc

2) CFLAGS="-O0 -g" --with-valgrind

should both work.


Can you try if this fixes the error?
msg332352 - (view) Author: Nikolaus Rath (nikratio) * Date: 2018-12-22 15:51
$ ./configure CFLAGS="-O0 -g" --with-valgrind && make -j8

still gives

==13281== Memcheck, a memory error detector
==13281== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==13281== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==13281== Command: /home/nikratio/clones/cpython/python /home/nikratio/in-progress/pyfuse3/test/../examples/hello.py /tmp/pytest-of-nikratio/pytest-15/test_hello_hello_py_0
==13281== 
==13281== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==13281==    at 0x584906A: epoll_ctl (syscall-template.S:84)
==13281==    by 0xB5C07FA: pyepoll_internal_ctl (selectmodule.c:1392)
==13281==    by 0xB5C08CB: select_epoll_register_impl (selectmodule.c:1438)
==13281==    by 0xB5C112A: select_epoll_register (selectmodule.c.h:599)
==13281==    by 0x173031: _PyMethodDef_RawFastCallKeywords (call.c:658)
==13281==    by 0x2FEFCD: _PyMethodDescr_FastCallKeywords (descrobject.c:290)
==13281==    by 0x21ED25: call_function (ceval.c:4611)
==13281==    by 0x21AB4E: _PyEval_EvalFrameDefault (ceval.c:3183)
==13281==    by 0x21007C: PyEval_EvalFrameEx (ceval.c:533)
==13281==    by 0x17245F: function_code_fastcall (call.c:285)
==13281==    by 0x1728B5: _PyFunction_FastCallKeywords (call.c:410)
==13281==    by 0x21EDF4: call_function (ceval.c:4634)
==13281==  Address 0xffeff2d94 is on thread 1's stack
==13281==  in frame #1, created by pyepoll_internal_ctl (selectmodule.c:1379)
msg332503 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2018-12-25 04:36
I suspect Valgrind is being too conservative. union epoll_data is 64 bits wide but the file descriptor only occupies the first 32 bits. The last 32 bits don't need to be initialized by the application.
msg348066 - (view) Author: Nikolaus Rath (nikratio) * Date: 2019-07-17 15:08
Is there a good reason to not explicitly initialize them nevertheless? Valgrind is a common tool, and this false positive will affect everyone attempting to run it on a Python extension module.
msg359433 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-06 16:20
On my Fedora 31, epoll_event structure is defined in sys/epoll.h as:

typedef union epoll_data
{
  void *ptr;
  int fd;
  uint32_t u32;
  uint64_t u64;
} epoll_data_t;

struct epoll_event
{
  uint32_t events;	/* Epoll events */
  epoll_data_t data;	/* User data variable */
} __EPOLL_PACKED;


I can reproduce the issue using this Python script:

import select
p = select.epoll()
p.register(1, select.EPOLLIN)
p.poll(0)

The Linux syscall is defined as:

asmlinkage long sys_epoll_ctl(int epfd, int op, int fd,
				struct epoll_event __user *event);

It's implemented in fs/eventpoll.c:

https://github.com/torvalds/linux/blob/c79f46a282390e0f5b306007bf7b11a46d529538/fs/eventpoll.c#L2077-L2083

Valgrind complains that ev.data is only partially initialized: it's a 64-bit structure, but we only set ev.data.fd. I guess that the glibc implements epoll_ctl() as a raw system call: Valgrind is unable to know if it's ok to only partially initialize the epoll_data union.


Benjamin: "I suspect Valgrind is being too conservative. union epoll_data is 64 bits wide but the file descriptor only occupies the first 32 bits. The last 32 bits don't need to be initialized by the application."

Right. But Valgrind is not smart enough.

I see 2 options:

* Initialize ev.data to 0
* Use Misc/valgrind-python.supp to ignore this *false alarm*
msg359434 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-06 16:23
> Initialize ev.data to 0

I dislike this option, since the code is legit: Valgrind produces a false alarm.

> Use Misc/valgrind-python.supp to ignore this *false alarm*

I prefer this option.
msg359833 - (view) Author: Zackery Spytz (ZackerySpytz) * (Python triager) Date: 2020-01-12 06:54
Thank you, Victor, for the comments.  I will update PR 17782.
msg360079 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-15 21:47
> Thank you, Victor, for the comments.  I will update PR 17782.

Since it's a very different approach (modify C vs suppress the false alarm), I suggest instead to open a new different PR.
msg360274 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-19 22:38
New changeset d8ef64422a75f40cecdb1a7ee43492607d3daaf6 by Victor Stinner (Zackery Spytz) in branch 'master':
bpo-35561: Supress valgrind false alarm on epoll_ctl(event) (GH-18060)
https://github.com/python/cpython/commit/d8ef64422a75f40cecdb1a7ee43492607d3daaf6
msg360275 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-19 22:40
Thanks Nikolaus Rath for the bug report, and thnaks Zackery Spytz for the fix!

The fix will be backported automatically to 3.7 and 3.8 as soon as CI tests pass.
msg360276 - (view) Author: miss-islington (miss-islington) Date: 2020-01-19 22:44
New changeset 296383b6d05f9617283aeb5b601106f84b016198 by Miss Islington (bot) in branch '3.7':
bpo-35561: Supress valgrind false alarm on epoll_ctl(event) (GH-18060)
https://github.com/python/cpython/commit/296383b6d05f9617283aeb5b601106f84b016198
msg360277 - (view) Author: miss-islington (miss-islington) Date: 2020-01-19 22:44
New changeset 23793edf0d61aa98478541d0ff8f2b900ff1813d by Miss Islington (bot) in branch '3.8':
bpo-35561: Supress valgrind false alarm on epoll_ctl(event) (GH-18060)
https://github.com/python/cpython/commit/23793edf0d61aa98478541d0ff8f2b900ff1813d
History
Date User Action Args
2020-01-19 22:44:39miss-islingtonsetmessages: + msg360277
2020-01-19 22:44:07miss-islingtonsetnosy: + miss-islington
messages: + msg360276
2020-01-19 22:40:28vstinnersetstatus: open -> closed
versions: + Python 3.7
messages: + msg360275

resolution: fixed
stage: patch review -> resolved
2020-01-19 22:39:08miss-islingtonsetpull_requests: + pull_request17464
2020-01-19 22:39:02miss-islingtonsetpull_requests: + pull_request17463
2020-01-19 22:38:51vstinnersetmessages: + msg360274
2020-01-19 06:54:51ZackerySpytzsetpull_requests: + pull_request17454
2020-01-15 21:47:31vstinnersetmessages: + msg360079
2020-01-12 06:54:58ZackerySpytzsetmessages: + msg359833
2020-01-06 16:23:58vstinnersetmessages: + msg359434
2020-01-06 16:20:30vstinnersetnosy: + vstinner
messages: + msg359433
2020-01-01 14:46:22ZackerySpytzsetnosy: + ZackerySpytz

versions: + Python 3.9
2020-01-01 14:45:16ZackerySpytzsetkeywords: + patch
stage: patch review
pull_requests: + pull_request17215
2019-07-17 15:08:17nikratiosetmessages: + msg348066
2018-12-25 04:36:42benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg332503
2018-12-22 15:51:17nikratiosetmessages: + msg332352
2018-12-22 15:44:44skrahsetnosy: + skrah
messages: + msg332351
2018-12-22 15:32:09nikratiosetmessages: + msg332349
2018-12-22 15:22:42nikratiocreate