Issue45325
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2021-09-29 18:08 by pablogsal, last changed 2022-04-11 14:59 by admin.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 28634 | open | pablogsal, 2021-09-29 18:22 |
Messages (18) | |||
---|---|---|---|
msg402897 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2021-09-29 18:08 | |
We should allow Py_BuildValue to take a "p" argument that takes a C int and produces a Python Bool so it would be symmetric with the p format code for PyArg_Parse that takes a Python object and returns a C int containing PyObject_IsTrue(obj). |
|||
msg402899 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2021-09-29 19:48 | |
There was no much need of this feature. In rare cases when we needed to build a bool in Py_BuildValue (I have found only 2 cases in the stdlib, and one of them is added by me) we use the following idiom: Py_BuildValue("...O...", ..., expr ? Py_True : Py_False, ...) You can have a temptation to write it as Py_BuildValue("...p...", ..., expr, ...) but there is a catch -- the arguments should be a C int. If it is not, you can break a stack. Explicit cast to int ("(int)expr") is not always correct, so you will need to write "expr ? 1 : 0" which is not much better than "expr ? Py_True : Py_False". |
|||
msg402900 - (view) | Author: STINNER Victor (vstinner) * | Date: 2021-09-29 20:20 | |
I would be interested to see how the "p" format could be used in existing code. I found a few places would benefit of it. Either create a new draft PR, or put it in the same PR 28634. Modules/_itertoolsmodule.c: return Py_BuildValue("O(N)(OO)", Py_TYPE(lz), it, lz->saved, Py_True); return Py_BuildValue("O(O)(OO)", Py_TYPE(lz), lz->it, lz->saved, lz->firstpass ? Py_True : Py_False); Modules/_ssl.c: ok = RAND_bytes((unsigned char*)PyBytes_AS_STRING(bytes), len); if (ok == 0 || ok == 1) return Py_BuildValue("NO", bytes, ok == 1 ? Py_True : Py_False); return Py_BuildValue( "{sksssssssisi" "sOssssssss" "}", "id", cipher_id, "name", cipher_name, "protocol", cipher_protocol, "description", buf, "strength_bits", strength_bits, "alg_bits", alg_bits ,"aead", aead ? Py_True : Py_False, "symmetric", skcipher, "digest", digest, "kea", kx, "auth", auth ); Modules/main.c: runargs = PyTuple_Pack(2, module, set_argv0 ? Py_True : Py_False); Objects/fileobject.c: stream = _PyObject_CallMethodId(io, &PyId_open, "isisssO", fd, mode, buffering, encoding, errors, newline, closefd ? Py_True : Py_False); |
|||
msg402901 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2021-09-29 20:23 | |
> Either create a new draft PR, or put it in the same PR 28634. I prefer to reach an agreement first here before I add more things to the PR |
|||
msg402902 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2021-09-29 20:25 | |
> There was no much need of this feature. In rare cases when we needed to build a bool in Py_BuildValue (I have found only 2 cases in the stdlib, and one of them is added by me) we use the following idiom: Is true that this is not very common, but I have find this in several occasions writing extension code and given the small impact and maintenance cost and the symmetry with PyArg_Parse I figured that could be useful |
|||
msg402904 - (view) | Author: STINNER Victor (vstinner) * | Date: 2021-09-29 20:33 | |
> but there is a catch -- the arguments should be a C int. If it is not, you can break a stack. I never understood how "..." arguments work under the hood. What happens if you pass a double, it is stored as a double on the C stack, and then Py_BuildValue() will read junk data? |
|||
msg402910 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2021-09-29 20:48 | |
The va_start macro just sets up a pointer to the first function parameter, e.g.:- void func (int a, ...) { // va_start char *p = (char *) &a + sizeof a; } which makes p point to the second parameter. The va_arg macro does this:- void func (int a, ...) { // va_start char *p = (char *) &a + sizeof a; // va_arg int i1 = *((int *)p); p += sizeof (int); // va_arg int i2 = *((int *)p); p += sizeof (int); // va_arg long i2 = *((long *)p); p += sizeof (long); } |
|||
msg402915 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2021-09-29 21:46 | |
> What happens if you pass a double, it is stored as a double on the C stack, and then Py_BuildValue() will read junk data? AFAIK, it is complicated. On old computer primitive compilers just pushed arguments one by one on the stack (in platform-depending order). When you push 64-bytes double and read 32-bit int, you get a junk not only for this read, but for reads of all following or preceding arguments. Modern compilers on modern platforms can use registers to pass variable arguments. I do not know details, but doubles and ints are passed using different registers, so the difference can be even larger. My concern is that such feature can provoke bugs. The risk can exceed the minor benefit. |
|||
msg402917 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2021-09-29 21:58 | |
You can see how is implemented on am64 for example here: https://blog.nelhage.com/2010/10/amd64-and-va_arg/ |
|||
msg402918 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2021-09-29 21:58 | |
> My concern is that such feature can provoke bugs. The risk can exceed the minor benefit. But isn't that risk the same for other formatting parameters? |
|||
msg402921 - (view) | Author: Matt Wozniski (godlygeek) * | Date: 2021-09-29 23:25 | |
> but there is a catch -- the arguments should be a C int Or a type that promotes to int. If you pass a C short or char, or a C++ bool, it is implicitly promoted to int. > so you will need to write "expr ? 1 : 0" Or alternatively "!!expr" > which is not much better than "expr ? Py_True : Py_False" I had to write that recently after reading through the Py_BuildValue docs twice to make sure I wasn't missing a format code I could use. It's a strange omission, especially because 'p' exists for PyArg_Parse. I'd very much like to see this change. |
|||
msg402923 - (view) | Author: Matt Wozniski (godlygeek) * | Date: 2021-09-29 23:38 | |
I spotted three other uses in the stdlib: Modules/_io/_iomodule.c raw = PyObject_CallFunction(RawIO_class, "OsOO", path_or_fd, rawmode, closefd ? Py_True : Py_False, opener); wrapper = PyObject_CallFunction((PyObject *)&PyTextIOWrapper_Type, "OsssO", buffer, encoding, errors, newline, line_buffering ? Py_True : Py_False); Modules/_sqlite/connection.c if (PySys_Audit("sqlite3.enable_load_extension", "OO", self, onoff ? Py_True : Py_False) < 0) { return NULL; } |
|||
msg402924 - (view) | Author: Matt Wozniski (godlygeek) * | Date: 2021-09-29 23:49 | |
The leftmost argument of the ternary is an int for every example that Victor and I found in the stdlib, so no casting would be required in any of these cases. |
|||
msg402938 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2021-09-30 08:12 | |
> But isn't that risk the same for other formatting parameters? I think that the risk for other formatting parameters is smaller, because you know, that there are different formatting parameters for different integer and floating point types, and for pointers, and you know that you should care about truncation and overflow if the type of the argument is different from the type of the parameter. But I am not in strict opposition against this feature. It is just that I have fixed so many bugs, that I try to search potential flaws and drawbacks in any new feature. Now you know about this, and you can decide whether the benefit is larger than the risk of potential damages. To me they look equally small. Technically PR 28634 LGTM. |
|||
msg402941 - (view) | Author: STINNER Victor (vstinner) * | Date: 2021-09-30 09:04 | |
> I think that the risk for other formatting parameters is smaller, because you know, that there are different formatting parameters for different integer and floating point types, and for pointers, and you know that you should care about truncation and overflow if the type of the argument is different from the type of the parameter. IMO such problem can be solved with documentation. Pablo: can you try to explain the problem in the documentation, and maybe provide an example in the doc showing how to avoid it? I guess that a generic fix is to replace "value" with "value != 0". In C, "expr != 0" is an int, no? What I understood is that Py_BuildValue("p", value) is problematic if value type is not int. "!value" or "!!value" also has the issue if I understood correctly. Like: long value = 3; Py_BuildValue("p", !value); I agree with Serhiy that Py_BuildValue() is ok-ish with other formats, but IMO it's the responsibility of the developer to pass the expect type (int for "p" format). This problem is not specific to Py_BuildValue(). printf() also has exactly the same issue. Such code has an undefined behavior: long long x = 1; print("%d\n", x); You must cast explicitly: long long x = 1; print("%d\n", (int)x); Or use a different format, like: long long x = 1; print("%lld\n", x); |
|||
msg402951 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2021-09-30 12:04 | |
> But I am not in strict opposition against this feature. It is just that I have fixed so many bugs, that I try to search potential flaws and drawbacks in any new feature. Now you know about this, and you can decide whether the benefit is larger than the risk of potential damages. To me they look equally small. Thanks, Serhiy, for your insights and careful consideration. Among many things, your attention to detail, experience and careful consideration are what I admire the most! You rock :) I will look at the stdlib examples and will incorporated them into the PR and then I will carefully consider the change again. Thanks for the comments! |
|||
msg402965 - (view) | Author: Matt Wozniski (godlygeek) * | Date: 2021-09-30 15:44 | |
> "!value" or "!!value" also has the issue if I understood correctly. No, just as "value != 0" is an int, so is "!value". |
|||
msg412234 - (view) | Author: Jelle Zijlstra (JelleZijlstra) * | Date: 2022-02-01 03:59 | |
I stumbled upon the PR and tend to agree with Serhiy that this could be a source of nasty bugs. Passing the wrong type to a va_arg() argument is undefined behavior (e.g. https://wiki.sei.cmu.edu/confluence/display/c/EXP47-C.+Do+not+call+va_arg+with+an+argument+of+the+incorrect+type), and it does seem easy to do that by accident for an int-like type. For printf() compilers implement custom checks that make sure the types are right, but we don't get that luxury with Py_BuildValue(). It's also easy to make a bool yourself with the O code and Py_True/Py_False, so I don't think the additional code is worth the risk. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:50 | admin | set | github: 89488 |
2022-02-01 03:59:42 | JelleZijlstra | set | nosy:
+ JelleZijlstra messages: + msg412234 |
2021-09-30 15:44:46 | godlygeek | set | messages: + msg402965 |
2021-09-30 12:04:41 | pablogsal | set | messages: + msg402951 |
2021-09-30 09:04:20 | vstinner | set | messages: + msg402941 |
2021-09-30 08:12:20 | serhiy.storchaka | set | messages: + msg402938 |
2021-09-29 23:49:32 | godlygeek | set | messages: + msg402924 |
2021-09-29 23:38:20 | godlygeek | set | messages: + msg402923 |
2021-09-29 23:25:08 | godlygeek | set | nosy:
+ godlygeek messages: + msg402921 |
2021-09-29 21:58:50 | pablogsal | set | messages: + msg402918 |
2021-09-29 21:58:01 | pablogsal | set | messages: + msg402917 |
2021-09-29 21:46:30 | serhiy.storchaka | set | messages: + msg402915 |
2021-09-29 20:48:05 | pablogsal | set | messages: + msg402910 |
2021-09-29 20:33:40 | vstinner | set | messages: + msg402904 |
2021-09-29 20:25:22 | pablogsal | set | messages: + msg402902 |
2021-09-29 20:23:57 | pablogsal | set | messages: + msg402901 |
2021-09-29 20:20:56 | vstinner | set | nosy:
+ vstinner messages: + msg402900 |
2021-09-29 19:48:54 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg402899 |
2021-09-29 18:22:35 | pablogsal | set | keywords:
+ patch stage: patch review pull_requests: + pull_request27003 |
2021-09-29 18:08:04 | pablogsal | create |