classification
Title: PyList_SET_ITEM could be safer
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Christian.Tismer, ZackerySpytz, espie, petr.viktorin, rhettinger, serhiy.storchaka, skrah, vstinner
Priority: normal Keywords: patch

Created on 2017-05-24 15:14 by espie, last changed 2020-12-30 11:43 by Christian.Tismer. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 19975 merged ZackerySpytz, 2020-05-07 05:58
PR 23645 closed vstinner, 2020-12-04 18:54
PR 23654 merged vstinner, 2020-12-05 10:41
Messages (22)
msg294362 - (view) Author: Espie Marc (espie) Date: 2017-05-24 15:14
Documentation says PyList_SET_ITEM is void, but it lies. The macro is such that it yields the actual element being set.

wrapping the macro content in a do {} while (0)  makes sure PyList_SET_ITEM is really void, e.g.:
#define PyList_SET_ITEM(op, i, v) do { (((PyListObject *)(op))->ob_item[i] = (v)); } while (0)


I just ran into the problem while compiling py-qt4 with clang.
There was some confusion between PyList_SET_ITEM and PyList_SetItem:

if (obj == NULL || PyList_SET_ITEM (l, i, obj) < 0)

g++ didn't catch it (because it doesn't see negative pointers as a problem), but clang++ instantly broke.

With PyList_SET_ITEM truly void the problem disappears.
msg294421 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-25 00:55
The docs do make the claim of returning void: https://docs.python.org/3/c-api/list.html#c.PyList_SET_ITEM

However, I think we should change the docs rather than changing the macro.  This macro is very old and very widely used.  Changing it is likely to break existing code which may rely on the current behavior.

Also, I don't want to encourage code like what you saw in py-qt4.  It is very indirect about what it is trying to express.
msg294450 - (view) Author: Espie Marc (espie) Date: 2017-05-25 09:10
Well, there is not going to be a lot of breakage. This problem is the only instance I've encountered in the full OpenBSD ports tree.

I thought python was supposed to be a clean language, and didn't shy away from removing stuff/tweaking stuff to achieve that goal ?...

The py-kde4 error was deadly. I'm lucky clang finally caught it, but I'd rather this kind of stuff just not compile.

I think we're in a world where *correctness* is finally beginning to matter a bit more than *compatibility forever whatever the cost*.
msg294451 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-25 09:56
I think it would be safer to just cast the result of the expression to void. This decreases the chance of breaking valid code, that for example uses PyList_SET_ITEM() in a comma expression.
msg294472 - (view) Author: Espie Marc (espie) Date: 2017-05-25 12:10
yep, casting to (void) would be safer indeed. didn't think of that one ;)
msg294490 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-25 15:24
> Well, there is not going to be a lot of breakage. This 
> problem is the only instance I've encountered in the 
> full OpenBSD ports tree.

That is no basis for knowing what the entire rest of the world has done with done with this API (perhaps valid working code using chained assignments).
msg294499 - (view) Author: Espie Marc (espie) Date: 2017-05-25 17:09
Note that the API is fully documented for returning void... not anything else.

"No basis" right. We're taling 10000 pieces of software. a lot of what is actually used in the world.

I'm very surprised, considering python has routinely done "spring cleanup" by breaking fairly old apis.
 
If this breaks, people will fix their code, seriously.

In most places, we would rather have undocumented, unportable code, break *cleanly*, rather than rely on a fuzzy behavior that could possibly change at any moment, and that was never documented as doing anything.


Or is there some kind of mystique that, because this is low-level C implementation, somehow, python programmers are not going to be able to cope with the internals ?....
msg294502 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-05-25 17:20
I guess since it's documented, it would be ok to change for 3.7.  BTW, we also allow "static inline" now in headers, so perhaps we could make it a function.
msg294507 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-25 18:38
I prefer to leave this as macro rather than assuming the compiler will heed the inline hint.
msg294511 - (view) Author: Espie Marc (espie) Date: 2017-05-25 20:11
it's still 100% safe as a macro since each parameter is not used more than once. only the return type is an issue.
msg382550 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-05 01:28
I propose to merge my PR 23645 change right now. If it breaks too many C extensions, we still have time before Python 3.10.0 final scheduled at Monday, 2021-10-04, to revert the change.


"if (obj == NULL || PyList_SET_ITEM (l, i, obj) < 0)"

Well, this is obviously a bug in py-qt4 and it's a good thing to get a compiler error on such code. I guess that the intent was to use PyList_SetItem().

It's also an issue for Python implementation other than CPython which does not behave exactly like CPython C API. I prefer to fix the C API to make it respect its own documentation (return "void").


Espie Marc: "Well, there is not going to be a lot of breakage. This problem is the only instance I've encountered in the full OpenBSD ports tree."

Oh, thanks for providing this very useful feedback! That's very promising.

Did you report the issue to py-qt4 upstream? If yes, can you please copy here the link to your report?
msg382560 - (view) Author: Espie Marc (espie) Date: 2020-12-05 09:20
On Sat, Dec 05, 2020 at 01:28:33AM +0000, STINNER Victor wrote:
> 
> STINNER Victor <vstinner@python.org> added the comment:
> 
> I propose to merge my PR 23645 change right now. If it breaks too many C extensions, we still have time before Python 3.10.0 final scheduled at Monday, 2021-10-04, to revert the change.
> 
> 
> "if (obj == NULL || PyList_SET_ITEM (l, i, obj) < 0)"
> 
> Well, this is obviously a bug in py-qt4 and it's a good thing to get a compiler error on such code. I guess that the intent was to use PyList_SetItem().
> 
> It's also an issue for Python implementation other than CPython which does not behave exactly like CPython C API. I prefer to fix the C API to make it respect its own documentation (return "void").
> 
> 
> Espie Marc: "Well, there is not going to be a lot of breakage. This problem is the only instance I've encountered in the full OpenBSD ports tree."
> 
> Oh, thanks for providing this very useful feedback! That's very promising.
> 
> Did you report the issue to py-qt4 upstream? If yes, can you please copy here the link to your report?

I'm sorry, it's been so long ago, I can't remember.

I've been dealing with 10s of other bugs since.
msg382561 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-05 09:38
I prefer PR 19975 to PR 23645. It solves the initial issue and is much simpler.
msg382562 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-05 10:35
New changeset 556d97f473fa538cef780f84bd29239ecf57d9c5 by Zackery Spytz in branch 'master':
bpo-30459: Cast the result of PyList_SET_ITEM() to void (GH-19975)
https://github.com/python/cpython/commit/556d97f473fa538cef780f84bd29239ecf57d9c5
msg382626 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-07 10:56
New changeset 0ef96c2b2a291c9d2d9c0ba42bbc1900a21e65f3 by Victor Stinner in branch 'master':
bpo-30459: Cast the result of PyCell_SET to void (GH-23654)
https://github.com/python/cpython/commit/0ef96c2b2a291c9d2d9c0ba42bbc1900a21e65f3
msg382627 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-07 10:57
Thanks Espie Marc for the bug report, it's now fixed in the master branch. IMO not only clang users will benefit of a better defined API. For example, it should help other Python implementation to implement such API, without the weird side effects of a macro.
msg382825 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2020-12-10 12:02
This change goes directly against PEP 387.
msg382830 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-10 13:18
> This change goes directly against PEP 387.

The change respects the documentation which always documented the result type as "void".

3.10: https://docs.python.org/dev/c-api/tuple.html#c.PyTuple_SET_ITEM
3.5: https://docs.python.org/3.5/c-api/tuple.html#c.PyTuple_SET_ITEM
2.7: https://docs.python.org/2.7/c-api/tuple.html#c.PyTuple_SET_ITEM

This change is backward incompatible on purpose: it's to implement the documented behavior, and the macro was misused leading to a bug: "PyList_SET_ITEM (l, i, obj) < 0" in py-qt4.

I would also prefer a deprecation warning, but I don't see how do detect when a macro is abused to write "x = SET();" or "SET() = x;" Maybe a C linter could detect that, but I don't know any tool doing that.

The best we can do is to announce the change. That's why I documented in What's New in Python 3.10, and not only "hidden" in the Changelog:
https://docs.python.org/dev/whatsnew/3.10.html#id2

My expectation is that apart py-qt4, no project abuse these 3 macros.
msg382831 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-10 13:31
I checked: commit 0ef96c2b2a291c9d2d9c0ba42bbc1900a21e65f3 is part of Python 3.10.0a3 (released 3 days ago).
msg382832 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2020-12-10 14:12
> The change respects the documentation which always documented the result type as "void".

Then, IMO, the documentation should be fixed.

> My expectation is that apart py-qt4, no project abuse these 3 macros.

That's not true; at least ALSA's python bindings use PyTuple_SET_ITEM as a lvalue as well.
msg382971 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-14 09:56
> That's not true; at least ALSA's python bindings use PyTuple_SET_ITEM as a lvalue as well.

alsa-python used PyTuple_SET_ITEM(..., obj) to decide if it should call Py_INCREF(obj). This code looks suspicious. PyTuple_SET_ITEM() should not be used to set an item to NULL.

It's already fixed:
https://github.com/alsa-project/alsa-python/commit/5ea2f8709b4d091700750661231f8a3ddce0fc7c

IMO it's a good thing that such suspicious code is discovered. The surprising part is that it worked previously :-)

Downstream Fedora issue: https://bugzilla.redhat.com/show_bug.cgi?id=1906380 (CLOSED)
msg384058 - (view) Author: Christian Tismer (Christian.Tismer) * (Python committer) Date: 2020-12-30 11:43
Congrats to that change!
History
Date User Action Args
2020-12-30 11:43:15Christian.Tismersetnosy: + Christian.Tismer
messages: + msg384058
2020-12-14 09:56:04vstinnersetmessages: + msg382971
2020-12-10 14:12:11petr.viktorinsetmessages: + msg382832
2020-12-10 13:31:47vstinnersetmessages: + msg382831
2020-12-10 13:18:17vstinnersetmessages: + msg382830
2020-12-10 12:02:14petr.viktorinsetnosy: + petr.viktorin
messages: + msg382825
2020-12-07 10:57:40vstinnersetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 3.10, - Python 3.9
2020-12-07 10:57:33vstinnersetmessages: + msg382627
2020-12-07 10:56:27vstinnersetmessages: + msg382626
2020-12-05 10:41:46vstinnersetpull_requests: + pull_request22523
2020-12-05 10:35:20vstinnersetmessages: + msg382562
2020-12-05 09:38:39serhiy.storchakasetmessages: + msg382561
2020-12-05 09:20:31espiesetmessages: + msg382560
2020-12-05 01:28:33vstinnersetmessages: + msg382550
2020-12-04 18:54:52vstinnersetpull_requests: + pull_request22513
2020-05-07 06:13:16ZackerySpytzsetversions: + Python 3.9, - Python 3.7
2020-05-07 05:58:06ZackerySpytzsetkeywords: + patch
nosy: + ZackerySpytz

pull_requests: + pull_request19291
stage: patch review
2017-05-25 20:11:07espiesetmessages: + msg294511
2017-05-25 18:38:25rhettingersetmessages: + msg294507
2017-05-25 17:20:23skrahsetnosy: + skrah
messages: + msg294502
2017-05-25 17:09:24espiesetmessages: + msg294499
2017-05-25 15:24:11rhettingersetmessages: + msg294490
2017-05-25 12:10:33espiesetmessages: + msg294472
2017-05-25 09:56:22serhiy.storchakasetnosy: + vstinner, serhiy.storchaka

messages: + msg294451
versions: + Python 3.7, - Python 3.6
2017-05-25 09:10:38espiesetmessages: + msg294450
2017-05-25 00:55:09rhettingersetnosy: + rhettinger
messages: + msg294421
2017-05-24 15:14:12espiecreate