classification
Title: Multi-phase extension module initialization, inconsistent exceptions and conflicts between code and PEP
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: ncoghlan, petr.viktorin, python-dev, xiang.zhang
Priority: normal Keywords: patch

Created on 2016-08-17 06:04 by xiang.zhang, last changed 2016-09-07 06:06 by xiang.zhang. This issue is now closed.

Files
File name Uploaded Description Edit
issue27782.patch xiang.zhang, 2016-08-17 17:50 review
Messages (13)
msg272903 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-17 06:04
From doc [1], when create_module returns a non-module instance,  m_methods, m_traverse, m_clear, m_free must be NULL. But actually in the codes, only m_traverse, m_clear, m_free are checked and emitting consistent errors. If m_methods is NULL, it will fail in [2] and emit an inconsistent misleading argument error. And what's more confusing is, in [3], it says "regardless of type, the module's functions are initialized from m_methods, if any", which I think conflicts with the codes and doc.

[1] https://docs.python.org/3.6/c-api/module.html#c.Py_mod_create
[2] https://hg.python.org/cpython/file/tip/Objects/moduleobject.c#l300
[3] https://www.python.org/dev/peps/pep-0489/#post-creation-steps
msg272906 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-08-17 06:52
I think the intended behaviour here is that which was documented in the PEP: that m_methods should still work based on ducktyping for reading a __name__ attribute and setting the method attributes, even if the result of Py_create_mod isn't a module type object.

However, it isn't currently working due to the PyModule_Check call in PyModule_GetNameObject, which is in turn called from PyModule_AddFunctions. The Py_mod_create docs then reflect that limitation of the implementation.

The cleanest way to refactor and fix this that comes to mind would be to make static _get_object_name() and _add_methods_to_object() functions in moduleobject.c (which omit any strict type checks), and then call those from PyModule_GetNameObject and PyModule_AddFunctions with the explicit typecheck.

That way we don't have to support this for arbitrary third party code calling in to the C API, while still allowing it for the specific case of objects returned from Py_mod_create.
msg272907 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-17 07:11
> The cleanest way to refactor and fix this that comes to mind would be to make static _get_object_name() and _add_methods_to_object() functions in moduleobject.c (which omit any strict type checks), and then call those from PyModule_GetNameObject and PyModule_AddFunctions with the explicit typecheck.

That's one solution. How about loosing PyModule_GetNameObject's constraint? Let it accept non-ModuleType objects? Actually without the constraint of PyModule_GetNameObject, PyModule_AddFunctions can handle non-ModuleType objects.
msg272908 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-08-17 07:26
Loosening the constraint on PyModule_GetNameObject would indeed work, but it means the code still has a readability problem: the convention in the C API is that officially ducktyped APIs use the PyObject_* prefix, or one of the other abstract protocols (PyNumber_*, PyMapping_*, etc), rather than a concrete type name like PyModule_*. Relying on folks to "just know" that these particular APIs deliberately don't enforce the type constraint is a recipe for future confusion, even if it's documented that way.

Such a change also has potential ripple effects on other implementations that emulate the C API, and hence isn't something I'd be comfortable with changing in a maintenance release, whereas fixing the implementation to match the PEP could be done for the next 3.5.x update.
msg272909 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-17 07:38
Nice point. I'll write a patch to fix this these days.
msg272957 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2016-08-17 14:55
Hi! I'm on a tight schedule this week, so I'm not looking into this in detail. But please let me know if you need any help and I'll raise the priority.
msg272977 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-17 17:50
Thanks Petr. I'd appreciate it if you are willing to review the patch.

Upload a patch to fix this, along with tests and doc updating.

But here is something different. In PEP489, it is explicitly stated that "Py_mod_create slot is not responsible for setting import-related attributes specified in PEP 451 (such as __name__ or __loader__ ) on the new module". So when an object(even ModuleType instances) is returned, it's __name__ attribute is not set and we can't rely on it (which means we can't even use PyModule_GetNameObject). I then use the name attribute of the spec. Looking forward to feedback.
msg273269 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-08-21 05:57
The patch looks good to me, and the relevant contributor licensing is in place, so I'll be applying this shortly :)
msg273275 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-08-21 07:44
New changeset 913268337886 by Nick Coghlan in branch '3.5':
Issue #27782: Fix m_methods handling in multiphase init
https://hg.python.org/cpython/rev/913268337886

New changeset fb509792dffc by Nick Coghlan in branch 'default':
Merge #27782 fix from 3.5
https://hg.python.org/cpython/rev/fb509792dffc
msg273276 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-08-21 07:46
Thanks for the bug report and the patch!

These kinds of collaborative interactions are my favourite aspect of open source participation :)
msg273280 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-21 08:21
Thanks for your work too, Nick! :) Active reply from the core devs always gives me more motivation to open source.
msg274762 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-09-07 05:32
Xiang, if multi-phase initialisation is an area you're interested in, it occurs to me you may also want to take a look at Petr's proposal to provide efficient access to global module state from methods of extension types: https://mail.python.org/pipermail/import-sig/2016-August/001065.html

With the 3.6 feature freeze deadline imminent it's looking like that will be deferred to 3.7, but it's still a significant improvement that should encourage greater use of the newer more submodule and reloading friendly approach to C extension module initialisation.
msg274767 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-09-07 06:06
Thanks for your notice, Nick. :) Of course I am interested. I'll start following import-sig and reading Petr's good idea.
History
Date User Action Args
2016-09-07 06:06:00xiang.zhangsetmessages: + msg274767
2016-09-07 05:32:42ncoghlansetmessages: + msg274762
2016-08-21 08:21:19xiang.zhangsetmessages: + msg273280
2016-08-21 07:46:56ncoghlansetstatus: open -> closed
resolution: fixed
messages: + msg273276

stage: commit review -> resolved
2016-08-21 07:44:13python-devsetnosy: + python-dev
messages: + msg273275
2016-08-21 05:57:07ncoghlansetmessages: + msg273269
stage: commit review
2016-08-21 05:47:34ncoghlansetassignee: ncoghlan
2016-08-17 17:50:02xiang.zhangsetfiles: + issue27782.patch
keywords: + patch
messages: + msg272977
2016-08-17 14:55:49petr.viktorinsetmessages: + msg272957
2016-08-17 07:38:34xiang.zhangsetmessages: + msg272909
2016-08-17 07:26:17ncoghlansetmessages: + msg272908
2016-08-17 07:11:00xiang.zhangsetmessages: + msg272907
2016-08-17 06:52:38ncoghlansetmessages: + msg272906
2016-08-17 06:04:37xiang.zhangcreate