classification
Title: PEP 3121, 384 Refactoring applied to xx module
Type: resource usage Stage: resolved
Components: Documentation, Extension Modules Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Robin.Schreiber, belopolsky, docs@python, loewis, pitrou, vstinner
Priority: normal Keywords: patch, pep3121

Created on 2012-09-02 14:32 by Robin.Schreiber, last changed 2020-06-22 09:33 by vstinner.

Files
File name Uploaded Description Edit
xxmodule_pep3121-384_v0.patch Robin.Schreiber, 2012-09-02 14:32
xxmodule_pep3121-384_v1.patch Robin.Schreiber, 2013-08-13 08:34 review
xxmodule_pep3121-384_v2.patch Robin.Schreiber, 2013-08-18 13:17 review
Messages (17)
msg169702 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2012-09-02 14:32
Changes proposed in PEP3121 and PEP384 have now been applied to the xx module!
msg169728 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2012-09-02 23:12
xxmodule.c is used as an example in PEP 3121 itself.  To the extent the recipe in the PEP is complete, the changes to actual xxmodule.c should follow the text.  For example, the text in PEP recommends to leave m_free member of PyModuleDef 0:

static struct PyModuleDef xxmodule = {
  {}, /* m_base */
  sizeof(struct xxstate),
  &xx_methods,
  0,  /* m_reload */
  xx_traverse,
  xx_clear,
  0,  /* m_free - not needed, since all is done in m_clear */
}


In your patch, member names are not shown in the initializer.  The PEP text also omits them when it is obvious from the value.  (xx_clear initializes m_clear.)  I think all lines in the initializer should include the member name in comments.  The reason is that people will use xxmodule.c as a template for their code and may want to replace xx_clear with something that is not as suggestive.

You should add some tests demonstrating that module load/unload cycles do not introduce reference leaks.
msg169731 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2012-09-02 23:37
+#define xx_state_global 
+	((xxstate *)PyModule_GetState(PyState_FindModule(&xxmodule)))

This is unsafe: PyState_FindModule(&xxmodule) can return NULL.  I think code should account for this possibility and not use this macro.  For example, XxoObject_Check(v) should be defined as

(PyState_FindModule(&xxmodule) == NULL ? 0 : \
  (Py_TYPE(v) == xx_state(PyState_FindModule(&xxmodule))->Xxo_Type))

(Should this also check PyModule_GetState() return?)
msg194893 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2013-08-11 12:37
I absolutely agree on mentioning the member names in the comments. :-)

In the example Martin gave in his PEP 3121, the PyInit does not perform any INCREFs on the Variables that are referenced from inside the module state.
He therefore left out m_free completely as there was nothing to DECREF within the module state.
Back when I did my GSoC together with Martin, we decided that the Module state itself can be considered a valid container object, an therefore has to INCREF and in the end of its lifecycle (that is within m_free) also DECREF every object reference it holds. I therefore decided to include that into every module I refactored, and consequently also the xxmodule.

I was also thinking about redefining the macro of xx_state_global with a NULL check, however this would lead either to a redundant call of PyState_FindModule (Which may lead to unnecessary performance degregation as xx_state_global is used quite frequently in some parts of the respective module) or I had to find some awkward way to store the result o f FindModule in some local variable exapnded by the macro, which I would not consider a good idea. Instead Martin and I were thinking of including a NULL safe variant of xx_state_global only in CPython Debug Builds. What do you think about that?
msg195055 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2013-08-13 08:34
Updated the patch, corrected multiple syntax errors and missing INCREFS. Also added the comments that include the members names. I am yet undecided regarding the NULL-check for FindModule.

Apart from that I have tried to build some tests that prove that loading and unloading the module do not cause any memory leaks. This has turned up several problems: For one, the only possibility to check for the leaks that PEP 3121 tries to fix, is to run PyInit of the respective module multiple times. This is only possible if Py_finalize() has been called after the module has been imported beforehand. This means we can not test for these leaks from within Python, but need some C-Code that calls
Py_initialize(); ... import xx ... Py_finalize(); multiple times. The problem is that also the plain Py_initialize(); Py_finalize(); calls leak memory. Unfortunately the amount of objects that are leaked also seems to vary, so there is no constant factor that I can subtract to determine how much the imported module itself leaks.
So I am kind of on a dead end here. I could upload the tests scripts that I have written so far, if that helps.
msg195057 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-13 09:04
> In the example Martin gave in his PEP 3121, the PyInit does not perform any INCREFs on
> the Variables that are referenced from inside the module state.
> He therefore left out m_free completely as there was nothing to DECREF within the module
> state.
> Back when I did my GSoC together with Martin, we decided that the Module state itself can 
> be considered a valid container object, an therefore has to INCREF and in the end of its
> lifecycle (that is within m_free) also DECREF every object reference it holds. I therefore 
> decided to include that into every module I refactored, and consequently also the xxmodule.

I agree with that, but then PEP 3121's example code should be fixed.

> I am yet undecided regarding the NULL-check for FindModule.

Not checking for NULL makes it dangerous to implement unloading of extension modules: see issue18674.

If checking for NULL makes extension code too complicated, then please take a look at the helper API I've suggested in issue18710.

Also, it would be nice if you could also read the following python-dev thread, since it discusses concrete issues and possible solutions:
http://mail.python.org/pipermail/python-dev/2013-August/127862.html

> This means we can not test for these leaks from within Python, but need some C-Code

You can use _testcapi.run_in_subinterp() to run custom code in a sub-interpreter.

Here is an example of a non-leaking extension module, and then a (presumably) leaking one:

$ ./python -Xshowrefcount
Python 3.4.0a1+ (default:9e61563edb67, Aug 12 2013, 14:52:25) 
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import _testcapi
[64175 refs, 19937 blocks]
>>> _testcapi.run_in_subinterp("import resource")
0
[68839 refs, 20969 blocks]
>>> _testcapi.run_in_subinterp("import resource")
0
[68852 refs, 20980 blocks]
>>> _testcapi.run_in_subinterp("import resource")
0
[68850 refs, 20979 blocks]
>>> _testcapi.run_in_subinterp("import resource")
0
[68852 refs, 20980 blocks]
>>> _testcapi.run_in_subinterp("import resource")
0
[68850 refs, 20979 blocks]
>>> _testcapi.run_in_subinterp("import _socket")
0
[71840 refs, 22059 blocks]
>>> _testcapi.run_in_subinterp("import _socket")
0
[71911 refs, 22083 blocks]
>>> _testcapi.run_in_subinterp("import _socket")
0
[71981 refs, 22107 blocks]
>>> _testcapi.run_in_subinterp("import _socket")
0
[72051 refs, 22131 blocks]
msg195553 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2013-08-18 10:44
Antoine, regarding that the last pending problem with the patch is related to the NULL checking of FindModule, I would be inclined to include your proposed helper API. I see that issue18710 has not been included into the trunk yet, but I think this is mostly due to the additional _csv patch and not because of the proposed API itself.
So I will upload a corrected version of the patch soon, but it will rely on issue18710 to be accepted beforehand...
msg195558 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2013-08-18 13:17
Updated patch accordingly. 
Regarding the problem in http://mail.python.org/pipermail/python-dev/2013-August/127862.html , it can indeed be solved by returning the already existing module object, if PyInit is called multiple times. I followed the discussion and can not make out a definite decision on how to handle this.
My understanding is, that up to now extension modules are only supposed to be initialized once per interpreter, so the check I have included in, for example issue15651, makes sense from this perspective. There have been reasonable desires (from Eli) to make the module state separate from each imported module within the interpreter, but this would involve more drastic changes and will be rather part of future Python releases.
Should we therefore (for now) make this a mandatory PEP3121 convention and include it into the xxmodule.c refactoring?
msg195559 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-18 13:19
> Updated patch accordingly. 

I don't understand why your patch still has a module state while all
objects can be looked up in the module dict.
msg195562 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2013-08-18 13:34
Everything except for the Xxo_Type. But you are right. Then again, why are these global static variables from the start? Isn't this because xxmodule is some kind of "dummy" module that is supposed to demonstrate some coding conventions on how to build extension modules? 
Another possibility could be the faster lookup of the variables, which is now of course invalidated if we store it within the module state.
msg195563 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-18 13:45
> Everything except for the Xxo_Type. But you are right. Then again, why
> are these global static variables from the start? Isn't this because
> xxmodule is some kind of "dummy" module that is supposed to
> demonstrate some coding conventions on how to build extension
> modules? 

I don't really know what the purpose of xxmodule is. But I don't think
redundant storage is a good idea, especially with different
lifetimes :-)
msg222856 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-07-12 16:59
I added "documentation" to the components list because this in the main purpose of this module - to serve as a template for extension module writers.
msg222860 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-07-12 17:49
I think we should perhaps leave the xxmodule as an example for
static types and create another pep-384 version that mentions
*potential* performance traps.

Of course many modules won't suffer from this, but first-time
extension writers tend to paste from the examples and should
know the alternatives if they happen to write a performance
sensitive application.
msg222861 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-07-12 18:15
> create another pep-384 version

+1 - and with a more descriptive name than xxmodule.c

Suggestions:

* pep384module.c
* pep384demo.c
* pep384.c
* abidemo.c
msg372071 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 09:30
Fixed by:

commit d5cacbb1d9c3edc02bf0ba01702e7c06da5bc318
Author: Nick Coghlan <ncoghlan@gmail.com>
Date:   Sat May 23 22:24:10 2015 +1000

    PEP 489: Multi-phase extension module initialization
    
    Known limitations of the current implementation:
    
    - documentation changes are incomplete
    - there's a reference leak I haven't tracked down yet
    
    The leak is most visible by running:
    
      ./python -m test -R3:3 test_importlib
    
    However, you can also see it by running:
    
      ./python -X showrefcount
    
    Importing the array or _testmultiphase modules, and
    then deleting them from both sys.modules and the local
    namespace shows significant increases in the total
    number of active references each cycle. By contrast,
    with _testcapi (which continues to use single-phase
    initialisation) the global refcounts stabilise after
    a couple of cycles.
msg372072 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 09:31
Hum, I reopen the issue: the xx module still defines types statically and so the "PEP 384" part of this issue is not fixed yet.
msg372075 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 09:33
See also bpo-40077: "Convert static types to PyType_FromSpec()".
History
Date User Action Args
2020-06-22 09:33:41vstinnersetmessages: + msg372075
2020-06-22 09:31:45vstinnersetstatus: closed -> open
resolution: fixed ->
messages: + msg372072
2020-06-22 09:30:02vstinnersetstatus: open -> closed

nosy: + vstinner
messages: + msg372071

resolution: fixed
stage: resolved
2014-10-14 15:45:55skrahsetnosy: - skrah
2014-07-12 18:15:45belopolskysetmessages: + msg222861
2014-07-12 17:49:17skrahsetnosy: + skrah
messages: + msg222860
2014-07-12 16:59:14belopolskysetversions: + Python 3.5, - Python 3.4
2014-07-12 16:59:03belopolskysetnosy: + docs@python
messages: + msg222856

assignee: docs@python
components: + Documentation
2013-08-18 13:45:19pitrousetmessages: + msg195563
2013-08-18 13:34:56Robin.Schreibersetmessages: + msg195562
2013-08-18 13:19:40pitrousetmessages: + msg195559
2013-08-18 13:17:42Robin.Schreibersetfiles: + xxmodule_pep3121-384_v2.patch

messages: + msg195558
2013-08-18 10:44:34Robin.Schreibersetmessages: + msg195553
2013-08-13 09:04:51pitrousetnosy: + pitrou
messages: + msg195057
2013-08-13 08:34:30Robin.Schreibersetfiles: + xxmodule_pep3121-384_v1.patch
keywords: + patch
messages: + msg195055
2013-08-11 12:37:10Robin.Schreibersetmessages: + msg194893
2012-11-08 13:24:29Robin.Schreibersetkeywords: + pep3121, - patch
2012-09-02 23:37:12belopolskysetmessages: + msg169731
2012-09-02 23:12:13belopolskysetnosy: + loewis
messages: + msg169728
2012-09-02 22:47:49belopolskylinkissue15787 dependencies
2012-09-02 14:32:33Robin.Schreibercreate