This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add tests to test_interpreters to import C extension modules converted to PEP 489 multiphase initialization
Type: Stage:
Components: Tests Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: corona10, shihai1991, vstinner
Priority: normal Keywords:

Created on 2020-06-15 16:06 by vstinner, last changed 2022-04-11 14:59 by admin.

Messages (5)
msg371568 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-15 16:06
In bpo-1635741, many C extension modules were ported to the new PEP 489 multiphase initialization. Some of these changes suddenly make old bugs visible. I mean that bugs were there for years, but nobody noticed them previously.

* _weakref in importlib: bpo-40050 (commit 83d46e0622d2efdf5f3bf8bf8904d0dcb55fc322)
* leak in _testcapi: commit 310e2d25170a88ef03f6fd31efcc899fe062da2c (bpo-36854)
* Reference leak in select: bpo-32604 (commit 18a90248fdd92b27098cc4db773686a2d10a4d24)
* test_threading vs the garbage collector: bpo-40217 "The garbage collector doesn't take in account that objects of heap allocated types hold a strong reference to their type"... This issue has a controversial history... Read bpo-40217 discussion.
* etc.

The problem is that these issues are discovered afterwards, usually when "Refleaks" buildbots complete.

I propose to add a new test case to test_interpreters which would run "import xxx" in _testcapi.run_in_subinterp() on the 34 C extensions already ported to PyModuleDef_Init():

$ grep -l PyModuleDef_Init Modules/*.c
Modules/_abc.c
Modules/arraymodule.c
Modules/atexitmodule.c
Modules/audioop.c
Modules/binascii.c
Modules/_bz2module.c
Modules/_codecsmodule.c
Modules/_collectionsmodule.c
Modules/_contextvarsmodule.c
Modules/_cryptmodule.c
Modules/errnomodule.c
Modules/fcntlmodule.c
Modules/_functoolsmodule.c
Modules/_heapqmodule.c
Modules/itertoolsmodule.c
Modules/_json.c
Modules/_localemodule.c
Modules/mathmodule.c
Modules/mmapmodule.c
Modules/nismodule.c
Modules/_operator.c
Modules/posixmodule.c
Modules/resource.c
Modules/_stat.c
Modules/_statisticsmodule.c
Modules/syslogmodule.c
Modules/_testmultiphase.c
Modules/timemodule.c
Modules/_uuidmodule.c
Modules/_weakref.c
Modules/xxlimited.c
Modules/xxmodule.c
Modules/xxsubtype.c
Modules/_zoneinfo.c

The test case should explain the purpose of these tests:

* ensure that the import doesn't crash
* help to discover reference leaks when running "python -m test test_interpreters -R 3:3"

Maybe later we could elaborate these tests to ensure that the sub interpreter module is unrelated to the module in the main interpreter. For example, add a specific attribute in the subinterpreter, and then check that it doesn't exist in the main interpreter. But I'm not sure if it's a good idea to import 34 modules in test_interpreters.
msg371679 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-16 16:08
According to Dong-hee, importing dbm in a subinterpreter leaks references:
https://github.com/python/cpython/pull/20920#issuecomment-644856401
msg371684 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-06-16 16:42
> According to Dong-hee, importing dbm in a subinterpreter leaks references:

GH-20920 fixed the leak :)
msg371721 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-06-17 08:51
Hmm today I personally wrote the tests for it.

Two goals should be achived
1. import doesn't crash
2. Memory leak should be detected.

Running "import xxx" in _testcapi.run_in_subinterp()
only detect that the import doesn't crash

But the memory leak was not detected with a leaky module.
So I'd like to suggest to write import tests on their module test
by using _testcapi.run_in_subinterp() for each case.
msg371726 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-17 10:14
> So I'd like to suggest to write import tests on their module test
> by using _testcapi.run_in_subinterp() for each case.

I like this idea. But I would prefer to factorize the code as much as possible. Like being able to add more checks later without having to modify N files. In test_xxx, I expect something like:
---
class SubinterpreterTests(support.SubinterpreterTestCase):
    module_name = "xxx"
---

Nothing else, all logic would live in a different file, like test.support.

I prefer to use a string for the module. For example, it is possible to import a fresh copy of the module in the current interpreter using a module name.

We can start with a single test which imports the module in run_in_subinterp(), but it becomes possible to add more tests later. For example, import in the current interpreter, import in a subinterpreter, and ensure that the two modules are not the same object.

I see an advantage of putting such tests in each test_xxx file: it prevents to have to import 50+ modules in test_interpreters which would increase its memory footprint, likely have unexpected side effects, etc.

Maybe support.SubinterpreterTestCase should give the ability to override some method. For example, if we need a fresh copy of a module, maybe we should give the ability to provide a list of blocked modules. That's important for "io" vs "_pyio" tests, or "datetime" vs "_datetime", etc. (PEP 399)
History
Date User Action Args
2022-04-11 14:59:32adminsetgithub: 85159
2020-06-17 10:42:52shihai1991setnosy: + shihai1991
2020-06-17 10:14:11vstinnersetmessages: + msg371726
2020-06-17 08:51:40corona10setmessages: + msg371721
2020-06-16 16:42:11corona10setmessages: + msg371684
2020-06-16 16:08:59vstinnersetmessages: + msg371679
2020-06-15 16:06:29vstinnercreate