classification
Title: Python 3.5.1 C API, the global variable is not destroyed when delete the module
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: Jack Liu, brett.cannon, eric.snow, josh.r, ncoghlan, pitrou, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2016-09-19 05:02 by Jack Liu, last changed 2019-10-23 00:29 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
PyTest.zip Jack Liu, 2016-09-20 05:48
Messages (14)
msg276945 - (view) Author: Jack Liu (Jack Liu) Date: 2016-09-19 05:02
0
down vote
favorite
I have a app loading python35.dll. Use python API PyImport_AddModule to run a py file. And use PyDict_DelItemString to delete the module. There is a global vailable in the py file. The global variable is not destroyed when calling PyDict_DelItemString to delete the module. That cause the memory leak.

But it is ok with python33.dll, the global variable can be destroyed when calling PyDict_DelItemString to delete the module.

How to resolve the problem? Is there a workaround? I need to use python35.dll and wish the global variable in a module can be released automatically when call PyDict_DelItemString to delete the module.

Here is the python test code:

class Simple:  
     def __init__( self ):  
         print('Simple__init__')
     def __del__( self ):  
         print('Simple__del__') 

simple = Simple()
msg276960 - (view) Author: Jack Liu (Jack Liu) Date: 2016-09-19 08:56
I have a app loading python35.dll. Use python API PyImport_AddModule to run a py file. And use PyDict_DelItemString to delete the module. There is a global vailable in the py file. The global variable is not destroyed when calling PyDict_DelItemString to delete the module. The global variable is destroyed when calling Py_Finalize. It's too late. That cause the memory leak. Because the Py_Initialize is called at the app startup, the Py_Finalize is called at the app shutdown. 

But it is ok with python33.dll, the global variable can be destroyed when calling PyDict_DelItemString to delete the module.

How to resolve the problem? Is there a workaround? I need to use python35.dll and wish the global variable in a module can be released automatically when call PyDict_DelItemString to delete the module.

Here is the python test code:

    class Simple:  
         def __init__( self ):  
             print('Simple__init__')
         def __del__( self ):  
             print('Simple__del__') 
    
    simple = Simple()
msg276980 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2016-09-19 15:07
To make sure I'm understanding:

* you are using PyDict_DelItemString() on sys.modules
* a module-level variable in the module is not getting cleaned up when the module is deleted from sys.modules
* this worked in Python 3.3 but not in 3.5

It may help to have a more complete test case, perhaps uploaded to this issue with the multiple files zipped up.

Also, does 3.2 have the same behavior as 3.3 or 3.5?  What about 3.6 (currently in beta)?

Note that deleting the module from sys.modules only reduces the refcount by one.  Other objects may still hold a reference to the module or any of its variables.  So nothing in the module would be cleaned up until the refcount hits zero.  For example, if the module was imported in another module and that second module still has a variable bound to the imported module (or the not-destroyed variable) then you would not see your printed message.

The fact that the behavior is different between 3.3 and 3.5 is concerning though.  I'd expect 3.3 to behave like 3.5 is.  It could be that a change in Lib/importlib (or Python/import.c) since 3.3 is leaking module references, though it's unlikely.
msg276999 - (view) Author: Jack Liu (Jack Liu) Date: 2016-09-20 03:10
@eric.snow, Thank you for the replay. You understood right.

I run this module as __main__ module, so there is no other modules to reference this module. And as I debugged, the ref count of this module became 0 after calling PyDict_DelItemString, but global variable in this module was not released with Python 3.5.1. Is that memory leak? As I said, it worked on python 3.3. Is it a regression in python 3.5.1? Any workaround to resolve this problem? It's an urgency issue to me.
msg277011 - (view) Author: Jack Liu (Jack Liu) Date: 2016-09-20 05:48
I wrote the test code as below. I also attached the files in attachment.
Python
=========================================================

class Simple:
     def __init__( self ):
         print('Simple__init__')
     def __del__( self ):
         print('Simple__del__')

simple = None

def run():
    global simple
    simple = Simple()

if __name__ == '__main__':
	run()
==============================================================

C++
=========================================================================

#include "stdafx.h"
#include <Python.h>
#include <string>
#include <iostream>
#include <fstream>
#include <codecvt>
#include <map>

using namespace std;

namespace {
	wstring readfile(const wchar_t *filename)
	{
		wifstream wifs;

		wifs.open(filename);
		//wifs.imbue(locale(wifs.getloc(), new codecvt_utf8<wchar_t, 0x10ffff, consume_header>()));
		wstring wstr((std::istreambuf_iterator<wchar_t>(wifs)),
			std::istreambuf_iterator<wchar_t>());
		return wstr;
	}

	string wstrtostr(const wstring& ws)
	{
		std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
		return converter.to_bytes(ws);
	}
}

int main()
{
	cout << "Input py file full path:" << endl;
	wstring filePath;
	wcin >> filePath;
	//filePath = L"L:\\Dev\\PyTest\\SimpleTest.py";

	string moduleName = "__main__";

	string script = wstrtostr(readfile(filePath.c_str()));
	if (script.empty())
	{
		cout << "Invalid python file path" << endl;
		return 1;
	}

	string sfileName = wstrtostr(filePath);

	// Initialize the Python Interpreter
	Py_Initialize();

	PyObject *py_module = PyImport_AddModule(moduleName.c_str());
	PyObject *py_dict = PyModule_GetDict(py_module);

	PyObject* res = nullptr;
	auto arena = PyArena_New();
	if (arena)
	{
		auto mod = PyParser_ASTFromString(script.c_str(), sfileName.c_str(), Py_file_input, nullptr, arena);
		if (mod)
		{
			auto co = PyAST_Compile(mod, sfileName.c_str(), nullptr, arena);
			if (co)
			{
				res = PyEval_EvalCode((PyObject*)(co), py_dict, py_dict);
				Py_DECREF(co);
			}
		}
		PyArena_Free(arena);
	}
	if (res)
	{
		Py_DECREF(res);
	}

	// Delete the module from sys.modules
	PyObject* modules = PyImport_GetModuleDict();
	cout << "PyDict_DelItemString" << endl;
	PyDict_DelItemString(modules, moduleName.c_str());

	// May run many scripts here

	// Finish the Python Interpreter
	cout << "Py_Finalize" << endl;
	Py_Finalize();

	return 0;
}
===================================================================

The expected output in console should be:
Simple__init__
PyDict_DelItemString
Simple__del__
Py_Finalize

I tested with Python 3.2.5, 3.3.5, 3.5.1 and 3.6.0 beta.
It worked as expected with Python 3.2.5 and 3.3.5, but did not work Python 3.5.1 and 3.6.0 beta.

On Python 3.5.1 and 3.6.0 beta, the output is:
Simple__init__
PyDict_DelItemString
Py_Finalize
Simple__del__
That means the Simple object is not released at PyDict_DelItemString, it's released at Py_Finalize.

So it it a regression bug since Python 3.5? I wish there is a solution to resolve memory leak issue.
msg277014 - (view) Author: Jack Liu (Jack Liu) Date: 2016-09-20 06:27
I know there is a workaround to set the global variables to None at last in Python scripts. But my app just provide a framework for my customers to run python scripts. That means the workaround requires my customers to update their python scripts. That may make them unhappy :(. First, we need to confirm if it's a bug of Python 3.5. If it's a bug of Python 3.5, is there a workaround in my code C++ side to call Python C APIs to resolve the memory leak issue? If it's not a bug of Python 3.5, is there any mistake in my C++ code?
msg277086 - (view) Author: Jack Liu (Jack Liu) Date: 2016-09-21 03:44
The problem is resolved if call PyGC_Collect() after PyDict_DelItemString(). Is it expected to call PyGC_Collect() here?
msg277090 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-09-21 04:25
The fact that it's resolved by PyGC_Collect indicates there is a reference cycle somewhere. PyGC_Collect is just looking for cyclic garbage and breaking the cycles so it can be cleaned; it would happen eventually unless GC was explicitly disabled or the process exited before the next implicit GC invocation, so it means this bug is really about timing (and possibly cycles we'd prefer to avoid), not reference leaks.
msg277092 - (view) Author: Jack Liu (Jack Liu) Date: 2016-09-21 05:04
Looks to me, there is NO reference cycle on the Simple object in the python test code. Why needs to call PyGC_Collect() here?
msg277095 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-09-21 05:56
Please output Py_REFCNT(py_module) and Py_REFCNT(py_dict) before deleting the module from sys.modules. Is there a difference between 3.5 and 3.6?
msg277097 - (view) Author: Jack Liu (Jack Liu) Date: 2016-09-21 06:29
@serhiy.storchaka, The reference counts before PyDict_DelItemString are same on Python 3.3, 3.5 and 3.6.
Py_REFCNT(py_module)1
Py_REFCNT(py_dict)4
msg277151 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-09-21 14:58
The most likely relevant difference here is that Python 3.4+ no longer forcibly break cycles through the module globals when the module is deallocated: https://docs.python.org/dev/whatsnew/3.4.html#whatsnew-pep-442

Due to the implicit cycles created between function definitions and their global namespace via the __globals__ attribute on the function, this means that embedding applications will need to explicitly run a GC collection cycle after deleting a module in order to fully finalise it.
msg277153 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-09-21 15:01
To be entirely clear about what's going on, the reference cycle seen in the example arises for *any* module level function, even if it's completely empty:

>>> def f():
...     pass
... 
>>> f.__globals__["f"] is f
True

The existence of that cycle will then keep other module globals alive until the next garbage collection run.
msg355199 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-23 00:29
> The problem is resolved if call PyGC_Collect() after PyDict_DelItemString(). Is it expected to call PyGC_Collect() here?

Yeah sadly, to handle such reference cycles, you have to trigger an explicit garbage collection.

It doesn't sound like a bug to me.

Python 3.4 made this way better with PEP 442.

Anyway, that's an old issue with no activity since 2016. I close it.
History
Date User Action Args
2019-10-23 00:29:57vstinnersetstatus: open -> closed

nosy: + vstinner
messages: + msg355199

resolution: out of date
stage: resolved
2016-09-22 08:59:20Jack Liusetnosy: + pitrou
2016-09-21 15:01:53ncoghlansetmessages: + msg277153
2016-09-21 14:58:24ncoghlansetmessages: + msg277151
2016-09-21 06:29:48Jack Liusetmessages: + msg277097
2016-09-21 05:56:59serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg277095
2016-09-21 05:04:07Jack Liusetmessages: + msg277092
2016-09-21 04:25:49josh.rsetnosy: + josh.r
messages: + msg277090
2016-09-21 03:44:45Jack Liusetmessages: + msg277086
2016-09-20 06:27:52Jack Liusetmessages: + msg277014
2016-09-20 05:48:35Jack Liusetfiles: + PyTest.zip

messages: + msg277011
2016-09-20 03:10:05Jack Liusetmessages: + msg276999
2016-09-19 15:07:51eric.snowsetmessages: + msg276980
versions: + Python 3.6
2016-09-19 09:18:00xiang.zhangsetnosy: + brett.cannon, ncoghlan, eric.snow
2016-09-19 08:56:37Jack Liusetmessages: + msg276960
2016-09-19 06:52:53Jack Liusetcomponents: + Extension Modules, - Library (Lib)
title: Python 3.5.1 C API, the global available available is not destroyed when delete the module -> Python 3.5.1 C API, the global variable is not destroyed when delete the module
2016-09-19 05:02:27Jack Liucreate