Message 346581 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	eryksun
Recipients	efiop, eryksun, gregory.p.smith, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
Date	2019-06-26.03:43:45
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1561520626.22.0.194609269571.issue37380@roundup.psfhosted.org>
In-reply-to

Content
> One issue on Linux is that the zombie process keeps the pid used until > the parent reads the child exit status, and Linux pids are limited to > 32768 by default. Windows allocates Process and Thread IDs out of a kernel handle table, which can grow to about 2**24 entries (more than 16 million). So the practical resource limit for inactive Process and Thread objects is available memory, not running out of PID/TID values. > Linux (for example) has the same design: the kernel doesn't keep a > "full process" alive, but a lightweight structure just for its parent > process which gets the exit status. That's the concept of "zombie > process". In Unix, the zombie remains visible in the task list (marked as <defunct> in Linux), but in Windows an exited process is removed from the Process Manager's active list, so it's no longer visible to users. Also, a Process object is reaped as soon as the last reference to it is closed, since clearly no one needs it anymore. > The subprocess module uses a magic Handle object which calls > CloseHandle(handle) in its __del__() method. I dislike relying on > destructors. If an object is kept alive by a reference cycle, it's > never released: CloseHandle() isn't called. We could call self._handle.Close() in _wait(), right after calling GetExitCodeProcess(self._handle). With this change, __exit__ will ensure that _handle gets closed in a deterministic context. Code that needs the handle indefinitely can call _handle.Detach() before exiting the with-statement context, but that should rarely be necessary. I don't understand emitting a resource warning in Popen.__del__ if a process hasn't been waited on until completion beforehand (i.e. self.returncode is None). If a script wants to be strict about this, it can use a with statement, which is documented to wait on the process. I do understand emitting a resource warning in Popen.__del__ if self._internal_poll() is None. In this case, in Unix only now, the process gets added to the _active list to try to avoid leaking a zombie. The list gets polled in subprocess._cleanup, which is called in Popen.__init__. Shouldn't _cleanup also be set as an atexit function? There should be a way to indicate a Popen instance is intended to continue running detached from our process, so scripts don't have to ignore an irrelevant resource warning.

> One issue on Linux is that the zombie process keeps the pid used until 
> the parent reads the child exit status, and Linux pids are limited to
> 32768 by default.

Windows allocates Process and Thread IDs out of a kernel handle table, which can grow to about 2**24 entries (more than 16 million). So the practical resource limit for inactive Process and Thread objects is available memory, not running out of PID/TID values.

> Linux (for example) has the same design: the kernel doesn't keep a 
> "full process" alive, but a lightweight structure just for its parent
> process which gets the exit status. That's the concept of "zombie 
> process".

In Unix, the zombie remains visible in the task list (marked as <defunct> in Linux), but in Windows an exited process is removed from the Process Manager's active list, so it's no longer visible to users. Also, a Process object is reaped as soon as the last reference to it is closed, since clearly no one needs it anymore. 

> The subprocess module uses a magic Handle object which calls 
> CloseHandle(handle) in its __del__() method. I dislike relying on 
> destructors. If an object is kept alive by a reference cycle, it's
> never released: CloseHandle() isn't called.

We could call self._handle.Close() in _wait(), right after calling GetExitCodeProcess(self._handle). With this change, __exit__ will ensure that _handle gets closed in a deterministic context. Code that needs the handle indefinitely can call _handle.Detach() before exiting the with-statement context, but that should rarely be necessary.

I don't understand emitting a resource warning in Popen.__del__ if a process hasn't been waited on until completion beforehand (i.e. self.returncode is None). If a script wants to be strict about this, it can use a with statement, which is documented to wait on the process. 

I do understand emitting a resource warning in Popen.__del__ if self._internal_poll() is None. In this case, in Unix only now, the process gets added to the _active list to try to avoid leaking a zombie. The list gets polled in subprocess._cleanup, which is called in Popen.__init__. Shouldn't _cleanup also be set as an atexit function?

There should be a way to indicate a Popen instance is intended to continue running detached from our process, so scripts don't have to ignore an irrelevant resource warning.

History
Date	User	Action	Args
2019-06-26 03:43:46	eryksun	set	recipients: + eryksun, gregory.p.smith, paul.moore, vstinner, tim.golden, zach.ware, steve.dower, efiop
2019-06-26 03:43:46	eryksun	set	messageid: <1561520626.22.0.194609269571.issue37380@roundup.psfhosted.org>
2019-06-26 03:43:46	eryksun	link	issue37380 messages
2019-06-26 03:43:45	eryksun	create