classification
Title: "Restart Shell" command leaves pythonw.exe processes running
Type: resource usage Stage: resolved
Components: IDLE Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ned.deily Nosy List: Aneesh, Peter.Caven, brian.curtin, eli.bendersky, georg.brandl, kbk, loewis, ned.deily, python-dev, sunqiang, terry.reedy, tim.golden
Priority: high Keywords: 3.2regression, patch

Created on 2011-07-12 09:21 by Peter.Caven, last changed 2011-08-09 08:17 by Aneesh. This issue is now closed.

Files
File name Uploaded Description Edit
issue12540.1.patch eli.bendersky, 2011-07-30 04:31 review
issue12540_rev2.patch ned.deily, 2011-08-03 02:24 review
unnamed eli.bendersky, 2011-08-05 11:51
Messages (36)
msg140179 - (view) Author: Peter Caven (Peter.Caven) Date: 2011-07-12 09:21
On Windows Vista (x64) the IDLE "Restart Shell" command leaves a "pythonw.exe" process running each time that the command is used.
Observed in Python 3.2.1 release and RC2.
msg140474 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-07-15 20:53
I just 'upgraded' to 3.2.1 on my XP machine and I see the same with F5-run, which restarts before running the saved file. This appears to be a nasty regression from 3.2.0 that should have been a release blocker if caught earlier. I believe it merits a quick re-release, once fixed, of either the Windows binary or Python itself depending on whether this is windows specific or not.

The normal behavior when starting IDLE is 2 pythonw.exe processes -- one to run IDLE itself and the other for the attached process that runs user code. Restart/Run starts a third process with a fresh interpreter for user code and the old, orphaned user process should and usually does disappear in 2-3 seconds (on my old, slow machine). (There are tracker issues about this not always happening when a runaway process is stopped with ^C, but is has always worked otherwise.) Each process appears to take about 10+Mb, so anyone doing rapid code/test iterations, as I sometimes do, could easily overfill physical memory. 
Closing the IDLE windows does not stop the detached processes, so they have to be killed 1 at a time with 3 clicks each or by rebooting. Neither is pleasant.

Although I should have 3.2.1 loaded at least for reviewing issues, I plan to revert to 3.2.0.

Victor: do you know of any way to test for process extinction on Windows?
msg140486 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-07-16 10:02
Terry, please don't overreact. Nobody has noticed it during the *long* rc period of 3.2.1, so it can't be that bad. Actually, I *did* notice, but didn't have the time to submit a bug report.
msg140487 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-07-16 10:02
FWIW, it only happens with IDLE; python.exe seems to terminate fine when done.
msg140555 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2011-07-18 02:03
This appears to be a Windows-only issue; I'm not able to readily reproduce it on either Linux or OS X.  Taking a quick look at diffs between 3.2 and 3.2.1, there aren't a lot of changes in IDLE (Lib/idlelib) and nothing obviously related.  There are a number of changes elsewhere in signal handling and process handling, though.  The restart_subprocess code is in Lib/idlelib/PyShell.py. If there isn't anything obvious elsewhere, perhaps someone can try to hg bisect it on Windows.
msg140773 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-07-21 00:01
This nasty bug really can cause severe problems. If a zombie process ran a tkinter (tk) window, then attempting to logout/restart/shutdown eventually brings up a window I have never seen before: End Process -- EmbeddedMenuWindow. The message window shows a countdown timer that implies that it will shutdown the process automatically when the timer reaches 0. But it does not because the process does not respond. It will sit there indefinitely. One has to click a shutdown button to get rid of it.  Then a few seconds later, another window for the next zombie appears. And so on. It there are 50 zombies, then one would have to repeat 50 times. Not acceptible to me.

Since IDLE is my Python workspace and since I plan to soon start running lots of tkinter tests and experiments soon, I am reverting to 3.2.0 until there is a fixed Windows binary.

Adding the two listed Windows experts.
msg140775 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-07-21 02:07
Restart is not required to create a zombie. Just start IDLE and quit, and the second, user process does not disappear.

Reverting completely does not seem possible. I first just ran the 3.2 installer and it did not complain, that I noticed, about replacing a newer version, but apparently only replaced .exe and .dll and a few other files but not Lib/*.py. Idle or command prompt window said it was still 3.2.1 and the bug remained.

So I uninstalled, deleted everything left except my .py files in site-packages and another subdirectory, and reinstalled 3.2. Now everything seems to predate 2/22/2011. BUT IDLE and command prompt window *still* report "Python 3.2.1 (default, Jul 10 2011, 21:51:15) [MSC v.1500 32 bit (Intel)] on win32". This is sys.version. Something somewhere (registry?) seems to not be deleted by uninstall.  And the bug remains. Could a registry entry possibly affect this?

My system is 7 years old with updated win xp 32 bit.

I then checked the never updated 64 bit Py 3.2 on an 18 month old 64 bit Win 7 laptop and detached user processes *do* disappear as I remember on this machine. It did, however, take 8 sec over restart and 12 after closing, which is longer than I remember for my older and definitely slower machine.
msg140786 - (view) Author: Aneesh (Aneesh) Date: 2011-07-21 05:47
We are also noticed same issue and reverted to Python 3.1. Whenever we run a script, two new pythonw.exe process is started and it is really irritating to see all in Windows Task Manager. 

Last day I killed around 14 Pythonw.exe to clean up everything.
msg140872 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-22 12:10
Is the problem happening only on 64-bit Windows, or 32-bit as well?
msg140896 - (view) Author: Qiang Sun (sunqiang) Date: 2011-07-22 16:47
I can reproduce this in 32-bit Windows XP Pro. SP3
msg140900 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-07-22 18:05
Good question.
Peter, you said Vista x64. Are you running 32 or 64 bit Python?

My system with the apparently irreversible problem is 32 bit xp home.
I am reluctant to test on my daughter's 64 bit laptop as I do not know that I would be able to revert successfully and I want her to be able to write small programs for schoolwork without extra hassle.
msg140933 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-23 06:19
OK, I can reproduce the problem on a clean virtual machine running a pristine XP home SP2. No Python was previously installed.

Steps to reproduce:

1. Install Python 3.2.1 MSI from http://www.python.org/download/releases/3.2.1/
2. Run IDLE from start menu
3. Close IDLE
4. A pythonw.exe process stays running after IDLE has exited

Running steps 2-3 N times leaves N pythonw.exe processes alive
msg140938 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-23 06:45
Am I missing something, or is there no explicit command to kill the subprocess on Windows in PyShell.py

The kill_subprocess method (which does get invoked) of ModifiedInterpreter is:

    def kill_subprocess(self):
        try:
            self.rpcclt.close()
        except AttributeError:  # no socket
            pass
        self.unix_terminate()
        self.tkconsole.executing = False
        self.rpcclt = None

The subprocess is started with:

    self.rpcpid = os.spawnv(os.P_NOWAIT, sys.executable, args)

Could it be that in earlier versions this ensured the subprocess exited with its parent, but this somehow got modified?

Note that the same code existed in PyShell.py for ages, so it's unlikely that the culprit is there.
msg140948 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2011-07-23 08:27
Doesn't unix_terminate() also get called on Windows?  If so, what does os.kill() do on Windows?  The docs for os.kill say "New in version 3.2: Windows support."  Perhaps this was being skipped before and now has some negative effect?
msg140954 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-23 09:44
Indeed, unix_terminate is invoked on Windows, and since Windows now has "os.kill" it runs. However, it appears that the actual os.kill call throws OSError, saying:

    [Error 87] The parameter is incorrect
msg140965 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-23 10:45
Here's a simple reproducer for the same problem, without the context of IDLE. As far as I understand, what IDLE's spawn and then kill process are doing is:

  import os
  from signal import SIGTERM

  pid = os.spawnv(os.P_NOWAIT, "notepad.exe", ['notepad.exe'])
  print('pid = %s, SIGTERM = %s' % (pid, SIGTERM))
  os.kill(pid, SIGTERM)

Running this, the notepad.exe subprocess stays alive after the script exits, and I get the error:

  pid = 1868, SIGTERM = 15
  Traceback (most recent call last):
    File "k.py", line 6, in <module>
      os.kill(pid, SIGTERM)
  WindowsError: [Error 87] The parameter is incorrect
msg140968 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-07-23 10:47
Hmm, the docs say "Any other value for sig will cause the process to be unconditionally killed by the TerminateProcess API [...]"

What happens if you try to use other signals (like signal.SIGKILL) instead of SIGTERM?
msg140970 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-07-23 10:49
The other question is if it is an access control problem.

win32_kill tries to open the process with PROCESS_ALL_ACCESS, while IMO PROCESS_TERMINATE would suffice.
msg140971 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-23 11:05
Georg, I'm now debugging into win32_kill, and it's an error in OpenProcess, so this *could* be a security issue.

The process is started with _spawnv, so maybe this causes problems opening it with OpenProcess later.
msg140973 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-23 11:14
According to http://msdn.microsoft.com/en-us/library/7zt1y878%28v=vs.80%29.aspx, on Windows _spawnv in async mode (P_NOWAIT) returns the process _handle_, not the process ID.

win32_kill uses OpenProcess, passing it pid to obtain the handle, but this pid is already the process handle. 

Removing the whole call to OpenProcess in win32_kill and passing pid (instead of handle) directly to TerminateProcess, solves the problem.

----

So this appears to be a mismatch between os.spawnv and os.kill on windows. The fist returns the process handle, the second expects a process ID.

Note that the documentation of os.spawnv mentions something about this:

  If mode is P_NOWAIT, this function returns the process id of the new process; [...] On Windows, the process id will actually be the process handle, so can be used with the waitpid() function.
msg140974 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-07-23 11:21
Hmm, on the other hand there may be valid use cases for using os.kill() with a PID.  Argh.
msg140975 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-23 11:21
I don't think there's a problem with os.spawnv and os.kill - they do what their docs describe.

IMHO, the solution should be to change IDLE so that it uses subprocess.Popen for both starting and killing the child process.
msg141010 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-07-23 19:43
Eli, nice detective work. What I understand is that there was a latent platform-dependent buglet that presumably got exposed by a recent change in process handling, as Ned suggested.

idlelib/PyShell.py, class ModifiedInterpreter(InteractiveInterpreter) has
    def spawn_subprocess(self):
        if self.subprocess_arglist is None:
            self.subprocess_arglist = self.build_subprocess_arglist()
        args = self.subprocess_arglist
        self.rpcpid = os.spawnv(os.P_NOWAIT, sys.executable, args)

so IDLE expects the return to always be a pid which it is not.

Spawn_subprocess is called in both start_subprocess and restart_subprocess. Both now leave zombies on exit. I presume idlelib.run.main listens on the passed in port (in args) to make the connection. It appears to me that restart reuses the socket wrapped in self.rpcclt (rpc client).

Using subprocess.Popen seems like an good idea. The subprocess module is explicitly intended to replace low-level, fragile, difficult to get right, usage of os.spawn* and similar. If it does not work for this problem, *it* should be fixed.

On the other hand, IDLE uses sockets rather than pipes to communicate with subproccesses, perhaps because Windows pipes either are or were not as usable as unix pipes. Also, named or reusable pipes may not be usiversally available, so wrapping a pipe instead of a socket would, it seems to me, take more than simple replacement of spawnv by Popen.

Kurt, what do you think about possible fixes to this bug (critical for using IDLE on Windows)?
msg141037 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-24 03:05
Terry,

What still bugs me is that it isn't clear what change from 3.2 to 3.2.1 caused this problem to manifest.

Also, I'm not sure why you mention the sockets vs. pipes issue.
msg141060 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-07-24 22:09
I mentioned pipes because half of the subprocess chapter, it seems, talks about them. ASo I got the mis-impression that they are special for subprocess-started processes. But if the subprocess gets the args it needs to connect to a socket, it should not care how it is started.

Anyway, some experiments:

3.1.3 and 3.1.4 freshly installed do not seem to have the zombie problem. This seems to rule out the possibility that the problem is due to recent patches from Microsoft.

I redeleted 3.2 installation and re-installed 3.2. Sys.version still mistakingly say 3.2.1, July 10., so there is something different about the relationship between 3.2 and 3.2.1 and that between 3.1.3 and 3.1.4. And the 3.2 re-install has the zombie problem while I do not believe the fresh 3.2 install did. (And it does not on my other machine.) But I do not see how something stuck in the registry could affect process killing.

In the notepad example, changing

pid = os.spawnv(os.P_NOWAIT, 'C:/WINDOWS/notepad.exe', ['notepad.exe'])
# (full path needed if not in /windows) to
pid = subprocess.Popen(['C:/WINDOWS/notepad.exe']).pid

changed the 'pid' from a constant (across multiple runs) to a variable (across multiple runs) and changed the result of the kill from a zombie and exception to proper termination.

When I tried the same fix in idlelib/PyShell.py, adding 'import subprocess' and changing
        self.rpcpid = os.spawnv(os.P_NOWAIT, sys.executable, args)
to
        self.rpcpid = subprocess.Popen(args).pid
(args begins with sys.executable) IDLE failed to start. The only evidence that it had been invoked was a brief (1/4 second?) appearance of 1 pythonw process in task manager. On a subsequent tries, without touching the file, I do not see even that. Is there any obvious mistake in the above?
msg141330 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-29 06:04
Terry,

""" When I tried the same fix in idlelib/PyShell.py, adding 'import subprocess' and changing
        self.rpcpid = os.spawnv(os.P_NOWAIT, sys.executable, args)
to
        self.rpcpid = subprocess.Popen(args).pid
(args begins with sys.executable) IDLE failed to start. The only evidence that it had been invoked was a brief (1/4 second?) appearance of 1 pythonw process in task manager. On a subsequent tries, without touching the file, I do not see even that. Is there any obvious mistake in the above? """

No, when I do the same, things seem to go fine. No zombie is left running after IDLE is closed, and even "Restart shell" works without leaving a zombie.

Maybe you had other modifications in your idlelib sources? 

Anyway, this wouldn't be a complete fix, because in:

    def unix_terminate(self):
        "UNIX: make sure subprocess is terminated and collect status"
        if hasattr(os, 'kill'):
            try:
                os.kill(self.rpcpid, SIGTERM)
            except OSError:
                # process already terminated:
                return
            else:
                try:
                    os.waitpid(self.rpcpid, 0)
                except OSError:
                    return

os.waitpid on Windows also expects a process handle, not pid. 

I think the complete solution, in addition to replacing os.spawnv by subprocess.Popen, would be to use Popen.kill and then Popen.wait instead of os.kill and then os.wait in the code above. This would require keeping the Popen object somewhere in self.
msg141344 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-07-29 12:27
I was working with the freshly reinstalled 3.2 which is not the same as a pristine 3.2 install because it still had the problem that 3.2.1 has and the 3.2.1 sys.version. 3.2.1 uninstall in not complete (a different issue). So I should reinstall 3.2.1 again and try again.

I agree about fully switching t. subprocess. If self.rpcpid is not used anywhere other than unix_terminate (which I think should be renamed)), then self.subproc could replace self.rpcpid.
msg141420 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-30 04:31
Attaching an initial patch for Lib/idlelib/PyShell.py

It uses subprocess.Popen instead of spawn&kill, in the way discussed in earlier messages. 

As far as I can tell, IDLE opens and restarts shells successfully, without leaving zombies behind. I only tested it on Windows, however, and due to the lack of unit tests for idlelib there wasn't much verification done.
msg141422 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-07-30 04:49
I've now tested this on Ubuntu Linux as well. IDLE works, no zombies left behind.
msg141595 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2011-08-03 02:24
I've added a couple of review comments to the one Peter already made in Rietveld.  Here is an updated patch that addresses all of the comments.  I've tested in briefly on Windows and on OS X and it seems to work OK.  Eli, if you're OK with it, feel free to commit or I will do it.
msg141611 - (view) Author: Peter Caven (Peter.Caven) Date: 2011-08-03 14:34
Terry,  sorry about the delay in responding: I'm using 32bit Python. I haven't had a chance yet to try the 64 bit release.
msg141647 - (view) Author: Roundup Robot (python-dev) Date: 2011-08-05 06:39
New changeset cc86f4ca5020 by Ned Deily in branch '3.2':
Issue #12540: Prevent zombie IDLE processes on Windows due to changes
http://hg.python.org/cpython/rev/cc86f4ca5020

New changeset c2fd1ce1c6d4 by Ned Deily in branch 'default':
Issue #12540: Prevent zombie IDLE processes on Windows due to changes
http://hg.python.org/cpython/rev/c2fd1ce1c6d4
msg141648 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2011-08-05 06:43
With Eli's concurrence, I have applied the updated patch to 3.2 (for 3.2.2) and to default (for 3.3).
msg141656 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-08-05 11:51
On Fri, Aug 5, 2011 at 09:43, Ned Deily <report@bugs.python.org> wrote:

>
> Ned Deily <nad@acm.org> added the comment:
>
> With Eli's concurrence, I have applied the updated patch to 3.2 (for 3.2.2)
> and to default (for 3.3).
>
> ----------
>

Tested this on Windows XP with Python 3.2 installed into "Program Files".
Works fine (including shell restart).

It would be great to get more people to test it though, especially people
who ran into the problem originally - Peter? Terry? Anish? Qiang Sun?
msg141672 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-08-05 17:33
The replacement file, for anyone without a dev setup, is
http://hg.python.org/cpython/file/cc86f4ca5020/Lib/idlelib/PyShell.py

After renaming PyShell to PyShellBak and replacing with the above,
IDLE seems to run better than ever. On my XP system, the several second delay for the old process dying is gone and I never see 3 pythonw processes, even temporarily, as before. I can no longer tell from TaskManager that one process has been killed and another started. Testing included running tk windows. Will test later on new Win7 system.

Pending some unforeseen new problem, thank you Ned and Eli.
msg141810 - (view) Author: Aneesh (Aneesh) Date: 2011-08-09 08:17
I too retested this on a Windows 7 32 and 64 bit machines and is working fine when the provided PyShell.py is used.

As Terry mentioned, IDLE seems be running better. The process in Task Manager disappears quickly after I close the IDLE.
History
Date User Action Args
2011-12-22 06:18:22ned.deilylinkissue8093 superseder
2011-08-09 08:17:30Aneeshsetmessages: + msg141810
2011-08-05 17:33:33terry.reedysetmessages: + msg141672
2011-08-05 11:51:39eli.benderskysetfiles: + unnamed

messages: + msg141656
2011-08-05 06:43:12ned.deilysetstatus: open -> closed
messages: + msg141648

assignee: ned.deily
resolution: fixed
stage: commit review -> resolved
2011-08-05 06:39:19python-devsetnosy: + python-dev
messages: + msg141647
2011-08-03 14:34:32Peter.Cavensetmessages: + msg141611
2011-08-03 02:24:47ned.deilysetfiles: + issue12540_rev2.patch
versions: + Python 3.3
messages: + msg141595

keywords: + 3.2regression, patch, - buildbot
stage: needs patch -> commit review
2011-07-30 04:49:37eli.benderskysetkeywords: - patch

messages: + msg141422
2011-07-30 04:31:59eli.benderskysetfiles: + issue12540.1.patch
keywords: + patch
messages: + msg141420
2011-07-29 12:27:26terry.reedysetkeywords: + buildbot

messages: + msg141344
2011-07-29 07:50:05hayposetnosy: - haypo
2011-07-29 06:04:13eli.benderskysetmessages: + msg141330
2011-07-24 22:09:02terry.reedysetmessages: + msg141060
2011-07-24 03:05:26eli.benderskysetmessages: + msg141037
2011-07-23 19:43:17terry.reedysetnosy: loewis, georg.brandl, terry.reedy, kbk, haypo, tim.golden, ned.deily, eli.bendersky, brian.curtin, sunqiang, Peter.Caven, Aneesh
messages: + msg141010
2011-07-23 11:21:45eli.benderskysetmessages: + msg140975
2011-07-23 11:21:26georg.brandlsetmessages: + msg140974
2011-07-23 11:14:09eli.benderskysetmessages: + msg140973
2011-07-23 11:05:44eli.benderskysetmessages: + msg140971
2011-07-23 10:49:38georg.brandlsetmessages: + msg140970
2011-07-23 10:47:03georg.brandlsetmessages: + msg140968
2011-07-23 10:45:01eli.benderskysetmessages: + msg140965
2011-07-23 09:44:17eli.benderskysetmessages: + msg140954
2011-07-23 08:27:43ned.deilysetmessages: + msg140948
2011-07-23 06:45:05eli.benderskysetmessages: + msg140938
2011-07-23 06:19:53eli.benderskysetmessages: + msg140933
2011-07-22 18:05:16terry.reedysetmessages: + msg140900
2011-07-22 16:47:05sunqiangsetnosy: + sunqiang
messages: + msg140896
2011-07-22 12:10:11eli.benderskysetmessages: + msg140872
2011-07-21 05:47:30Aneeshsetnosy: + Aneesh
messages: + msg140786
2011-07-21 02:30:45eli.benderskysetnosy: + eli.bendersky
2011-07-21 02:08:01terry.reedysetmessages: + msg140775
2011-07-21 00:01:51terry.reedysetnosy: + tim.golden, brian.curtin
messages: + msg140773
2011-07-18 02:03:25ned.deilysetmessages: + msg140555
2011-07-16 10:02:58loewissetmessages: + msg140487
2011-07-16 10:02:01loewissetpriority: critical -> high

messages: + msg140486
2011-07-15 21:00:10georg.brandlsetnosy: + ned.deily
2011-07-15 20:53:12terry.reedysetpriority: normal -> critical

nosy: + terry.reedy, loewis, georg.brandl, haypo, kbk
messages: + msg140474

stage: needs patch
2011-07-12 09:21:40Peter.Cavencreate