classification
Title: kill_python sometimes fails to kill processes on buildbots
Type: behavior Stage:
Components: Windows Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, db3l, ocean-city, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2010-12-07 00:15 by db3l, last changed 2015-02-19 16:23 by steve.dower.

Files
File name Uploaded Description Edit
kill_python.log db3l, 2010-12-07 00:15
Messages (7)
msg123510 - (view) Author: David Bolen (db3l) Date: 2010-12-07 00:15
On the XP and Win7 buildbots, kill_python sometimes fails to kill hung processes.  I caught one instance recently and gathered some information though not yet enough to identify the issue.  I can say that no processes are killed and no error messages displayed.  I think that implies either a process ownership-related snapshot failure (which can exit without error) or a failure to identify the processes.

I noticed issue10136 and considered it might be related, but in testing I found cases where the exact same usage of kill_python as this failing case worked fine, whereas if it was a path mismatch problem I would expect it to fail consistently. 

I have attached a log showing the hung processes, attempt to use kill_python, and final successful operation with the pskill utility.

In this case it was important to restore the buildbot quickly, but if I can catch it again I'll try to add some debugging code to kill_python first.

One thing that confused me along the way is that kill_python is only run at the beginning of a build and not as part of the clean process.  So there are cases where I have hung processes around, but they turn out to be killable when the next build starts.  I'm wondering if kill_python shouldn't perhaps be used on every clean operation too.

-- David
msg123531 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-12-07 08:46
To kill python_d.exe, you should use kill_python_d.exe instead of
kill_python.exe.

> On the XP and Win7 buildbots, kill_python sometimes fails to kill hung
> processes.

Could you post the buildbot log url?
msg123532 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-12-07 08:47
I think #9973 is rather related.
msg123535 - (view) Author: David Bolen (db3l) Date: 2010-12-07 09:06
> To kill python_d.exe, you should use kill_python_d.exe instead of
> kill_python.exe.

Crud, I thought I did.  Well, ok, so can't trust this test.

> Could you post the buildbot log url?

I think this is the last build in the sequence that was failing until I killed the processes:  http://www.python.org/dev/buildbot/all/builders/x86%20Windows7%203.x/builds/2297  ... the processes had been around for maybe 45 hours at that point, but I can't find the last working run in the waterfall display (I get back to 2252 and then it sort of goes blank).

Maybe the main problem is just the sequencing ... the fact that kill_python is only used at the start of a build (but after the svn step) and not during the clean step at the end of a test run?  In this case since the svn step was failing it probably never got as far as running kill_python[_d].
msg123537 - (view) Author: David Bolen (db3l) Date: 2010-12-07 09:12
> I think #9973 is rather related.

Certainly could be another artifact of a python_d process still executing. In particular though, the suggested patch in that issue agrees with what I was thinking might be needed, in terms of moving kill_python_d over to clean.bat.

Can't think of any downsides of applying that change offhand, so if that were to be made, we could just then watch and wait to see if another hung condition occurred.
msg236217 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015-02-19 15:27
I don't recall seeing this flagged anywhere else so can we close it as out of date?
msg236223 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-02-19 16:23
Not sure if this still affects 2.7, but it doesn't affect 3.5 anymore. 

If one of the buildbot owners confirms that 2.7 is fine, I'll close.
History
Date User Action Args
2015-02-19 16:23:24steve.dowersetmessages: + msg236223
2015-02-19 15:27:44BreamoreBoysetnosy: + tim.golden, BreamoreBoy, zach.ware, steve.dower
messages: + msg236217
components: + Windows, - Build
2010-12-07 09:12:12db3lsetmessages: + msg123537
2010-12-07 09:06:23db3lsetmessages: + msg123535
2010-12-07 08:47:57ocean-citysetmessages: + msg123532
2010-12-07 08:46:07ocean-citysetnosy: + ocean-city
messages: + msg123531
2010-12-07 00:15:39db3lcreate