Message111124
> I thought the EOF errors would take care of that, at least this has
> been running in production on many platforms without that happening.
There are a lot of corner cases here, some more pedantic than others. For example, suppose a child dies while holding the queue read lock... that wouldn't trigger an EOF error anywhere. Would a child being OOM-killed raise an EOF error? (It very well could, but I seem to recall that it does not.)
I've said most of this before, but I still believe it's relevant, so here goes. In the context where I'm using this library, I'll often run jobs that should complete in O(10 minutes). I'll often start a job, realize I did something wrong and hit C-c (which could catch the workers anywhere). I've seen workers be OOM killed, silently dropping the tasks they had. As we've established, at the moment any of these failures results in a hang; I'd be very happy to see any sort of patch that improves my chances of seeing the program terminate in a finite amount of time. (And I'd be happiest if this is guaranteed.)
It's possible that my use case isn't supported... but I just want to make sure I've made clear how I'm using the library. Does that make sense?
> How would you shut down the pool then?
A potential implementation is in termination.patch. Basically, try to shut down gracefully, but if you timeout, just give up and kill everything.
> And why is that simpler?
It's a lot less code (one could write an even shorter patch that doesn't try to do any additional graceful error handling), doesn't add a new monitor thread, doesn't add any more IPC mechanism, etc.. FWIW, I don't see any of these changes as bad, but I don't feel like I have a sense of how generally useful they would be. |
|
Date |
User |
Action |
Args |
2010-07-21 21:36:52 | gdb | set | recipients:
+ gdb, jnoller, asksol |
2010-07-21 21:36:52 | gdb | set | messageid: <1279748212.17.0.628476102769.issue9205@psf.upfronthosting.co.za> |
2010-07-21 21:36:50 | gdb | link | issue9205 messages |
2010-07-21 21:36:49 | gdb | create | |
|