Message 224419 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	amrith
Recipients	amrith, gps, pitrou, r.david.murray, vstinner
Date	2014-07-31.15:36:06
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1406820967.14.0.537716756737.issue22114@psf.upfronthosting.co.za>
In-reply-to

Content
I see three comments, one from r.david.murray, one from haypo and one from pitrou. I'll try and address all three. r.david.murray: The code in question is in https://github.com/openstack/oslo-incubator/blob/master/openstack/common/processutils.py#L177-L189 note that we're catching EAGAIN and EINTR. I have not been able to isolate this down to a simple repro without the rest of this paraphernalia but I'm trying. So, we are 'catching' EAGAIN or EINTR here and we're trying to handle it to the best of our ability. However, if the underlying layer is not setup to handle a retry, our best efforts will be fruitless. That is what is happening here. The reason for this code (ignoring the retry of 20) was put in place exactly because a call to communicate() received an EAGAIN. The issue therefore is that in order for the higher level to properly handle this, communicate() should be setup to handle a second call, which it currently is not. haypo and pitrou: that may be true; I'm not competent to comment on that. But, as pointed out in earlier comment (and modulo this may be eventlet specific), just catching more exceptions isn't the answer. if the descriptor is closed, the thing that communicate/_communicate() call should be able to handle that situation. And this bug illustrates that at least eventlet doesn't handle that. However, I submit to you that this is NOT an eventlet issue. Here's why. The failure here is that a flush call is being attempted on a closed descriptor. I believe that the implementation of flush (in eventlet) is legitimately throwing an exception indicating that the state machine was violated (cannot flush on closed descriptor). The close() was invoked by subprocess.py after it finished doing what it thought it had to do with stdin on the first invocation. therefore I believe it must be the responsibility of subprocess.py to make sure that when invoked again, it doesn't step on itself. Either that, or subprocess.py's communicate() implementation should indicate that it can only be called once, capture all exceptions that would point a user to retry (such as EAGAIN and EINTR) and mask them and return some EFATAL.

I see three comments, one from r.david.murray, one from haypo and one from pitrou. I'll try and address all three.

r.david.murray:

The code in question is in https://github.com/openstack/oslo-incubator/blob/master/openstack/common/processutils.py#L177-L189

note that we're catching EAGAIN and EINTR.

I have not been able to isolate this down to a simple repro without the rest of this paraphernalia but I'm trying.

So, we are 'catching' EAGAIN or EINTR here and we're trying to handle it to the best of our ability. However, if the underlying layer is not setup to handle a retry, our best efforts will be fruitless.

That is what is happening here.

The reason for this code (ignoring the retry of 20) was put in place exactly because a call to communicate() received an EAGAIN.

The issue therefore is that in order for the higher level to properly handle this, communicate() should be setup to handle a second call, which it currently is not.

haypo and pitrou: that may be true; I'm not competent to comment on that.

But, as pointed out in earlier comment (and modulo this may be eventlet specific), just catching more exceptions isn't the answer.

if the descriptor is closed, the thing that communicate/_communicate() call should be able to handle that situation. And this bug illustrates that at least eventlet doesn't handle that.

However, I submit to you that this is NOT an eventlet issue. Here's why.

The failure here is that a flush call is being attempted on a closed descriptor. I believe that the implementation of flush (in eventlet) is legitimately throwing an exception indicating that the state machine was violated (cannot flush on closed descriptor).

The close() was invoked by subprocess.py after it finished doing what it thought it had to do with stdin on the first invocation. therefore I believe it must be the responsibility of subprocess.py to make sure that when invoked again, it doesn't step on itself.

Either that, or subprocess.py's communicate() implementation should indicate that it can only be called once, capture all exceptions that would point a user to retry (such as EAGAIN and EINTR) and mask them and return some EFATAL.

History
Date	User	Action	Args
2014-07-31 15:36:07	amrith	set	recipients: + amrith, pitrou, vstinner, gps, r.david.murray
2014-07-31 15:36:07	amrith	set	messageid: <1406820967.14.0.537716756737.issue22114@psf.upfronthosting.co.za>
2014-07-31 15:36:07	amrith	link	issue22114 messages
2014-07-31 15:36:06	amrith	create