classification
Title: test_subprocess.test_communicate_timeout_large_ouput failure on select(): negative timeout?
Type: behavior Stage: resolved
Components: Library (Lib), Tests Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, gregory.p.smith, haypo, neologix, pitrou, python-dev, rnk
Priority: normal Keywords: patch

Created on 2011-04-03 22:28 by pitrou, last changed 2013-08-02 08:57 by neologix. This issue is now closed.

Files
File name Uploaded Description Edit
select_negative_timeout.diff neologix, 2011-04-07 19:14 review
Messages (13)
msg132898 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-03 22:28
In http://www.python.org/dev/buildbot/all/builders/AMD64%20OpenIndiana%203.x/builds/904/steps/test/logs/stdio :

test test_subprocess failed -- Traceback (most recent call last):
  File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_subprocess.py", line 452, in test_communicate_timeout_large_ouput
    self.assertRaises(subprocess.TimeoutExpired, p.communicate, timeout=0.4)
  File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/unittest/case.py", line 574, in assertRaises
    callableObj(*args, **kwargs)
  File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/subprocess.py", line 846, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/subprocess.py", line 1539, in _communicate
    orig_timeout)
  File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/subprocess.py", line 1670, in _communicate_with_select
    self._remaining_time(endtime))
select.error: (22, 'Invalid argument')

Traceback (most recent call last):
  File "<string>", line 1, in <module>
IOError: [Errno 32] Broken pipe
msg132900 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-03 22:35
According to POSIX, EINVAL in select() means either "an invalid timeout interval was specified" or "the nfds argument is less than 0 or greater than FD_SETSIZE".

(http://www.opengroup.org/onlinepubs/007904875/functions/pselect.html)

Additionally, Python's select.select() should raise ValueError if a file descriptor is outside of [0; FD_SETSIZE[, so the invalid timeout interval seems possible.
msg132920 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-04 07:23
_remaining_time doesn't check that endtime > current time and can return a negative number, which would trigger an EINVAL when passed to select (select_select doesn't seem to check for negative double).
Note that a check is performed through _check_timeout but after having called select, so there are at least two possible ways to get this error:
The process blocks a little before calling select for the first time. This can at least happen here:
    if self.stdin and not self._communication_started:
        # Flush stdio buffer.  This might block, if the user has
        # been writing to .stdin in an uncontrolled fashion.
        self.stdin.flush()
        if not input:
            self.stdin.close()

There's also a short race window if the endtime deadline expires between the call to _check_timeout and remaining_time.
msg133018 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-05 11:16
New changeset 3664fc29e867 by Victor Stinner in branch 'default':
Issue #11757: subprocess ensures that select() and poll() timeout >= 0
http://hg.python.org/cpython/rev/3664fc29e867
msg133246 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-07 19:14
It seems to have fixed the failure, no ?
I don't know what's the policy regarding syscall parameters check, but
I think it'd be better to check that the timeout passed to select is
not negative, and raise an exception otherwise, instead of silently
storing it into struct timeval (with an overflow) before passing it to
select.
Attached is a patch + test that does just that.
msg133261 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2011-04-07 21:16
Adding that check with an exception to selectmodule.c is a good idea.
i like your patch.
msg133264 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-04-07 22:00
You may also patch poll_poll().
msg133291 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-08 05:34
> You may also patch poll_poll().
>

Poll accepts negative timeout values, since it's the only way to
specify an infinite wait (contrarily to select which can be passed
NULL).
msg133297 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-04-08 07:20
Le vendredi 08 avril 2011 à 05:34 +0000, Charles-Francois Natali a
écrit :
> Charles-Francois Natali <neologix@free.fr> added the comment:
> 
> > You may also patch poll_poll().
> >
> 
> Poll accepts negative timeout values, since it's the only way to
> specify an infinite wait (contrarily to select which can be passed
> NULL).

Oh, I didn't know. In this case, is my commit 3664fc29e867 correct? I
think that it is, because without the patch, subprocess may call poll()
with a negative timeout, and so it is no more a timeout at all.

If I am correct, it is a real bug. Should it be fixed in Python 2.7, 3.1
and 3.2? ... Hum, it looks like communicate() timeout was introduced in
Python 3.3: c4a0fa6e687c. This commit has no reference to an issue: it
is the issue #5673. And as it was already written in msg130851, the doc
is wrong: the doc indicates that the feature was introduced in 3.2, but
it is 3.3 only. The change is not documented in Misc/NEWS.
msg133406 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-09 18:13
> Oh, I didn't know. In this case, is my commit 3664fc29e867 correct? I
> think that it is, because without the patch, subprocess may call poll()
> with a negative timeout, and so it is no more a timeout at all.
>

Yes, it looks correct.
But I think there are a couple places left where functions can be
called with a negative timeout, for example here :

  1537                 stdout, stderr =
self._communicate_with_select(input, endtime,
  1538
orig_timeout)
  1539
  1540             self.wait(timeout=self._remaining_time(endtime))

or here:

  1113             if self.stdout is not None:
  1114                 self.stdout_thread.join(self._remaining_time(endtime))
  1115                 if self.stdout_thread.isAlive():

Also, it might be simpler and cleaner to factorize the raising of the
TimeoutExpired exception inside _remaining_time, instead of scattering
this kind of checks around the file:

  1514                     remaining = self._remaining_time(endtime)
  1515                     if remaining <= 0:
  1516                         raise TimeoutExpired(self.args, timeout)

merging what's done in _check_timeout
msg133427 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-09 21:50
New changeset 3982be773b54 by Antoine Pitrou in branch 'default':
Issue #11757: select.select() now raises ValueError when a negative timeout
http://hg.python.org/cpython/rev/3982be773b54
msg133497 - (view) Author: Reid Kleckner (rnk) (Python committer) Date: 2011-04-11 02:29
I think the best behavior would be to go ahead and check one last time before raising the exception, so _remaining_time should turn a negative value into 0 (assuming that a timeout value of zero does the right thing for our use case).

If people don't feel that is best, refactoring _remaining_time to incorporate the check in _check_timeout would also be good.
msg192377 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-07-05 23:32
Is there anything left to do here?
History
Date User Action Args
2013-08-02 08:57:03neologixsetstatus: pending -> closed
2013-07-05 23:32:46christian.heimessetstatus: open -> pending

nosy: + christian.heimes
messages: + msg192377

resolution: fixed
stage: needs patch -> resolved
2011-04-11 02:29:48rnksetmessages: + msg133497
2011-04-09 21:50:03python-devsetmessages: + msg133427
2011-04-09 18:13:32neologixsetmessages: + msg133406
2011-04-08 07:20:17hayposetmessages: + msg133297
2011-04-08 05:34:55neologixsetmessages: + msg133291
2011-04-07 22:00:32hayposetmessages: + msg133264
2011-04-07 21:16:20gregory.p.smithsetmessages: + msg133261
2011-04-07 19:14:52neologixsetfiles: + select_negative_timeout.diff
keywords: + patch
messages: + msg133246
2011-04-05 11:16:13python-devsetnosy: + python-dev
messages: + msg133018
2011-04-05 10:57:18hayposettitle: test_subprocess failure -> test_subprocess.test_communicate_timeout_large_ouput failure on select(): negative timeout?
2011-04-04 08:55:43hayposetnosy: + haypo
2011-04-04 07:23:33neologixsetnosy: + neologix
messages: + msg132920
2011-04-03 22:35:34pitrousetmessages: + msg132900
2011-04-03 22:28:24pitroucreate