classification
Title: select module: loop if the timeout is too large (OverflowError "timeout is too large")
Type: Stage:
Components: asyncio Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: gvanrossum, haypo, neologix, python-dev, yselivanov, ziesemer
Priority: normal Keywords: patch

Created on 2014-02-03 00:37 by haypo, last changed 2017-04-19 04:50 by ziesemer.

Files
File name Uploaded Description Edit
timeout_overflow.py haypo, 2014-02-03 00:37
asyncio_timeout_overflow.patch haypo, 2014-02-03 00:40 review
Messages (10)
msg210061 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-02-03 00:37
In asyncio, if the next event is in 2^40 seconds, epoll.poll() raises an OverflowError because epoll_wait() maximum value for the timeout is INT_MAX seconds.

Test timeout_overflow.py to reproduce the issue.
msg210062 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-02-03 00:40
Attached patch fixes the issue, but it has no unit test :-(

On Windows, it looks like IocpProactor can also raise an error if the timeout is too large:

            # GetQueuedCompletionStatus() has a resolution of 1 millisecond,
            # round away from zero to wait *at least* timeout seconds.
            ms = math.ceil(timeout * 1e3)
            if ms >= INFINITE:
                raise ValueError("timeout too big")

with INFINITE = 0xffffffff.
msg210070 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-02-03 01:24
Shouldn't this be fixed in the C implementation of the select module or in selectors.py? It seems likely that the exact range might be different for each syscall and possibly per OS or even OS version.
msg210100 - (view) Author: Charles-Fran├žois Natali (neologix) * (Python committer) Date: 2014-02-03 08:48
> Shouldn't this be fixed in the C implementation of the select module or
in selectors.py? It seems likely that the exact range might be different
for each syscall and possibly per OS or even OS version.

Agreed: if we want to fix this, it should be done in the select module.

I'm saying "if", because we could either consider such a large timeout as
an error and report it (like it's currently done), or silently cap the
timeout.
The later approach is used by libevent, and makes sense, to a certain
extent (we just need to consider whether this can cause backward
compatibility issues).
msg210154 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-02-03 18:39
I guess whenever you have a timeout like that it's the product of a bad calculation in the app (wouldn't be the first time that someone multiplied by 1000 instead of dividing to go from milliseconds to seconds :-). So it would be good to catch this in call_later() / call_at() -- when it's caught in <selector>.select() the developer will have a harder time debugging which event had the bad time.

Is there a limit we can set on times that is clearly absurd when real times? E.g. past the year 9999? We should have two numbers: one value below which times will definitely be accepted (even if they may sound a bit absurd), another, higher value above which bad times will definitely be rejected. In between it may depend on the epoch of time.monotonic(), so I don't want to be too precise in the promises here.

I suppose all the related system calls have *some* limit, it's just that only poll and epoll internally use an integer (I suppose 64-bit?) expressing milliseconds, which is a little too close for comfort?

OK, I just played around with the various selector classes on OS X. I only tried values of t that were a power of 2 minus one (starting at 2**65 - 1).

For SelectSelector, I get OverflowError for t >= 18446744073709551615, OSError(EINVAL) for t >= 134217727.

For PollSelector, I get OverflowError for t >= 4194303, never OSError.

For KqueueSelector, I get OverflowError for t >= 18446744073709551615, OSError(EINVAL) for t >= 134217727.

Of all these, 4194303 is the smallest, it's only 2**22-1, i.e. 48 days in the future (and all I know is that 2**21-1 worked -- I don't know about values in between). But even 134217727 (2**27 - 1) is not that large, only about 4 years. I can easily see apps (e.g. calendars) manage real events that far in the future, knowing full well that they won't ever wait that long, but trying to treat all events uniformly.

I now think that the selector classes probably shouldn't have to deal with this (it can't really know when the syscall will raise OSError, and it shouldn't loop), but asyncio should be better behaved. Perhaps it should reject times that are close to the OverflowError limit in call_soon() / call_at(), but silently wait for a shorter period when the selector's select() raises OSError? (If it weren't for the really low limit with poll(), I'd just substitute None, expecting the process to die long before the event fires, but it's not unreasonable to expect a server process to stay up for months.)
msg211384 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-02-17 01:39
> Of all these, 4194303 is the smallest, it's only 2**22-1, i.e. 48 days in the future 

Maybe asyncio can uses a maximum timeout of 30 days? Or maybe even 1 day. Wake up every day to recompute the timeout should not kill the battery of your laptop or of your phone.
msg211474 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-02-18 04:12
For now, can we just add to the asyncio docs that timeouts shouldn't exceed one day? Then we can fix it later without breaking expectations.
msg211491 - (view) Author: Roundup Robot (python-dev) Date: 2014-02-18 08:37
New changeset 79e5bb0d9b8e by Victor Stinner in branch 'default':
Issue #20493: Document that asyncio should not exceed one day
http://hg.python.org/cpython/rev/79e5bb0d9b8e
msg213813 - (view) Author: Roundup Robot (python-dev) Date: 2014-03-17 06:30
New changeset 41c6c066feb2 by Victor Stinner in branch '3.4':
Issue #20493: Document that asyncio should not exceed one day
http://hg.python.org/cpython/rev/41c6c066feb2
msg291860 - (view) Author: Mark A. Ziesemer (ziesemer) Date: 2017-04-19 04:50
Not sure what may have changed here over the past 3 years, but some current findings:

For _UnixSelectorEventLoop, "/usr/lib/python3.5/selectors.py", line 445, in select, fd_event_list = self._epoll.poll(timeout, max_ev), Python 3.5.3 (or 3.6.1), Linux 4.10.0-19 x86_64 (or Cygwin 2.8.0):

2,147,483 is acceptable, 2,147,484 is not.  (Either side of 2**31/1000.)  With the seconds/milliseconds conversions, I suppose the previous testing here just didn't get this specific - but this is only 24.86 days, just ever so slightly above half of the "48 days" figure mentioned below.  So no, not even the "maximum timeout of 30 days" previously proposed would be sufficient - though the currently documented "one day" maximum is.

So if intending to use this as a serious scheduler, what to do?  I considered checking for values more than 86,400 (1 day) before scheduling - and if in excess, instead scheduling a proxy that would repeatedly reschedule the next interval as needed in < 1 day increments.  However, this would require some special handling of the asyncio.Handle's that are returned to cancel the callback - as a new handle would be required for each renewal.

It would seem that a better and simpler approach could simply be to ensure that a recurring "dummy" task (or heartbeat) is scheduled to run every 1 day or less.  As long as a another task is scheduled in the queue ahead of any tasks with "excessively long" delays, the longer delay will never be passed to poll() until it is reduced to within the smaller threshold.  I can do this within my own code - but this could maybe also be further considered to happen automagically within asyncio.  Am I missing any further considerations or gotchas here?

(See also: http://stackoverflow.com/questions/27129037/why-is-there-a-limit-on-delayed-calls-like-asyncio-call-later-to-not-exceed-one)
History
Date User Action Args
2017-04-19 04:50:28ziesemersetnosy: + ziesemer
messages: + msg291860
2014-06-19 15:49:31hayposettitle: asyncio: OverflowError('timeout is too large') -> select module: loop if the timeout is too large (OverflowError "timeout is too large")
2014-06-06 11:41:47hayposetnosy: + yselivanov
components: + asyncio
2014-03-17 06:30:55python-devsetmessages: + msg213813
2014-02-18 08:37:53python-devsetnosy: + python-dev
messages: + msg211491
2014-02-18 04:12:36gvanrossumsetmessages: + msg211474
2014-02-17 01:40:00hayposetmessages: + msg211384
2014-02-03 18:39:35gvanrossumsetmessages: + msg210154
2014-02-03 08:48:23neologixsetmessages: + msg210100
2014-02-03 01:24:35gvanrossumsetmessages: + msg210070
2014-02-03 00:40:40hayposetfiles: + asyncio_timeout_overflow.patch

nosy: + gvanrossum, neologix
messages: + msg210062

keywords: + patch
2014-02-03 00:37:50haypocreate