classification
Title: Document that sched.cancel() doesn't distinguish equal events and can break order
Type: enhancement Stage: patch review
Components: Documentation Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: bar.harel, docs@python, giampaolo.rodola, josh.r, pitrou, rhettinger, serhiy.storchaka, wocket
Priority: normal Keywords: easy, patch

Created on 2013-10-16 08:47 by serhiy.storchaka, last changed 2020-10-19 07:33 by serhiy.storchaka.

Files
File name Uploaded Description Edit
sched_test_stable.patch serhiy.storchaka, 2013-10-16 08:47 review
Pull Requests
URL Status Linked Edit
PR 22729 merged bar.harel, 2020-10-16 19:49
Messages (23)
msg200040 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-16 08:47
sched.cancel() breaks events order if events are scheduled on same time and have same priority.

Patch contains tests which expose this issue.
msg200048 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-16 10:01
Actually there are two bugs:

1. sched.cancel() can remove wrong event (because it uses equality instead identity).

2. sched.cancel() change order of equal (by time and priority) events.
msg200698 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-10-21 05:12
> 1. sched.cancel() can remove wrong event
> (because it uses equality instead identity).

I don't think this is a bug.  The user should be able to define events that can be compared on equality.  It is the user's responsibility to make sure that events are distinct (this is no different than how list.remove() works for example).

> 2. sched.cancel() change order of equal (by time and priority) events.

That is allowed.   We make no stability guarantees.  Plus it just makes sense that events with the same time and priority are non-deterministic.

Note, to add stability we would have to store the order that events were added.  This would slow down the code.

Add note that Tornado uses heapq in its ioloop in the same way that the schedule module does.  It too makes no effort to preserve input order.

I recommend closing this as "not a bug" or as an "further document the existing behaviors".
msg200708 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-21 07:49
If the order of events with the same time and priority doesn't matter, we can optimize the queue property more than 10 times by using sort() (see cancel_4.patch in issue13451).
msg200711 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-21 07:57
> That is allowed.   We make no stability guarantees.  Plus it just makes sense that events with the same time and priority are non-deterministic.

It would be nice to keep the insertion order. If it is not guaranteed, it must be well documented (big red warning).

Does it occur frequently to schedule two events at exactly the same time? On Linux, clocks have a good precision, even time.monotonic(). On Windows, the precision of time.monotonic() is worse (around 16 ms), so two events at the same time is more likely.

The issue is specific to events scheduled at an absolute time, right? Or does .enter() has a similar issue?
msg200714 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-21 08:14
> Does it occur frequently to schedule two events at exactly the same
> time? On Linux, clocks have a good precision, even time.monotonic().

It depends how you calculate your timestamps, I'd say :-) It's unlikely
for two calls to time.time() to give the exact same outcome. OTOH, if
you're using a fixed base time and add user-provided timedeltas to it,
collisions are quite possible.

(a special case is using a 0 delay, in order to schedule a call for
the next loop iteration)
msg200715 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-21 08:19
> ... for two calls to time.time() to give the exact same outcome

sched now uses time.monotonic() since Python 3.3. In Python 2.7, there is no default timer, but I get that users pick time.time(). (Oh, sched has no unit test in Python 2.7.)
msg200717 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-21 08:23
> > ... for two calls to time.time() to give the exact same outcome
> 
> sched now uses time.monotonic() since Python 3.3.

This seems quite irrelevant:

  >>> len(set(time.monotonic() for i in range(100)))
  100
msg200719 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-21 08:28
Victor: "On Windows, the precision of time.monotonic() is worse (around 16 ms), so two events at the same time is more likely."

Antoine: "This seems quite irrelevant:"

I would be interested of the same test on Windows.
msg200788 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-10-21 14:26
This module has been around for a long time and no users have reported any issues with respect to event ordering.  The logic is essentially the same as is used in popular event loops like Tornado.

Please don't just make-up a new invariant for this module.
msg200798 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-21 15:04
Well. In any case I have no ideas how fix it. Any behavior change will break existing code.

I requalify this issue as documentation enhancement. The documentation should specify that evens scheduled on same tame with same priority considered equal. That cancel() doesn't distinguish equal events and can cancel arbitrary of them. That cancel() doesn't preserve order of equal events.
msg200799 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-21 15:09
@Raymond: The logic is essentially the same as is used in popular event loops like Tornado.

Sorry, I didn't understand your sentence: do you mean that sched must keep the insertion order or the order is undefined?

Undefined order is fine if it's documented.
msg200828 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-21 18:53
I'm not strong on a new invariant, however I think bug #1 would deserve fixing:

> 1. sched.cancel() can remove wrong event (because it uses equality instead identity).
msg200830 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-21 19:43
> I'm not strong on a new invariant, however I think bug #1 would deserve fixing:

How would you do this?

1. Use identity instead equality to search canceled event. It will break code where an user cancels an event by time and priority: scheduler.cancel(Event(time, priority, ...)).

2. Always cancel chronologically last (first) of equals events. This requires popp-ing and push-ing all events. Too slooooow.

3. Add an ordered number to the event. This will slow down all too.

4. Mixed strategy. First use identity search, then equality search, and only if found several equals events fallback to slow variant. This is too complicated. It will work as expected in most normal cases, but in rare cases... This behavior would hard to document and understand.

If you have better idea, patch is welcome.
msg200831 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-21 19:46
> 4. Mixed strategy. First use identity search, then equality search,
> and only if found several equals events fallback to slow variant. This
> is too complicated. It will work as expected in most normal cases, but
> in rare cases... This behavior would hard to document and understand.

You're right. So perhaps this should simply be documented as is.
msg200832 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2013-10-21 19:49
> This module has been around for a long time and no users have reported 
> any issues with respect to event ordering.  The logic is essentially 
> the same as is used in popular event loops like Tornado.
> Please don't just make-up a new invariant for this module.

+1
msg244346 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2015-05-28 21:08
sched has been around for a long time, but it's been useless for so many purposes that it *should* handle (completely unsafe in threaded contexts until 3.3, still can't handle useful threaded scenarios today, e.g. scheduling tasks for short delays when draining the task queue, waiting on a task with a long delay, see #20126 ) that calling it acceptable is more about lack of available uses than acceptable design.

Saying "don't schedule two events for the same time and priority if expect to cancel either of them" is not a reasonable solution; the main reason to schedule two such events in that way would be to have the option to cancel one but not the other; as is, trying to cancel one will (pseudo-)randomly cancel either of them. I don't particularly care how it's fixed (though the proposed fix for #13451 seems like a good one), and the issue with changing event order isn't so important, but cancelling the wrong event is really bad.
msg327404 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2018-10-09 14:18
Victor: "I would be interested of the same test on Windows."

Looks like someone performed it by accident, and filed #34943 in response (because time.monotonic() did in fact return the exact same time twice in a row on Windows).
msg371895 - (view) Author: Bar Harel (bar.harel) * Date: 2020-06-19 17:12
I've just encountered this bug as well.

Although this issue is old, #1 should be fixed whether we like it or not. Cancelling an unintended event instead of the one we wanted is a bug, and prevents me from using the library at all (I schedule events based on absolute time).

To be honest, I do not believe anyone uses scheduler.cancel(Event(time, priority, ...)) in order to cancel an event based on the given time and priority.

Even if they do so, Event itself is undocumented and is being treated mostly as an ID for a currently scheduled event.

A good solution would be to add a DeprecationWarning to the ordering functions, so people won't use them, and remove __eq__ at 3.12.

The only issue that will arise is that an event1 might be >= than event2, <= than event 2, and != to event 2, as the other ordering will stay. We can fix this by implementing a slower lookup but I believe it isn't a real issue.
msg372139 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-06-23 04:27
> Cancelling an unintended event instead of the one we wanted is a bug,
> and prevents me from using the library at all

That problem is certainly worth fixing even if we have to break a few eggs to do it (this module has been around for a very long time and almost certainly every little detail is being relied on by some users).

The docstring for enter() and enterabs() only promises that it returns an ID for an event and that the ID can be used to remove the event.

The regular docs for enter() and enterabs() make a stronger promise that they return an "event" but doesn't specify what that means.

The Event class name doesn't has a leading underscore but is not listed in __all__ and is not in the main docs.

The *queue* property is documented to return an ordered list of upcoming events which are defined to be named tuples with fields for: time, priority, action, arguments, kwargs.  That is the most specific and restrictive of the documented promises, one that presumably is being relied on in a myriad of ways (a repr suitable for logging, a pretty printable list, a sortable list, named access to fields, unpackable tuples, etc).

If we respect those published promises, one way to go is to add a unique identifier field to the Event class.  If the that unique id is consecutively numbered, it would also allow us to guarantee FIFO ordering (not sure whether that is actually need though).

Here's a preliminary sketch:

   scheduler.__init__():
+      self.event_sequence = 0   # updated as events are created
        ...

   scheduler.enterabs():
-       event = Event(time, priority, action, argument, kwargs)
+       self.event_sequence += 1
+       event = Event(time, priority, self.event_sequence, action, argument, kwargs)

- class Event(namedtuple('Event', 'time, priority, action, argument, kwargs')):
-    __slots__ = []
-    def __eq__(s, o): return (s.time, s.priority) == (o.time, o.priority)
-    def __lt__(s, o): return (s.time, s.priority) <  (o.time, o.priority)
-    def __le__(s, o): return (s.time, s.priority) <= (o.time, o.priority)
-    def __gt__(s, o): return (s.time, s.priority) >  (o.time, o.priority)
-    def __ge__(s, o): return (s.time, s.priority) >= (o.time, o.priority)
+ Event = namedtuple('Event', 'time, priority, sequence, action, argument, kwargs')

The user visible effects are:

* A new field would be displayed.
* Any code that constructs Event objects manually would break instantly.
* Any code using tuple unpacking on Event objects would break instantly.
* Slightly slower Event creation time.
* Faster heap updates (because native tuple ordering is faster than pure python rich comparisons).

However, outside the mention in the documentation for the *queue* property, Event objects are not part of the public API.  Also, we do have precedent for adding new fields without a negative consequence (argument and kwargs were added in Python 3.3).

An alternative solution for removing exact events (not for FIFO stability) is to have enter() and enterabs() return an opaque, unique event identifier.  Then the code for cancel() would need to be changed to something like:

    self._queue = [e for e in self._queue if id(e) != event]
    heapq.heapify(self._queue)

The user visible effects are:

* enter() and enterabs() no longer return an event object
  (it was implied that they would, but not explictly promised).
  Any code relying on an actual Event object being returned
  would break.
* cancel() would only accept the opaque IDs
  (again, it was only implied that actual Event objects would be used).
  Code would break that took Event objects out of the *queue* listing
  and passed those into cancel().
* cancel() would run a bit faster (because calling id() is cheaper
  that calling the pure python rich comparisons and having them
  extract the time and priority fields).

Either way breaks a few eggs but makes the module safer to use.
msg372142 - (view) Author: Bar Harel (bar.harel) * Date: 2020-06-23 07:38
@Raymond your first idea sounds good and was the first thing that came to my mind.
I only worried about breaking things, so I gave the more conservative suggestion.

If breaking a few eggs isn't an issue and the implications of your idea are agreed upon, I'll patch and add a PR.
msg376985 - (view) Author: Jonas Norling (wocket) Date: 2020-09-16 10:40
@bar.harel: I didn't find a PR, so I'd like to encourage you to submit one :-)

I stumbled onto this bug when the scheduler would cancel the wrong event for me (Python 3.7, 3.8). Raymond's suggestion 1 sounds reasonable; it would be very unlikely to break code that doesn't depend on internals in sched, and it simplifies the implementation a bit.
msg378930 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-19 07:33
New changeset 5368c2b6e23660cbce7e38dc68f859c66ac349ee by Bar Harel in branch 'master':
bpo-19270: Fixed sched.scheduler.cancel to cancel correct event (GH-22729)
https://github.com/python/cpython/commit/5368c2b6e23660cbce7e38dc68f859c66ac349ee
History
Date User Action Args
2020-10-19 07:33:59serhiy.storchakasetmessages: + msg378930
2020-10-16 19:49:20bar.harelsetstage: needs patch -> patch review
pull_requests: + pull_request21694
2020-09-16 10:40:21wocketsetnosy: + wocket
messages: + msg376985
2020-06-23 07:38:23bar.harelsetmessages: + msg372142
2020-06-23 04:27:34rhettingersetmessages: + msg372139
2020-06-22 17:00:08vstinnersetnosy: - vstinner
2020-06-19 17:12:08bar.harelsetnosy: + bar.harel
messages: + msg371895
2018-10-09 14:18:09josh.rsetmessages: + msg327404
2015-05-28 21:08:43josh.rsetnosy: + josh.r
messages: + msg244346
2013-10-21 19:49:05giampaolo.rodolasetmessages: + msg200832
2013-10-21 19:46:15pitrousetmessages: + msg200831
2013-10-21 19:43:03serhiy.storchakasetmessages: + msg200830
2013-10-21 18:53:31pitrousetmessages: + msg200828
2013-10-21 15:09:20vstinnersetmessages: + msg200799
2013-10-21 15:05:32serhiy.storchakaunlinkissue13451 dependencies
2013-10-21 15:04:58serhiy.storchakasetassignee: docs@python
type: behavior -> enhancement

components: + Documentation, - Library (Lib)
title: sched.cancel() breaks events order -> Document that sched.cancel() doesn't distinguish equal events and can break order
keywords: + easy
nosy: + docs@python
versions: + Python 2.7, Python 3.3
messages: + msg200798
2013-10-21 14:26:33rhettingersetmessages: + msg200788
versions: - Python 2.7, Python 3.3
2013-10-21 08:28:11vstinnersetmessages: + msg200719
2013-10-21 08:23:53pitrousetmessages: + msg200717
2013-10-21 08:19:45vstinnersetmessages: + msg200715
2013-10-21 08:14:41pitrousetmessages: + msg200714
2013-10-21 07:57:09vstinnersetnosy: + vstinner
messages: + msg200711
2013-10-21 07:49:19serhiy.storchakasetmessages: + msg200708
2013-10-21 05:12:14rhettingersetmessages: + msg200698
2013-10-16 10:01:07serhiy.storchakasetmessages: + msg200048
2013-10-16 08:49:35serhiy.storchakalinkissue13451 dependencies
2013-10-16 08:47:32serhiy.storchakacreate