Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sched.cancel() doesn't distinguish equal events and can break order #63469

Closed
serhiy-storchaka opened this issue Oct 16, 2013 · 24 comments
Closed
Labels
3.10 only security fixes docs Documentation in the Doc dir easy type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 19270
Nosy @rhettinger, @pitrou, @giampaolo, @serhiy-storchaka, @MojoVampire, @bharel, @akulakov
PRs
  • bpo-19270: Fixed sched.scheduler.cancel to cancel correct event #22729
  • Files
  • sched_test_stable.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-11-22.10:43:52.847>
    created_at = <Date 2013-10-16.08:47:32.204>
    labels = ['easy', 'type-bug', '3.10', 'docs']
    title = "sched.cancel() doesn't distinguish equal events and can break order"
    updated_at = <Date 2021-11-22.10:43:52.846>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2021-11-22.10:43:52.846>
    actor = 'iritkatriel'
    assignee = 'docs@python'
    closed = True
    closed_date = <Date 2021-11-22.10:43:52.847>
    closer = 'iritkatriel'
    components = ['Documentation']
    creation = <Date 2013-10-16.08:47:32.204>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['32142']
    hgrepos = []
    issue_num = 19270
    keywords = ['patch', 'easy']
    message_count = 24.0
    messages = ['200040', '200048', '200698', '200708', '200711', '200714', '200715', '200717', '200719', '200788', '200798', '200799', '200828', '200830', '200831', '200832', '244346', '327404', '371895', '372139', '372142', '376985', '378930', '398628']
    nosy_count = 9.0
    nosy_names = ['rhettinger', 'pitrou', 'giampaolo.rodola', 'docs@python', 'serhiy.storchaka', 'josh.r', 'bar.harel', 'wocket', 'andrei.avk']
    pr_nums = ['22729']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue19270'
    versions = ['Python 3.10']

    @serhiy-storchaka
    Copy link
    Member Author

    sched.cancel() breaks events order if events are scheduled on same time and have same priority.

    Patch contains tests which expose this issue.

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Oct 16, 2013
    @serhiy-storchaka
    Copy link
    Member Author

    Actually there are two bugs:

    1. sched.cancel() can remove wrong event (because it uses equality instead identity).

    2. sched.cancel() change order of equal (by time and priority) events.

    @rhettinger
    Copy link
    Contributor

    1. sched.cancel() can remove wrong event
      (because it uses equality instead identity).

    I don't think this is a bug. The user should be able to define events that can be compared on equality. It is the user's responsibility to make sure that events are distinct (this is no different than how list.remove() works for example).

    1. sched.cancel() change order of equal (by time and priority) events.

    That is allowed. We make no stability guarantees. Plus it just makes sense that events with the same time and priority are non-deterministic.

    Note, to add stability we would have to store the order that events were added. This would slow down the code.

    Add note that Tornado uses heapq in its ioloop in the same way that the schedule module does. It too makes no effort to preserve input order.

    I recommend closing this as "not a bug" or as an "further document the existing behaviors".

    @serhiy-storchaka
    Copy link
    Member Author

    If the order of events with the same time and priority doesn't matter, we can optimize the queue property more than 10 times by using sort() (see cancel_4.patch in bpo-13451).

    @vstinner
    Copy link
    Member

    That is allowed. We make no stability guarantees. Plus it just makes sense that events with the same time and priority are non-deterministic.

    It would be nice to keep the insertion order. If it is not guaranteed, it must be well documented (big red warning).

    Does it occur frequently to schedule two events at exactly the same time? On Linux, clocks have a good precision, even time.monotonic(). On Windows, the precision of time.monotonic() is worse (around 16 ms), so two events at the same time is more likely.

    The issue is specific to events scheduled at an absolute time, right? Or does .enter() has a similar issue?

    @pitrou
    Copy link
    Member

    pitrou commented Oct 21, 2013

    Does it occur frequently to schedule two events at exactly the same
    time? On Linux, clocks have a good precision, even time.monotonic().

    It depends how you calculate your timestamps, I'd say :-) It's unlikely
    for two calls to time.time() to give the exact same outcome. OTOH, if
    you're using a fixed base time and add user-provided timedeltas to it,
    collisions are quite possible.

    (a special case is using a 0 delay, in order to schedule a call for
    the next loop iteration)

    @vstinner
    Copy link
    Member

    ... for two calls to time.time() to give the exact same outcome

    sched now uses time.monotonic() since Python 3.3. In Python 2.7, there is no default timer, but I get that users pick time.time(). (Oh, sched has no unit test in Python 2.7.)

    @pitrou
    Copy link
    Member

    pitrou commented Oct 21, 2013

    > ... for two calls to time.time() to give the exact same outcome

    sched now uses time.monotonic() since Python 3.3.

    This seems quite irrelevant:

      >>> len(set(time.monotonic() for i in range(100)))
      100

    @vstinner
    Copy link
    Member

    Victor: "On Windows, the precision of time.monotonic() is worse (around 16 ms), so two events at the same time is more likely."

    Antoine: "This seems quite irrelevant:"

    I would be interested of the same test on Windows.

    @rhettinger
    Copy link
    Contributor

    This module has been around for a long time and no users have reported any issues with respect to event ordering. The logic is essentially the same as is used in popular event loops like Tornado.

    Please don't just make-up a new invariant for this module.

    @serhiy-storchaka
    Copy link
    Member Author

    Well. In any case I have no ideas how fix it. Any behavior change will break existing code.

    I requalify this issue as documentation enhancement. The documentation should specify that evens scheduled on same tame with same priority considered equal. That cancel() doesn't distinguish equal events and can cancel arbitrary of them. That cancel() doesn't preserve order of equal events.

    @serhiy-storchaka serhiy-storchaka added docs Documentation in the Doc dir easy and removed stdlib Python modules in the Lib dir labels Oct 21, 2013
    @serhiy-storchaka serhiy-storchaka changed the title sched.cancel() breaks events order Document that sched.cancel() doesn't distinguish equal events and can break order Oct 21, 2013
    @serhiy-storchaka serhiy-storchaka added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Oct 21, 2013
    @vstinner
    Copy link
    Member

    @Raymond: The logic is essentially the same as is used in popular event loops like Tornado.

    Sorry, I didn't understand your sentence: do you mean that sched must keep the insertion order or the order is undefined?

    Undefined order is fine if it's documented.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 21, 2013

    I'm not strong on a new invariant, however I think bug #1 would deserve fixing:

    1. sched.cancel() can remove wrong event (because it uses equality instead identity).

    @serhiy-storchaka
    Copy link
    Member Author

    I'm not strong on a new invariant, however I think bug #1 would deserve fixing:

    How would you do this?

    1. Use identity instead equality to search canceled event. It will break code where an user cancels an event by time and priority: scheduler.cancel(Event(time, priority, ...)).

    2. Always cancel chronologically last (first) of equals events. This requires popp-ing and push-ing all events. Too slooooow.

    3. Add an ordered number to the event. This will slow down all too.

    4. Mixed strategy. First use identity search, then equality search, and only if found several equals events fallback to slow variant. This is too complicated. It will work as expected in most normal cases, but in rare cases... This behavior would hard to document and understand.

    If you have better idea, patch is welcome.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 21, 2013

    1. Mixed strategy. First use identity search, then equality search,
      and only if found several equals events fallback to slow variant. This
      is too complicated. It will work as expected in most normal cases, but
      in rare cases... This behavior would hard to document and understand.

    You're right. So perhaps this should simply be documented as is.

    @giampaolo
    Copy link
    Contributor

    This module has been around for a long time and no users have reported
    any issues with respect to event ordering. The logic is essentially
    the same as is used in popular event loops like Tornado.
    Please don't just make-up a new invariant for this module.

    +1

    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented May 28, 2015

    sched has been around for a long time, but it's been useless for so many purposes that it *should* handle (completely unsafe in threaded contexts until 3.3, still can't handle useful threaded scenarios today, e.g. scheduling tasks for short delays when draining the task queue, waiting on a task with a long delay, see bpo-20126 ) that calling it acceptable is more about lack of available uses than acceptable design.

    Saying "don't schedule two events for the same time and priority if expect to cancel either of them" is not a reasonable solution; the main reason to schedule two such events in that way would be to have the option to cancel one but not the other; as is, trying to cancel one will (pseudo-)randomly cancel either of them. I don't particularly care how it's fixed (though the proposed fix for bpo-13451 seems like a good one), and the issue with changing event order isn't so important, but cancelling the wrong event is really bad.

    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented Oct 9, 2018

    Victor: "I would be interested of the same test on Windows."

    Looks like someone performed it by accident, and filed bpo-34943 in response (because time.monotonic() did in fact return the exact same time twice in a row on Windows).

    @bharel
    Copy link
    Mannequin

    bharel mannequin commented Jun 19, 2020

    I've just encountered this bug as well.

    Although this issue is old, #1 should be fixed whether we like it or not. Cancelling an unintended event instead of the one we wanted is a bug, and prevents me from using the library at all (I schedule events based on absolute time).

    To be honest, I do not believe anyone uses scheduler.cancel(Event(time, priority, ...)) in order to cancel an event based on the given time and priority.

    Even if they do so, Event itself is undocumented and is being treated mostly as an ID for a currently scheduled event.

    A good solution would be to add a DeprecationWarning to the ordering functions, so people won't use them, and remove __eq__ at 3.12.

    The only issue that will arise is that an event1 might be >= than event2, <= than event 2, and != to event 2, as the other ordering will stay. We can fix this by implementing a slower lookup but I believe it isn't a real issue.

    @rhettinger
    Copy link
    Contributor

    Cancelling an unintended event instead of the one we wanted is a bug,
    and prevents me from using the library at all

    That problem is certainly worth fixing even if we have to break a few eggs to do it (this module has been around for a very long time and almost certainly every little detail is being relied on by some users).

    The docstring for enter() and enterabs() only promises that it returns an ID for an event and that the ID can be used to remove the event.

    The regular docs for enter() and enterabs() make a stronger promise that they return an "event" but doesn't specify what that means.

    The Event class name doesn't has a leading underscore but is not listed in __all__ and is not in the main docs.

    The *queue* property is documented to return an ordered list of upcoming events which are defined to be named tuples with fields for: time, priority, action, arguments, kwargs. That is the most specific and restrictive of the documented promises, one that presumably is being relied on in a myriad of ways (a repr suitable for logging, a pretty printable list, a sortable list, named access to fields, unpackable tuples, etc).

    If we respect those published promises, one way to go is to add a unique identifier field to the Event class. If the that unique id is consecutively numbered, it would also allow us to guarantee FIFO ordering (not sure whether that is actually need though).

    Here's a preliminary sketch:

    scheduler.__init__():
    + self.event_sequence = 0 # updated as events are created
    ...

       scheduler.enterabs():
    -       event = Event(time, priority, action, argument, kwargs)
    +       self.event_sequence += 1
    +       event = Event(time, priority, self.event_sequence, action, argument, kwargs)
    
    - class Event(namedtuple('Event', 'time, priority, action, argument, kwargs')):
    -    __slots__ = []
    -    def __eq__(s, o): return (s.time, s.priority) == (o.time, o.priority)
    -    def __lt__(s, o): return (s.time, s.priority) <  (o.time, o.priority)
    -    def __le__(s, o): return (s.time, s.priority) <= (o.time, o.priority)
    -    def __gt__(s, o): return (s.time, s.priority) >  (o.time, o.priority)
    -    def __ge__(s, o): return (s.time, s.priority) >= (o.time, o.priority)
    + Event = namedtuple('Event', 'time, priority, sequence, action, argument, kwargs')

    The user visible effects are:

    • A new field would be displayed.
    • Any code that constructs Event objects manually would break instantly.
    • Any code using tuple unpacking on Event objects would break instantly.
    • Slightly slower Event creation time.
    • Faster heap updates (because native tuple ordering is faster than pure python rich comparisons).

    However, outside the mention in the documentation for the *queue* property, Event objects are not part of the public API. Also, we do have precedent for adding new fields without a negative consequence (argument and kwargs were added in Python 3.3).

    An alternative solution for removing exact events (not for FIFO stability) is to have enter() and enterabs() return an opaque, unique event identifier. Then the code for cancel() would need to be changed to something like:

        self._queue = [e for e in self._queue if id(e) != event]
        heapq.heapify(self._queue)

    The user visible effects are:

    • enter() and enterabs() no longer return an event object
      (it was implied that they would, but not explictly promised).
      Any code relying on an actual Event object being returned
      would break.
    • cancel() would only accept the opaque IDs
      (again, it was only implied that actual Event objects would be used).
      Code would break that took Event objects out of the *queue* listing
      and passed those into cancel().
    • cancel() would run a bit faster (because calling id() is cheaper
      that calling the pure python rich comparisons and having them
      extract the time and priority fields).

    Either way breaks a few eggs but makes the module safer to use.

    @bharel
    Copy link
    Mannequin

    bharel mannequin commented Jun 23, 2020

    @Raymond your first idea sounds good and was the first thing that came to my mind.
    I only worried about breaking things, so I gave the more conservative suggestion.

    If breaking a few eggs isn't an issue and the implications of your idea are agreed upon, I'll patch and add a PR.

    @wocket
    Copy link
    Mannequin

    wocket mannequin commented Sep 16, 2020

    @bar.harel: I didn't find a PR, so I'd like to encourage you to submit one :-)

    I stumbled onto this bug when the scheduler would cancel the wrong event for me (Python 3.7, 3.8). Raymond's suggestion 1 sounds reasonable; it would be very unlikely to break code that doesn't depend on internals in sched, and it simplifies the implementation a bit.

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 5368c2b by Bar Harel in branch 'master':
    bpo-19270: Fixed sched.scheduler.cancel to cancel correct event (GH-22729)
    5368c2b

    @akulakov
    Copy link
    Contributor

    This can be closed as fixed.

    @iritkatriel iritkatriel added the 3.10 only security fixes label Nov 22, 2021
    @iritkatriel iritkatriel changed the title Document that sched.cancel() doesn't distinguish equal events and can break order sched.cancel() doesn't distinguish equal events and can break order Nov 22, 2021
    @iritkatriel iritkatriel added type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels Nov 22, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes docs Documentation in the Doc dir easy type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants