classification
Title: sched.py: speedup cancel() method
Type: performance Stage: patch review
Components: Library (Lib) Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: giampaolo.rodola, josiah.carlson, josiahcarlson, pitrou, rhettinger, serhiy.storchaka, stutzbach, vstinner
Priority: low Keywords: patch

Created on 2011-11-22 01:03 by giampaolo.rodola, last changed 2016-06-08 05:08 by serhiy.storchaka.

Files
File name Uploaded Description Edit
sched-cancel-speedup.patch giampaolo.rodola, 2011-11-22 01:03 review
cancel-later-approach.patch giampaolo.rodola, 2011-11-22 09:03 review
cancel.patch giampaolo.rodola, 2011-11-26 13:35 review
bench.py giampaolo.rodola, 2011-11-26 13:36
cancel_2.patch serhiy.storchaka, 2012-10-08 12:08 review
cancel3.patch giampaolo.rodola, 2013-10-15 20:13 review
cancel_4.patch serhiy.storchaka, 2013-10-15 21:37 review
cancel_4b.patch serhiy.storchaka, 2013-10-21 15:33 review
Messages (19)
msg148103 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2011-11-22 01:03
<snippet>
# bench.py
import sched, time
events = []
scheduler = sched.scheduler(time.time, time.sleep)
for x in range(4000):
    scheduler.enter(1, 1, lambda: None, ())
t = time.time()
for x in scheduler._queue:
    scheduler.cancel(x)
print(time.time() - t)
</snippet>

Before the patch:
9.433167934417725

After the patch:
1.3120810985565186


I have another approach in mind, which avoids removing the element from the queue immediately, and which should be an order of magnitude faster, but I'll provide that as a separate patch since it poses questions about API and backward compatibility.
msg148105 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-11-22 02:13
Can you post your other patch too?  I would like to review both at the same time.
msg148108 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2011-11-22 09:03
In attachment. 

Before the patch:
9.433167934417725

After the patch:
0.0016150474548339844

scheduler.queue and scheduler.empty should be modified in accordance (which I haven't done, it's just to give you an idea).
msg148403 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2011-11-26 13:35
New patch in attachment takes care of modifying empty() and queue property according with the new implementation.
With this, the API behaves the same as before (this was my main concern).
Also, it's smarter when it comes to cleaning up too many pending cancelled items:

if self._cancellations > 50 \
  and self._cancellations > (len(self._queue) >> 1):
     ...

Also, I made a little benchmark script (in attachment) to make sure that the speed of the rest of the API hasn't been significantly affected by this change:


BEFORE THE PATCH

test_cancel                    : time=0.66648 : calls=1 : stdev=0.00000
test_empty                     : time=0.00026 : calls=1 : stdev=0.00000
test_enter                     : time=0.00309 : calls=1 : stdev=0.00000
test_queue                     : time=6.20777 : calls=1 : stdev=0.00000
test_run                       : time=0.00746 : calls=1 : stdev=0.00000


AFTER THE PATCH

test_cancel                    : time=0.00054 : calls=1 : stdev=0.00000
test_empty                     : time=0.00031 : calls=1 : stdev=0.00000
test_enter                     : time=0.00375 : calls=1 : stdev=0.00000
test_queue                     : time=6.30314 : calls=1 : stdev=0.00000
test_run                       : time=0.00716 : calls=1 : stdev=0.00000
msg149292 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2011-12-12 11:50
Thread locks introduced in issue8684 should make this change more robust.
If this patch is reasonable, I'd like to commit it before the one in issue8684 for simplicity.
Raymond?
msg172377 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-08 12:08
In principle this is the right approach. But the time of enter() is increased by 20%. Here is updated and slightly optimized patch that restores enter() performance (but a little slow down cancel()). Because enter() is executed for each event and cancel() is not, and enter's gain greater cancel's loss, I think it is a very profitable exchange.
msg199978 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-15 06:26
Giampaolo, Raymond? What are your opinions?
msg199988 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-15 10:23
Serhiy, perhaps it would be useful to see if such optimizations can apply to Tulip's (or asyncio's) event loop, since it will probably be the new standard in 3.4.
msg200012 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-15 18:26
Yes, I will see.
msg200015 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2013-10-15 18:52
I don't have time to look into Serhiy's changes right now but here's a brief summary:

- there's a (I think) *minor* downside in terms of backward compatibility because scheduler._queue won't be updated after cancel() (basically this is the reason why this issue was waiting for Raymond's approval)

- the upside (other than the great speedup) is that I doubt anyone relies on that and scheduler._queue is not supposed to be used in the first place

- tulip's scheduler already provides something very similar to what this patch proposes so no action should be taken on that front


I personally think this should go in but I'd like to hear an OK from Raymond first.
msg200016 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2013-10-15 19:08
Sorry, I failed to notice there's a scheduler.queue property which exposes the underlying _queue attribute so the patch should take that into account and return the updated list.
msg200017 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2013-10-15 20:13
Patch in attachment applies cleanly with the current 3.4 code (last one wasn't) and returns an updated list on scheduler.queue.

I rebased my work starting from my original patch (cancel.patch) not 	Serhiy's because it wasn't clear to me *where* exactly the enter() speedup was introduced (enter() method apparently is untouched by cancel_2.patch).
msg200023 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-15 21:37
> it wasn't clear to me *where* exactly the enter() speedup was introduced

Constructing Event object. You introduced __init__().

Here is a patch which is based on my patch and new Giampaolo's patch. In additional it fixes a performance for the queue property (perhaps regression was introduced in issue18432).

Unpatched:

test_cancel                    : time=4.05863 : calls=1 : stdev=0.00000
test_empty                     : time=0.00499 : calls=1 : stdev=0.00000
test_enter                     : time=0.03537 : calls=1 : stdev=0.00000
test_queue                     : time=37.82003 : calls=1 : stdev=0.00000
test_run                       : time=0.05289 : calls=1 : stdev=0.00000

cancel3.patch:

test_cancel                    : time=0.00649 : calls=1 : stdev=0.00000
test_empty                     : time=0.00704 : calls=1 : stdev=0.00000
test_enter                     : time=0.03959 : calls=1 : stdev=0.00000
test_queue                     : time=45.34278 : calls=1 : stdev=0.00000
test_run                       : time=0.05477 : calls=1 : stdev=0.00000

cancel_4.patch:

test_cancel                    : time=0.00889 : calls=1 : stdev=0.00000
test_empty                     : time=0.00636 : calls=1 : stdev=0.00000
test_enter                     : time=0.03092 : calls=1 : stdev=0.00000
test_queue                     : time=3.93284 : calls=1 : stdev=0.00000
test_run                       : time=0.05294 : calls=1 : stdev=0.00000
msg200043 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-16 09:10
All patches have problem with stable order. Rehashifying change it. But there are even more serious problems with current code (see issue19270).
msg200792 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-21 14:51
In updated patch I have reverted queue optimization (this should be separated issue) and made some minor changes.
msg200802 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-21 15:33
Fixed a bug in previous patch.
msg219316 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-05-28 22:28
> Serhiy, perhaps it would be useful to see if such optimizations can apply to Tulip's (or asyncio's) event loop, since it will probably be the new standard in 3.4.

asyncio was designed differently. Cancelling a task doesn't remove it from a list of pending tasks. Cancelled tasks are just skipped when the event loop executes tasks.

If you look more closely, a "task" can be a Handle, Future or Task object. A Handle object has a _cancelled attribute, its cancel() method just sets this attribute to True. It's almost the same for a Future object. In the context of a Task object, cancel() is very different because it sends a CancelledError exception into the running code.

I see no possible optimization here.
msg228045 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-01 00:46
> I see no possible optimization here.

The asyncio was just optimized to handle cancellation of many callbacks, see issue #22448.
msg267780 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-08 05:08
Ping.
History
Date User Action Args
2016-06-08 05:08:52serhiy.storchakasetmessages: + msg267780
2014-10-01 00:46:50vstinnersetmessages: + msg228045
2014-05-28 22:28:54vstinnersetnosy: + vstinner
messages: + msg219316
2013-10-21 15:33:35serhiy.storchakasetfiles: + cancel_4b.patch

messages: + msg200802
2013-10-21 15:32:24serhiy.storchakasetfiles: - cancel_4a.patch
2013-10-21 15:05:32serhiy.storchakasetdependencies: - Document that sched.cancel() doesn't distinguish equal events and can break order
2013-10-21 14:51:30serhiy.storchakasetfiles: + cancel_4a.patch

messages: + msg200792
2013-10-16 09:10:25serhiy.storchakasetmessages: + msg200043
2013-10-16 08:49:35serhiy.storchakasetdependencies: + Document that sched.cancel() doesn't distinguish equal events and can break order
2013-10-15 21:37:46serhiy.storchakasetfiles: + cancel_4.patch

messages: + msg200023
2013-10-15 20:13:24giampaolo.rodolasetfiles: + cancel3.patch

messages: + msg200017
2013-10-15 19:08:36giampaolo.rodolasetmessages: + msg200016
2013-10-15 18:52:38giampaolo.rodolasetmessages: + msg200015
2013-10-15 18:26:39serhiy.storchakasetmessages: + msg200012
2013-10-15 10:23:01pitrousetnosy: + pitrou
messages: + msg199988
2013-10-15 06:26:56serhiy.storchakasetmessages: + msg199978
2012-10-24 09:34:07serhiy.storchakasetstage: patch review
2012-10-08 12:08:29serhiy.storchakasetfiles: + cancel_2.patch
nosy: + serhiy.storchaka
messages: + msg172377

2012-10-08 06:14:58giampaolo.rodolasetversions: + Python 3.4, - Python 3.3
2011-12-12 11:50:25giampaolo.rodolasetmessages: + msg149292
2011-11-26 13:36:03giampaolo.rodolasetfiles: + bench.py
2011-11-26 13:35:32giampaolo.rodolasetfiles: + cancel.patch
nosy: + josiahcarlson, josiah.carlson
messages: + msg148403

2011-11-22 09:03:42giampaolo.rodolasetfiles: + cancel-later-approach.patch

messages: + msg148108
2011-11-22 02:13:48rhettingersetpriority: normal -> low
messages: + msg148105

assignee: rhettinger
components: + Library (Lib)
type: performance
2011-11-22 01:03:42giampaolo.rodolacreate