This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Expand concurrent.futures.Future's public API
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: aeros Nosy List: Big Stone, aeros, bquinlan, gvanrossum, iforapsy, jakirkham, pitrou
Priority: normal Keywords:

Created on 2020-02-16 07:38 by aeros, last changed 2022-04-11 14:59 by admin.

Messages (10)
msg362047 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-02-16 07:38
Based on the following python-ideas thread: https://mail.python.org/archives/list/python-ideas@python.org/thread/LMTQ2AI6A7UXEFVHRGHKWD33H24FGM6G/#ICJKHZ4BPIUMOPIT2TDTBIW2EH4CPNCP.

In the above ML thread, the author proposed adding a new cf.SerialExecutor class, which seems to be not a great fit for the standard library (based on the current state of the discussion, as of writing this). But, Guido mentioned the following:

> IOW I'm rather lukewarm about this -- even if you (Jonathan) have found use for it, I'm not sure how many other people would use it, so I doubt it's worth adding it to the stdlib. (The only thing the stdlib might grow could be a public API that makes implementing this feasible without overriding private methods.)

Specifically, the OPs proposal should be reasonably possible to implement (either just locally for themselves or a potential PyPI package) with a few minor additions to cf.Future's public API:

1) Add a means of *publicly* accessing the future's state (future._state) without going through the internal condition's RLock.

This would allow the developer to implement their own condition or other synchronization primitive to access the state of the future. IMO, this would best be implemented as a separate ``future.state()`` and ``future.set_state()``. 

2) Add a means of *publicly* accessing the future's result (future._result) without going through the internal condition's RLock.

This would be similar to the above, but since there's already a ``future.result()`` and ``future.set_result()``, I think it would be best implemented as an optional *sync* parameter that defaults to True. When set to False, it directly accesses future._result without the condition; when set to True, it has the current behavior. 

3) Add public global constants for the different possible future states: PENDING, RUNNING, CANCELLED, CANCELLED_AND_NOTIFIED, and FINISHED. 

This would be useful to serve as a template of possible future states for custom implementations. I also find that ``fut.set_state(cf.RUNNING)`` looks better than ``fut.state("running")`` from an API design perspective. 

Optional addition: To make ``fut.state()`` and ``fut.set_state()`` more useful for general purposes, it could have a single *sync* boolean parameter (feel free to bikeshed over the name), which changes whether it directly accesses future._state or does so safely through the condition. Presumably, the documentation would explicitly state that with sync=False, the future's state will not be synchronized across separate threads or processes. This would also allow it to have the same API as ``future.result()`` and ``future.set_result()``.

Also, as for my own personal motivation in expanding upon the public API for cf.Future, I've found that directly accessing the state of the future can be incredibly useful for debugging purposes. I made significant use of it while implementing the new *cancel_futures* parameter for executor.shutdown(). But, since future._state is a private member, there's no guarantee that it will continue to behave the same or that it can be relied upon in the long-term. This may not be a huge concern for quick debugging sessions, but could easily result in breakage when used in logging or unit tests.
msg362110 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-02-16 22:33
I'll leave it to Brian and/or Antoine to review this. Good luck!
msg362163 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-02-17 23:22
Upon further consideration and based on recent developments in the python-ideas thread (mostly feedback from Antoine), I've decided to reduce the scope of this issue to remove future.set_state() and the *sync* parameters.

This leaves just future.state() and having the states as publicly accessible module-level constants.
msg363114 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-03-02 05:05
> This leaves just future.state() and having the states as publicly accessible module-level constants.

After reading over the python-ideas again and having some time to reflect, I think we can start with just adding future.state(), as that had the most value. Since future.state() is primarily intended for debugging/informational purposes as an approximation (similar to `queue.qsize()`) rather than something to be consistently relied upon, I don't see a strong practical use case for returning the states as enums (instead of a string).

If we consider adding a means to directly modify the state of the future in the future or providing an option to safely read the state of the future (through its RLock) later down the road, it may be worth considering. But not at the moment, IMO.

I'll update the name of the issue accordingly, and should have time to open a PR in the next few days. It should be rather straightforward, with the main emphasis being on the documentation to ensure that it clearly communicates the purpose of future.state(); so that users don't assume it's anything more than an approximation.
msg363117 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-03-02 05:27
I’m a bit disappointed, since it looks like this won’t allow implementing the OP’s classes without using private APIs. The debugging and loggin use cases aren’t very compelling to me.
msg363119 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-03-02 05:54
> I’m a bit disappointed, since it looks like this won’t allow implementing the OP’s classes without using private APIs. The debugging and loggin use cases aren’t very compelling to me.

Yeah, I had every intention when I initially proposed the idea on the python-ideas thread to provide an extensive means of implementing custom Future and Executor classes, but Antoine brought up a very valid concern about users shooting themselves in the foot with it:

> I'm much more lukewarm on set_state().  How hard is it to reimplement
> one's own Future if someone wants a different implementation?  By
> allowing people to change the future's internal state, we're also
> giving them a (small) gun to shoot themselves with.

In terms of a cost-benefit analysis, I'd imagine that it's going to be a rather small portion of the concurrent.futures users that would actually have a genuine use case for implementing their own custom Future or Executor. 

He was still approving of a `future.state()` though, which is why I considered implementing it alone:

> That sounds useful to me indeed.  I assume you mean something like a
> state() method?  We already have Queue.qsize() which works a bit like
> this (unlocked and advisory).

Although not nearly as significant (or interesting), I've personally encountered situations where it would be useful for logging purposes to be able to read the approximate state of the future. But if the consensus ends up being that it's not useful enough to justify adding compared to the original purpose of the issue, I would certainly understand.
msg363121 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-03-02 06:01
But note my response to Antoine at the time, mentioning that implementing ‘as_completed()’ is impossible that way. Antoine then backtracked somewhat.
msg363124 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-03-02 06:17
> But note my response to Antoine at the time, mentioning that implementing
> ‘as_completed()’ is impossible that way. Antoine then backtracked somewhat.

Ah, I had seen that but for some reason hadn't considered that Antoine might have also changed his stance on a means of modifying the state of the future:

> > It's actually really hard to implement your own
> > Future class that works
> > well with concurrent.futures.as_completed() -- this is basically what
> > complicated the OP's implementation. Maybe it would be useful to look into
> > a protocol to allow alternative Future implementations to hook into that?

> Ah, I understand the reasons then.  Ok, it does sound useful to explore
> the space of solutions.  But let's decouple it from simply querying the
> current Future state.

In that case, I'll revert the title and leave the issue open for further discussion; but I'll hold off on any PRs until we have some consensus regarding the direction we want to go in with regards to potential new future protocols. Apologies for the misunderstanding, thanks for clarifying. :)

I'd also be interested in hearing Brian Quinlain's thoughts on the matter.
msg363205 - (view) Author: Brian Quinlan (bquinlan) * (Python committer) Date: 2020-03-02 19:51
I'll try to take a look at this before the end of the week, but I'm currently swamped with other life stuff :-(
msg368440 - (view) Author: Big Stone (Big Stone) Date: 2020-05-08 14:19
it seems this feature would interest Dask team.
 https://github.com/dask/distributed/issues/3695
History
Date User Action Args
2022-04-11 14:59:26adminsetgithub: 83826
2020-06-12 21:00:12jakirkhamsetnosy: + jakirkham
2020-05-08 14:19:05Big Stonesetnosy: + Big Stone
messages: + msg368440
2020-04-24 01:14:15iforapsysetnosy: + iforapsy
2020-03-02 19:51:44bquinlansetmessages: + msg363205
2020-03-02 06:17:02aerossetmessages: + msg363124
title: Read approximate state of concurrent.futures.Future -> Expand concurrent.futures.Future's public API
2020-03-02 06:01:32gvanrossumsetmessages: + msg363121
2020-03-02 05:54:14aerossetmessages: + msg363119
2020-03-02 05:27:00gvanrossumsetmessages: + msg363117
2020-03-02 05:05:41aerossetmessages: + msg363114
title: Expand concurrent.futures.Future's public API -> Read approximate state of concurrent.futures.Future
2020-02-17 23:22:57aerossetmessages: + msg362163
2020-02-16 22:33:33gvanrossumsetmessages: + msg362110
2020-02-16 07:38:05aeroscreate