classification
Title: Allow multiprocessing Pool initializer to return values
Type: enhancement Stage:
Components: Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: noxdafox, sbt
Priority: normal Keywords: patch

Created on 2013-10-06 20:14 by noxdafox, last changed 2013-10-21 20:28 by noxdafox.

Files
File name Uploaded Description Edit
pool_initializer.patch noxdafox, 2013-10-06 20:14 review
Messages (7)
msg199116 - (view) Author: Matteo Cafasso (noxdafox) Date: 2013-10-06 20:14
This patch allows the pool initializer function to return the initialized values. The returned values will be passed to the called function as first positional argument.

Previously the common pattern was to store the initialized objects into global variables making the code more difficult to manage.

The patch is not breaking any backward compatibility as the previous initializers were not supposed to return any value, if the initializer does not return anything the behavior is the same as usual.
msg199119 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-10-06 22:41
> the previous initializers were not supposed to return any value

Previously, any returned value would have been ignored.  But the documentation does not say that the function has to return None.  So I don't think we can assume there is no compatibility issue.
msg199131 - (view) Author: Matteo Cafasso (noxdafox) Date: 2013-10-07 06:53
I agree with your point, I've probably made my considerations too quickly.

The consideration was based on the fact that returning any value previously was a misuse (without consequences) of the initializer itself.

Now the misuse would be exposed by the new implementation, probably meeting the requirements that leds to the misuse itself.

Aim of the patch is to give an alternative to the use of global variables.
Global variables usage is a pattern which might lead to code errors and many developers discourage from following it.
I do believe that forcing such pattern in order to accomplish the desired goals is quite restrictive from an API.
msg199136 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-10-07 10:32
I think "misuse" is an exageration.  Various functions change some state and return a value that is usually ignored, e.g. os.umask(), signal.signal().

> Global variables usage is a pattern which might lead to code errors and many 
> developers discourage from following it.

What sort of code errors?  This really seems a stylistic point.  Maybe such developers would be happier using class methods and class variables rather than functions and globals variables.

Out of interest, what do you usually do in your initializer functions?
msg199331 - (view) Author: Matteo Cafasso (noxdafox) Date: 2013-10-09 18:56
On 07/10/13 13:32, Richard Oudkerk wrote:
> Richard Oudkerk added the comment:
>
> I think "misuse" is an exageration.  Various functions change some state and return a value that is usually ignored, e.g. os.umask(), signal.signal().
These functions are compliant with POSIX standards and the return values 
are actually useful, they return the previously set masks and handlers, 
often are ignored but in complex cases it's good to know their previous 
state.

The problem here is quite different, the interface is giving the 
opportunity of executing a function but it ignores the returned values, 
this is pretty limiting from an API point of view. It is quite 
counterintuitive and also not documented, proof is the amount of 
questions on how to use the initializer (just a couple of examples):
http://stackoverflow.com/questions/10117073/how-to-use-initializer-to-set-up-my-multiprocess-pool
http://stackoverflow.com/questions/9944370/use-of-initialize-in-python-multiprocessing-worker-pool
>
>> Global variables usage is a pattern which might lead to code errors and many
>> developers discourage from following it.
> What sort of code errors?  This really seems a stylistic point.  Maybe such developers would be happier using class methods and class variables rather than functions and globals variables.
http://c2.com/cgi/wiki?GlobalVariablesAreBad

It is a pretty common code practice to avoid global variables whenever 
possible; as always: is the way a tool is used to make it evil not the 
tool itself; yet I agree with the fact that a global variable change is 
hard to track down into the code and when the code grows it can lead to 
very tricky errors.
>
> Out of interest, what do you usually do in your initializer functions?
I mainly develop back-end systems which take great advantage from the 
Worker Pool pattern. We are talking about services which uses third 
party libraries to execute CPU bounded tasks trying to scale up with the 
number of CPU cores. Many of these libraries, unfortunately, are 
stateful (I would say "state-full") and their initialization is 
time-consuming.

Typically a worker initializes some of those objects (which currently 
are stored in global variables) and starts crunching some data, 
meanwhile the state of these objects keeps changing and here the global 
variables pattern shows its worst side.

>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue19185>
> _______________________________________
msg199339 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-10-09 19:59
> These functions are compliant with POSIX standards and the return values 
> are actually useful, they return the previously set masks and handlers, 
> often are ignored but in complex cases it's good to know their previous 
> state.

Yes.  But my point was that somebody might have used such a function as the initializer argument.  The proposed change would break a program which does

    with Pool(initializer=os.nice, initargs=(incr,)) as p:
        ...
msg200838 - (view) Author: Matteo Cafasso (noxdafox) Date: 2013-10-21 20:28
On 09/10/13 22:59, Richard Oudkerk wrote:
> Yes.  But my point was that somebody might have used such a function as the initializer argument.  The proposed change would break a program which does
>
>      with Pool(initializer=os.nice, initargs=(incr,)) as p:
>          ...
Indeed in cases like that the backward compatibility would break if the 
passed function is accepting a fixed amount of positional arguments.
History
Date User Action Args
2013-10-21 20:28:28noxdafoxsetmessages: + msg200838
2013-10-09 19:59:24sbtsetmessages: + msg199339
2013-10-09 18:56:03noxdafoxsetmessages: + msg199331
2013-10-07 10:32:58sbtsetmessages: + msg199136
2013-10-07 06:53:51noxdafoxsetmessages: + msg199131
2013-10-06 22:41:53sbtsetnosy: + sbt
messages: + msg199119
2013-10-06 20:14:30noxdafoxcreate