This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author josh.r
Recipients josh.r
Date 2019-08-14.16:58:35
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1565801916.64.0.686845199304.issue37852@roundup.psfhosted.org>
In-reply-to
Content
Inspired by this Stack Overflow question, where it prevented using multiprocessing.Pool.map with a private method: https://stackoverflow.com/q/57497370/364696

The __name__ of a private method remains the unmangled form, even though only the mangled form exists on the class dictionary for lookup. The __reduce__ for bound methods doesn't handle them private names specially, so it will serialize it such that on the other end, it does getattr(method.__self__, method.__func__.__name__). On deserializing, it tries to perform that lookup, but of course, only the mangled name exists, so it dies with an AttributeError.

Minimal repro:

import pickle

class Spam:
    def __eggs(self):
        pass
    def eggs(self):
        return pickle.dumps(self.__eggs)

spam = Spam()
pkl = spam.eggs()                       # Succeeds via implicit mangling (but pickles unmangled name)
pickle.loads(pkl)                       # Fails (tried to load __eggs

Explicitly mangling via pickle.dumps(spam._Spam__eggs) fails too, and in the same way.

A similar problem occurs (on the serializing end) when you do:

pkl = pickle.dumps(Spam._Spam__eggs)    # Pickling function in Spam class, not bound method of Spam instance

though that failure occurs at serialization time, because pickle itself tries to look up <module>.Spam.__eggs (which doesn't exist), instead of <module>.Spam._Spam__eggs (which does).

1. It fails at serialization time (so it doesn't silently produce pickles that can never be unpickled)
2. It's an explicit PicklingError, with a message that explains what it tried to do, and why it failed ("Can't pickle <function Spam.__eggs at 0xdeadbeef)>: attribute lookup Spam.__eggs on __main__ failed")

In the use case on Stack Overflow, it was the implicit case; a public method of a class created a multiprocessing.Pool, and tried to call Pool.map with a private method on the same class as the mapper function. While normally pickling methods seems odd, for multiprocessing, it's pretty standard.

I think the correct fix here is to make method_reduce in classobject.c (the __reduce__ implementation for bound methods) perform the mangling itself (meth_reduce in methodobject.c has the same bug, but it's less critical, since only private methods of built-in/extension types would be affected, and most of the time, such private methods aren't exposed to Python at all, they're just static methods for direct calling in C).

This would handle all bound methods, but for "unbound methods" (read: functions defined in a class), it might also be good to update save_global/get_deep_attribute in _pickle.c to make it recognize the case where a component of a dotted name begins with two underscores (and doesn't end with them), and the prior component is a class, so that pickling the private unbound method (e.g. plain function which happened to be defined on a class) also works, instead of dying with a lookup error.

The fix is most important, and least costly, for bound methods, but I think doing it for plain functions is still worthwhile, since I could easily see Pool.map operations using an @staticmethod utility function defined privately in the class for encapsulation purposes, and it seems silly to force them to make it more public and/or remove it from the class.
History
Date User Action Args
2019-08-14 16:58:36josh.rsetrecipients: + josh.r
2019-08-14 16:58:36josh.rsetmessageid: <1565801916.64.0.686845199304.issue37852@roundup.psfhosted.org>
2019-08-14 16:58:36josh.rlinkissue37852 messages
2019-08-14 16:58:35josh.rcreate