Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unpickling vs. __getattr__ #49620

Closed
mwm mannequin opened this issue Feb 25, 2009 · 11 comments
Closed

unpickling vs. __getattr__ #49620

mwm mannequin opened this issue Feb 25, 2009 · 11 comments
Assignees
Labels
docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@mwm
Copy link
Mannequin

mwm mannequin commented Feb 25, 2009

BPO 5370
Nosy @birkenfeld
Files
  • pickletest.py: Simple python 2.5/2.6/3.0 example showing problem.
  • pp
  • pickle.diff: Doc update
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/birkenfeld'
    closed_at = <Date 2009-04-05.14:40:15.376>
    created_at = <Date 2009-02-25.21:26:14.167>
    labels = ['type-bug', 'library', 'docs']
    title = 'unpickling vs. __getattr__'
    updated_at = <Date 2009-04-05.14:40:15.375>
    user = 'https://bugs.python.org/mwm'

    bugs.python.org fields:

    activity = <Date 2009-04-05.14:40:15.375>
    actor = 'georg.brandl'
    assignee = 'georg.brandl'
    closed = True
    closed_date = <Date 2009-04-05.14:40:15.376>
    closer = 'georg.brandl'
    components = ['Documentation', 'Library (Lib)']
    creation = <Date 2009-02-25.21:26:14.167>
    creator = 'mwm'
    dependencies = []
    files = ['13179', '13290', '13327']
    hgrepos = []
    issue_num = 5370
    keywords = ['patch']
    message_count = 11.0
    messages = ['82721', '82889', '82941', '82964', '83413', '83415', '83542', '83544', '83550', '83569', '85503']
    nosy_count = 3.0
    nosy_names = ['georg.brandl', 'mwm', 'ggenellina']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue5370'
    versions = ['Python 2.6', 'Python 3.0', 'Python 3.1', 'Python 2.7']

    @mwm
    Copy link
    Mannequin Author

    mwm mannequin commented Feb 25, 2009

    The attached short file causes all of python 2.5, 2.6 and 3.0 to drop
    into infinite recursion trying to unpickle the created object, but the
    2.5 version has the cleanest The problem appears to be that the unpickle
    code (C in 3.0; Python in 2.5 - I assume the others are similar) uses
    getattr to try and get the __setstate__ method (if any) of the opject
    being built before it's dictionary is populated. This causes the
    invocation of the __getattr__ method, which tries to use the
    non-existent "attrs" attribute, thus starting the recursion.

    A simple workaround is to provide this:

      def __setstate__(self, data):
          self.__dict__.update(data)

    but methinks that either 1) the interpreter shouldn't be falling into
    the infinite loop in this case, or 2) the pickle code should be using
    hasattr to check for the attribute (which won't trigger this problem)
    before trying getattr on it.

    @mwm mwm mannequin added the type-bug An unexpected behavior, bug, or error label Feb 25, 2009
    @ggenellina
    Copy link
    Mannequin

    ggenellina mannequin commented Feb 28, 2009

    The __getattr__ method, as written, assumes that an attribute 'args'
    already exists. That's unsafe - a more robust approach would be:

        def __getattr__(self, name):
            if 'attrs' in self.__dict__ and name in self.attrs:
                return self.attrs[name]
            raise AttributeError(name)

    The suggestion of using hasattr in pickle code doesn't work, because
    hasattr is implemented using getattr.

    I could not reproduce it with 2.6, nor the trunk -- there is another
    recursion problem involving __subclasscheck__ but unpickling is
    unaffected. 3.0 does show this problem. FWIW, 2.2 did not.

    @mwm
    Copy link
    Mannequin Author

    mwm mannequin commented Feb 28, 2009

    The args attribute gets created by __init__ and nothing in the class
    removes it. I don't think it's unreasonable for the class to expect the
    attribute to not vanish on it. Possibly it should be spelled __args (or
    declared private :-), but neither of those would make a difference in
    this case.

    Any feature access can be made more robust by checking for it in
    __dict__ first, but such a practice is neither practical nor pragmatic.
    It may be different for __getargs__, in which case the bug is in the
    documentation for __getargs__, which should mention this issue.

    @ggenellina
    Copy link
    Mannequin

    ggenellina mannequin commented Mar 1, 2009

    Perhaps this should be made more clear in the
    documentation for the pickle module. Probably here:

    http://docs.python.org/library/pickle.html#the-pickle-
    protocol

    Could you come with some enhancements?

    (Note that it already states that __init__ is not
    called when unpickling)

    @mwm
    Copy link
    Mannequin Author

    mwm mannequin commented Mar 10, 2009

    I don't believe in documenting bugs instead of fixing them. If this bug
    is going to stay in the code, I can either fix my install of Python to
    have non-broken Pickle modules, or I can fix every third-party libraries
    objects I use whose authors didn't worry about pickling them enough to
    find this. The fornmer is almost certainly easier, given a little time.
    Probably even easier than agreeing on proper document wording.

    The __init__ case is different - there isn't a common use case (this one
    comes from the standard library) for __init__ that breaks pickling, and
    the problems tend to manifest in the resulting pickles, not in the
    unpickling process.

    @mwm
    Copy link
    Mannequin Author

    mwm mannequin commented Mar 10, 2009

    QAD patch for 2.6 pickle.py to fix this bug. Passes the 2.6 pickle unit
    tests.

    @ggenellina
    Copy link
    Mannequin

    ggenellina mannequin commented Mar 13, 2009

    Are you sure you uploaded the right patch? I've not tested it, but I
    don't think this actually fixes the reported bug.

    __setstate__ is *very* unlikely to be found in the instance's dict, so
    you end up not calling __setstate__ at all.

    If no test failed, this may indicate bad test coverage.

    @mwm
    Copy link
    Mannequin Author

    mwm mannequin commented Mar 13, 2009

    True. But hat's why it was a QAD hack - all I did was make sure it didn't blow up on the
    test code, and that it passed the provided unit tests.

    It really needs to walk the class tree. So something like hasattr(inst.__class__,
    '__setstate__') is more likely to be correct.

    And yes, the pickle unit tests apparently don't test the __*state__ routines, but that's a
    different bug.

    @ggenellina
    Copy link
    Mannequin

    ggenellina mannequin commented Mar 14, 2009

    I think that trying to emulate all "getattr" details in the middle of
    unpickling (but omiting calls to __getattr__) isn't the right thing to
    do. What if in some cases __getattr__ (or __getattribute__) *does* the
    right thing, but pickle doesn't call it? Other people would be
    rightfully upset.

    There should be a warning note about __getattr__ in the pickle
    documentation, and people should be encouraged to write __getattr__ in
    a more robust way, if class instances are meant to be pickled.

    @ggenellina
    Copy link
    Mannequin

    ggenellina mannequin commented Mar 14, 2009

    This doc update should warn people about the need to implement
    __getinitargs__ / __getnewargs__ when the instance relies on some
    internal invariants that must be preserved, and reiterates the
    important fact that __init__ / __new__ are NOT normally invoked.

    The patch is against trunk; wording for 3.x should omit references to
    __getinitargs__ and __init__.

    @ggenellina ggenellina mannequin added docs Documentation in the Doc dir stdlib Python modules in the Lib dir labels Mar 14, 2009
    @ggenellina ggenellina mannequin assigned birkenfeld Mar 14, 2009
    @birkenfeld
    Copy link
    Member

    Updated docs in r71240.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant