Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickling of typing types #77054

Closed
serhiy-storchaka opened this issue Feb 19, 2018 · 15 comments
Closed

Pickling of typing types #77054

serhiy-storchaka opened this issue Feb 19, 2018 · 15 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes deferred-blocker stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 32873
Nosy @gvanrossum, @ned-deily, @serhiy-storchaka, @ilevkivskyi, @miss-islington, @wrmsr
PRs
  • bpo-32873: Treat type variables and special typing forms as immutable by copy and pickle #6216
  • [3.7] bpo-32873: Treat type variables and special typing forms as immutable by copy and pickle (GH-6216) #6264
  • bpo-32873: Remove a name hack for generic aliases in typing module #6376
  • [3.7] bpo-32873: Remove a name hack for generic aliases in typing module (GH-6376) #6378
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-03-26.22:37:30.986>
    created_at = <Date 2018-02-19.10:46:45.002>
    labels = ['3.7', '3.8', 'deferred-blocker', 'type-bug', 'library']
    title = 'Pickling of typing types'
    updated_at = <Date 2018-04-05.00:46:42.937>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2018-04-05.00:46:42.937>
    actor = 'miss-islington'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-03-26.22:37:30.986>
    closer = 'levkivskyi'
    components = ['Library (Lib)']
    creation = <Date 2018-02-19.10:46:45.002>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 32873
    keywords = ['patch']
    message_count = 15.0
    messages = ['312346', '312354', '312355', '312357', '312377', '312952', '312953', '312954', '312955', '314482', '314483', '314952', '314955', '314963', '314964']
    nosy_count = 6.0
    nosy_names = ['gvanrossum', 'ned.deily', 'serhiy.storchaka', 'levkivskyi', 'miss-islington', 'wrmsr']
    pr_nums = ['6216', '6264', '6376', '6378']
    priority = 'deferred blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue32873'
    versions = ['Python 3.7', 'Python 3.8']

    @serhiy-storchaka
    Copy link
    Member Author

    In 3.6 typing types are pickled by names:

    >>> import pickle, pickletools, typing
    >>> pickletools.optimize(pickle.dumps(typing.List))
    b'\x80\x03ctyping\nList\n.'
    >>> pickletools.dis(pickletools.optimize(pickle.dumps(typing.List)))
        0: \x80 PROTO      3
        2: c    GLOBAL     'typing List'
       15: .    STOP
    highest protocol among opcodes = 2

    The side effect of this is that they are considered atomic by the copy module.

    In 3.7 the pickle data contains all private attributes.

    >>> pickletools.optimize(pickle.dumps(typing.List))
    b'\x80\x03ctyping\n_GenericAlias\n)\x81}(X\x05\x00\x00\x00_inst\x89X\x08\x00\x00\x00_special\x88X\x05\x00\x00\x00_nameX\x04\x00\x00\x00ListX\n\x00\x00\x00__origin__cbuiltins\nlist\nX\x08\x00\x00\x00__args__ctyping\nTypeVar\n)\x81q\x00}(X\x04\x00\x00\x00nameX\x01\x00\x00\x00TX\x05\x00\x00\x00boundNX\x0b\x00\x00\x00constraints)X\x02\x00\x00\x00co\x89X\x06\x00\x00\x00contra\x89ub\x85X\x0e\x00\x00\x00__parameters__h\x00\x85X\t\x00\x00\x00__slots__Nub.'
    >>> pickletools.dis(pickletools.optimize(pickle.dumps(typing.List)))
        0: \x80 PROTO      3
        2: c    GLOBAL     'typing _GenericAlias'
       24: )    EMPTY_TUPLE
       25: \x81 NEWOBJ
       26: }    EMPTY_DICT
       27: (    MARK
       28: X        BINUNICODE '_inst'
       38: \x89     NEWFALSE
       39: X        BINUNICODE '_special'
       52: \x88     NEWTRUE
       53: X        BINUNICODE '_name'
       63: X        BINUNICODE 'List'
       72: X        BINUNICODE '__origin__'
       87: c        GLOBAL     'builtins list'
      102: X        BINUNICODE '__args__'
      115: c        GLOBAL     'typing TypeVar'
      131: )        EMPTY_TUPLE
      132: \x81     NEWOBJ
      133: q        BINPUT     0
      135: }        EMPTY_DICT
      136: (        MARK
      137: X            BINUNICODE 'name'
      146: X            BINUNICODE 'T'
      152: X            BINUNICODE 'bound'
      162: N            NONE
      163: X            BINUNICODE 'constraints'
      179: )            EMPTY_TUPLE
      180: X            BINUNICODE 'co'
      187: \x89         NEWFALSE
      188: X            BINUNICODE 'contra'
      199: \x89         NEWFALSE
      200: u            SETITEMS   (MARK at 136)
      201: b        BUILD
      202: \x85     TUPLE1
      203: X        BINUNICODE '__parameters__'
      222: h        BINGET     0
      224: \x85     TUPLE1
      225: X        BINUNICODE '__slots__'
      239: N        NONE
      240: u        SETITEMS   (MARK at 27)
      241: b    BUILD
      242: .    STOP
    highest protocol among opcodes = 2

    Unpickling it creates a new object. And I'm not sure all invariants are satisfied.

    In additional to lesses efficiency and lost of preserving identity, such pickle can be incompatible with old Python versions and future Python versions if the internal representation of typing types will be changed.

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life 3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Feb 19, 2018
    @gvanrossum
    Copy link
    Member

    I think it would be nice it would be pickled by name so the pickles are
    compatible between Python versions.

    What would we do for List[int]?

    How are regular ABCs pickled?

    @ilevkivskyi
    Copy link
    Member

    Here is the situation for 3.6 and before:

    Generic classes are all actual class objects, so they are pickled as immutable. However this creates a problem, parameterized generics, such as List[int] cannot be pickled in 3.6 and before, see python/typing#511 (and this is something very hard to fix).

    Here is the situation for 3.7:

    Almost no generics are actual class objects, so they are pickled as usual. This also fixes the pickling problems in 3.6. However, there is one problematic thing, type variables, they should be pickled as immutable (i.e. by name reference), but I didn't have time to fix this, this is tracked in python/typing#512

    What is interesting this issue adds here is an idea that we can treat special typing aliases that are conceptually "unique" also as immutable. For example, typing.List will be pickled as "typing.List", while typing.List[int] will be pickled as _GenericAlias(<builtins.list>, args=(<builtins.int>,), ...)

    Conveniently, all the special typing aliases are already marked with _special=True, so the potential solution would be like this:

    class _GenericAlias:
        ...
        def __reduce__(self):
            if self._special:
                return 'typing.' + self._name
            return super().__reduce__()

    @serhiy-storchaka
    Copy link
    Member Author

    I think it would be better to pickle typing.List[int] as operator.getitem(typing.List, int).

        def __reduce__(self):
            if self._special:
                return self._name # __module__ = 'typing'
            index = self._args
            if len(index) == 1:
                index, = index
            return operator.getitem, (self._unparametrized, index)

    And there may be a special case for Union. I tried to implement this, but it seems to me that parametrized type doesn't have a reference to unparametrized type, and I don't know this code enough for writing idiomatic code.

    @gvanrossum
    Copy link
    Member

    I'm honestly not too concerned about what happens with List[int] (though
    doing a sensible thing here is not wrong :-), but I feel strongly that a
    pickle containing a reference to typing.List should be compatible between
    Python 3.6 and 3.7.

    @ned-deily
    Copy link
    Member

    So we need a decision on this about what, if anything, to do for 3.7. The 3.7.0 ABI freeze is in 3.7.0b3; it would be better to get it resolved for 3.7.0b2.

    @ilevkivskyi
    Copy link
    Member

    I am sick now, so can't work on this. There is a small chance I will be able to work on this issue this week. Is it possible to fix this in 3.7b3?

    @ned-deily
    Copy link
    Member

    Is it possible to fix this in 3.7b3?

    Yes. Get well first!

    @ilevkivskyi
    Copy link
    Member

    Thank you, Ned!

    @ilevkivskyi
    Copy link
    Member

    New changeset 8349403 by Ivan Levkivskyi in branch 'master':
    bpo-32873: Treat type variables and special typing forms as immutable by copy and pickle (GH-6216)
    8349403

    @miss-islington
    Copy link
    Contributor

    New changeset d0e04c8 by Miss Islington (bot) in branch '3.7':
    bpo-32873: Treat type variables and special typing forms as immutable by copy and pickle (GH-6216)
    d0e04c8

    @wrmsr
    Copy link
    Mannequin

    wrmsr mannequin commented Apr 4, 2018

    I believe I hit a bug with this fix (just pulled the code a few min ago):

    In [10]: pickle.loads(pickle.dumps(typing.List))
    Out[10]: typing.List
    
    In [11]: pickle.loads(pickle.dumps(typing.FrozenSet))
    \---------------------------------------------------------------------------
    
        PicklingError                             Traceback (most recent call last)
        <ipython-input-11-be060c6090e3> in <module>()
        ----> 1 pickle.loads(pickle.dumps(typing.FrozenSet))
    PicklingError: Can't pickle typing.Frozenset: attribute lookup Frozenset on typing failed
    

    The cause is in _GenericAlias.__init__

                name = orig_name[0].title() + orig_name[1:]

    Maybe just pass the name explicitly?

    For context I originally hit this trying to explicitly getattr(typing, alias_name) not by pickling but I'm pleased to see that's at least apparently intended to be valid use (I need to get the underlying special's parameter variance which is lost when you give it args).

    @ilevkivskyi
    Copy link
    Member

    Apparently there is another type with a similar problem -- DefaultDict.

    Will fix this now.

    @ilevkivskyi
    Copy link
    Member

    New changeset 2a363d2 by Ivan Levkivskyi in branch 'master':
    bpo-32873: Remove a name hack for generic aliases in typing module (GH-6376)
    2a363d2

    @miss-islington
    Copy link
    Contributor

    New changeset 04eac02 by Miss Islington (bot) in branch '3.7':
    bpo-32873: Remove a name hack for generic aliases in typing module (GH-6376)
    04eac02

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes deferred-blocker stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants