Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickling of ipaddress classes #67322

Closed
serhiy-storchaka opened this issue Dec 30, 2014 · 10 comments
Closed

Pickling of ipaddress classes #67322

serhiy-storchaka opened this issue Dec 30, 2014 · 10 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 23133
Nosy @ncoghlan, @pitrou, @serhiy-storchaka
Files
  • ipaddress_pickle.patch
  • ipaddress_pickle_2.patch: Pickle addresses as ints
  • ipaddress_pickle_3.patch: + optimization
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-01-18.20:53:56.670>
    created_at = <Date 2014-12-30.10:52:49.594>
    labels = ['type-feature', 'library']
    title = 'Pickling of ipaddress classes'
    updated_at = <Date 2015-01-18.20:57:32.046>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2015-01-18.20:57:32.046>
    actor = 'python-dev'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-01-18.20:53:56.670>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2014-12-30.10:52:49.594>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['37564', '37727', '37728']
    hgrepos = []
    issue_num = 23133
    keywords = ['patch']
    message_count = 10.0
    messages = ['233201', '234118', '234121', '234123', '234124', '234264', '234270', '234274', '234276', '234277']
    nosy_count = 5.0
    nosy_names = ['ncoghlan', 'pitrou', 'pmoody', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue23133'
    versions = ['Python 3.5']

    @serhiy-storchaka
    Copy link
    Member Author

    Currently ipaddress classes support pickling, but the pickling is not efficient and is implementation depened. Proposed patch makes pickling more compact and implementation agnostic.

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Dec 30, 2014
    @pitrou
    Copy link
    Member

    pitrou commented Jan 16, 2015

    Patch looks good to me. For further efficiency, addresses could be pickled as ints (but beware of interfaces and networks).

    @serhiy-storchaka
    Copy link
    Member Author

    Yes, pickling (and especially unpickling) ints is more efficient, but the code will more complex. Interfaces should be pickled as strings for backward compatibility, and interfaces are subclasses of addresses.

    Here are microbenchmarks:

    ./python -m timeit -s "import ipaddress, pickle; ips = [ipaddress.ip_address('192.0.2.%s'%i) for i in range(1, 101)]" -- "pickle.dumps(ips)"
    ./python -m timeit -s "import ipaddress, pickle; ips = [ipaddress.ip_address('2001:db8::%x'%i) for i in range(1, 101)]" -- "pickle.dumps(ips)"
    ./python -m timeit -s "import ipaddress, pickle; ips = [ipaddress.ip_address('192.0.2.%s'%i) for i in range(1, 101)]; pickled = pickle.dumps(ips)" -- "pickle.loads(pickled)"
    ./python -m timeit -s "import ipaddress, pickle; ips = [ipaddress.ip_address('2001:db8::%x'%i) for i in range(1, 101)]; pickled = pickle.dumps(ips)" -- "pickle.loads(pickled)"

    Results for unpatched module:
    1000 loops, best of 3: 1.56 msec per loop
    1000 loops, best of 3: 1.62 msec per loop
    1000 loops, best of 3: 1.08 msec per loop
    1000 loops, best of 3: 1.09 msec per loop

    With ipaddress_pickle.patch:
    100 loops, best of 3: 3.43 msec per loop
    100 loops, best of 3: 10.6 msec per loop
    100 loops, best of 3: 7.76 msec per loop
    100 loops, best of 3: 8.58 msec per loop

    With ipaddress_pickle_2.patch:
    1000 loops, best of 3: 1.11 msec per loop
    1000 loops, best of 3: 1.16 msec per loop
    1000 loops, best of 3: 1.88 msec per loop
    100 loops, best of 3: 2.05 msec per loop

    With ipaddress_pickle_3.patch:
    1000 loops, best of 3: 1.12 msec per loop
    1000 loops, best of 3: 1.15 msec per loop
    1000 loops, best of 3: 1.13 msec per loop
    1000 loops, best of 3: 1.15 msec per loop

    @serhiy-storchaka
    Copy link
    Member Author

    ipaddress_pickle_3.patch breaks one test (testMissingAddressVersion). Is this test needed?

    @pitrou
    Copy link
    Member

    pitrou commented Jan 16, 2015

    I don't understand what the test is for. I think it's safe it's remove it.

    @serhiy-storchaka serhiy-storchaka self-assigned this Jan 16, 2015
    @serhiy-storchaka
    Copy link
    Member Author

    Then I'll remove it. Could you please make a review of optimized patch?

    @pitrou
    Copy link
    Member

    pitrou commented Jan 18, 2015

    The patch looks fine to me.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 18, 2015

    New changeset 781b54f7bccc by Serhiy Storchaka in branch 'default':
    Issue bpo-23133: Pickling of ipaddress objects now produces more compact and
    https://hg.python.org/cpython/rev/781b54f7bccc

    @serhiy-storchaka
    Copy link
    Member Author

    Thank you Antoine.

    And here is comparison of pickle size.

    Unpatched:
    >>> len(pickle.dumps([ipaddress.ip_address('192.0.2.%s'%i) for i in range(1, 101)]))
    2971
    >>> len(pickle.dumps([ipaddress.ip_address('2001:db8::%x'%i) for i in range(1, 101)]))
    4071
    >>> len(pickle.dumps([ipaddress.ip_interface('192.0.2.%s/27'%i) for i in range(1, 101)]))
    19341
    >>> len(pickle.dumps([ipaddress.ip_interface('2001:db8::%x/124'%i) for i in range(1, 101)]))
    22741
    >>> len(pickle.dumps([ipaddress.ip_network('192.0.2.%s/27'%(i&-32)) for i in range(1, 101)]))
    10614
    >>> len(pickle.dumps([ipaddress.ip_interface('2001:db8::%x/124'%(i&-32)) for i in range(1, 101)]))
    22741
    
    Patched:
    >>> len(pickle.dumps([ipaddress.ip_address('192.0.2.%s'%i) for i in range(1, 101)]))
    1531
    >>> len(pickle.dumps([ipaddress.ip_address('2001:db8::%x'%i) for i in range(1, 101)]))
    2631
    >>> len(pickle.dumps([ipaddress.ip_interface('192.0.2.%s/27'%i) for i in range(1, 101)]))
    2963
    >>> len(pickle.dumps([ipaddress.ip_interface('2001:db8::%x/124'%i) for i in range(1, 101)]))
    3256
    >>> len(pickle.dumps([ipaddress.ip_network('192.0.2.%s/27'%(i&-32)) for i in range(1, 101)]))
    2938
    >>> len(pickle.dumps([ipaddress.ip_interface('2001:db8::%x/124'%(i&-32)) for i in range(1, 101)]))
    3209

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 18, 2015

    New changeset 712ac77b772b by Serhiy Storchaka in branch 'default':
    Fixed tests for issue bpo-23133 (pickling of IPv4Network was not tested).
    https://hg.python.org/cpython/rev/712ac77b772b

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants