Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make dict.setdefault() atomic #57730

Closed
rhettinger opened this issue Dec 3, 2011 · 33 comments
Closed

Make dict.setdefault() atomic #57730

rhettinger opened this issue Dec 3, 2011 · 33 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@rhettinger
Copy link
Contributor

BPO 13521
Nosy @rhettinger, @jcea, @pitrou, @merwok, @meadori
Files
  • 13521.patch
  • 13521_2.patch
  • 13521_27_1.patch
  • 13521_27_3.patch
  • 13521_27_4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2012-02-27.00:06:32.686>
    created_at = <Date 2011-12-03.00:30:28.386>
    labels = ['interpreter-core', 'type-bug']
    title = 'Make dict.setdefault() atomic'
    updated_at = <Date 2012-02-27.00:06:32.685>
    user = 'https://github.com/rhettinger'

    bugs.python.org fields:

    activity = <Date 2012-02-27.00:06:32.685>
    actor = 'pitrou'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2012-02-27.00:06:32.686>
    closer = 'pitrou'
    components = ['Interpreter Core']
    creation = <Date 2011-12-03.00:30:28.386>
    creator = 'rhettinger'
    dependencies = []
    files = ['23917', '23918', '24035', '24244', '24652']
    hgrepos = []
    issue_num = 13521
    keywords = ['patch']
    message_count = 33.0
    messages = ['148782', '148784', '148815', '149142', '149164', '149195', '149245', '149246', '149247', '149249', '149255', '149324', '149328', '149329', '149330', '149332', '149333', '149335', '149350', '149367', '149375', '149785', '150878', '150886', '150908', '150912', '150914', '151288', '151296', '154384', '154421', '154422', '154423']
    nosy_count = 9.0
    nosy_names = ['rhettinger', 'jcea', 'pitrou', 'eric.araujo', 'gruszczy', 'meador.inge', 'python-dev', 'jcon', 'Ramchandra Apte']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue13521'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3']

    @rhettinger
    Copy link
    Contributor Author

    A dialog on python-dev suggests that dict.setdefault() was intended to be atomic and that a number of experienced developers have deployed code relying on its atomicity: http://mail.python.org/pipermail/python-3000/2007-July/008693.html

    The actual implementation shows a second call to PyDict_Setitem() which can call PyObject_Hash() which can call arbitrary Python code. http://hg.python.org/cpython/file/2.7/Objects/dictobject.c#l1937

    This fix is straight-forward, use the results of the initial lookup to insert the default object. This will make the operation atomic and it will make it faster.

    @rhettinger rhettinger added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Dec 3, 2011
    @pitrou
    Copy link
    Member

    pitrou commented Dec 3, 2011

    +1.

    @rhettinger rhettinger self-assigned this Dec 3, 2011
    @jcea
    Copy link
    Member

    jcea commented Dec 3, 2011

    Atomic and faster... +1.

    @RamchandraApte
    Copy link
    Mannequin

    RamchandraApte mannequin commented Dec 10, 2011

    +1 for atomic and more robust

    @merwok
    Copy link
    Member

    merwok commented Dec 10, 2011

    maniram: We try to keep bug reports focused on one thing, and we don’t generally collect votes (we prefer that people vote with patches, tests and messages with content—for example, saying exactly what “more robust” should be). Here I think Antoine and Jesús said +1 to agree that this should be changed even in stable versions.

    Raymond: I think I could make the patch, but I’m not sure about testing the new behavior.

    @jcea
    Copy link
    Member

    jcea commented Dec 10, 2011

    Eric, overload "__hash__()" and check that is only called once, while now would be called twice.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Dec 11, 2011

    I have written a patch and a test, but since it's changing C code, I am far from being sure if it's achieve the expected behavior in the right way. There are also tests and running whole test suite didn't bring any errors.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Dec 11, 2011

    Also: I'll be happy to work further on this patch, if I get some comments and advice.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 11, 2011

    If _PyDict_SetItemUsingHash is module-private, it should be declared "static". Also, better if it follows the usual naming of static functions inside that C file (i.e. "dict_some_lowercase_name").

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Dec 11, 2011

    Done.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 11, 2011

    The patch looks ok to me. I'll let Raymond make the final call.

    @merwok
    Copy link
    Member

    merwok commented Dec 12, 2011

    Thanks for the idea Jesús, even though I didn’t get the change to use it :)

    Filip: Is there a reasons for using a nonlocal count rather than e.g. self.count? Otherwise the test looks good.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 12, 2011

    Filip: Is there a reasons for using a nonlocal count rather than e.g.
    self.count? Otherwise the test looks good.

    Eric, please, could we stop such pointless nitpicking about one's
    stylistic preferences? "nonlocal" is as good as any other solution here,
    and reviewing is not supposed to be a pedantry contest. There are tons
    of serious issues to be solved where your energy would be better spent,
    IM(NSH)O.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Dec 12, 2011

    I haven't given it much thought, when I was making the choice of using nonlocal rather than self.count. I was rather excited to see, if the change will work as I wanted it to. If you believe it would be better to use an attribute, I'll be happy to change it.

    @jcea
    Copy link
    Member

    jcea commented Dec 12, 2011

    Well, if this is going to be applied to 2.7, "nonlocal" is invalid syntax there. I guess that being able to use the same test in 2.7 and 3.2/3.3 would be nice. Style, but justified.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 12, 2011

    Well, if this is going to be applied to 2.7, "nonlocal" is invalid
    syntax there.

    Ah, good point.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Dec 12, 2011

    I'll send a patch, when I get home from work.

    @rhettinger
    Copy link
    Contributor Author

    Let's just start with a working 2.7 patch and go from there.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Dec 12, 2011

    I have tried to port patch to python2.7, but apparently I must be doing something wrong, because while most tests pass, several fail (including tests for unittest itself).

    I would like to continue working on this test, but incoming week is pretty intensive for me. Is it ok if I returned to working on this during the weekend?

    @pitrou
    Copy link
    Member

    pitrou commented Dec 13, 2011

    I would like to continue working on this test, but incoming week is
    pretty intensive for me. Is it ok if I returned to working on this
    during the weekend?

    Of course. There's no rush.
    (I'm really not sure it should go in 2.7 and 3.2; this looks more like a feature than a bug, since atomicity has never been documented)

    @rhettinger
    Copy link
    Contributor Author

    Thanks, I'll look at it over the next few days.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Dec 18, 2011

    Patch for 2.7.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Jan 8, 2012

    Bump! There was no activity here for two weeks. Is my patch for 2.7 ok or should I do something more about it?

    @pitrou
    Copy link
    Member

    pitrou commented Jan 8, 2012

    Looking at the patch again, I think this isn't enough.
    setdefault() will still call the lookup routine twice which, in the general case (i.e. lookdict() not lookdict_unicode()), can call arbitrary Python code through e.g. __eq__ methods.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Jan 8, 2012

    I understand you are talking about this call:

    mp->ma_lookup(mp, key, hash);

    I haven't noticed that earlier. I'll try to provide a better fix (for 2.7 first, after we agree it's good enough, I will provide one for 3.3).

    Do you have any idea, how I might this part? I mean how to check, that this is called only once?

    @pitrou
    Copy link
    Member

    pitrou commented Jan 8, 2012

    Do you have any idea, how I might this part? I mean how to check, that
    this is called only once?

    Checking that __eq__ is called only once should be a good proxy.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Jan 8, 2012

    Thanks, I'll try that. Like before I will probably find time next weekend.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Jan 15, 2012

    No more double lookup.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 15, 2012

    No more double lookup.

    Your patch doesn't check hashed2.eq_count.
    Since the dict specification doesn't say on which instance eq will
    be called when doing a lookup, the patch should either check
    hashed1.eq_count + hashed2.eq_count, or make eq_count a class
    attribute.

    A nit: be careful not to use tabs in C files.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Feb 26, 2012

    Fixed both issues.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 26, 2012

    New changeset 5c52e7c6d868 by Antoine Pitrou in branch '2.7':
    Issue bpo-13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
    http://hg.python.org/cpython/rev/5c52e7c6d868

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 27, 2012

    New changeset 90572ccda12c by Antoine Pitrou in branch '3.2':
    Issue bpo-13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
    http://hg.python.org/cpython/rev/90572ccda12c

    New changeset 3dfa98cf2e26 by Antoine Pitrou in branch 'default':
    Issue bpo-13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
    http://hg.python.org/cpython/rev/3dfa98cf2e26

    @pitrou
    Copy link
    Member

    pitrou commented Feb 27, 2012

    Thank you Filip! I've now committed the patch.

    @pitrou pitrou closed this as completed Feb 27, 2012
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants