msg148782 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2011-12-03 00:30 |
A dialog on python-dev suggests that dict.setdefault() was intended to be atomic and that a number of experienced developers have deployed code relying on its atomicity: http://mail.python.org/pipermail/python-3000/2007-July/008693.html
The actual implementation shows a second call to PyDict_Setitem() which can call PyObject_Hash() which can call arbitrary Python code. http://hg.python.org/cpython/file/2.7/Objects/dictobject.c#l1937
This fix is straight-forward, use the results of the initial lookup to insert the default object. This will make the operation atomic and it will make it faster.
|
msg148784 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-12-03 01:27 |
+1.
|
msg148815 - (view) |
Author: Jesús Cea Avión (jcea) * |
Date: 2011-12-03 18:27 |
Atomic and faster... +1.
|
msg149142 - (view) |
Author: Ramchandra Apte (Ramchandra Apte) * |
Date: 2011-12-10 04:28 |
+1 for atomic and more robust
|
msg149164 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2011-12-10 16:08 |
maniram: We try to keep bug reports focused on one thing, and we don’t generally collect votes (we prefer that people vote with patches, tests and messages with content—for example, saying exactly what “more robust” should be). Here I think Antoine and Jesús said +1 to agree that this should be changed even in stable versions.
Raymond: I think I could make the patch, but I’m not sure about testing the new behavior.
|
msg149195 - (view) |
Author: Jesús Cea Avión (jcea) * |
Date: 2011-12-10 22:31 |
Eric, overload "__hash__()" and check that is only called once, while now would be called twice.
|
msg149245 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2011-12-11 19:53 |
I have written a patch and a test, but since it's changing C code, I am far from being sure if it's achieve the expected behavior in the right way. There are also tests and running whole test suite didn't bring any errors.
|
msg149246 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2011-12-11 19:53 |
Also: I'll be happy to work further on this patch, if I get some comments and advice.
|
msg149247 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-12-11 19:59 |
If _PyDict_SetItemUsingHash is module-private, it should be declared "static". Also, better if it follows the usual naming of static functions inside that C file (i.e. "dict_some_lowercase_name").
|
msg149249 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2011-12-11 20:03 |
Done.
|
msg149255 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-12-11 22:43 |
The patch looks ok to me. I'll let Raymond make the final call.
|
msg149324 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2011-12-12 16:09 |
Thanks for the idea Jesús, even though I didn’t get the change to use it :)
Filip: Is there a reasons for using a nonlocal count rather than e.g. self.count? Otherwise the test looks good.
|
msg149328 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-12-12 16:48 |
> Filip: Is there a reasons for using a nonlocal count rather than e.g.
> self.count? Otherwise the test looks good.
Eric, please, could we stop such pointless nitpicking about one's
stylistic preferences? "nonlocal" is as good as any other solution here,
and reviewing is not supposed to be a pedantry contest. There are tons
of serious issues to be solved where your energy would be better spent,
IM(NSH)O.
|
msg149329 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2011-12-12 16:53 |
I haven't given it much thought, when I was making the choice of using nonlocal rather than self.count. I was rather excited to see, if the change will work as I wanted it to. If you believe it would be better to use an attribute, I'll be happy to change it.
|
msg149330 - (view) |
Author: Jesús Cea Avión (jcea) * |
Date: 2011-12-12 16:55 |
Well, if this is going to be applied to 2.7, "nonlocal" is invalid syntax there. I guess that being able to use the same test in 2.7 and 3.2/3.3 would be nice. Style, but justified.
|
msg149332 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-12-12 17:09 |
> Well, if this is going to be applied to 2.7, "nonlocal" is invalid
> syntax there.
Ah, good point.
|
msg149333 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2011-12-12 17:17 |
I'll send a patch, when I get home from work.
|
msg149335 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2011-12-12 17:20 |
Let's just start with a working 2.7 patch and go from there.
|
msg149350 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2011-12-12 19:43 |
I have tried to port patch to python2.7, but apparently I must be doing something wrong, because while most tests pass, several fail (including tests for unittest itself).
I would like to continue working on this test, but incoming week is pretty intensive for me. Is it ok if I returned to working on this during the weekend?
|
msg149367 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-12-13 00:37 |
> I would like to continue working on this test, but incoming week is
> pretty intensive for me. Is it ok if I returned to working on this
> during the weekend?
Of course. There's no rush.
(I'm really not sure it should go in 2.7 and 3.2; this looks more like a feature than a bug, since atomicity has never been documented)
|
msg149375 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2011-12-13 02:26 |
Thanks, I'll look at it over the next few days.
|
msg149785 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2011-12-18 18:00 |
Patch for 2.7.
|
msg150878 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2012-01-08 16:50 |
Bump! There was no activity here for two weeks. Is my patch for 2.7 ok or should I do something more about it?
|
msg150886 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-01-08 17:58 |
Looking at the patch again, I think this isn't enough.
setdefault() will still call the lookup routine twice which, in the general case (i.e. lookdict() not lookdict_unicode()), can call arbitrary Python code through e.g. __eq__ methods.
|
msg150908 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2012-01-08 21:30 |
I understand you are talking about this call:
mp->ma_lookup(mp, key, hash);
I haven't noticed that earlier. I'll try to provide a better fix (for 2.7 first, after we agree it's good enough, I will provide one for 3.3).
Do you have any idea, how I might this part? I mean how to check, that this is called only once?
|
msg150912 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-01-08 21:42 |
> Do you have any idea, how I might this part? I mean how to check, that
> this is called only once?
Checking that __eq__ is called only once should be a good proxy.
|
msg150914 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2012-01-08 21:46 |
Thanks, I'll try that. Like before I will probably find time next weekend.
|
msg151288 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2012-01-15 14:01 |
No more double lookup.
|
msg151296 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-01-15 17:20 |
> No more double lookup.
Your patch doesn't check hashed2.eq_count.
Since the dict specification doesn't say on which instance __eq__ will
be called when doing a lookup, the patch should either check
``hashed1.eq_count + hashed2.eq_count``, or make eq_count a class
attribute.
A nit: be careful not to use tabs in C files.
|
msg154384 - (view) |
Author: Filip Gruszczyński (gruszczy) |
Date: 2012-02-26 20:40 |
Fixed both issues.
|
msg154421 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2012-02-26 23:48 |
New changeset 5c52e7c6d868 by Antoine Pitrou in branch '2.7':
Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
http://hg.python.org/cpython/rev/5c52e7c6d868
|
msg154422 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2012-02-27 00:05 |
New changeset 90572ccda12c by Antoine Pitrou in branch '3.2':
Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
http://hg.python.org/cpython/rev/90572ccda12c
New changeset 3dfa98cf2e26 by Antoine Pitrou in branch 'default':
Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
http://hg.python.org/cpython/rev/3dfa98cf2e26
|
msg154423 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-02-27 00:06 |
Thank you Filip! I've now committed the patch.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:24 | admin | set | github: 57730 |
2012-05-16 16:49:58 | pitrou | link | issue5730 superseder |
2012-02-27 00:06:32 | pitrou | set | status: open -> closed resolution: fixed messages:
+ msg154423
stage: patch review -> resolved |
2012-02-27 00:05:45 | python-dev | set | messages:
+ msg154422 |
2012-02-26 23:48:58 | python-dev | set | nosy:
+ python-dev messages:
+ msg154421
|
2012-02-26 20:40:41 | gruszczy | set | files:
+ 13521_27_4.patch
messages:
+ msg154384 |
2012-01-15 17:20:03 | pitrou | set | messages:
+ msg151296 |
2012-01-15 14:01:29 | gruszczy | set | files:
+ 13521_27_3.patch
messages:
+ msg151288 |
2012-01-08 21:46:34 | gruszczy | set | messages:
+ msg150914 |
2012-01-08 21:42:19 | pitrou | set | messages:
+ msg150912 |
2012-01-08 21:30:00 | gruszczy | set | messages:
+ msg150908 |
2012-01-08 17:58:43 | pitrou | set | messages:
+ msg150886 |
2012-01-08 16:50:39 | gruszczy | set | messages:
+ msg150878 |
2011-12-18 18:00:55 | gruszczy | set | files:
+ 13521_27_1.patch
messages:
+ msg149785 |
2011-12-13 02:26:57 | rhettinger | set | messages:
+ msg149375 |
2011-12-13 00:37:30 | pitrou | set | messages:
+ msg149367 |
2011-12-12 19:43:48 | gruszczy | set | messages:
+ msg149350 |
2011-12-12 17:20:10 | rhettinger | set | messages:
+ msg149335 |
2011-12-12 17:17:52 | gruszczy | set | messages:
+ msg149333 |
2011-12-12 17:09:34 | pitrou | set | messages:
+ msg149332 |
2011-12-12 16:55:40 | jcea | set | messages:
+ msg149330 |
2011-12-12 16:53:35 | gruszczy | set | messages:
+ msg149329 |
2011-12-12 16:48:16 | pitrou | set | messages:
+ msg149328 |
2011-12-12 16:09:19 | eric.araujo | set | messages:
+ msg149324 |
2011-12-11 22:43:20 | pitrou | set | messages:
+ msg149255 |
2011-12-11 20:03:27 | gruszczy | set | files:
+ 13521_2.patch
messages:
+ msg149249 |
2011-12-11 19:59:12 | pitrou | set | messages:
+ msg149247 stage: needs patch -> patch review |
2011-12-11 19:53:59 | gruszczy | set | messages:
+ msg149246 |
2011-12-11 19:53:36 | gruszczy | set | files:
+ 13521.patch
nosy:
+ gruszczy messages:
+ msg149245
keywords:
+ patch |
2011-12-10 22:31:30 | jcea | set | messages:
+ msg149195 |
2011-12-10 16:08:15 | eric.araujo | set | nosy:
+ eric.araujo
messages:
+ msg149164 versions:
- Python 3.1 |
2011-12-10 04:28:36 | Ramchandra Apte | set | nosy:
+ Ramchandra Apte messages:
+ msg149142
|
2011-12-10 04:01:31 | jcon | set | nosy:
+ jcon
|
2011-12-03 18:27:44 | jcea | set | nosy:
+ jcea messages:
+ msg148815
|
2011-12-03 18:02:26 | meador.inge | set | nosy:
+ meador.inge
stage: needs patch |
2011-12-03 02:15:57 | rhettinger | set | assignee: rhettinger |
2011-12-03 01:27:27 | pitrou | set | nosy:
+ pitrou messages:
+ msg148784
|
2011-12-03 00:30:28 | rhettinger | create | |