Issue14205
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012-03-05 21:19 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
dict_lookup.patch | vstinner, 2012-03-05 21:19 | review | ||
nomodify.patch | vstinner, 2012-03-05 21:20 | review |
Messages (29) | |||
---|---|---|---|
msg154976 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-05 21:19 | |
Lib/test/crashers/nasty_eq_vs_dict.py does crash Python because of an infinite loop in the dictionary lookup. The script modifies the dictionary at each lookup, whereas Python tries a new lookup each time that the dictionary is modified. I proposed to make the lookup fail with a RuntimeError if the dictionary has been modified during a lookup. It should not occur if you are not doing something evil. |
|||
msg154977 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-05 21:20 | |
Another possibility is to avoid the modification of a dictionary during a lookup: nomodify.patch. |
|||
msg154989 - (view) | Author: Mark Shannon (Mark.Shannon) * | Date: 2012-03-05 23:28 | |
I much prefer dict_lookup.patch to nomodify.patch. It doesn't increase the memory use of dict. One extra word per dict could be a lot of memory for a large application. There is a very subtle semantic change, but I think it is a positive one. Raising a runtimne seesm sensible as the dict iterators already raise a RuntimeError if the size of the dict changes. |
|||
msg154991 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-05 23:43 | |
> I much prefer dict_lookup.patch to nomodify.patch. > It doesn't increase the memory use of dict. One extra word > per dict could be a lot of memory for a large application. nomodify.patch is the correct fix, but I agree that dict_lookup.patch is better (and sufficient) in practive. > Raising a runtimne seesm sensible as the dict iterators already > raise a RuntimeError if the size of the dict changes. Yes, that's how I chose the exception. >>> d={k: str(k) for k in range(10)} >>> for k in d: ... del d[k] ... RuntimeError: dictionary changed size during iteration >>> d={k for k in range(10)} >>> for k in d: ... d.remove(k) ... RuntimeError: Set changed size during iteration >>> d=list(range(10)) >>> def keyfunc(x): ... d.append(1) ... return x ... >>> d.sort(key=keyfunc) ValueError: list modified during sort |
|||
msg154993 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2012-03-05 23:50 | |
Yeah, dict_lookup.patch seems fine. |
|||
msg154994 - (view) | Author: Roundup Robot (python-dev) | Date: 2012-03-06 00:13 | |
New changeset 934aaf2191d0 by Victor Stinner in branch 'default': Close #14205: dict lookup raises a RuntimeError if the dict is modified during http://hg.python.org/cpython/rev/934aaf2191d0 |
|||
msg155021 - (view) | Author: Jim Jewett (Jim.Jewett) * | Date: 2012-03-06 16:35 | |
Can't this be triggered by non-malicious code that just happened to have a python comparison and get hit with a thread switch? I'm not sure how often it happens, but today it would not be visible to the user; after the patch, users will see a sporadic failure that they can't easily defend against. Would it be worth adding a counter to lookdict, so that one modification is OK, but 5 aren't? |
|||
msg155022 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-06 16:54 | |
> Can't this be triggered by non-malicious code that just happened to have a python comparison and get hit with a thread switch? The issue was triggered without threads.If the __eq__ method of the objects used for keys use C functions releasing the GIL, you may trigger the issue. > Would it be worth adding a counter to lookdict, so that one modification is OK, but 5 aren't? If you implement a special type modifying the dict that contains the object, you should implement your own retry function on lookup failure (catch RuntimeError). Honestly, if you get RuntimeError, you should change your program, not retry the lookup! |
|||
msg155023 - (view) | Author: Mark Shannon (Mark.Shannon) * | Date: 2012-03-06 16:56 | |
Jim Jewett wrote: > Jim Jewett <jimjjewett@gmail.com> added the comment: > > Can't this be triggered by non-malicious code that just happened to have a python comparison and get hit with a thread switch? So, they are writing to a dict in one thread while reading from the same dict in another thread, without any external locks and with keys written in Python. > > I'm not sure how often it happens, but today it would not be visible to the user; after the patch, users will see a sporadic failure that they can't easily defend against. > I suspect, they are already seeing sporadic failures. I think raising an exception is better than weird failures. We should document the new behaviour, as it is a change in semantics. |
|||
msg155024 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-06 16:59 | |
> If the __eq__ method of the > objects used for keys use C functions releasing the GIL, you may > trigger the issue. Oh, I mean: trigger the issue with threads. I hope that your objects don't call C functions like open() in their __eq__() method! |
|||
msg155026 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2012-03-06 17:21 | |
On Tue, Mar 6, 2012 at 8:56 AM, Mark Shannon <report@bugs.python.org> wrote: > > Mark Shannon <mark@hotpy.org> added the comment: > > Jim Jewett wrote: >> Jim Jewett <jimjjewett@gmail.com> added the comment: >> >> Can't this be triggered by non-malicious code that just happened to have a python comparison and get hit with a thread switch? > > So, they are writing to a dict in one thread while reading from the same > dict in another thread, without any external locks and with keys written > in Python. >> >> I'm not sure how often it happens, but today it would not be visible to the user; after the patch, users will see a sporadic failure that they can't easily defend against. >> > I suspect, they are already seeing sporadic failures. > I think raising an exception is better than weird failures. > > We should document the new behaviour, as it is a change in semantics. +1 on everything you said. |
|||
msg155027 - (view) | Author: Jim Jewett (Jim.Jewett) * | Date: 2012-03-06 17:43 | |
On Tue, Mar 6, 2012 at 11:56 AM, Mark Shannon wrote: > Jim Jewett: >> Can't this be triggered by non-malicious code that just happened >> to have a python comparison and get hit with a thread switch? > So, they are writing to a dict in one thread while reading from the > same dict in another thread, without any external locks and with > keys written in Python. Correct. For example, it could be a configuration manager, or a cache, or even a worklist. If they're just adding new keys, or even deleting other (==> NOT the one being looked up) keys, why should that keep them from finding the existing, unchanged keys? >> I'm not sure how often it happens, but today it would not be visible >> to the user; after the patch, users will see a sporadic failure that >> they can't easily defend against. > I suspect, they are already seeing sporadic failures. How? The chain terminates as soon as the dict doesn't resize; without malicious intent, the odds of hitting several resizes in a row are so miniscule that it probably hasn't even slowed them down. |
|||
msg155029 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2012-03-06 18:05 | |
On Tue, Mar 6, 2012 at 9:43 AM, Jim Jewett <report@bugs.python.org> wrote: > > Jim Jewett <jimjjewett@gmail.com> added the comment: > > On Tue, Mar 6, 2012 at 11:56 AM, Mark Shannon wrote: > >> Jim Jewett: >>> Can't this be triggered by non-malicious code that just happened >>> to have a python comparison and get hit with a thread switch? > >> So, they are writing to a dict in one thread while reading from the >> same dict in another thread, without any external locks and with >> keys written in Python. > > Correct. For example, it could be a configuration manager, or a > cache, or even a worklist. If they're just adding new keys, or even > deleting other (==> NOT the one being looked up) keys, why should that > keep them from finding the existing, unchanged keys? Use a lock or a built-in key class. I realize that neither is ideal, but then, neither was the old situation. >>> I'm not sure how often it happens, but today it would not be visible >>> to the user; after the patch, users will see a sporadic failure that >>> they can't easily defend against. > >> I suspect, they are already seeing sporadic failures. > > How? > > The chain terminates as soon as the dict doesn't resize; without > malicious intent, the odds of hitting several resizes in a row are so > miniscule that it probably hasn't even slowed them down. Now I'm torn. If we'd have this RuntimeError from the start, would we consider it a flaw in the dict implementation or a feature? The RuntimeError when changing a dict's size while iterating over it is definitely a feature (so as to allow the implementation to rehash everything upon insert/delete). But this is not quite the same. Or is it? On the one hand I think the scenario is pretty unlikely (mostly because it involves a user-defined comparison); OTOH it would be quite nasty to debug. Or would it? You do get a decent error message... Note that Victor's alternate fix (nomodify.diff) has the same problem -- it just raises RuntimeError in the mutating thread rather than in the lookup thread. |
|||
msg155037 - (view) | Author: Jim Jewett (Jim.Jewett) * | Date: 2012-03-06 20:33 | |
On Tue, Mar 6, 2012 at 1:05 PM, Guido van Rossum Jim Jewett: >>... If they're just adding new keys, or even deleting other >> (==> NOT the one being looked up) keys, why should that >> keep them from finding the existing, unchanged keys? >> ... The chain terminates as soon as the dict doesn't resize; without >> malicious intent, the odds of hitting several resizes in a row are so >> miniscule that it probably hasn't even slowed them down. > Now I'm torn. If we'd have this RuntimeError from the start, would we > consider it a flaw in the dict implementation or a feature? I would personally have considered it a flaw. Wrapping all dict access just in case some other code added a weird key seems ugly and inefficient. > RuntimeError when changing a dict's size while iterating over it is > definitely a feature (so as to allow the implementation to rehash > everything upon insert/delete). In that case, you've explicitly asked for the whole set (or at least a snapshot); the RuntimeError indicates that you can't get it. Maybe there should be a way to say "I don't need the set to be perfect", but there isn't, and you aren't getting what you actually did ask for, so an Error is appropriate. But in this case, you're asking for a specific key, and the key is there. It may even be a perfectly normal string, that just happened to hash-collide with something some other code inserted. The RuntimeError exposes a race condition to user code -- and it may not be the code that stuck the oddball class in the dict in the first place. > Note that Victor's alternate fix (nomodify.diff) has the same problem > -- it just raises RuntimeError in the mutating thread rather than in > the lookup thread. Right; I'm saying that raising an error is the wrong thing to do. But there is an analogy to the hash-code change vs collision counting. Breaking an application that got unlucky once is wrong. But if the bad luck *continues* to the point where it would only occur one time in a zillion, RuntimeError is better than a likely infinite loop. (It really was a question initially, but I've now convinced at least myself.) I would rather see something like replacing 349 else { 350 /* The compare did major nasty stuff to the 351 * dict: start over. 352 * XXX A clever adversary could prevent this 353 * XXX from terminating. 354 */ 355 return lookdict(mp, key, hash); with 349 else { 350 /* The compare did major nasty stuff to the 351 * dict: start over. 352 * XXX A clever adversary could prevent this 353 * XXX from terminating. 354 */ 355 return lookdict_paranoid(mp, key, hash, 1); where lookdict_paranoid is a near-copy of lookdict that adds a paranoia parameter and replaces those same lines with else { /* The compare did major nasty stuff *again*. */ if (paranoia < PARANOIA_LEVEL) { return lookdict_paranoid(mp, key, hash, paranoia+1); } /* Something is wrong; raise a RuntimeError. */ } |
|||
msg155147 - (view) | Author: Roundup Robot (python-dev) | Date: 2012-03-08 01:50 | |
New changeset 0255bafbccf2 by Victor Stinner in branch 'default': Issue #14205: document the change of dict[key] behaviour if dict is modified http://hg.python.org/cpython/rev/0255bafbccf2 |
|||
msg155194 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-09 00:43 | |
Guido: So, what do you think? Do you like the new behaviour, or would you prefer a softer solution like retrying N times? Python < 3.3 retries the lookup an unlimited number of times which lead to a crash, that's the issue fixed by my change. |
|||
msg155196 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2012-03-09 00:50 | |
I'm still torn. Can you implement the counter without adding an extra field to the dict object? I worry that we'll get the same objection we had when MAL proposed collision counting as a solution to the hash DoS attack. The numbers 0, 1 and infinity are special; all others are to be treated with skepticism. I'm prepared to make the current semantic change a 3.3 feature, but I'm not sure it's safe to backport. |
|||
msg155197 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-09 00:56 | |
> Can you implement the counter without adding an extra field to the dict object? Add a counter requires to change the prototype of the C lookup function: PyDictEntry *(*ma_lookup)(PyDictObject *mp, PyObject *key, Py_hash_t hash); I don't know if it is a problem for ABI compatibility or extension modules. I suppose that it is safer and simpler to not change it :-) > I worry that we'll get the same objection we had when MAL proposed collision counting as a solution to the hash DoS attack. > > The numbers 0, 1 and infinity are special; all others are to be treated with skepticism. Is it really useful to only retry once? Retrying once means that it would work in most cases, but not in some corner cases. For example, retrying once may hide the problem if you have only 1 client (1 thread), but you get the exception when you have more clients (more threads). Or do I misunderstand the problem? |
|||
msg155199 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2012-03-09 01:01 | |
On Thu, Mar 8, 2012 at 4:56 PM, STINNER Victor <report@bugs.python.org> wrote: > > STINNER Victor <victor.stinner@gmail.com> added the comment: > >> Can you implement the counter without adding an extra field to the dict object? > > Add a counter requires to change the prototype of the C lookup function: > PyDictEntry *(*ma_lookup)(PyDictObject *mp, PyObject *key, Py_hash_t hash); > > I don't know if it is a problem for ABI compatibility or extension > modules. I suppose that it is safer and simpler to not change it :-) Agreed. >> I worry that we'll get the same objection we had when MAL proposed collision counting as a solution to the hash DoS attack. >> >> The numbers 0, 1 and infinity are special; all others are to be treated with skepticism. > > Is it really useful to only retry once? Retrying once means that it > would work in most cases, but not in some corner cases. For example, > retrying once may hide the problem if you have only 1 client (1 > thread), but you get the exception when you have more clients (more > threads). > > Or do I misunderstand the problem? Sorry, I was joking; anyway, 1 in this context would mean do the lookup exactly once (so 0 would mean not to do it at all). One retry would effectively mean try it twice, which would have to be treated skeptically. :-) Given all this I think we should keep it as you have committed and add it to the docs and whatsnew. |
|||
msg155205 - (view) | Author: Jim Jewett (Jim.Jewett) * | Date: 2012-03-09 04:28 | |
On Thu, Mar 8, 2012 at 7:43 PM, STINNER Victor wrote: > Python < 3.3 retries the lookup an unlimited number of times which > lead to a crash, that's the issue fixed by my change. How does it lead to a crash? I think it just leads to an infinite loop (which is worse) if the data is actually malicious, but eventually fixes itself if the data is merely unlucky or even ill-behaved. Guido van Rossume: >> Can you implement the counter without adding an extra field to the dict object? Yes. The logic does add complexity, but there is no per-dict charge, and the extra logic can be isolated to a function that will normally not be paged in. > Add a counter requires to change the prototype of the C lookup > function: > PyDictEntry *(*ma_lookup)(PyDictObject *mp, PyObject *key, Py_hash_t hash); > I don't know if it is a problem for ABI compatibility or extension > modules. I suppose that it is safer and simpler to not change it :-) That should still be the signature for the primary lookup; it can do its own delegation to a private function with an extra parameter if/when needed. > The numbers 0, 1 and infinity are special; all others are to be > treated with skepticism. Yes; it isn't clear whether to cut off after 1 retry or 3 or 10... > Is it really useful to only retry once? Retrying once means that it > would work in most cases, but not in some corner cases. For example, > retrying once may hide the problem if you have only 1 client (1 > thread), but you get the exception when you have more clients (more > threads). Without threads, there shouldn't be any problems for innocent code. Even with threads, the number of resets is not (for innocent code) tied to the number of alternative threads; only to the fact that they can run mid-lookup. -jJ |
|||
msg155214 - (view) | Author: Mark Shannon (Mark.Shannon) * | Date: 2012-03-09 09:00 | |
Unmodified CPython (without the patch) already passes the new test in the patch. The unmodified code already raises a Runtime error; a recursion limit exceeded error! |
|||
msg155219 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-09 09:44 | |
> Unmodified CPython (without the patch) already passes the new test in the patch. You should try Lib/test/crashers/nasty_eq_vs_dict.py, not my test. |
|||
msg155230 - (view) | Author: Roundup Robot (python-dev) | Date: 2012-03-09 13:04 | |
New changeset a428f85de29c by Victor Stinner in branch 'default': Issue #14205: Document the dict lookup change in What's New in Python 3.3 http://hg.python.org/cpython/rev/a428f85de29c |
|||
msg155231 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-09 13:05 | |
Guido> Given all this I think we should keep it as you have committed Guido> and add it to the docs and whatsnew. I updated What's New in Python 3.3 document. I also wrote an email to python-dev to notify all developers of this change. Let's close the issue. |
|||
msg155232 - (view) | Author: Mark Shannon (Mark.Shannon) * | Date: 2012-03-09 13:52 | |
STINNER Victor wrote: > STINNER Victor <victor.stinner@gmail.com> added the comment: > > Guido> Given all this I think we should keep it as you have committed > Guido> and add it to the docs and whatsnew. > > I updated What's New in Python 3.3 document. I also wrote an email to python-dev to notify all developers of this change. Let's close the issue. > The tests still need to fixed test_mutating_lookuptest doesn't fail on the unmodified version of cpython. Lib/test/crashers/nasty_eq_vs_dict.py has been removed, but no equivalent test has been added to Lib/test/test_dict.py There should be at least one failing test for this issue. |
|||
msg155266 - (view) | Author: Roundup Robot (python-dev) | Date: 2012-03-09 21:59 | |
New changeset 70fbb02d588c by Victor Stinner in branch 'default': Issue #14205: Fix test_dict.test_mutating_lookup() http://hg.python.org/cpython/rev/70fbb02d588c |
|||
msg156858 - (view) | Author: Jim Jewett (Jim.Jewett) * | Date: 2012-03-26 19:35 | |
See http://bugs.python.org/issue14417 |
|||
msg157334 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2012-04-01 23:26 | |
Was the crasher ever converted into a unittest? I think that should be done regardless of the outcome of the ongoing discussion about this issue. |
|||
msg160543 - (view) | Author: Roundup Robot (python-dev) | Date: 2012-05-13 18:50 | |
New changeset 93748e2d64e3 by Antoine Pitrou in branch 'default': Issue #14417: Mutating a dict during lookup now restarts the lookup instead of raising a RuntimeError (undoes issue #14205). http://hg.python.org/cpython/rev/93748e2d64e3 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:27 | admin | set | github: 58413 |
2012-05-13 18:50:18 | python-dev | set | messages: + msg160543 |
2012-04-01 23:26:20 | gvanrossum | set | messages: + msg157334 |
2012-03-26 19:35:43 | Jim.Jewett | set | messages: + msg156858 |
2012-03-09 21:59:09 | python-dev | set | messages: + msg155266 |
2012-03-09 13:52:47 | Mark.Shannon | set | messages: + msg155232 |
2012-03-09 13:05:00 | vstinner | set | status: open -> closed resolution: fixed messages: + msg155231 |
2012-03-09 13:04:09 | python-dev | set | messages: + msg155230 |
2012-03-09 09:44:08 | vstinner | set | messages: + msg155219 |
2012-03-09 09:00:08 | Mark.Shannon | set | messages: + msg155214 |
2012-03-09 04:28:09 | Jim.Jewett | set | messages: + msg155205 |
2012-03-09 01:01:47 | gvanrossum | set | messages: + msg155199 |
2012-03-09 00:56:16 | vstinner | set | messages: + msg155197 |
2012-03-09 00:50:02 | gvanrossum | set | messages: + msg155196 |
2012-03-09 00:43:10 | vstinner | set | messages: + msg155194 |
2012-03-08 01:50:33 | python-dev | set | messages: + msg155147 |
2012-03-07 15:06:02 | rhettinger | set | assignee: rhettinger -> |
2012-03-06 20:33:48 | Jim.Jewett | set | messages: + msg155037 |
2012-03-06 18:05:09 | gvanrossum | set | messages: + msg155029 |
2012-03-06 17:43:10 | Jim.Jewett | set | messages: + msg155027 |
2012-03-06 17:29:08 | vstinner | set | status: closed -> open resolution: fixed -> (no value) |
2012-03-06 17:21:44 | gvanrossum | set | messages: + msg155026 |
2012-03-06 16:59:19 | vstinner | set | messages: + msg155024 |
2012-03-06 16:56:28 | Mark.Shannon | set | messages: + msg155023 |
2012-03-06 16:54:11 | vstinner | set | messages: + msg155022 |
2012-03-06 16:35:37 | Jim.Jewett | set | nosy:
+ Jim.Jewett messages: + msg155021 |
2012-03-06 00:13:02 | python-dev | set | status: open -> closed nosy: + python-dev messages: + msg154994 resolution: fixed stage: resolved |
2012-03-05 23:50:47 | gvanrossum | set | messages: + msg154993 |
2012-03-05 23:49:20 | rhettinger | set | priority: normal -> low assignee: rhettinger |
2012-03-05 23:43:12 | vstinner | set | nosy:
+ gvanrossum messages: + msg154991 |
2012-03-05 23:28:56 | Mark.Shannon | set | nosy:
+ Mark.Shannon messages: + msg154989 |
2012-03-05 22:14:40 | pitrou | set | nosy:
+ rhettinger |
2012-03-05 21:20:58 | vstinner | set | files:
+ nomodify.patch messages: + msg154977 |
2012-03-05 21:19:54 | vstinner | create |