msg286849 - (view) |
Author: (lamby) |
Date: 2017-02-03 09:37 |
Due to implementation changes, since CPython 3.6 dict keys are returned
in insertion order. However, in order to test for reproducible builds [0],
it would be convenient to be able to reverse this ordering; we would then
run a build of an arbitrary package both with and without this flag and
compare the resulting binary.
(We already run such a testing framework, so specifying this environment
variable would be trivial. Note that this "reverse" would actually find
more issues than simply relying on the pre-3.6 non-deterministic
behaviour.)
This patch changes the behaviour of:
* for x in d:
* d.popitem()
* d.items()
* _PyDict_Next
[0] https://reproducible-builds.org/
|
msg286850 - (view) |
Author: Inada Naoki (methane) * |
Date: 2017-02-03 09:40 |
Why don't you use OrderdDict and reversed()?
|
msg286851 - (view) |
Author: (lamby) |
Date: 2017-02-03 09:48 |
> Why don't you use OrderdDict and reversed()?
This isn't for my own code; I want to change the behaviour of CPython itself so it affects arbitrary third-party code - this is what we are testing when we are testing for reproducibility :)
|
msg286852 - (view) |
Author: Inada Naoki (methane) * |
Date: 2017-02-03 09:58 |
I can't understand what is the problem.
If the package produce same binary when dict keeps insertion order,
isn't it a "reproducible build"?
|
msg286853 - (view) |
Author: (lamby) |
Date: 2017-02-03 10:02 |
> If the package produce same binary when dict keeps insertion order,
> isn't it a "reproducible build"?
No, as that's a CPython-specific (and 3.6+) implementation detail. Hence "forcing" a test for it :)
|
msg286855 - (view) |
Author: Inada Naoki (methane) * |
Date: 2017-02-03 10:20 |
For checking compatibility with other implementation, I want to wait
until other implementation compatible with 3.6+ which doesn't
keep insertion order of dict.
For now, there are no 3.6+ compatible Python implementation except CPython.
For checking compatibility with Python 3.5-, I -1 to add such flag.
Python 3.6 has many new features. You should use 3.5 instead.
|
msg286872 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2017-02-03 14:50 |
Inada: we haven't 100% decided that this is going to become a language feature. However it is likely to become so, so adding such a flag is probably wasted effort. Further, if the goal is to test compatibility with other python implementations, shouldn't you actually be testing against those other implementations? You are likely to catch more problems than just dict order that way. So I vote -1 on this.
|
msg286879 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2017-02-03 17:01 |
I concur with David and Inada on this one (it is likely to become a wasted effort and it impacts maintainability to try to support this even for the short run).
|
msg286886 - (view) |
Author: (lamby) |
Date: 2017-02-03 20:16 |
I think we are misunderstanding each other regarding our goals here :)
I'm not trying to test against other Python implementations or versions of CPython itself but rather "flush out" reproducibility issues in third-party Python code that (incorrectly) relies on dict ordering being relatively stable and/or in insertion order, etc. etc.
(The only reason I mention 3.6 is because the insertion-order behaviour there simply makes it easier to have a 'reverse' order)
|
msg286888 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2017-02-03 20:50 |
But that reliance/reproducibility-error would be an issue only on interpreters that don't preserve insertion order, and we're expecting we'll make that a language requirement. So for now, or for as long as you think it is warranted, just test against interpreters that randomize the order.
Note that this is different from the pre-randomization dict behavior, where lots of programs depended on the accident-of-the-implementation order in which keys were returned. What we think is coming is a guaranteed ordering, which is, thus, reproducible.
|
msg286938 - (view) |
Author: (lamby) |
Date: 2017-02-04 09:37 |
> we're expecting we'll make that a language requirement
Mmm, but only for (at least) 3.7+. It would still be very useful to find software that is relying on (currently) undefined behaviour, no?
|
msg286958 - (view) |
Author: Inada Naoki (methane) * |
Date: 2017-02-04 11:33 |
At least, ordering of namespace dict and kwargs dict are language spec for 3.6.
This option breaks it. When this option is set, CPython 3.6 is not Python 3.6.
|
msg286988 - (view) |
Author: (lamby) |
Date: 2017-02-04 20:46 |
> ordering of namespace dict and kwargs dict are language spec for 3.6
Are they really _specced_ for 3.6? I was under the impression that it was just an implementation detail.
|
msg287034 - (view) |
Author: Inada Naoki (methane) * |
Date: 2017-02-05 02:51 |
see https://mail.python.org/pipermail/python-dev/2016-September/146348.html
kwargs, __duct__, and namespace passed to metaclass are ordered by language design.
order of other dicts are implementation detail.
|
msg287066 - (view) |
Author: (lamby) |
Date: 2017-02-05 23:38 |
> order of other dicts are implementation detail.
Right, exactly :)
|
msg287067 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-02-05 23:58 |
While the use case makes sense, test if an application relies on the dictionary iterating order, I'm not sure that adding an option to change the order.
For me, it's a rare and very specific use case, whereas your option is public and "too easy" to find and use. For example, what if a developer decides that its application now requires this option to run?
Moreover, your code changes performance critical code. I don't want to get a slowdown here for rare use case, since we spent a lot of time to optimize these functions!
I suggest you to try to implement your feature in a dict subtype in a third party module, and try to monkey-patch applications to use your type. Attached hack_dict.py is an example, but it only handles code explicitly calling the "dict()" type to create a dictionray.
Another option for you is to maintain your downstream CPython patch, sorry.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:42 | admin | set | github: 73617 |
2017-02-05 23:58:10 | vstinner | set | files:
+ hack_dict.py nosy:
+ vstinner messages:
+ msg287067
|
2017-02-05 23:38:17 | lamby | set | messages:
+ msg287066 |
2017-02-05 02:51:53 | methane | set | messages:
+ msg287034 |
2017-02-04 20:46:49 | lamby | set | messages:
+ msg286988 |
2017-02-04 11:33:44 | methane | set | messages:
+ msg286958 |
2017-02-04 09:37:50 | lamby | set | messages:
+ msg286938 |
2017-02-03 20:50:27 | r.david.murray | set | messages:
+ msg286888 |
2017-02-03 20:16:57 | lamby | set | messages:
+ msg286886 |
2017-02-03 17:01:32 | rhettinger | set | status: open -> closed
nosy:
+ rhettinger messages:
+ msg286879
resolution: rejected stage: resolved |
2017-02-03 14:50:36 | r.david.murray | set | nosy:
+ r.david.murray messages:
+ msg286872
|
2017-02-03 10:20:15 | methane | set | messages:
+ msg286855 |
2017-02-03 10:02:18 | lamby | set | messages:
+ msg286853 |
2017-02-03 09:58:46 | methane | set | messages:
+ msg286852 |
2017-02-03 09:48:39 | lamby | set | messages:
+ msg286851 |
2017-02-03 09:40:41 | methane | set | nosy:
+ methane messages:
+ msg286850
|
2017-02-03 09:37:45 | lamby | create | |