Issue 16991: Add OrderedDict written in C

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/61195

classification

Title:	Add OrderedDict written in C
Type:	enhancement	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.5

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	eric.snow	Nosy List:	Arfrever, BreamoreBoy, Jim.Jewett, Mark.Shannon, alex, asvetlov, benjamin.peterson, christian.heimes, eric.araujo, eric.smith, eric.snow, ezio.melotti, flox, gregory.p.smith, introom, josh.r, larry, mrabarnett, ncoghlan, ned.deily, pitrou, python-dev, refi64, rhettinger, scoder, serhiy.storchaka, skrah, tonn81, westurner, yselivanov
Priority:	release blocker	Keywords:	patch

Created on 2013-01-18 04:08 by eric.snow, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
odict-speed.diff	eric.snow, 2013-06-04 06:08	pybench addition for OrderedDict	review
cOrderedDict.diff	eric.snow, 2013-06-21 06:42		review
cOrderedDict-2.diff	serhiy.storchaka, 2013-10-20 18:32		review
cOrderedDict-3.diff	eric.snow, 2015-03-31 05:17		review
cOrderedDict-4.diff	eric.snow, 2015-05-21 04:05		review
3b2a9026d48e.diff	eric.snow, 2015-05-25 22:24		review
c3fab329aa7f.diff	eric.snow, 2015-05-26 01:11		review
ba1c6d40ca63.diff	eric.snow, 2015-05-26 01:53		review
0813b1a88171.diff	eric.snow, 2015-05-30 02:52		review
odict-windows.diff	eric.snow, 2015-05-30 18:26
crash-1.py	skrah, 2015-06-01 09:36
crash-2.py	skrah, 2015-06-01 09:37

Repositories containing patches
https://hg.python.org/features/cordereddict#cordereddict

Messages (134)
msg180170 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-01-18 04:08
Here's an initial stab at writing OrderedDict in C. Though, the implementation is not heavily optimized and isn't super subclass-friendly, my hope is that it's relatively close to the final version. I'm getting this up now to get some eyes on it. The spot in the builtins is mostly for convenience, but I expect it will need to be exposed somewhere (perhaps _collections?). My experience with the C-API is relatively limited and my C-fu is not at a professional level. However, I'm pretty sure that I have most everything correct. The ultimate goal for this type is to use it for **kwargs. Note: this first patch has some reference leaks that I'm tracking down.
msg180176 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2013-01-18 05:22
Nit: This really not be exposed through _collections rather than hacked into builtins.
msg180202 - (view)	Author: Ezio Melotti (ezio.melotti) *	Date: 2013-01-18 18:06
The tests should probably be updated to use the PEP 399 idiom in order to test both the implementations.
msg180236 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-01-19 08:37
@Benjamin: Yeah, I've fixed that. @Ezio: Good point. I've touched that up. Once I have cleared up reference counting issues I'll put up a new patch.
msg180640 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-01-26 05:48
Here's a cleanup of test.test_collections that helps keep the subsequent patch (still forthcoming) cleaner.
msg180682 - (view)	Author: Ezio Melotti (ezio.melotti) *	Date: 2013-01-26 16:37
What's the reason for moving the OrderedDict tests in a separate file?
msg180688 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-01-26 17:19
> What's the reason for moving the OrderedDict tests in a separate file? Following the precedent of collections.deque and collections.defaultdict: * a big chunk of code * the default implementation will be coming via _collections.
msg180690 - (view)	Author: Ezio Melotti (ezio.melotti) *	Date: 2013-01-26 17:27
These are indeed good reasons.
msg181421 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-02-05 06:53
Here's an updated patch. I still have some ref-counting issues, but the patch is much closer to what I expect will be the final version. At this point it passes all the main unit tests (segfaults in some of the supplemental Mapping tests). One key thing is that for now the various iterators are faked using lists. Once everything is sorted out I'll plug my implementation of the iterators back in (which I'd already mostly finished). I see room for efficiency improvements in a couple spots. As well, I plan on improving the subclass-ability of the type once it's otherwise happy. Any feedback at the point would be helpful. Regardless, I'm being careful to get this right and I'm no expert with the C-API, so it's taking a while. :) Some other considerations: * ensure that __init__() can reset and populate an existing dictionary: d=OrderedDict(); d[0]=1; d.__init__() # resets * avoid any reference cycles so that dicts can clean-up right-away when their ref-count drops to zero rather than waiting on GC. * possibly use the GIL to make the link updates atomic, trying to make the overall class thread safe. * methods like get() pop() or setdefault() shouldn't rely on __getitem__ raising KeyError because __missing__ may be present in a subclass
msg181422 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-02-05 07:17
Looks like I didn't get the patch lined up to tip so the review link isn't showing up. I'll have to fix that tomorrow.
msg182132 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-02-15 09:21
Are there any objections to this unittest cleanup patch? I'd like to commit it separately prior to anything that will go in for the C OrderedDict.
msg182134 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-02-15 09:27
Incidently, I'm just about done with the OrdereDict patch. I'm attaching the current one. At present it passes the unit tests, but I have two memory-related issues and a number of open questions (that I don't think are critical). The patch has the unittest cleanup merged in just so it will show up in reviewboard.
msg182145 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2013-02-15 15:03
Why did you copy test_collections.py instead of just creating a new file?
msg182147 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-02-15 16:56
> Why did you copy test_collections.py instead of just creating > a new file? To preserve the history of changes to the bulk of the code in test_ordereddict.py.
msg182214 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-02-16 06:46
I should clarify, odict.diff passes test_ordereddict. Regardless, it's pretty close. I'm going to figure out why the review link hates me on this issue, but until then any extra eyes here would be appreciated. The memory-related issues are pushing well past my experience. My goal is to get this in before PyCon and to have a reasonable plan at least for implementing **kwargs as OrderedDict. My intuition is that writing OrderedDict in C is the hard part.
msg184801 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-03-20 21:28
My current patch ends up with O(n) deletion, which won't fly, so I'm refactoring. While I'm at it I'm also planning on using the BSD queue.h doubly-linked list rather than the one that I rolled. I'm also going to pull the ordered dict implementation into it's own source file. However, these things should not have much of an impact on most of the code I've already written. I anticipate that the changes won't translate into a further large volume of work. In talking to Raymond, he emphasized the importance of making sure we avoid reentrancy problems. I'll be double-checking that and likely making use of the GIL in a couple spots. While the bulk of the implementation is complete, the remaining work to do here is what I've described above, along with more testing. An orthogonal problem is addressing the problem of the concrete dict API. I'll bring that up separately when this issue is basically done.
msg184835 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2013-03-21 03:04
I just looked at the test-collections patch and don't want it committed. It is full of trivial edits that make it hard to review and doesn't make the test suite better. It is okay to add some new tests, but I don't want to rearrange the existing code with minor edits. That will just make it more difficult to maintain the code across versions (it is no fun to backport a fix and then have unnecessary merge conflicts).
msg189891 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-05-24 04:04
It's worth noting that issue10977 is pretty closely related to this isssue.
msg189893 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-05-24 04:38
> I just looked at the test-collections patch and don't want it > committed. It is full of trivial edits that make it hard to > review and doesn't make the test suite better. Thanks for the feedback, Raymond. I agree that there are a number of trivial edits there that shouldn't be there. I'll fix those.
msg189895 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-05-24 05:10
Raymond, do you have any objections to splitting the OrderedDict-related tests into their own module. Doing so allows for testing all the those test classes at once, which is handy when working on OrderedDict. This is more relevant now with the extra PEP-399-motivated tests.
msg189897 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-05-24 05:56
I finally had some time so here's an updated patch (including O(1) deletions). I've also added a bunch of notes at the top of odictobject.c. Some notable things left to do: * address two recently failing tests due to r83881 (issue 17900) * check for any reference cycles (should be fine) * validate reentrancy (make sure everything is thread-safe around calls into Python) * make sure subclassing is okay * compare performance to dict, pure Python OrderedDict My goal here is to get an effective OrderedDict that we are happy with, while recognizing that there is room for optimization. However, I won't be okay with faultiness, so the implementation must be complete. This has been my mindset throughout.
msg190589 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-06-04 06:08
Without too many optimzations, the C implementation of OrderedDict is basically between 1x and 4x the performance of raw dict. This contrasts with the pure Python OrderedDict, which falls in roughly between 4x and 100x. I've attached an addition to pybench, not intended to be committed, that compares the 3. Run any of these commands to get timing: ./python Tools/pybench/pybench.py -t ODict ./python Tools/pybench/pybench.py -t ODict -t _Py ./python Tools/pybench/pybench.py -t ODict -t _C ./python Tools/pybench/pybench.py -t ODict -t _Dict I tuned the # rounds for each test such that the raw dict results all come out to just about the same value (2ms on my computer). That way it's easy to compare the pure Python or C results to the dict results.
msg190590 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-06-04 06:13
Here's an updated patch that has fixes for recursive pickles and for a couple memory-related segfaults that I missed before. That leaves the following to be done: * handle the case where a node is deleted during iteration * check for any reference cycles (should be fine) * validate reentrancy (make sure everything is thread-safe around calls into Python) * make sure subclassing is okay * address any remaining TODOs in the code
msg190639 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-06-05 02:26
Here's one solution to the deletion-during-iteration problem. It makes the iterators track the key rather than the node, at the expense of a sliver of speed on __iter__() and keys().
msg190640 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-06-05 02:57
The concern about reference cycles here lies in their existence in the linked-list. To see what I mean, check out the use of weakrefs in the pure Python implementation at Lib/collections/__init__.py. As the current implementation does not use PyObjects for the linked-list, I'm going to call this a non-issue.
msg190643 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-06-05 06:49
Here's what I feel is a nearly complete patch. The only outstanding issues that I feel need to be answered are 4 spots where calls into the interpreter may result in unexpected changes to the object or make the current function state out-of-date. 1. _odict_clear_nodes() while iterating over the nodes. 2. _odict_keys_equal() while iterating over the nodes. 3. odict_sizeof() -- I'm not too worried about this one. 4. _odict_copy() while iterating over the nodes. Once I feel comfortable with some resolution for those, I'm going to consider the patch ready to go, pending reviews.
msg191557 - (view)	Author: Eric Snow (eric.snow) *	Date: 2013-06-21 06:42
I figured out what I hope were the last memory-related issues. Apparently tp_traverse is not inherited if tp_flags is set. I had it set on all the view types and all the iterator types. So during GC it would blow up when it tried to call tp_traverse. Everything is looking pretty good so I'm attaching the latest diff.
msg200607 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2013-10-20 18:32
Rebase patch to tip.
msg209700 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2014-01-30 02:21
Can we still merge this in 3.4?
msg209744 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-01-31 02:17
It's much too late for such a disruptive change in 3.4.
msg209831 - (view)	Author: Eric Snow (eric.snow) *	Date: 2014-01-31 21:21
I agree with Antoine. It's first on my todo list for 3.5. My goal is that this and a couple of related features will land during the PyCon sprints.
msg215971 - (view)	Author: Eric Snow (eric.snow) *	Date: 2014-04-12 13:13
After pulling up to tip and sticking in the basic **kwargs change, the results on the benchmark suite imply a refleak somewhere. As soon as I sort that out I'll push up the updated patch.
msg231087 - (view)	Author: Артём Скорецкий (tonn81)	Date: 2014-11-12 17:32
Any progress? It was planned for 3.5 release
msg232412 - (view)	Author: Eric Snow (eric.snow) *	Date: 2014-12-10 05:12
I haven't had time for a while to do much work on this. At the last posting this patch was basically done. I'm sure it no longer applies cleanly. The biggest obstacle to wrapping this issue up is the size of the patch. It's hard to get a thorough review because someone would have to dedicate the time for it. Furthermore, adding this much code introduces a debatable risk which needs to be minimized. When I'm able to get back to this I'll likely split the patch up. I'd really like to introduce some helpers (perhaps macros) to the C API for writing iterators, since the boilerplate involved is massive; just take a look at how much code in my implementation involves the boilerplate of creating iterators. Unfortunately I don't know when my time will free up in the near future. I still hope to land this for 3.5.
msg232803 - (view)	Author: Eric Snow (eric.snow) *	Date: 2014-12-17 06:31
I'm open to suggestions on how to help make the patch more reviewable. Currently at roughly +2500 lines of code it's a lot to ask someone to review. While probably not the only way to help reviewers, I expect the best thing I could do is to split the patch up into several smaller parts. However, I could use some insight into the best way to approach that. I keep considering factoring out the iterator/view boilerplate into helpers (macros?) in a separate patch. That would go a long way to making the actual cOrderedDict implementation more reasonably sized. I'm just not convinced yet it's the right approach.
msg232825 - (view)	Author: Wes Turner (westurner) *	Date: 2014-12-17 16:43
* Macros could be useful. * Would this make it easy/faster to also have a DefaultOrderedDict (which can/could also be accomplished with .get(attr, []) and .setdefault(attr, [])? * There may be some value in looking at https://pypi.python.org/pypi/cyordereddict * (I'm really not qualified to review this. Valgrind?)
msg232828 - (view)	Author: Eric Snow (eric.snow) *	Date: 2014-12-17 21:27
Suggestions from Antoine on python-dev: * do not make cOrderedDict "builtin"; drop changes to: - Include/Python.h - Objects/object.c - Makefile.pre.in * do not add a C-API * include the linked list pointers in the hash entries? (may simplify things) * drop the shared/split keys mechanism? * do not subclass dict? (can of worms)
msg239665 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-03-31 05:17
I've opened a feature clone to better track the work (features/cordereddict). Here's the updated patch against default from that clone. I haven't added any argument client stuff. I also haven't addressed any of the feedback from Antoine. I'd rather leave OrderedDict as a "builtin" since the whole point is to be able to use OrderedDict in the interpreter.
msg239666 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-03-31 05:18
s/argument client/Argument Clinic/
msg239740 - (view)	Author: Stefan Behnel (scoder) *	Date: 2015-03-31 17:35
> do not add a C-API what speaks against it?
msg239761 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-03-31 23:16
I expect Antoine is trying to limit the scope of the change, which makes sense. In this case, though, once (if?) the patch lands I plan on using it in the interpreter, so the C-API bits will be necessary.
msg239867 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2015-04-02 01:13
From a cursory glance, I still think this is the wrong approach. There's a lot of complication due to the simple fact that it is subclassing from dict (and therefore trying to be ABI-compatible, kind of). The linked list scheme should be much simpler (perhaps it shouldn't be a linked list at all, actually: see Raymond's recent proposal). I understand a lot of work has gone into it, but I'm against the current proposal.
msg239871 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-04-02 02:12
Thanks for speaking up, Antoine. Keep in mind that a C implementation of OrderedDict has a strict compatibility requirement with the existing pure Python one. The Python implementation subclasses dict so the C implementation must as well. I agree that it would be simpler otherwise. It would have saved me a lot of headache! As to the linked list implementation, it does the job. I'm not trying to blaze any trails with this code. Furthermore, my implementation does not preclude a better one replacing it. I would welcome a better implementation. Until that appears, my implementation is what there is. :) Frankly, my interest does not lie in the a C OrderedDict, but in what I want to do one we have one. Since no one was willing to do that for me <wink>, I did it myself. That said, I feel like my implementation satisfies what is required and does so in a way that we can be satisfied with it indefinitely. It sounds like you feel it would be worse to land it than to not have a C implementation of OrderedDict. Obviously I don't feel the same way. :) Regardless, I think we both feel the same way that the amount of time I've spent on this shouldn't factor into determining that. Aside from subclassing dict and my linked list approach, do you have any other concerns?
msg243096 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-13 15:07
Eric, is there any chance this can land in 3.5? OrderedDict is a heavily used thing, everyone will benefit from a fast implementation. It's OK if we have an imperfect (but fully compatible with existing OrderedDict) implementation in 3.5. We can optimize it in 3.6.
msg243121 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-13 21:47
@Yury, I'm mostly just waiting for Raymond to give it at least a quick sanity-check. I know there is at least 1 ref leak, but that can be sorted out.
msg243123 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-13 21:48
@Eric, can you set priority to "release blocker"?
msg243343 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-16 17:49
Raymond, is there any chance you can review the patch before beta1? Sorry, for bugging you with this, I just really hope we can have fast OrderedDict in 3.5.
msg243358 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-16 19:06
As for refleak -- I looked at it briefly: simple "a = OrderedDict(); a = None" code leaks, so it must be somewhere in tp_new/tp_dealloc. And I know what object is leaking - it's None.
msg243371 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-16 22:16
I must have used a better example :) >>> import sys; from collections import OrderedDict as OD >>> sys.getrefcount(None) 2053 >>> OD();OD();OD();OD(); OrderedDict() OrderedDict() OrderedDict() OrderedDict() >>> sys.getrefcount(None) 2057
msg243379 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-16 23:32
Started to look at the patch in more detail. Found one leak. There are more. Eric, maybe you can hunt them?
msg243492 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-18 15:54
Thanks for taking a look, Yury. I'll follow up on the ref leak soon.
msg243699 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-20 22:23
I've updated the cordereddict branch on the feature repo to fix the ref leak on None. I'm still seeing other leaks (regrtest -R 3:3), but I don't think that should block landing before the feature freeze. However, I'm concerned about a segfault I'm seeing roughly 3% of the time when running the following: ./python -m unittest test.test_configparser.ConfigParserTestCase.test_basic Presumably there is a dealloc bug there which manifests intermittently due to a GC race. I'm going work on tracking it down but help is appreciated. :) I'm also seeing consistent test failures in test_enum.
msg243702 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2015-05-20 22:57
A patch shouldn't be integrated if there are known bugs with it (especially segfaults).
msg243705 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-20 23:15
Agreed. I wanted to be clear about what is blocking landing the patch.
msg243706 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-20 23:35
These are the leaks I'm seeing in test_collections (14/24 tests): test_copying : 178 test_iterators : 40 test_update : 29 test_init : 22 test_move_to_end : 21 test_sorted_iterators: 20 test_setdefault : 16 test_clear : 5 test_setitem : 5 test_delitem : 4 test_repr_recursive : 4 test_repr : 3 test_override_update : 2 test_reinsert : 1
msg243714 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-21 02:48
I've fixed most of the leaks now. The only remaining ones are: test_clear : 5 test_repr_recursive : 3
msg243715 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-21 02:58
Regarding the segfault, the following does not fail: ./python -m test.regrtest --forever -m test_basic test_configparser but this does: for i in `seq 1 20`; do ./python -m test.regrtest -m test_basic test_configparser ; done
msg243716 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-21 02:59
Eric, could you please upload your latest patch?
msg243726 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-21 04:05
Here's a diff from the current cordereddict branch of the feature repo.
msg243733 - (view)	Author: Ned Deily (ned.deily) *	Date: 2015-05-21 06:23
FWIW, the random segfault seems to be triggered by hash randomization. If I disable randomization, it does not seem to fail: for i in `seq 1 20`; do PYTHONHASHSEED=1 ./python -m test.regrtest -m test_basic test_configparser ; done Presumably, the --forever option does not vary the hash seed value? Also, FWIW, clang issues a warning: ../../source/Objects/odictobject.c:550:15: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare] if (i < 0) { ~ ^ ~ for this: _odict_FOREACH(od, node) { size_t i = _odict_get_index(od, _odictnode_KEY(node)); if (i < 0) { od->od_size = prev_size; PyMem_FREE(fast_nodes); return -1; } fast_nodes[i] = node; I can supply OS X crash tracebacks if they would be of help.
msg243749 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-21 13:47
Thanks for looking into this, Ned. I've changed that size_t to ssize_t which I expect will quiet that clang warning you saw. I'm glad you pointed it out because it means that that branch was never executing! Unfortunately fixing that does not solve all my problems. <wink> I'll look into the hash randomization connection. Thanks for pointing that out. Hopefully it doesn't complicate matters. My guess is that the segfaults only happen for certain seeds, so I'll check that first.
msg243753 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-21 14:09
Cool. The following gives consistent failures at certain seed values: for i in `seq 1 100`; do echo $i; PYTHONHASHSEED=$i ./python -m test.regrtest -m test_basic test_configparser ; done Through 100 I get segfaults with 7, 15, 35, 37, 39, 40, 42, 47, 50, 66, 67, 85, 87, 88, and 92. The distribution is probably essentially uniform across the full set of seeds, even if the exact pattern of failures isn't precisely uniform. I'll try to see why those hashes are significant here. If I can't figured it out quickly then I'll post about it on python-dev. My hunch is that the hash randomization impacts either dict/odict resizing or dict/odict lookup (or both). TBH, I've been pretty sure for a while that the segfault is coming out of one of those two.
msg243799 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-22 02:41
I've spent a bit of time exploring the segfault. Here's some data that might help relative to the configparser test. I put the following at the beginning of _odict_resize: Py_ssize_t len = PyObject_Size((PyObject )od); if (len == 0) PySys_FormatStdout("."); else { if (((PyDictObject )od)->ma_keys->dk_size < od->od_size) PySys_FormatStdout("-"); if (len < 10) PySys_FormatStdout("%d", len); else PySys_FormatStdout("+"); } if (len >= 10) PySys_FormatStdout("%d\n", len); If the current item count is 0 then it prints a dot. If the resize is shrinking then it prints a - (this did not happen). Otherwise the odict is growing and it prints the current item count. Multi-digit numbers are preceded by + and followed by a newline. I've included the results of different hash seeds (0/no randomization, 1, and 7). You'll notice how the results are subtly different. In the case of 7, it matches no randomization up to the point that it segfaults. I got the same results with 15. However, 35 fails right after the second +22. $ PYTHONHASHSEED=0 ./python -m test.regrtest -m test_basic test_configparser ...6+12 +22 ....6+12 +22 ........6.6........+10 ........6.6........+10 ........6.6........+10 ........6.6........+10 ........6.6........+10 .+10 ........6.6........+10 ........6.6........+10 ........6.6........+10 ........6.6...............6..6............+10 ....6.6.6.6.66.66.6.6.6.6.+12 6.+12 6.6.6.6.6.6.6.6.6.+22 6.+22 6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.+44 6.+44 6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.+86 6.+86 6.6.6.6.6.6.6.6.6.6.6.6.6.6.6........6.6........+10 ........6.6........+10 ........6.6........+10 ........6.6........+10 .6.. $ PYTHONHASHSEED=1 ./python -m test.regrtest -m test_basic test_configparser ...6+12 +22 ....6+12 +22 ........6.6................6.6................6.6................6.6................6.6.........+11 ........6.6................6.6................6.6................6.6...............6..6................6.6.6.6.66.66.6.6.6.6.+12 6.+12 6.6.6.6.6.6.6.6.6.+22 6.+22 6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.+44 6.+44 6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.+86 6.+86 6.6.6.6.6.6.6.6.6.6.6.6.6.6.6........6.6................6.6................6.6................6.6.........6.. $ PYTHONHASHSEED=7 ./python -m test.regrtest -m test_basic test_configparser ...6+12 +22 ....6+12 +22 ........6.6........+10 ........6.6........+10 ........6.6........+10 ........6.6........+10 ........6.6........+10 .+10 Fatal Python error: Segmentation fault
msg243801 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-22 03:53
As far as I can tell there aren't any patterns that repeat across multiple seeds (which makes sense).
msg243802 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-22 04:00
Here are the last 10 frames from the backtrace (gdb): #0 0x00000000005b15d0 in odictiter_iternext (di=<optimized out>) at Objects/odictobject.c:1888 #1 0x0000000000453179 in PyIter_Next (iter=<optimized out>) at Objects/abstract.c:2760 #2 0x00000000005881e7 in chain_next (lz=0x7ffff2b8d878) at ./Modules/itertoolsmodule.c:1866 #3 0x0000000000537db6 in PyEval_EvalFrameEx (f=f@entry=0xcab188, throwflag=throwflag@entry=0) at Python/ceval.c:2975 #4 0x000000000053afee in fast_function (func=func@entry=0x7ffff2bc88f8, pp_stack=pp_stack@entry=0x7fffffff99f8, n=n@entry=1, na=na@entry=1, nk=nk@entry=0) at Python/ceval.c:4752 #5 0x000000000053b655 in call_function (pp_stack=pp_stack@entry=0x7fffffff99f8, oparg=oparg@entry=0) at Python/ceval.c:4679 #6 0x000000000053891d in PyEval_EvalFrameEx (f=f@entry=0xcbd588, throwflag=throwflag@entry=0) at Python/ceval.c:3198 #7 0x000000000053afee in fast_function (func=func@entry=0x7ffff2bc8840, pp_stack=pp_stack@entry=0x7fffffff9bd8, n=n@entry=3, na=na@entry=3, nk=nk@entry=0) at Python/ceval.c:4752 #8 0x000000000053b655 in call_function (pp_stack=pp_stack@entry=0x7fffffff9bd8, oparg=oparg@entry=2) at Python/ceval.c:4679 #9 0x000000000053891d in PyEval_EvalFrameEx (f=f@entry=0xc7d458, throwflag=throwflag@entry=0) at Python/ceval.c:3198 and from Python: Current thread 0x00007f3ad9260740 (most recent call first): File "/home/esnow/projects/cordereddict/Lib/configparser.py", line 1115 in _join_multiline_values File "/home/esnow/projects/cordereddict/Lib/configparser.py", line 1109 in _read File "/home/esnow/projects/cordereddict/Lib/configparser.py", line 716 in read_file File "/home/esnow/projects/cordereddict/Lib/configparser.py", line 721 in read_string File "/home/esnow/projects/cordereddict/Lib/test/test_configparser.py", line 354 in test_basic File "/home/esnow/projects/cordereddict/Lib/unittest/case.py", line 597 in run
msg243803 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-22 04:03
The segfault happens at line 1888 of odictobject.c when it tries to Py_INCREF a NULL value. The problem is that the value that gets looked up for a presumably valid key is returned as NULL. So either the value is messed up, lookup is broken, or the wrong key is being used.
msg243806 - (view)	Author: Matthew Barnett (mrabarnett) *	Date: 2015-05-22 05:11
FTR, in "_odict_resize" there's this: size_t i = _odict_get_index(od, _odictnode_KEY(node)); if (i < 0) { Note that "i" is declared as unsigned and then tested whether it's negative.
msg243807 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-22 05:26
Yeah, Ned pointed that one out. I fixed it but haven't pushed the change.
msg243843 - (view)	Author: Jim Jewett (Jim.Jewett) *	Date: 2015-05-22 18:05
Why does generated file Include/opcode.h show up as changed? It looks like pure whitespace, but I wonder if a similar whitespace change might be making something a space too long somewhere.
msg243849 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-22 20:06
Include/opcode.h shouldn't be in the change (and won't be when committed). I'm guessing it being there is related to one of the recent merges I did from default into the feature branch.
msg243863 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-22 23:13
Just to narrow things down a bit further, the failure happens specifically under ConfigParserTestCaseNoValue: PYTHONHASHSEED=7 ./python -m unittest test.test_configparser.ConfigParserTestCaseNoValue.test_basic -v
msg243868 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-22 23:38
If I still the following at Lib/test/test_configparser.py:328: print(len(cf._proxies), len(list(cf._proxies)), list(cf._proxies)) print(len(cf._sections), len(list(cf._sections)), list(cf._sections)) I get unexpected results that are a clear indication of a problem: 9 8 ['DEFAULT', 'Foo Bar', 'Spacey Bar', 'Spacey Bar From The Beginning', 'Commented Bar', 'Long Line', 'Section\\with$weird%characters[\t', 'Internationalized Stuff'] 8 7 ['Foo Bar', 'Spacey Bar', 'Spacey Bar From The Beginning', 'Commented Bar', 'Long Line', 'Section\\with$weird%characters[\t', 'Internationalized Stuff'] Note that OrderedDict.__len__ is just dict.__len__, so the first number (the actual length) should be the same as the second number (the number of keys). Also note that OrderedDict.keys() is working (even if not exactly correct). Between the two observations, this implies that for this one test the keys are not only off, but at least one of the keys in the linked list is not in the underlying dict. I'm sure this is consequence of resizing. At the last point that we resize this is the state of the cf._proxies and cf._sections (ignore the trailing <NULL> (?)): OrderedDict([('DEFAULT', <Section: DEFAULT>), ('Foo Bar', <Section: Foo Bar>), ('Spacey Bar', <Section: Spacey Bar>), ('Spacey Bar From The Beginning', <Section: Spacey Bar From The Beginning>), ('Commented Bar', <Section: Commented Bar>), <NULL>]) OrderedDict([('Foo Bar', OrderedDict([('foo', ['bar1'])])), ('Spacey Bar', OrderedDict([('foo', ['bar2'])])), ('Spacey Bar From The Beginning', OrderedDict([('foo', ['bar3']), ('baz', ['qwe'])])), ('Commented Bar', OrderedDict([('foo', ['bar4']), ('baz', ['qwe'])])), ('Long Line', OrderedDict([('foo', ['this line is much, much longer than my editor', 'likes it.'])])), <NULL>]) I haven't had time to analyze this but I'll be taking a closer look in later tonight. I'm not giving up on 3.5. :)
msg243903 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2015-05-23 09:17
I'd suggest also taking a look into whether or not the PEP 412 keysharing might be causing problems.
msg243915 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-23 14:38
Good point, Nick. I'd checked that earlier but did not see any relationship. At this point it's worth checking again. :)
msg243959 - (view)	Author: Matthew Barnett (mrabarnett) *	Date: 2015-05-24 01:53
In "odict_new", if "_odict_initialize" fails, will the dict pointed to by "od_inst_dict" be deallocated? In "odictiter_new", there's "Py_INCREF(od);". If "di->di_result == NULL" fails, "od" isn't DECREFed.
msg243960 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-24 02:14
Good catch. I've fixed odictiter_new in the feature branch. However, I'm not sure there's anything to be fixed in odict_new. It follows the same pattern as dict_new (over in Objects/dictobject.c).
msg243964 - (view)	Author: Jim Jewett (Jim.Jewett) *	Date: 2015-05-24 03:18
Eric, unless I'm misreading your debugging info, it is the other way around -- something is in the dict, but not in the list that you iterate over. And since the list that you iterate over looks right, I have to wonder if it was something internal-to-configparser (or its various converters and proxies). Perhaps the __root used by the collections.OrderedDict that it uses by default? Can you iterate over it as a dict, instead of as an ordered dict, to find the discrepancy?
msg243965 - (view)	Author: Jim Jewett (Jim.Jewett) *	Date: 2015-05-24 03:21
(Just putting my review summary in the main ticket) I'm going to echo the previous comment that maybe trying to emulate the existing dict implementation too carefully just adds complexity. The split-keys implementation shows that there is at least some flexibility to the implementation. Enough that the keys could map to an array offset, rather than directly to the values? A simple array of key/value pairs (hashing only to ensure hashability, but ignoring it otherwise) should be faster for the really small dicts you care about, like keyword dicts. (Admittedly, those timing data are fairly out of date, but I would be surprised if it weren't still true.)
msg243966 - (view)	Author: Jim Jewett (Jim.Jewett) *	Date: 2015-05-24 03:53
Should dictobject.h get a bit more changes? In particular, should the following be expanded? #define PyDictKeys_Check(op) (Py_TYPE(op) == &PyDictKeys_Type) #define PyDictItems_Check(op) (Py_TYPE(op) == &PyDictItems_Type) #define PyDictValues_Check(op) (Py_TYPE(op) == &PyDictValues_Type) /* This excludes Values, since they are not sets. */ # define PyDictViewSet_Check(op) \ (PyDictKeys_Check(op) \|\| PyDictItems_Check(op))
msg244005 - (view)	Author: Matthew Barnett (mrabarnett) *	Date: 2015-05-24 22:03
First some background: when you put a new entry into a dict, it looks for an empty slot, the key's hash determining the initial choice, but if that slot's occupied, it picks another, etc, so the key might not be in the slot of first choice. In "PyODict_DelItem", it deletes the key from the dict and then calls "_odict_clear_node" to delete the node. If the key that it's looking for wasn't in the slot of first choice, but that slot is empty, that'll be the one it'll return, ie, the wrong one. The solution, therefore, is to delete the node _before_ deleting the key from the dict. When I tried that, the test ran to completion.
msg244039 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-25 15:46
@mrab gah! I could swear I originally had the _odict_clear_node first and had switched them due to a segfault. It even crossed my mind on Friday but I didn't pursue it. I'm guessing I did put the _odict_clear_node call first at some point but lost that fix along the way. :( That's the "benefit" of having a patch languish for so long; you end up working on it from multiple hosts and sometimes lose bits and pieces. Unfortunately only recently did I think to create a feature repo on which to track the on-going work. Anyway, thanks for helping with the investigation. The patch should be just about ready to commit at this point. :) I'm going to give it a once over, check for any lingering ref leaks, and double-check with Raymond. So I'm hopeful we can land this in the next few days.
msg244040 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-25 15:47
@Jim, You've got some good questions. I'll look into them today.
msg244044 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2015-05-25 18:40
I realise that I am bit late to the party, but I would like to point out that a smaller, arguably simpler, and almost certainly faster alternative design exists. This design simply consists of an array of (prev, next, key) nodes attached to the base dict. The linked list is maintained using the prev & next pointers of the nodes as normal. The mapping of keys to nodes is maintained by ensuring that the index of the node in the node array matches the index of the key in dict-key table. When the size of the array matches that of the keys table, this should be a very fast operation. When the dict is resized, the array will need to resized. (Possibly lazily if PyDict_* functions are used directly on the ordered dict.) Size analysis ------------- For an an occupancy of X, Eric's design takes 7/X + 3 slots per key/value pair. The alternative design takes 6/X slots per key/value pair. For an occupancy of 50%, half way between the minimum of 1/3 and maximum of 2/3, on a 64bit machine: The design proposed in this issue takes 17 slots or 136 bytes per key/value pair. The alternative would take 12 slots or 96 bytes per pair, about 70% of the size.
msg244046 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-25 18:52
> I'm going to echo the previous comment that maybe trying to emulate the existing dict implementation too carefully just adds complexity. The whole dance with _odict_get_index and _odict_resize is due to the requirement that OrderedDict maintain O(1) operation for deletion operations. Due to using a linked list, we needed a secondary mechanism for efficiently indexing into the list. There is a note at the top of the file explaining the alternatives I considered and the rationale for mirroring dict's hash table. > > The split-keys implementation shows that there is at least some flexibility to the implementation. Enough that the keys could map to an array offset, rather than directly to the values? What do you mean by this? If you are suggesting changes to dict or its accessory types then it is something I considered and rejected. Personally I don't want to change anything on dict itself for the sake of OrderedDict. If others would like to pursue that they're welcome to do so. :)
msg244047 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-25 18:53
> Should dictobject.h get a bit more changes? In particular, should the following be expanded? > > #define PyDictKeys_Check(op) (Py_TYPE(op) == &PyDictKeys_Type) > #define PyDictItems_Check(op) (Py_TYPE(op) == &PyDictItems_Type) > #define PyDictValues_Check(op) (Py_TYPE(op) == &PyDictValues_Type) > /* This excludes Values, since they are not sets. */ > # define PyDictViewSet_Check(op) \ > (PyDictKeys_Check(op) \|\| PyDictItems_Check(op)) I'm missing some context here. I'm not sure how this relates to OrderedDict.
msg244048 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-25 18:58
@Mark, great idea. I wish we'd discussed it more at PyCon 2013 when I was working on preserving OrderedDict's O(1) deletion. :) TBH, I don't have any problems with improvements. In fact, I'd be quite happy if folks jumped in and improved what I've done or even replaced it entirely. But at the point (with the feature freeze exception Larry gave) we should land what I have in 3.5 and then target 3.6 for any improvements.
msg244053 - (view)	Author: Jim Jewett (Jim.Jewett) *	Date: 2015-05-25 22:38
Eric .... I realize that O (1) deletion is hard, and don't see a good way around it without changing the implementation ... I just think that the preserving the current C layout may be forcing an even more complicated solution than neccessary. I am nervous about pushing this to 3.5 because of the complexity. I agree that a simpler solution should (also?) wait for 3.6. The question about dictheaher.h boils down to: if someone asks whether something is a dictview (not even using "exact", though that isn't an option), an odictview will say false ... is that really what you want?
msg244054 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-25 22:44
At present the only remaining issues with the patch are: * 10 leaked refs in test_collections * a failing test in test_enum =========================================================================== key: __members__ result: OrderedDict([('red', <Color.red: 1>), ('green', <Color.green: 2>), ('blue', <Color.blue: 3>)]) expected: OrderedDict([('red', <Color.red: 1>), ('green', <Color.green: 2>), ('blue', <Color.blue: 3>)]) =========================================================================== test test_enum failed -- Traceback (most recent call last): File "/home/esnow/projects/cordereddict/Lib/test/test_enum.py", line 1668, in test_inspect_getmembers self.fail("result does not equal expected, see print above") AssertionError: result does not equal expected, see print above I'm not sure what to make of that test. By all appearances it should be passing. My guess is that there's some wackiness afoot over in Enum, but I'll take a closer look after I get the ref counts sorted out.
msg244067 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-25 23:41
I've cleaned up all the ref leaks so now just the failing test_enum test remains to be resolved.
msg244070 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-26 01:11
The failing test is not passing so I don't see any further blockers to committing this.
msg244071 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-26 01:12
rather, it is passing now :)
msg244073 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-26 01:28
> rather, it is passing now :) Eric, thanks for working on this! Could you please go through your patch and replace "//" comments with "/* .. */" ones? It would also be great if you can clean-up XXX comments.
msg244074 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-26 01:29
Ah, good point. I'll take care of all those.
msg244076 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-26 01:52
I've cleaned up the patch. I still want to make one last pass to check re-entrancy concerns.
msg244122 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-26 18:40
> Eric .... I realize that O (1) deletion is hard, and don't see a good way around it without changing the implementation ... I just think that the preserving the current C layout may be forcing an even more complicated solution than neccessary. I am nervous about pushing this to 3.5 because of the complexity. I agree that a simpler solution should (also?) wait for 3.6. Noted (and thanks for the feedback). I'm still comfortable with moving ahead for 3.5 with what we have. The code is documented and structured in such a way that it should be clear what's going on and relatively straightforward to adjust. There's a decent chance we will find a bug or two in corner cases, but nothing at a scale that would give me pause for a 3.5 release. Furthermore, the test suite for OrderedDict is pretty thorough so strict compatibility with the pure Python OrderedDict allows us to derive a lot of confidence about the C implementation. > > The question about dictheaher.h boils down to: if someone asks whether something is a dictview (not even using "exact", though that isn't an option), an odictview will say false ... is that really what you want? Ah. I misunderstood your question to imply what should be added. Instead, you were just indicating what is already there. I don't think anything needs to be changed though. Those checks don't pass for the pure Python OrderedDict so I would not expect them to do so for the C implementation.
msg244380 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-29 14:27
Can we merge this patch before new beta2? https://mail.python.org/pipermail/python-committers/2015-May/003410.html
msg244401 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-29 19:39
Planning on committing today after I address some review comments.
msg244402 - (view)	Author: Wes Turner (westurner) *	Date: 2015-05-29 19:44
> * Would this make it easy/faster to also have a DefaultOrderedDict (which can/could also be accomplished with .get(attr, []) and .setdefault ? On May 29, 2015 2:39 PM, "Eric Snow" <report@bugs.python.org> wrote: > > Eric Snow added the comment: > > Planning on committing today after I address some review comments. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue16991> > _______________________________________ >
msg244421 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-29 21:39
> * Would this make it easy/faster to also have a DefaultOrderedDict (which can/could also be accomplished with .get(attr, []) and .setdefault ? Not in 3.5.
msg244436 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 02:48
@Jim, I believe I've addressed all the review comments that have indicate a risk. I've also answered basically all the rest. Thanks for the great review. Unless there are any objections, I'll likely commit the patch in the next hour or two.
msg244437 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 04:28
New changeset e7af362b78df by Eric Snow in branch 'default': Issue #16991: Add a C implementation of collections.OrderedDict. https://hg.python.org/cpython/rev/e7af362b78df
msg244438 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2015-05-30 04:32
@Eric: I think you also want to commit it to 3.5
msg244439 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 04:34
New changeset b85028c9d4b9 by Eric Snow in branch '3.5': Issue #16991: Add a C implementation of collections.OrderedDict. https://hg.python.org/cpython/rev/b85028c9d4b9
msg244440 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 04:35
Yep. :)
msg244441 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 04:35
I'll keep an eye out for trouble on the buildbots.
msg244442 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2015-05-30 05:02
PyObject_TypeCheck() should be used instead of PyObject_IsInstance() (see issue24257). Perhaps Py_ODict_GetItemId() should be private API as relevant dict function.
msg244462 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2015-05-30 15:19
I agree with all Stephan's comments on Rietveld. See also issue24115 that fixed bugs similar to found in C implementation of OrderedDict.
msg244463 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 15:36
New changeset 0a7380984e37 by Eric Snow in branch '3.5': Issue #16991: Use PyObject_TypeCheck instead of PyObject_IsInstance. https://hg.python.org/cpython/rev/0a7380984e37
msg244464 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 15:40
> PyObject_TypeCheck() should be used instead of PyObject_IsInstance() (see > issue24257). Thanks for pointing this out. I've fixed both dictobject.h and odictobject.h. > > Perhaps Py_ODict_GetItemId() should be private API as relevant dict function. This isn't needed for 3.5, right? I'm not opposed to adding more functions to the C API for OrderedDict, but I wanted to start out as small as possible. My main motivation for the C implementation was for use in the interpreter and I hadn't given much thought to the direct use of C OrderedDict by extension modules. That can be addressed in 3.6 if it's important. I supposed you could double-check with Larry if you want to add more odict support to the C-API for 3.5.
msg244465 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2015-05-30 15:40
I'm getting the following linker errors on Windows 8.1 for 32 and 64 bit debug and release builds. unresolved external symbol _PyODict_Type in C:\cpython\PCbuild\_collectionsmodule.obj, and _PyODictIter_Type, _PyODictValues_Type, _PyODictKeys_Type,_PyODictItems_Type in C:\cpython\PCbuild\object.obj
msg244466 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 15:41
> New changeset 0a7380984e37 by Eric Snow in branch '3.5': > Issue #16991: Use PyObject_TypeCheck instead of PyObject_IsInstance. > https://hg.python.org/cpython/rev/0a7380984e37 I've also merged this into default: changeset: 96384:10eabadba316b6f2034db4c076a60c63d25d8fc6
msg244467 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 15:50
> I'm getting the following linker errors on Windows 8.1 for 32 and 64 bit debug and release builds. unresolved external symbol _PyODict_Type in C:\cpython\PCbuild\_collectionsmodule.obj, and _PyODictIter_Type, > _PyODictValues_Type, _PyODictKeys_Type,_PyODictItems_Type in C:\cpython\PCbuild\object.obj Hmm. I'm not too familiar with how things work for Windows. I'm also not clear on where the leading underscore is coming from.
msg244468 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2015-05-30 16:00
> This isn't needed for 3.5, right? I'm not opposed to adding more > functions to the C API for OrderedDict, but I wanted to start out as > small as possible. You already added public name Py_ODict_GetItemId. It uses private _Py_Identifier API and shouldn't be public.
msg244469 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2015-05-30 16:06
As for Windows issue, perhaps these names should be enumerated in PC/python3.def? See also issue23903.
msg244475 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 17:57
> You already added public name Py_ODict_GetItemId. It uses private _Py_Identifier API and shouldn't be public. Ah. I'll remove it.
msg244476 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 18:04
New changeset c9404fba02ba by Eric Snow in branch '3.5': Issue #16991: Properly handle return values in several places. https://hg.python.org/cpython/rev/c9404fba02ba New changeset 951a3ef82180 by Eric Snow in branch '3.5': Issue #16991: Do not return None from OrderedDict.__reversed__. https://hg.python.org/cpython/rev/951a3ef82180
msg244478 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 18:08
New changeset 7117e9b0f595 by Eric Snow in branch '3.5': Issue #16991: Drop Py_ODict_GetItemId. https://hg.python.org/cpython/rev/7117e9b0f595
msg244480 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 18:26
If it's just a matter of adding the definitions then here's a patch. Does that look correct?
msg244481 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 18:26
That last message is about building on Windows.
msg244484 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 18:55
New changeset 030205f1e716 by Eric Snow in branch '3.5': Issue #16991: Fix a few leaks and other memory-related concerns in OrderedDict. https://hg.python.org/cpython/rev/030205f1e716
msg244486 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 19:25
New changeset fbe74badb0c6 by Eric Snow in branch 'default': Issue #16991: Ensure that the proper OrderedDict is used in tests. https://hg.python.org/cpython/rev/fbe74badb0c6
msg244491 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 20:27
New changeset 1d851112922f by Eric Snow in branch '3.5': Issue #16991: Add PyODict* to Windows builds. https://hg.python.org/cpython/rev/1d851112922f
msg244492 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 20:29
I'm checking a fix for Windows against a buildbot (http://buildbot.python.org/all/builders/AMD64%20Windows8%203.x).
msg244493 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 20:35
New changeset 3ba1fa925f17 by Eric Snow in branch 'default': Issue #16991: Use the correct version for master. https://hg.python.org/cpython/rev/3ba1fa925f17
msg244496 - (view)	Author: Roundup Robot (python-dev)	Date: 2015-05-30 20:56
New changeset aea27164d207 by Eric Snow in branch '3.5': Issue #16991: Add odictobject.h on Windows. https://hg.python.org/cpython/rev/aea27164d207
msg244499 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-05-30 21:08
Well, that last one has everything compiling again. I expect it should be okay now. I'll watch the results.
msg244535 - (view)	Author: Stefan Krah (skrah) *	Date: 2015-05-31 11:09
Opening again. I have too many questions. :)
msg244574 - (view)	Author: Stefan Krah (skrah) *	Date: 2015-06-01 10:11
crash-1.py is due to an unchecked return value from _odictnode_VALUE(). We should probably use PyDict_GetItemWithError(), also in other places. I normally try to steer clear of stylistic remarks, but the _odictnode* macros are hiding too many things. As of now, they were hiding that an assert() is always true and that a return value was unchecked. Also, they're very inconvenient in a debugger.
msg244575 - (view)	Author: Stefan Krah (skrah) *	Date: 2015-06-01 10:19
crash-2.py is due to the fact that _PyDict_Pop() deletes a reference to 'key' in _odict_popkey(). The INCREF(key) in popitem should take place before calling _odict_popkey(). Again, I don't see the point of INCREF/DECREF inside of _odict_popkey().
msg244586 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-06-01 14:54
@Jim and Stefan, Thanks for thorough reviews! @Stefan, I'll take a look at those crashers and other suggestions ASAP. I really appreciate you taking the time. Now that the patch has been landed, would you mind opening new issues for each problem you find? That will help keep individual problems from getting lost. Thanks!
msg244587 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2015-06-01 15:00
Coverity has found an issue in odict, too: *** CID 1302699: Null pointer dereferences (NULL_RETURNS) /Objects/odictobject.c: 1316 in odict_copy() 1310 od_copy = PyObject_CallFunctionObjArgs((PyObject )Py_TYPE(od), NULL); 1311 if (od_copy == NULL) 1312 return NULL; 1313 1314 if (PyODict_CheckExact(od)) { 1315 _odict_FOREACH(od, node) { >>> CID 1302699: Null pointer dereferences (NULL_RETURNS) >>> Dereferencing a pointer that might be null "PyDict_GetItem((PyObject )(PyObject )od, node->key)" when calling "PyODict_SetItem". 1316 int res = PyODict_SetItem((PyObject )od_copy, 1317 _odictnode_KEY(node), 1318 _odictnode_VALUE(node, od)); 1319 if (res != 0) 1320 goto fail; 1321 }
msg244600 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-06-01 16:26
I've opened the following issues to address the 3 last comments: issue24347 issue24348 issue24349 I'll be opening a separate issue for outstanding review comments.
msg244664 - (view)	Author: Stefan Krah (skrah) *	Date: 2015-06-02 10:43
It's fine to open other issues, but I'm not happy with the resize()/get_index() situation. Currently I can't come up even with an informal "proof" that it'll always work (and see #24361). I think these functions really need 100% code coverage.
msg244668 - (view)	Author: Eric Snow (eric.snow) *	Date: 2015-06-02 14:01
Addressing the concerns with resize()/get_index() is next on my list. I had meant to open up an issue on it last night but it was getting pretty late for me and it slipped my mind. I've opened issue24362 to track that work.

History
Date	User	Action	Args
2022-04-11 14:57:40	admin	set	nosy: + larry github: 61195
2015-06-02 14:01:03	eric.snow	set	messages: + msg244668
2015-06-02 10:43:10	skrah	set	messages: + msg244664
2015-06-01 18:20:11	steve.dower	set	nosy: - steve.dower
2015-06-01 16:26:38	eric.snow	set	status: open -> closed messages: + msg244600
2015-06-01 15:00:25	christian.heimes	set	nosy: + christian.heimes messages: + msg244587
2015-06-01 14:54:19	eric.snow	set	messages: + msg244586
2015-06-01 10:19:03	skrah	set	messages: + msg244575
2015-06-01 10:11:33	skrah	set	messages: + msg244574
2015-06-01 09:37:20	skrah	set	files: + crash-2.py
2015-06-01 09:36:51	skrah	set	files: + crash-1.py
2015-05-31 11:09:56	skrah	set	status: pending -> open nosy: + skrah messages: + msg244535
2015-05-30 21:08:29	eric.snow	set	status: open -> pending
2015-05-30 21:08:22	eric.snow	set	messages: + msg244499
2015-05-30 20:56:38	python-dev	set	messages: + msg244496
2015-05-30 20:35:26	python-dev	set	messages: + msg244493
2015-05-30 20:29:44	eric.snow	set	messages: + msg244492
2015-05-30 20:27:10	python-dev	set	messages: + msg244491
2015-05-30 19:25:22	python-dev	set	messages: + msg244486
2015-05-30 18:55:47	python-dev	set	messages: + msg244484
2015-05-30 18:26:56	eric.snow	set	nosy: + steve.dower messages: + msg244481
2015-05-30 18:26:07	eric.snow	set	files: + odict-windows.diff messages: + msg244480
2015-05-30 18:08:19	python-dev	set	messages: + msg244478
2015-05-30 18:04:25	python-dev	set	messages: + msg244476
2015-05-30 17:57:31	eric.snow	set	messages: + msg244475
2015-05-30 16:06:33	serhiy.storchaka	set	messages: + msg244469
2015-05-30 16:00:50	serhiy.storchaka	set	messages: + msg244468
2015-05-30 15:50:45	eric.snow	set	messages: + msg244467
2015-05-30 15:41:32	eric.snow	set	messages: + msg244466
2015-05-30 15:40:15	BreamoreBoy	set	nosy: + BreamoreBoy messages: + msg244465
2015-05-30 15:40:11	eric.snow	set	messages: + msg244464
2015-05-30 15:36:07	python-dev	set	messages: + msg244463
2015-05-30 15:19:22	serhiy.storchaka	set	messages: + msg244462
2015-05-30 05:02:40	serhiy.storchaka	set	status: pending -> open messages: + msg244442
2015-05-30 04:35:44	eric.snow	set	status: open -> pending resolution: fixed messages: + msg244441 stage: patch review -> resolved
2015-05-30 04:35:03	eric.snow	set	messages: + msg244440
2015-05-30 04:34:22	python-dev	set	messages: + msg244439
2015-05-30 04:32:55	yselivanov	set	messages: + msg244438
2015-05-30 04:28:36	python-dev	set	nosy: + python-dev messages: + msg244437
2015-05-30 02:52:53	eric.snow	set	files: + 0813b1a88171.diff
2015-05-30 02:48:41	eric.snow	set	messages: + msg244436
2015-05-29 21:39:42	yselivanov	set	messages: + msg244421
2015-05-29 19:44:03	westurner	set	messages: + msg244402
2015-05-29 19:39:04	eric.snow	set	messages: + msg244401
2015-05-29 14:27:25	yselivanov	set	messages: + msg244380
2015-05-26 18:40:31	eric.snow	set	messages: + msg244122
2015-05-26 01:53:20	eric.snow	set	files: + ba1c6d40ca63.diff
2015-05-26 01:52:49	eric.snow	set	messages: + msg244076
2015-05-26 01:29:43	eric.snow	set	messages: + msg244074
2015-05-26 01:28:45	yselivanov	set	messages: + msg244073
2015-05-26 01:12:26	eric.snow	set	messages: + msg244071
2015-05-26 01:12:05	eric.snow	set	files: + c3fab329aa7f.diff
2015-05-26 01:11:19	eric.snow	set	messages: + msg244070
2015-05-25 23:41:56	eric.snow	set	messages: + msg244067
2015-05-25 22:44:21	eric.snow	set	messages: + msg244054
2015-05-25 22:38:37	Jim.Jewett	set	messages: + msg244053
2015-05-25 22:24:18	eric.snow	set	files: + 3b2a9026d48e.diff
2015-05-25 18:58:49	eric.snow	set	messages: + msg244048
2015-05-25 18:53:53	eric.snow	set	messages: + msg244047
2015-05-25 18:52:54	eric.snow	set	messages: + msg244046
2015-05-25 18:40:05	Mark.Shannon	set	nosy: + Mark.Shannon messages: + msg244044
2015-05-25 15:47:04	eric.snow	set	messages: + msg244040
2015-05-25 15:46:20	eric.snow	set	messages: + msg244039
2015-05-24 22:03:49	mrabarnett	set	messages: + msg244005
2015-05-24 03:53:49	Jim.Jewett	set	messages: + msg243966
2015-05-24 03:21:34	Jim.Jewett	set	messages: + msg243965
2015-05-24 03:18:09	Jim.Jewett	set	messages: + msg243964
2015-05-24 02:17:36	eric.snow	set	hgrepos: + hgrepo309
2015-05-24 02:14:53	eric.snow	set	messages: + msg243960
2015-05-24 01:53:49	mrabarnett	set	messages: + msg243959
2015-05-23 14:38:54	eric.snow	set	messages: + msg243915
2015-05-23 09:17:00	ncoghlan	set	nosy: + ncoghlan messages: + msg243903
2015-05-23 09:10:00	ncoghlan	link	issue24254 dependencies
2015-05-22 23:38:31	eric.snow	set	messages: + msg243868
2015-05-22 23:13:35	eric.snow	set	messages: + msg243863
2015-05-22 20:06:07	eric.snow	set	messages: + msg243849
2015-05-22 18:05:01	Jim.Jewett	set	nosy: + Jim.Jewett messages: + msg243843
2015-05-22 05:26:42	eric.snow	set	messages: + msg243807
2015-05-22 05:11:06	mrabarnett	set	nosy: + mrabarnett messages: + msg243806
2015-05-22 04:03:32	eric.snow	set	messages: + msg243803
2015-05-22 04:00:12	eric.snow	set	messages: + msg243802
2015-05-22 03:53:21	eric.snow	set	messages: + msg243801
2015-05-22 02:41:03	eric.snow	set	messages: + msg243799
2015-05-21 14:09:56	eric.snow	set	messages: + msg243753
2015-05-21 13:47:30	eric.snow	set	messages: + msg243749
2015-05-21 06:23:05	ned.deily	set	nosy: + ned.deily messages: + msg243733
2015-05-21 04:06:19	eric.snow	set	files: + cOrderedDict-4.diff messages: + msg243726
2015-05-21 02:59:47	yselivanov	set	messages: + msg243716
2015-05-21 02:58:39	eric.snow	set	messages: + msg243715
2015-05-21 02:48:51	eric.snow	set	messages: + msg243714
2015-05-20 23:35:01	eric.snow	set	messages: + msg243706
2015-05-20 23:15:46	eric.snow	set	messages: + msg243705
2015-05-20 22:57:49	pitrou	set	messages: + msg243702
2015-05-20 22:23:31	eric.snow	set	messages: + msg243699
2015-05-18 15:54:13	eric.snow	set	messages: + msg243492
2015-05-16 23:32:52	yselivanov	set	messages: + msg243379
2015-05-16 22:16:30	yselivanov	set	messages: + msg243371
2015-05-16 19:06:28	yselivanov	set	messages: + msg243358
2015-05-16 17:49:22	yselivanov	set	messages: + msg243343
2015-05-13 21:49:42	eric.snow	set	priority: normal -> release blocker
2015-05-13 21:48:57	yselivanov	set	messages: + msg243123
2015-05-13 21:47:26	eric.snow	set	messages: + msg243121
2015-05-13 15:07:49	yselivanov	set	messages: + msg243096
2015-04-09 00:17:33	eric.smith	set	nosy: + eric.smith
2015-04-02 13:56:55	introom	set	nosy: + introom
2015-04-02 02:12:05	eric.snow	set	messages: + msg239871
2015-04-02 01:13:42	pitrou	set	messages: + msg239867
2015-03-31 23:16:59	eric.snow	set	messages: + msg239761
2015-03-31 17:35:08	scoder	set	nosy: + scoder messages: + msg239740
2015-03-31 05:18:55	eric.snow	set	messages: + msg239666
2015-03-31 05:17:44	eric.snow	set	files: + cOrderedDict-3.diff messages: + msg239665
2014-12-17 21:27:03	eric.snow	set	messages: + msg232828
2014-12-17 16:43:54	westurner	set	nosy: + westurner messages: + msg232825
2014-12-17 06:31:12	eric.snow	set	messages: + msg232803
2014-12-10 05:56:05	Arfrever	set	nosy: + Arfrever
2014-12-10 05:12:11	eric.snow	set	messages: + msg232412
2014-11-12 17:32:13	tonn81	set	nosy: + tonn81 messages: + msg231087
2014-10-14 15:16:08	skrah	set	nosy: - skrah
2014-04-16 01:02:55	josh.r	set	nosy: + josh.r
2014-04-15 21:29:53	gregory.p.smith	set	nosy: + gregory.p.smith
2014-04-12 13:20:53	skrah	set	nosy: + skrah
2014-04-12 13:13:23	eric.snow	set	messages: + msg215971
2014-01-31 21:21:10	eric.snow	set	messages: + msg209831
2014-01-31 02:17:24	pitrou	set	nosy: + pitrou messages: + msg209744
2014-01-31 02:17:08	pitrou	set	versions: + Python 3.5, - Python 3.4
2014-01-30 02:21:58	yselivanov	set	nosy: + yselivanov messages: + msg209700
2013-12-06 17:06:06	refi64	set	nosy: + refi64
2013-10-20 18:32:28	serhiy.storchaka	set	files: + cOrderedDict-2.diff messages: + msg200607
2013-06-21 06:43:12	eric.snow	set	files: - cOrderedDict.diff
2013-06-21 06:42:48	eric.snow	set	files: + cOrderedDict.diff messages: + msg191557
2013-06-05 06:49:43	eric.snow	set	files: + cOrderedDict.diff messages: + msg190643
2013-06-05 06:48:58	eric.snow	set	files: - cOrderedDict.diff
2013-06-05 06:48:54	eric.snow	set	files: - cOrderedDict-iterators-by-key.diff
2013-06-05 02:57:47	eric.snow	set	messages: + msg190640
2013-06-05 02:26:36	eric.snow	set	files: + cOrderedDict-iterators-by-key.diff messages: + msg190639
2013-06-04 06:13:24	eric.snow	set	files: + cOrderedDict.diff messages: + msg190590
2013-06-04 06:11:02	eric.snow	set	files: - cOrderedDict.diff
2013-06-04 06:08:50	eric.snow	set	files: + odict-speed.diff messages: + msg190589
2013-05-24 05:57:22	eric.snow	set	files: - odict.diff
2013-05-24 05:57:15	eric.snow	set	files: - cleanup-test-collections.diff
2013-05-24 05:57:00	eric.snow	set	files: + cOrderedDict.diff messages: + msg189897
2013-05-24 05:10:54	eric.snow	set	messages: + msg189895
2013-05-24 04:38:31	eric.snow	set	messages: + msg189893
2013-05-24 04:04:09	eric.snow	set	messages: + msg189891
2013-03-21 03:04:45	rhettinger	set	messages: + msg184835
2013-03-20 21:28:53	eric.snow	set	messages: + msg184801
2013-02-16 06:46:49	eric.snow	set	messages: + msg182214
2013-02-16 06:29:13	eric.snow	set	files: - cleanup-test-collections.diff
2013-02-16 06:25:52	eric.snow	set	files: + cleanup-test-collections.diff
2013-02-16 05:28:10	eric.araujo	set	nosy: + eric.araujo
2013-02-15 16:56:40	eric.snow	set	messages: + msg182147
2013-02-15 15:03:20	benjamin.peterson	set	messages: + msg182145
2013-02-15 09:27:49	eric.snow	set	files: - odict.diff
2013-02-15 09:27:41	eric.snow	set	files: - cleanup-test-collections.diff
2013-02-15 09:27:24	eric.snow	set	files: + odict.diff messages: + msg182134
2013-02-15 09:21:28	eric.snow	set	files: + cleanup-test-collections.diff messages: + msg182132
2013-02-05 23:22:34	flox	set	nosy: + flox
2013-02-05 23:19:01	eric.snow	set	files: - odict.diff
2013-02-05 23:18:55	eric.snow	set	files: - cleanup-test-collections.diff
2013-02-05 23:06:39	eric.snow	set	files: + odict.diff
2013-02-05 23:06:24	eric.snow	set	files: + cleanup-test-collections.diff
2013-02-05 07:17:14	eric.snow	set	messages: + msg181422
2013-02-05 06:54:08	eric.snow	set	files: - odict.diff
2013-02-05 06:53:58	eric.snow	set	files: + odict.diff messages: + msg181421
2013-01-26 17:27:44	ezio.melotti	set	messages: + msg180690
2013-01-26 17:19:19	eric.snow	set	messages: + msg180688
2013-01-26 16:37:58	ezio.melotti	set	messages: + msg180682
2013-01-26 05:48:25	eric.snow	set	files: + cleanup-test-collections.diff messages: + msg180640
2013-01-26 05:08:17	asvetlov	set	nosy: + asvetlov
2013-01-26 02:04:27	ezio.melotti	link	issue16651 dependencies
2013-01-19 08:37:05	eric.snow	set	messages: + msg180236
2013-01-18 18:12:40	alex	set	nosy: + alex
2013-01-18 18:06:39	ezio.melotti	set	nosy: + ezio.melotti messages: + msg180202
2013-01-18 08:07:34	serhiy.storchaka	set	nosy: + serhiy.storchaka
2013-01-18 05:22:41	benjamin.peterson	set	nosy: + benjamin.peterson messages: + msg180176
2013-01-18 04:08:30	eric.snow	create