classification
Title: Add OrderedDict written in C
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: eric.snow Nosy List: Arfrever, Ryan.Gonzalez, alex, asvetlov, benjamin.peterson, eric.araujo, eric.smith, eric.snow, ezio.melotti, flox, gregory.p.smith, introom, josh.r, mrabarnett, ned.deily, pitrou, rhettinger, scoder, serhiy.storchaka, tonn81, westurner, yselivanov
Priority: release blocker Keywords: patch

Created on 2013-01-18 04:08 by eric.snow, last changed 2015-05-22 05:26 by eric.snow.

Files
File name Uploaded Description Edit
odict-speed.diff eric.snow, 2013-06-04 06:08 pybench addition for OrderedDict review
cOrderedDict.diff eric.snow, 2013-06-21 06:42 review
cOrderedDict-2.diff serhiy.storchaka, 2013-10-20 18:32 review
cOrderedDict-3.diff eric.snow, 2015-03-31 05:17 review
cOrderedDict-4.diff eric.snow, 2015-05-21 04:05 review
Messages (68)
msg180170 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-01-18 04:08
Here's an initial stab at writing OrderedDict in C.  Though, the implementation is not heavily optimized and isn't super subclass-friendly, my hope is that it's relatively close to the final version.  I'm getting this up now to get some eyes on it.

The spot in the builtins is mostly for convenience, but I expect it will need to be exposed somewhere (perhaps _collections?).

My experience with the C-API is relatively limited and my C-fu is not at a professional level.  However, I'm pretty sure that I have most everything correct.

The ultimate goal for this type is to use it for **kwargs.

Note: this first patch has some reference leaks that I'm tracking down.
msg180176 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-01-18 05:22
Nit: This really not be exposed through _collections rather than hacked into builtins.
msg180202 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-01-18 18:06
The tests should probably be updated to use the PEP 399 idiom in order to test both the implementations.
msg180236 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-01-19 08:37
@Benjamin: Yeah, I've fixed that.

@Ezio: Good point.  I've touched that up.

Once I have cleared up reference counting issues I'll put up a new patch.
msg180640 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-01-26 05:48
Here's a cleanup of test.test_collections that helps keep the subsequent patch (still forthcoming) cleaner.
msg180682 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-01-26 16:37
What's the reason for moving the OrderedDict tests in a separate file?
msg180688 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-01-26 17:19
> What's the reason for moving the OrderedDict tests in a separate file?

Following the precedent of collections.deque and collections.defaultdict:

* a big chunk of code
* the default implementation will be coming via _collections.
msg180690 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-01-26 17:27
These are indeed good reasons.
msg181421 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-02-05 06:53
Here's an updated patch.  I still have some ref-counting issues, but the patch is much closer to what I expect will be the final version.  At this point it passes all the main unit tests (segfaults in some of the supplemental Mapping tests).

One key thing is that for now the various iterators are faked using lists.  Once everything is sorted out I'll plug my implementation of the iterators back in (which I'd already mostly finished).

I see room for efficiency improvements in a couple spots.  As well, I plan on improving the subclass-ability of the type once it's otherwise happy.

Any feedback at the point would be helpful.  Regardless, I'm being careful to get this right and I'm no expert with the C-API, so it's taking a while.  :)

Some other considerations:

   * ensure that __init__() can reset and populate an existing dictionary:
             d=OrderedDict(); d[0]=1; d.__init__() # resets
   * avoid any reference cycles so that dicts can clean-up right-away when their ref-count drops to zero
     rather than waiting on GC.
   * possibly use the GIL to make the link updates atomic, trying to make the overall class thread safe.
   * methods like get() pop() or setdefault() shouldn't rely on __getitem__ raising KeyError
     because __missing__ may be present in a subclass
msg181422 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-02-05 07:17
Looks like I didn't get the patch lined up to tip so the review link isn't showing up.  I'll have to fix that tomorrow.
msg182132 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-02-15 09:21
Are there any objections to this unittest cleanup patch?  I'd like to commit it separately prior to anything that will go in for the C OrderedDict.
msg182134 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-02-15 09:27
Incidently, I'm just about done with the OrdereDict patch.  I'm attaching the current one.  At present it passes the unit tests, but I have two memory-related issues and a number of open questions (that I don't think are critical).

The patch has the unittest cleanup merged in just so it will show up in reviewboard.
msg182145 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-02-15 15:03
Why did you copy test_collections.py instead of just creating a new file?
msg182147 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-02-15 16:56
> Why did you copy test_collections.py instead of just creating
> a new file?

To preserve the history of changes to the bulk of the code in test_ordereddict.py.
msg182214 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-02-16 06:46
I should clarify, odict.diff passes test_ordereddict.  Regardless, it's pretty close.  I'm going to figure out why the review link hates me on this issue, but until then any extra eyes here would be appreciated.  The memory-related issues are pushing well past my experience.

My goal is to get this in before PyCon and to have a reasonable plan at least for implementing **kwargs as OrderedDict.  My intuition is that writing OrderedDict in C is the hard part.
msg184801 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-03-20 21:28
My current patch ends up with O(n) deletion, which won't fly, so I'm refactoring.  While I'm at it I'm also planning on using the BSD queue.h doubly-linked list rather than the one that I rolled.  I'm also going to pull the ordered dict implementation into it's own source file.  However, these things should not have much of an impact on most of the code I've already written.  I anticipate that the changes won't translate into a further large volume of work.

In talking to Raymond, he emphasized the importance of making sure we avoid reentrancy problems.  I'll be double-checking that and likely making use of the GIL in a couple spots.

While the bulk of the implementation is complete, the remaining work to do here is what I've described above, along with more testing.  An orthogonal problem is addressing the problem of the concrete dict API.  I'll bring that up separately when this issue is basically done.
msg184835 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-03-21 03:04
I just looked at the test-collections patch and don't want it committed.  It is full of trivial edits that make it hard to review and doesn't make the test suite better.

It is okay to add some new tests, but I don't want to rearrange the existing code with minor edits.  That will just make it more difficult to maintain the code across versions (it is no fun to backport a fix and then have unnecessary merge conflicts).
msg189891 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-05-24 04:04
It's worth noting that issue10977 is pretty closely related to this isssue.
msg189893 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-05-24 04:38
> I just looked at the test-collections patch and don't want it
> committed.  It is full of trivial edits that make it hard to
> review and doesn't make the test suite better.

Thanks for the feedback, Raymond.  I agree that there are a number of trivial edits there that shouldn't be there.  I'll fix those.
msg189895 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-05-24 05:10
Raymond, do you have any objections to splitting the OrderedDict-related tests into their own module.  Doing so allows for testing all the those test classes at once, which is handy when working on OrderedDict.  This is more relevant now with the extra PEP-399-motivated tests.
msg189897 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-05-24 05:56
I finally had some time so here's an updated patch (including O(1) deletions).  I've also added a bunch of notes at the top of odictobject.c.  Some notable things left to do:

* address two recently failing tests due to r83881 (issue 17900)
* check for any reference cycles (should be fine)
* validate reentrancy (make sure everything is thread-safe around calls into Python)
* make sure subclassing is okay
* compare performance to dict, pure Python OrderedDict

My goal here is to get an effective OrderedDict that we are happy with, while recognizing that there is room for optimization.  However, I won't be okay with faultiness, so the implementation must be complete.  This has been my mindset throughout.
msg190589 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-06-04 06:08
Without too many optimzations, the C implementation of OrderedDict is basically between 1x and 4x the performance of raw dict.  This contrasts with the pure Python OrderedDict, which falls in roughly between 4x and 100x.

I've attached an addition to pybench, not intended to be committed, that compares the 3.  Run any of these commands to get timing:

  ./python Tools/pybench/pybench.py -t ODict
  ./python Tools/pybench/pybench.py -t ODict -t _Py
  ./python Tools/pybench/pybench.py -t ODict -t _C
  ./python Tools/pybench/pybench.py -t ODict -t _Dict

I tuned the # rounds for each test such that the raw dict results all come out to just about the same value (2ms on my computer).  That way it's easy to compare the pure Python or C results to the dict results.
msg190590 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-06-04 06:13
Here's an updated patch that has fixes for recursive pickles and for a couple memory-related segfaults that I missed before.  That leaves the following to be done:

* handle the case where a node is deleted during iteration
* check for any reference cycles (should be fine)
* validate reentrancy (make sure everything is thread-safe around calls into Python)
* make sure subclassing is okay
* address any remaining TODOs in the code
msg190639 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-06-05 02:26
Here's one solution to the deletion-during-iteration problem.  It makes the iterators track the key rather than the node, at the expense of a sliver of speed on __iter__() and keys().
msg190640 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-06-05 02:57
The concern about reference cycles here lies in their existence in the linked-list.  To see what I mean, check out the use of weakrefs in the pure Python implementation at Lib/collections/__init__.py.  As the current implementation does not use PyObjects for the linked-list, I'm going to call this a non-issue.
msg190643 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-06-05 06:49
Here's what I feel is a nearly complete patch.  The only outstanding issues that I feel need to be answered are 4 spots where calls into the interpreter may result in unexpected changes to the object or make the current function state out-of-date.

1. _odict_clear_nodes() while iterating over the nodes.
2. _odict_keys_equal() while iterating over the nodes.
3. odict_sizeof() -- I'm not too worried about this one.
4. _odict_copy() while iterating over the nodes.

Once I feel comfortable with some resolution for those, I'm going to consider the patch ready to go, pending reviews.
msg191557 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-06-21 06:42
I figured out what I hope were the last memory-related issues.  Apparently tp_traverse is not inherited if tp_flags is set.  I had it set on all the view types and all the iterator types.  So during GC it would blow up when it tried to call tp_traverse.

Everything is looking pretty good so I'm attaching the latest diff.
msg200607 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-20 18:32
Rebase patch to tip.
msg209700 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2014-01-30 02:21
Can we still merge this in 3.4?
msg209744 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-01-31 02:17
It's much too late for such a disruptive change in 3.4.
msg209831 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-01-31 21:21
I agree with Antoine.  It's first on my todo list for 3.5.  My goal is that this and a couple of related features will land during the PyCon sprints.
msg215971 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-04-12 13:13
After pulling up to tip and sticking in the basic **kwargs change, the results on the benchmark suite imply a refleak somewhere.  As soon as I sort that out I'll push up the updated patch.
msg231087 - (view) Author: Артём Скорецкий (tonn81) Date: 2014-11-12 17:32
Any progress? It was planned for 3.5 release
msg232412 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-12-10 05:12
I haven't had time for a while to do much work on this.  At the last posting this patch was basically done.  I'm sure it no longer applies cleanly.  The biggest obstacle to wrapping this issue up is the size of the patch.  It's hard to get a thorough review because someone would have to dedicate the time for it.  Furthermore, adding this much code introduces a debatable risk which needs to be minimized.

When I'm able to get back to this I'll likely split the patch up.  I'd really like to introduce some helpers (perhaps macros) to the C API for writing iterators, since the boilerplate involved is massive; just take a look at how much code in my implementation involves the boilerplate of creating iterators.  Unfortunately I don't know when my time will free up in the near future.  I still hope to land this for 3.5.
msg232803 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-12-17 06:31
I'm open to suggestions on how to help make the patch more reviewable.  Currently at roughly +2500 lines of code it's a lot to ask someone to review.  While probably not the only way to help reviewers, I expect the best thing I could do is to split the patch up into several smaller parts.

However, I could use some insight into the best way to approach that.  I keep considering factoring out the iterator/view boilerplate into helpers (macros?) in a separate patch.  That would go a long way to making the actual cOrderedDict implementation more reasonably sized.  I'm just not convinced yet it's the *right* approach.
msg232825 - (view) Author: Wes Turner (westurner) * Date: 2014-12-17 16:43
* Macros could be useful.
* Would this make it easy/faster to also have a DefaultOrderedDict (which can/could also be accomplished with .get(attr, []) and .setdefault(attr, [])?
* There may be some value in looking at https://pypi.python.org/pypi/cyordereddict
* (I'm really not qualified to review this. Valgrind?)
msg232828 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-12-17 21:27
Suggestions from Antoine on python-dev:

* do not make cOrderedDict "builtin"; drop changes to:
  - Include/Python.h
  - Objects/object.c
  - Makefile.pre.in
* do not add a C-API
* include the linked list pointers in the hash entries? (may simplify things)
* drop the shared/split keys mechanism?
* do not subclass dict? (can of worms)
msg239665 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-03-31 05:17
I've opened a feature clone to better track the work (features/cordereddict).  Here's the updated patch against default from that clone.  I haven't added any argument client stuff.  I also haven't addressed any of the feedback from Antoine.  I'd rather leave OrderedDict as a "builtin" since the whole point is to be able to use OrderedDict in the interpreter.
msg239666 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-03-31 05:18
s/argument client/Argument Clinic/
msg239740 - (view) Author: Stefan Behnel (scoder) * Date: 2015-03-31 17:35
> do not add a C-API

what speaks against it?
msg239761 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-03-31 23:16
I expect Antoine is trying to limit the scope of the change, which makes sense.  In this case, though, once (if?) the patch lands I plan on using it in the interpreter, so the C-API bits will be necessary.
msg239867 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-04-02 01:13
From a cursory glance, I still think this is the wrong approach. There's a lot of complication due to the simple fact that it is subclassing from dict (and therefore trying to be ABI-compatible, kind of). The linked list scheme should be much simpler (perhaps it shouldn't be a linked list at all, actually: see Raymond's recent proposal). I understand a lot of work has gone into it, but I'm against the current proposal.
msg239871 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-04-02 02:12
Thanks for speaking up, Antoine.  Keep in mind that a C implementation of OrderedDict has a strict compatibility requirement with the existing pure Python one.  The Python implementation subclasses dict so the C implementation must as well.  I agree that it would be simpler otherwise.  It would have saved me a lot of headache!

As to the linked list implementation, it does the job.  I'm not trying to blaze any trails with this code.  Furthermore, my implementation does not preclude a better one replacing it.  I would welcome a better implementation.  Until that appears, my implementation is what there is. :)  Frankly, my interest does not lie in the a C OrderedDict, but in what I want to do one we have one.  Since no one was willing to do that for me <wink>, I did it myself.

That said, I feel like my implementation satisfies what is required and does so in a way that we can be satisfied with it indefinitely.  It sounds like you feel it would be worse to land it than to not have a C implementation of OrderedDict.  Obviously I don't feel the same way. :)  Regardless, I think we both feel the same way that the amount of time I've spent on this shouldn't factor into determining that.

Aside from subclassing dict and my linked list approach, do you have any other concerns?
msg243096 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-05-13 15:07
Eric, is there any chance this can land in 3.5?  OrderedDict is a heavily used thing, everyone will benefit from a fast implementation.  It's OK if we have an imperfect (but fully compatible with existing OrderedDict) implementation in 3.5.  We can optimize it in 3.6.
msg243121 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-13 21:47
@Yury, I'm mostly just waiting for Raymond to give it at least a quick sanity-check.  I know there is at least 1 ref leak, but that can be sorted out.
msg243123 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-05-13 21:48
@Eric, can you set priority to "release blocker"?
msg243343 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-05-16 17:49
Raymond, is there any chance you can review the patch before beta1?  Sorry, for bugging you with this, I just really hope we can have fast OrderedDict in 3.5.
msg243358 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-05-16 19:06
As for refleak -- I looked at it briefly: simple "a = OrderedDict(); a = None" code leaks, so it must be somewhere in tp_new/tp_dealloc.  And I know what object is leaking - it's None.
msg243371 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-05-16 22:16
I must have used a better example :)

>>> import sys; from collections import OrderedDict as OD
>>> sys.getrefcount(None)
2053
>>> OD();OD();OD();OD();
OrderedDict()
OrderedDict()
OrderedDict()
OrderedDict()
>>> sys.getrefcount(None)
2057
msg243379 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-05-16 23:32
Started to look at the patch in more detail. Found one leak. There are more. Eric, maybe you can hunt them?
msg243492 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-18 15:54
Thanks for taking a look, Yury.  I'll follow up on the ref leak soon.
msg243699 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-20 22:23
I've updated the cordereddict branch on the feature repo to fix the ref leak on None.  I'm still seeing other leaks (regrtest -R 3:3), but I don't think that should block landing before the feature freeze.

However, I'm concerned about a segfault I'm seeing roughly 3% of the time when running the following:

  ./python -m unittest test.test_configparser.ConfigParserTestCase.test_basic

Presumably there is a dealloc bug there which manifests intermittently due to a GC race.  I'm going work on tracking it down but help is appreciated. :)

I'm also seeing consistent test failures in test_enum.
msg243702 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-05-20 22:57
A patch shouldn't be integrated if there are known bugs with it (especially segfaults).
msg243705 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-20 23:15
Agreed.  I wanted to be clear about what is blocking landing the patch.
msg243706 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-20 23:35
These are the leaks I'm seeing in test_collections (14/24 tests):

 test_copying         : 178
 test_iterators       : 40
 test_update          : 29
 test_init            : 22
 test_move_to_end     : 21
 test_sorted_iterators: 20
 test_setdefault      : 16
 test_clear           : 5
 test_setitem         : 5
 test_delitem         : 4
 test_repr_recursive  : 4
 test_repr            : 3
 test_override_update : 2
 test_reinsert        : 1
msg243714 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-21 02:48
I've fixed most of the leaks now.  The only remaining ones are:

 test_clear           : 5
 test_repr_recursive  : 3
msg243715 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-21 02:58
Regarding the segfault, the following does not fail:

  ./python -m test.regrtest --forever -m test_basic test_configparser

but this does:

  for i in `seq 1 20`; do ./python -m test.regrtest -m test_basic test_configparser ; done
msg243716 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-05-21 02:59
Eric, could you please upload your latest patch?
msg243726 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-21 04:05
Here's a diff from the current cordereddict branch of the feature repo.
msg243733 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2015-05-21 06:23
FWIW, the random segfault seems to be triggered by hash randomization.  If I disable randomization, it does not seem to fail:

for i in `seq 1 20`; do PYTHONHASHSEED=1 ./python -m test.regrtest -m test_basic test_configparser ; done

Presumably, the --forever option does not vary the hash seed value?

Also, FWIW, clang issues a warning:

../../source/Objects/odictobject.c:550:15: warning: comparison of unsigned expression < 0
      is always false [-Wtautological-compare]
        if (i < 0) {
            ~ ^ ~

for this:

    _odict_FOREACH(od, node) {
        size_t i = _odict_get_index(od, _odictnode_KEY(node));
        if (i < 0) {
            od->od_size = prev_size;
            PyMem_FREE(fast_nodes);
            return -1;
        }
        fast_nodes[i] = node;

I can supply OS X crash tracebacks if they would be of help.
msg243749 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-21 13:47
Thanks for looking into this, Ned.  I've changed that size_t to ssize_t which I expect will quiet that clang warning you saw.  I'm glad you pointed it out because it means that that branch was never executing!  Unfortunately fixing that does not solve all my problems. <wink>

I'll look into the hash randomization connection.  Thanks for pointing that out.  Hopefully it doesn't complicate matters.  My guess is that the segfaults only happen for certain seeds, so I'll check that first.
msg243753 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-21 14:09
Cool.  The following gives consistent failures at certain seed values:

  for i in `seq 1 100`; do echo $i; PYTHONHASHSEED=$i ./python -m test.regrtest -m test_basic test_configparser ; done

Through 100 I get segfaults with 7, 15, 35, 37, 39, 40, 42, 47, 50, 66, 67, 85, 87, 88, and 92.  The distribution is probably essentially uniform across the full set of seeds, even if the exact pattern of failures isn't precisely uniform.  I'll try to see why those hashes are significant here.  If I can't figured it out quickly then I'll post about it on python-dev.

My hunch is that the hash randomization impacts either dict/odict resizing or dict/odict lookup (or both).  TBH, I've been pretty sure for a while that the segfault is coming out of one of those two.
msg243799 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-22 02:41
I've spent a bit of time exploring the segfault.  Here's some data that might help relative to the configparser test.

I put the following at the beginning of _odict_resize:

    Py_ssize_t len = PyObject_Size((PyObject *)od);
    if (len == 0)
        PySys_FormatStdout(".");
    else {
        if (((PyDictObject *)od)->ma_keys->dk_size < od->od_size)
            PySys_FormatStdout("-");
        if (len < 10)
            PySys_FormatStdout("%d", len);
        else
            PySys_FormatStdout("+");
    }
    if (len >= 10)
        PySys_FormatStdout("%d\n", len);

If the current item count is 0 then it prints a dot.  If the resize is shrinking then it prints a - (this did not happen).  Otherwise the odict is growing and it prints the current item count.  Multi-digit numbers are preceded by + and followed by a newline.

I've included the results of different hash seeds (0/no randomization, 1, and 7).  You'll notice how the results are subtly different.  In the case of 7, it matches no randomization up to the point that it segfaults.  I got the same results with 15.  However, 35 fails right after the second +22.

$ PYTHONHASHSEED=0 ./python -m test.regrtest -m test_basic test_configparser
...6+12
+22
....6+12
+22
........6.6........+10
........6.6........+10
........6.6........+10
........6.6........+10
........6.6........+10
.+10
........6.6........+10
........6.6........+10
........6.6........+10
........6.6...............6..6............+10
....6.6.6.6.66.66.6.6.6.6.+12
6.+12
6.6.6.6.6.6.6.6.6.+22
6.+22
6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.+44
6.+44
6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.+86
6.+86
6.6.6.6.6.6.6.6.6.6.6.6.6.6.6........6.6........+10
........6.6........+10
........6.6........+10
........6.6........+10
.6..

$ PYTHONHASHSEED=1 ./python -m test.regrtest -m test_basic test_configparser
...6+12
+22
....6+12
+22
........6.6................6.6................6.6................6.6................6.6.........+11
........6.6................6.6................6.6................6.6...............6..6................6.6.6.6.66.66.6.6.6.6.+12
6.+12
6.6.6.6.6.6.6.6.6.+22
6.+22
6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.+44
6.+44
6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.+86
6.+86
6.6.6.6.6.6.6.6.6.6.6.6.6.6.6........6.6................6.6................6.6................6.6.........6..

$ PYTHONHASHSEED=7 ./python -m test.regrtest -m test_basic test_configparser
...6+12
+22
....6+12
+22
........6.6........+10
........6.6........+10
........6.6........+10
........6.6........+10
........6.6........+10
.+10
Fatal Python error: Segmentation fault
msg243801 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-22 03:53
As far as I can tell there aren't any patterns that repeat across multiple seeds (which makes sense).
msg243802 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-22 04:00
Here are the last 10 frames from the backtrace (gdb):

#0  0x00000000005b15d0 in odictiter_iternext (di=<optimized out>) at Objects/odictobject.c:1888
#1  0x0000000000453179 in PyIter_Next (iter=<optimized out>) at Objects/abstract.c:2760
#2  0x00000000005881e7 in chain_next (lz=0x7ffff2b8d878) at ./Modules/itertoolsmodule.c:1866
#3  0x0000000000537db6 in PyEval_EvalFrameEx (f=f@entry=0xcab188, throwflag=throwflag@entry=0) at Python/ceval.c:2975
#4  0x000000000053afee in fast_function (func=func@entry=0x7ffff2bc88f8, pp_stack=pp_stack@entry=0x7fffffff99f8, n=n@entry=1, na=na@entry=1, nk=nk@entry=0) at Python/ceval.c:4752
#5  0x000000000053b655 in call_function (pp_stack=pp_stack@entry=0x7fffffff99f8, oparg=oparg@entry=0) at Python/ceval.c:4679
#6  0x000000000053891d in PyEval_EvalFrameEx (f=f@entry=0xcbd588, throwflag=throwflag@entry=0) at Python/ceval.c:3198
#7  0x000000000053afee in fast_function (func=func@entry=0x7ffff2bc8840, pp_stack=pp_stack@entry=0x7fffffff9bd8, n=n@entry=3, na=na@entry=3, nk=nk@entry=0) at Python/ceval.c:4752
#8  0x000000000053b655 in call_function (pp_stack=pp_stack@entry=0x7fffffff9bd8, oparg=oparg@entry=2) at Python/ceval.c:4679
#9  0x000000000053891d in PyEval_EvalFrameEx (f=f@entry=0xc7d458, throwflag=throwflag@entry=0) at Python/ceval.c:3198

and from Python:

Current thread 0x00007f3ad9260740 (most recent call first):
  File "/home/esnow/projects/cordereddict/Lib/configparser.py", line 1115 in _join_multiline_values
  File "/home/esnow/projects/cordereddict/Lib/configparser.py", line 1109 in _read
  File "/home/esnow/projects/cordereddict/Lib/configparser.py", line 716 in read_file
  File "/home/esnow/projects/cordereddict/Lib/configparser.py", line 721 in read_string
  File "/home/esnow/projects/cordereddict/Lib/test/test_configparser.py", line 354 in test_basic
  File "/home/esnow/projects/cordereddict/Lib/unittest/case.py", line 597 in run
msg243803 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-22 04:03
The segfault happens at line 1888 of odictobject.c when it tries to Py_INCREF a NULL value.  The problem is that the value that gets looked up for a presumably valid key is returned as NULL.  So either the value is messed up, lookup is broken, or the wrong key is being used.
msg243806 - (view) Author: Matthew Barnett (mrabarnett) * Date: 2015-05-22 05:11
FTR, in "_odict_resize" there's this:

    size_t i = _odict_get_index(od, _odictnode_KEY(node));
    if (i < 0) {

Note that "i" is declared as unsigned and then tested whether it's negative.
msg243807 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-05-22 05:26
Yeah, Ned pointed that one out.  I fixed it but haven't pushed the change.
History
Date User Action Args
2015-05-22 05:26:42eric.snowsetmessages: + msg243807
2015-05-22 05:11:06mrabarnettsetnosy: + mrabarnett
messages: + msg243806
2015-05-22 04:03:32eric.snowsetmessages: + msg243803
2015-05-22 04:00:12eric.snowsetmessages: + msg243802
2015-05-22 03:53:21eric.snowsetmessages: + msg243801
2015-05-22 02:41:03eric.snowsetmessages: + msg243799
2015-05-21 14:09:56eric.snowsetmessages: + msg243753
2015-05-21 13:47:30eric.snowsetmessages: + msg243749
2015-05-21 06:23:05ned.deilysetnosy: + ned.deily
messages: + msg243733
2015-05-21 04:06:19eric.snowsetfiles: + cOrderedDict-4.diff

messages: + msg243726
2015-05-21 02:59:47yselivanovsetmessages: + msg243716
2015-05-21 02:58:39eric.snowsetmessages: + msg243715
2015-05-21 02:48:51eric.snowsetmessages: + msg243714
2015-05-20 23:35:01eric.snowsetmessages: + msg243706
2015-05-20 23:15:46eric.snowsetmessages: + msg243705
2015-05-20 22:57:49pitrousetmessages: + msg243702
2015-05-20 22:23:31eric.snowsetmessages: + msg243699
2015-05-18 15:54:13eric.snowsetmessages: + msg243492
2015-05-16 23:32:52yselivanovsetmessages: + msg243379
2015-05-16 22:16:30yselivanovsetmessages: + msg243371
2015-05-16 19:06:28yselivanovsetmessages: + msg243358
2015-05-16 17:49:22yselivanovsetmessages: + msg243343
2015-05-13 21:49:42eric.snowsetpriority: normal -> release blocker
2015-05-13 21:48:57yselivanovsetmessages: + msg243123
2015-05-13 21:47:26eric.snowsetmessages: + msg243121
2015-05-13 15:07:49yselivanovsetmessages: + msg243096
2015-04-09 00:17:33eric.smithsetnosy: + eric.smith
2015-04-02 13:56:55introomsetnosy: + introom
2015-04-02 02:12:05eric.snowsetmessages: + msg239871
2015-04-02 01:13:42pitrousetmessages: + msg239867
2015-03-31 23:16:59eric.snowsetmessages: + msg239761
2015-03-31 17:35:08scodersetnosy: + scoder
messages: + msg239740
2015-03-31 05:18:55eric.snowsetmessages: + msg239666
2015-03-31 05:17:44eric.snowsetfiles: + cOrderedDict-3.diff

messages: + msg239665
2014-12-17 21:27:03eric.snowsetmessages: + msg232828
2014-12-17 16:43:54westurnersetnosy: + westurner
messages: + msg232825
2014-12-17 06:31:12eric.snowsetmessages: + msg232803
2014-12-10 05:56:05Arfreversetnosy: + Arfrever
2014-12-10 05:12:11eric.snowsetmessages: + msg232412
2014-11-12 17:32:13tonn81setnosy: + tonn81
messages: + msg231087
2014-10-14 15:16:08skrahsetnosy: - skrah
2014-04-16 01:02:55josh.rsetnosy: + josh.r
2014-04-15 21:29:53gregory.p.smithsetnosy: + gregory.p.smith
2014-04-12 13:20:53skrahsetnosy: + skrah
2014-04-12 13:13:23eric.snowsetmessages: + msg215971
2014-01-31 21:21:10eric.snowsetmessages: + msg209831
2014-01-31 02:17:24pitrousetnosy: + pitrou
messages: + msg209744
2014-01-31 02:17:08pitrousetversions: + Python 3.5, - Python 3.4
2014-01-30 02:21:58yselivanovsetnosy: + yselivanov
messages: + msg209700
2013-12-06 17:06:06Ryan.Gonzalezsetnosy: + Ryan.Gonzalez
2013-10-20 18:32:28serhiy.storchakasetfiles: + cOrderedDict-2.diff

messages: + msg200607
2013-06-21 06:43:12eric.snowsetfiles: - cOrderedDict.diff
2013-06-21 06:42:48eric.snowsetfiles: + cOrderedDict.diff

messages: + msg191557
2013-06-05 06:49:43eric.snowsetfiles: + cOrderedDict.diff

messages: + msg190643
2013-06-05 06:48:58eric.snowsetfiles: - cOrderedDict.diff
2013-06-05 06:48:54eric.snowsetfiles: - cOrderedDict-iterators-by-key.diff
2013-06-05 02:57:47eric.snowsetmessages: + msg190640
2013-06-05 02:26:36eric.snowsetfiles: + cOrderedDict-iterators-by-key.diff

messages: + msg190639
2013-06-04 06:13:24eric.snowsetfiles: + cOrderedDict.diff

messages: + msg190590
2013-06-04 06:11:02eric.snowsetfiles: - cOrderedDict.diff
2013-06-04 06:08:50eric.snowsetfiles: + odict-speed.diff

messages: + msg190589
2013-05-24 05:57:22eric.snowsetfiles: - odict.diff
2013-05-24 05:57:15eric.snowsetfiles: - cleanup-test-collections.diff
2013-05-24 05:57:00eric.snowsetfiles: + cOrderedDict.diff

messages: + msg189897
2013-05-24 05:10:54eric.snowsetmessages: + msg189895
2013-05-24 04:38:31eric.snowsetmessages: + msg189893
2013-05-24 04:04:09eric.snowsetmessages: + msg189891
2013-03-21 03:04:45rhettingersetmessages: + msg184835
2013-03-20 21:28:53eric.snowsetmessages: + msg184801
2013-02-16 06:46:49eric.snowsetmessages: + msg182214
2013-02-16 06:29:13eric.snowsetfiles: - cleanup-test-collections.diff
2013-02-16 06:25:52eric.snowsetfiles: + cleanup-test-collections.diff
2013-02-16 05:28:10eric.araujosetnosy: + eric.araujo
2013-02-15 16:56:40eric.snowsetmessages: + msg182147
2013-02-15 15:03:20benjamin.petersonsetmessages: + msg182145
2013-02-15 09:27:49eric.snowsetfiles: - odict.diff
2013-02-15 09:27:41eric.snowsetfiles: - cleanup-test-collections.diff
2013-02-15 09:27:24eric.snowsetfiles: + odict.diff

messages: + msg182134
2013-02-15 09:21:28eric.snowsetfiles: + cleanup-test-collections.diff

messages: + msg182132
2013-02-05 23:22:34floxsetnosy: + flox
2013-02-05 23:19:01eric.snowsetfiles: - odict.diff
2013-02-05 23:18:55eric.snowsetfiles: - cleanup-test-collections.diff
2013-02-05 23:06:39eric.snowsetfiles: + odict.diff
2013-02-05 23:06:24eric.snowsetfiles: + cleanup-test-collections.diff
2013-02-05 07:17:14eric.snowsetmessages: + msg181422
2013-02-05 06:54:08eric.snowsetfiles: - odict.diff
2013-02-05 06:53:58eric.snowsetfiles: + odict.diff

messages: + msg181421
2013-01-26 17:27:44ezio.melottisetmessages: + msg180690
2013-01-26 17:19:19eric.snowsetmessages: + msg180688
2013-01-26 16:37:58ezio.melottisetmessages: + msg180682
2013-01-26 05:48:25eric.snowsetfiles: + cleanup-test-collections.diff

messages: + msg180640
2013-01-26 05:08:17asvetlovsetnosy: + asvetlov
2013-01-26 02:04:27ezio.melottilinkissue16651 dependencies
2013-01-19 08:37:05eric.snowsetmessages: + msg180236
2013-01-18 18:12:40alexsetnosy: + alex
2013-01-18 18:06:39ezio.melottisetnosy: + ezio.melotti
messages: + msg180202
2013-01-18 08:07:34serhiy.storchakasetnosy: + serhiy.storchaka
2013-01-18 05:22:41benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg180176
2013-01-18 04:08:30eric.snowcreate