Author larry
Recipients
Date 2007-01-14.10:42:55
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Thanks for taking the time!

> - Style: you set your tab stops to 4 spaces.  That is an absolute
> no-no!

Sorry about that; I'll fix it if I resubmit.


> - Segfault in test_array. It seems that it's receiving a unicode
> slice object and treating it like a "classic" unicode object.

I tested on Windows and Linux, and I haven't seen that behavior.

Which test_array, by the way?  In Lib/test, or Lib/ctypes/test?
I'm having trouble with most of the DLL extensions on Windows;
they complain that the module uses the incompatible python26.dll
or python26_d.dll.  So I haven't tested ctypes/test_array.py
on Windows, but I have tested the other three permutations of
Linux vs Windows and Lib/test/test_array vs
Lib/ctypes/test/test_array.

Can you give me a stack trace to the segfault?  With that I bet I
can fix it even without a reproducible test case.


> - I got it to come to a grinding halt with the following worst-case
> scenario:
> 
>   a = []
>   while True:
>       x = u"x"*1000000
>       x = x[30:60]  # Short slice of long string
>       a.append(x)
> 
> If you can't do better than that, I'll have to reject it.
> 
> PS I used your combined patch, if it matters.

It matters.  The combined patch has "lazy slices", the other
patch does not.


When you say "grind to a halt" I'm not sure what you mean.
Was it thrashing?  How much CPU was it using?

When I ran that test, my Windows computer got to 1035 iterations
then threw a MemoryError.  My Linux box behaved the same, except
it got to 1605 iterations.


Adding a call to .simplify() on the slice defeats this worst-case
scenario:

a = []
while True:
    x = u"x"*1000000
    x = x[30:60].simplify()  # Short slice of long string
    a.append(x)

.simplify() forces lazy strings to render themselves.  With that
change, this test will run until the cows come home.  Is that
acceptable?


Failing that, is there any sort of last-ditch garbage collection
pass that gets called when a memory allocation fails but before
it returns NULL?  If so, I could hook in to that and try to render
some slices.  (I don't see such a pass, but maybe I missed it.)

Failing that, I could add garbage-collect-and-retry-once logic to
memory allocation myself, either just for unicodeobject.c or as a
global change.  But I'd be shocked if you were interested in that
approach; if Python doesn't have such a thing by now, you probably
don't want it.

And failing that, "lazy slices" are probably toast.  It always was
a tradeoff of speed for worst-case memory use, and I always knew
it might not fly.  If that's the case, please take a look at the
other patch, and in the meantime I'll see if anyone can come up with
other ways to mitigate the worst-case scenario.
History
Date User Action Args
2007-08-23 15:56:04adminlinkissue1629305 messages
2007-08-23 15:56:04admincreate