classification
Title: start testing strings > 2GB
Type: Stage:
Components: Tests Versions: Python 2.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: nnorwitz, twouters
Priority: normal Keywords: patch

Created on 2006-04-17 06:01 by nnorwitz, last changed 2006-04-26 15:54 by twouters. This issue is now closed.

Files
File name Uploaded Description Edit
test_str.py nnorwitz, 2006-04-17 06:01 nn v1
test_bigstr.py twouters, 2006-04-19 20:00 v2 (twouters)
test_bigmem.diff twouters, 2006-04-25 22:29 v3 (twouters)
Messages (7)
msg50023 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-17 06:01
Incomplete patch.  Would be great if someone picked
this up and ran with it.  Need to be conservative and
try not to use too much memory in these tests.

Should start with strings and unicode.  Those will be
the most important things to test first.  Then can move
to arrays, mmap and other sequences.  Eventually on to
lists, sets, and dicts.

Right now, I'm sticking this in
Lib/tests/bigmem/test_str.py. 
msg50024 - (view) Author: Thomas Wouters (twouters) * (Python committer) Date: 2006-04-19 15:57
Logged In: YES 
user_id=34209

FWIW, I have a couple of 16Gb AMD64 machines at work, that
I'm preparing to be database servers. As long as they aren't
in production, I'm using them in off-hours to run with these
tests, although I don't know how far I'll get. (On the other
hand, it already found a couple of bugs ;)
msg50025 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-19 17:00
Logged In: YES 
user_id=33168

Cool.  I hadn't gotten a chance to run this yet.  In fact, I
never even verified it was syntactically correct. :-) 
Hopefully, this gives others the idea and they can help beef
up the tests.

Thanks for testing it out!
msg50026 - (view) Author: Thomas Wouters (twouters) * (Python committer) Date: 2006-04-19 20:00
Logged In: YES 
user_id=34209

It was syntactically correct, but many of the tests were
broken ;) Here's an updated version. I arranged it so
there's a 1K test version for easy debugging of tests, as
well as the 2GB and 4GB versions. There's still tests to be
added, but I'm bored with it now. (Python trunk on my
debian-amd64 box ran the 1K and 2GB versions without any new
issues, the 4GB one is still chugging away.)
msg50027 - (view) Author: Thomas Wouters (twouters) * (Python committer) Date: 2006-04-25 22:29
Logged In: YES 
user_id=34209

Well, I got unbored long enough to finish most (byte)string
tests, and do some tuple testing too. But, boy, do those
need a lot of memory. The 16Gb box I was testing on took 4
days to finish the tuple tests -- and I had to quickly add a
spare disk as swap, as the 20Gb I had allocated was just not
enough :)

Here's a completely reworked testfile. It adds a few things:

 - Support for 'bigmem tests' in test_support, and a
command-line option to regrtest. If bigmem tests aren't
explicitly enabled in regrtest, the tests are still run,
just with very small sizes (in order to test the tests
themselves.) Enabling the test is done by passing an upper
memory limit, and the framework tries to get tests to stay
inside that limit.

 - Instead of 'size' as a class attribute, each test is
decorated with a minimum size and expected memory use (in
bytes per size). The decorator calculates the allowable size
from the memory use and the limit, and skips the test if
it's going to take too much memory even with the minimum
size. In verbose mode, it warns that it skipped a file.

As I said, the tuple tests take quite a lot of memory, and
you probably don't want to the tests to start swapping (swap
is quite, quite slow.) However, not many of these tests can
be run with less than 8Gb of memory, and that'll just cover
the string tests and a few of the simpler tuple tests.

On the plus side, while the string tests found several bugs
in the Py_ssize_tification, the tuple tests found none. Or
perhaps that's a downside? I'm not sure. Anyway,
everything's here to build tests for more datatypes: dicts,
lists, sets and unicode strings come to mind (where testing
unicode strings is the most likely to discover bugs, and
should only take 2 to 4 times as much memory as the string
tests), but there's also array, buffer and probably more in
modules that can do with similar tests.
msg50028 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-26 04:03
Logged In: YES 
user_id=33168

Thomas, let's move this to either the sandbox or actual test
suite.  Everything you wrote makes sense.  A lot of it
probably should be added to comments and/or some sort of README.

My hope is that if this is somewhere in the core (or
sandbox), we might be able to attract some people with some
real big mem machines.

Thanks for doing all this work. 
msg50029 - (view) Author: Thomas Wouters (twouters) * (Python committer) Date: 2006-04-26 15:54
Logged In: YES 
user_id=34209

Checked in my latest version (which has list tests on top of
the string/tuple tests) so it'd make it into alpha 2.

History
Date User Action Args
2006-04-17 06:01:55nnorwitzcreate