Message113060
No. I'm not simply running out of system memory. 8Gb/x64/linux. And in my test cases I've only seen ~25% of memory utilized. And good idea. I'll try to play with the cyclic garbage collector.
It is harder than I thought to make a solid synthetic test case addressing that issue. The trouble you need to be able to generate data (e.g. 100,000,000 words/5,000,000 unique) with a distribution close to that in the real life scenario (e.g. word lengths, frequencies and uniqueness in the english text). If somebody have a good idea onto how to do it nicely - you'd be very welcome.
My best shot so far is in the attachment. |
|
Date |
User |
Action |
Args |
2010-08-06 01:40:24 | dmtr | set | recipients:
+ dmtr, rhettinger, mark.dickinson, pitrou, ezio.melotti, eric.araujo |
2010-08-06 01:40:23 | dmtr | set | messageid: <1281058823.56.0.268082505436.issue9520@psf.upfronthosting.co.za> |
2010-08-06 01:40:21 | dmtr | link | issue9520 messages |
2010-08-06 01:40:20 | dmtr | create | |
|