Message113887
So, here is the modified benchmark. It first creates a cache of the random wordlist, because it is quite long to compute (for N up to 10000000). The cache is reused in subsequent runs. This takes some memory though (probably won't run it if you have less than 2GB). The cache itself is quite small on-disk (around 80 MB).
I'm outputting the total dict size and the space occupied per individual key, and I've also timed lookups separately. Here are results on my machine:
1000 words ( 961 keys), 3269137 words/s, 16980987 lookups/s, 51 bytes/key (0.0MB)
10000 words ( 9042 keys), 2697648 words/s, 13680052 lookups/s, 87 bytes/key (0.8MB)
100000 words ( 83168 keys), 2462269 words/s, 6956074 lookups/s, 37 bytes/key (3.0MB)
500000 words ( 389442 keys), 1802431 words/s, 4733774 lookups/s, 64 bytes/key (24.0MB)
1000000 words ( 755372 keys), 1702130 words/s, 4485229 lookups/s, 66 bytes/key (48.0MB)
2000000 words ( 1463359 keys), 1616658 words/s, 4251021 lookups/s, 68 bytes/key (96.0MB)
5000000 words ( 3501140 keys), 1608889 words/s, 3909212 lookups/s, 57 bytes/key (192.0MB)
10000000 words ( 6764089 keys), 1531136 words/s, 3600395 lookups/s, 59 bytes/key (384.0MB)
As you can see, the O(1) behaviour seems almost observed up to the 10000000 words limit (given the way the data is computed, I'm afraid I can't go further than that). Of course, very small dicts are faster because everything fits in the CPU caches. |
|
Date |
User |
Action |
Args |
2010-08-14 11:25:51 | pitrou | set | recipients:
+ pitrou, rhettinger, terry.reedy, mark.dickinson, ezio.melotti, eric.araujo, daniel.urban, dmtr |
2010-08-14 11:25:51 | pitrou | set | messageid: <1281785151.62.0.78409418245.issue9520@psf.upfronthosting.co.za> |
2010-08-14 11:25:49 | pitrou | link | issue9520 messages |
2010-08-14 11:25:48 | pitrou | create | |
|