Author serhiy.storchaka
Recipients Arfrever, Giovanni.Bajo, PaulMcMillan, Vlado.Boza, alex, arigo, benjamin.peterson, camara, christian.heimes, dmalcolm, haypo, koniiiik, lemburg, mark.dickinson, serhiy.storchaka
Date 2012-11-07.12:55:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <201211071455.35522.storchaka@gmail.com>
In-reply-to <1352289191.91.0.972472753019.issue14621@psf.upfronthosting.co.za>
Content
> Serhiy, the performance of hash() for long strings isn't very relevant for the general performance of a Python program.

It exposes the raw speed of hashing algorithm.  It is good as a first estimate, because more real cases require more sophisticated measurements.

> Short strings dominate. I've modified the timeit to create a new string object every time.

timeit is absolutely not suitable for this.  Need to write a C program that uses the Python C API.

> for I in 5 10 15 20 30 40 50 60; do echo -ne "$I\t"; ./python -m timeit -n100000 -r30 -s "h = hash; x = 'รค' * $I" -- "h(x + 'a')" | awk '{print $6}' ; done

Please, do not be fooled by the wrong measurements. You measure the height of the building together with the hill, on which it stands. Use "-n1" and you will see a 
completely different numbers.
History
Date User Action Args
2012-11-07 12:55:50serhiy.storchakasetrecipients: + serhiy.storchaka, lemburg, arigo, mark.dickinson, haypo, christian.heimes, benjamin.peterson, Arfrever, alex, dmalcolm, Giovanni.Bajo, PaulMcMillan, Vlado.Boza, koniiiik, camara
2012-11-07 12:55:50serhiy.storchakalinkissue14621 messages
2012-11-07 12:55:50serhiy.storchakacreate