This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients Arfrever, Giovanni.Bajo, PaulMcMillan, Vlado.Boza, alex, arigo, benjamin.peterson, camara, christian.heimes, dmalcolm, koniiiik, lemburg, mark.dickinson, serhiy.storchaka, vstinner
Date 2012-11-07.12:55:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <201211071455.35522.storchaka@gmail.com>
In-reply-to <1352289191.91.0.972472753019.issue14621@psf.upfronthosting.co.za>
Content
> Serhiy, the performance of hash() for long strings isn't very relevant for the general performance of a Python program.

It exposes the raw speed of hashing algorithm.  It is good as a first estimate, because more real cases require more sophisticated measurements.

> Short strings dominate. I've modified the timeit to create a new string object every time.

timeit is absolutely not suitable for this.  Need to write a C program that uses the Python C API.

> for I in 5 10 15 20 30 40 50 60; do echo -ne "$I\t"; ./python -m timeit -n100000 -r30 -s "h = hash; x = 'ä' * $I" -- "h(x + 'a')" | awk '{print $6}' ; done

Please, do not be fooled by the wrong measurements. You measure the height of the building together with the hill, on which it stands. Use "-n1" and you will see a 
completely different numbers.
History
Date User Action Args
2012-11-07 12:55:50serhiy.storchakasetrecipients: + serhiy.storchaka, lemburg, arigo, mark.dickinson, vstinner, christian.heimes, benjamin.peterson, Arfrever, alex, dmalcolm, Giovanni.Bajo, PaulMcMillan, Vlado.Boza, koniiiik, camara
2012-11-07 12:55:50serhiy.storchakalinkissue14621 messages
2012-11-07 12:55:50serhiy.storchakacreate