classification
Title: improve performance of libSSL usage on hashing
Type: enhancement Stage:
Components: SSL Versions: Python 3.7, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: christian.heimes Nosy List: christian.heimes, gut, haypo
Priority: normal Keywords:

Created on 2017-04-19 18:16 by gut, last changed 2017-04-20 15:13 by gut.

Pull Requests
URL Status Linked Edit
PR 1181 open gut, 2017-04-19 18:17
Messages (5)
msg291892 - (view) Author: Gustavo Serra Scalet (gut) Date: 2017-04-19 18:16
To correctly pick the best algorithm for the current architecture,
libssl needs to have OPENSSL_config(NULL) called as described on:
https://wiki.openssl.org/index.php/Libcrypto_API

This short change lead to a speedup of 50% on POWER8 when using
hashlib.sha256 digest functionality as it now uses a SIMD approach that was already existing but not used by cpython.
msg291895 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017-04-19 19:08
This small change also changes behavior of OpenSSL dramatically. What are your Python and OpenSSL versions? How did you profile the performance of the hashing functions?

If OpenSSL_config() really improves performance on relevant platforms, then we should be consistent and call it in _ssl.c, too.
msg291897 - (view) Author: Gustavo Serra Scalet (gut) Date: 2017-04-19 19:27
> Christian Heimes added the comment:
> 
> This small change also changes behavior of OpenSSL dramatically. What
> are your Python and OpenSSL versions? 

I tested by compiling my own python3 (8aaf499) against its parent (d6d344d) For python2, I made the same by compiling the 2.7.13 as a baseline and results were consistent with the python I have on Ubuntu 17.10 (3.5.3 and 2.7.13).

> How did you profile the
> performance of the hashing functions?

I compared the timings between my patched vs patchless python versions by using a 630MB file to calculate sha256 as:

perf stat -r 10 Python-2.7.13/python -c "import hashlib; print(hashlib.sha256(open('ubuntu-16.10-server-ppc64el.iso','rb').read()).hexdigest())"

The speedup of ~50% is given due to patchless taking 3.082156387s and patched taking 2.027484800s (errors are less than 1%)

I also noticed what changed on a code level by checking with gdb what is being executed on the sha loop. After my patch I see altivec (SIMD) functions being used on SHA main loop.

> If OpenSSL_config() really improves performance on relevant platforms,
> then we should be consistent and call it in _ssl.c, too.

I'll check that then. Thanks for the hint.

Ps: please note this behavior is noticed on a POWER8 machine. I'm not an OpenSSL expert so I don't know if there are now more optimizations enabled on other architectures as well.
msg291928 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 01:09
> This small change also changes behavior of OpenSSL dramatically.

What do you mean by "dramatically"? What does a openssl.cnf configuration contain?

Do other applications using OpenSSL call OPENSSL_config(NULL)?

Since OPENSSL_config() accepts a filename, maybe a first step would be to expose the function as ssl.OPENSSL_config(filename) to allow user to load *explicitly* a configuration file? ssl.OPENSSL_config() would call OPENSSL_config(NULL). Would it work for you, Gustavo?

--

More links:

- https://wiki.openssl.org/index.php/Manual:OPENSSL_config(3)
- https://en.wikibooks.org/wiki/OpenSSL/Initialization
msg291980 - (view) Author: Gustavo Serra Scalet (gut) Date: 2017-04-20 15:13
> Since OPENSSL_config() accepts a filename, maybe a first step would be
> to expose the function as ssl.OPENSSL_config(filename) to allow user to
> load *explicitly* a configuration file? ssl.OPENSSL_config() would call
> OPENSSL_config(NULL). Would it work for you, Gustavo?

It would work, I would just wait for such a decision.

BTW, that interface is deprecated as we're discussing on the issue on GitHub:
https://github.com/openssl/openssl/blob/cda3ae5bd0798c56fef5a5c1462d51ca1776504e/doc/crypto/OPENSSL_config.pod#notes

I'll continue analyzing this issue there.
History
Date User Action Args
2017-04-20 15:13:57gutsetmessages: + msg291980
2017-04-20 01:09:36hayposetnosy: + haypo
messages: + msg291928
2017-04-19 19:27:41gutsetmessages: + msg291897
2017-04-19 19:08:41christian.heimessetmessages: + msg291895
2017-04-19 18:17:35gutsetpull_requests: + pull_request1309
2017-04-19 18:16:08gutcreate