This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Windows pyd slower when not loaded via load_dynamic
Type: performance Stage: resolved
Components: Extension Modules, Windows Versions: Python 2.7
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: Nosy List: Safihre, paul.moore, skrah, steve.dower, terry.reedy, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2017-09-22 20:24 by Safihre, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (9)
msg302768 - (view) Author: Safihre (Safihre) * Date: 2017-09-22 20:24
I have a Python 2.7 C-extension that was made for performance, however, I noticed something strange on Windows:
When I run locally python setup.py install and it installs it the "egg" way, the performance of calling the function in the module 500x in a loop is:

     yEnc C New  took   579 ms
     yEnc C New  took   580 ms
     yEnc C New  took   580 ms

But when I install it as a wheel locally or via pip without a single change to the C-code, this is the result:

     yEnc C New  took   702 ms
     yEnc C New  took   694 ms
     yEnc C New  took   691 ms

That's a 10-15% difference. I also checked macOS and Ubuntu, here it does not seem to be the case.
By investigating (https://github.com/pypa/setuptools/issues/1154) I found that the difference is only that the "egg" way the module gets loaded via imp.load_dynamic 

I cannot test under Python 3, since the module is (not yet) converted to Python 3-style.


To reproduce run and look at the yEnc C New score between the 2 ways (on Windows only of course):

git clone https://github.com/sabnzbd/sabyenc.git
cd sabyenc
python setup.py install
python .\tests\speed_compare.py; python .\tests\speed_compare.py;python .\tests\speed_compare.py;
pip uninstall sabyenc -y

pip install sabyenc
# Or this:
# python setup.py install bdist_wheel
# pip install .\dist\sabyenc-3.3.1-cp27-cp27m-win_amd64.whl
python .\tests\speed_compare.py; python .\tests\speed_compare.py;python .\tests\speed_compare.py;
msg303363 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-09-29 22:23
This issue is more likely to get attention if there is a difference in 3.6, or even better, 3.7.
msg303370 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2017-09-29 23:05
The only difference between your two tests is the install tool, and neither setuptools nor wheel bugs are tracked here.
msg303394 - (view) Author: Safihre (Safihre) * Date: 2017-09-30 05:25
No, the difference is not wheel vs setuptools. They both generate the identical pyd file. 
The difference is python: if the module is loaded by just being available on the path or if the module is loaded via  imp.load_dynamic

I understand if it's closed because it's not python 3, but it's not external tool related.
msg303412 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-09-30 14:45
Does the difference stay at 10-15% if you run it 50000 times or is it
a one time cost of around 100 ms?
msg303414 - (view) Author: Safihre (Safihre) * Date: 2017-09-30 16:30
Very good question!

5000 times via imp.load_dynamic:
     yEnc C New  took  5870 ms
     yEnc C New  took  5878 ms
     yEnc C New  took  5835 ms
5000 times via "pip: having the .pyd in site-packages"
     yEnc C New  took  6489 ms
     yEnc C New  took  6369 ms
     yEnc C New  took  6390 ms

Difference ~450 ms

10000 times via imp.load_dynamic:
     yEnc C New  took  11775 ms
     yEnc C New  took  11720 ms
     yEnc C New  took  11695 ms

10000 times via "pip: having the .pyd in site-packages"
     yEnc C New  took  12338 ms
     yEnc C New  took  12281 ms
     yEnc C New  took  12345 ms

Difference ~500 ms

10000 times via imp.load_dynamic:
     yEnc C New  took  23431 ms
     yEnc C New  took  23406 ms
     yEnc C New  took  23283 ms

10000 times via "pip: having the .pyd in site-packages"
     yEnc C New  took  24401 ms
     yEnc C New  took  24177 ms
     yEnc C New  took  24482 ms

Difference ~1100 ms

So not very linearly scaling but still increasing. Odd.
msg303435 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2017-09-30 23:33
The difference is on startup in how they generate sys.path entries.

Skip the install step and directly run the library in a clean virtual environment, so that no .pth or .egg files are in use. If the difference still occurs, show that it also occurs on a recent version of Python as well and we can look at it

I promise you, there is only one way that extension modules get loaded. The performance difference is not because of anything in core.
msg303459 - (view) Author: Safihre (Safihre) * Date: 2017-10-01 12:21
If you know the problem, would you also know the solution?
What is the difference in how they generate sys.path entries and how does this affect performance during execution of the module function?
Just want to understand what is going on :)
msg303462 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2017-10-01 14:30
I don't really care, since you're comparing multiple unsupported technologies, but my guess is that the egg install injects the library much earlier on sys.path, which avoids most of the relatively expensive file system scan.
History
Date User Action Args
2022-04-11 14:58:52adminsetgithub: 75736
2017-10-01 14:30:29steve.dowersetmessages: + msg303462
2017-10-01 12:21:26Safihresetmessages: + msg303459
2017-09-30 23:33:01steve.dowersetstatus: open -> closed
resolution: third party
messages: + msg303435
2017-09-30 16:30:32Safihresetmessages: + msg303414
2017-09-30 14:45:48skrahsetnosy: + skrah
messages: + msg303412
2017-09-30 05:26:45Safihresetresolution: third party -> (no value)
2017-09-30 05:25:42Safihresetstatus: closed -> open

messages: + msg303394
2017-09-29 23:05:46steve.dowersetstatus: open -> closed
resolution: third party
messages: + msg303370

stage: resolved
2017-09-29 22:23:36terry.reedysetnosy: + terry.reedy
messages: + msg303363
2017-09-22 20:24:22Safihrecreate