Author brett.cannon
Recipients Trundle, benjamin.peterson, brett.cannon, eric.snow, haypo, merwok, ncoghlan
Date 2012-02-24.01:06:38
SpamBayes Score 0.0
Marked as misclassified No
Message-id <>
OK, so I have now streamlined the case so that only C is called when a module is in sys.modules, level == 0 and not fromlist.

Using the normal_startup benchmark, bootstrapped importlib is 1.12x slower. A fast run of startup_nosite puts it at 1.2x slower.

Using importlib.test.benchmark, the results are:

Comparing new vs. old

sys.modules : 331,097 vs. 383,180 (86.407694%)
Built-in module : 14,906 vs. 33,071 (45.072722%)
Source writing bytecode: small : 1,244 vs. 2,128 (58.458647%)
Source w/o bytecode: small : 2,420 vs. 4,784 (50.585284%)
Source w/ bytecode: small : 2,757 vs. 5,221 (52.805976%)
Source writing bytecode: tabnanny : 129 vs. 133 (96.992481%)
Source w/o bytecode: tabnanny : 144 vs. 147 (97.959184%)
Source w/ bytecode: tabnanny : 1,004 vs. 1,120 (89.642857%)
Source writing bytecode: decimal : 9 vs. 9 (100.000000%)
Source w/o bytecode: decimal : 9 vs. 9 (100.000000%)
Source w/ bytecode: decimal : 96 vs. 98 (97.959184%)

Where does that leave us? Well, obviously on medium-sized modules and up, the I/O involved along with parsing and execution outweighs importlib's costs to the point of being negligible. sys.modules is also now fast enough (I don't care what people say; being able to import from sys.modules at 0.0026 ms vs. 0.003 ms is not important in the grand scheme of things).

Built-in modules and such could be faster, but (a) there are only so many and they end up in sys.modules quickly anyway, and (b) even at their current speed they are still fast. So the real hold-up is small modules and whether they matter. The tabnanny benchmark is the median size of the stdlib, so half of modules will see no slowdown while half will see something of some level.

The most critical thing is to not use _gcd_import() in situations where the import function is currently passed in but instead use builtins.__import__() so as to get any C speed-ups (which I measured to make sure there are in all cases). After that I think unrolling the mentioned functions would help speed things up, but otherwise it becomes time to profile the Python code to see where the inefficiencies lie.

Some more C unrolling could be done. The situation of not fromlist could be written in C. Both _calc___packages__() and _resolve_name() could probably be written in C if someone cared.
Date User Action Args
2012-02-24 01:06:39brett.cannonsetrecipients: + brett.cannon, ncoghlan, haypo, benjamin.peterson, merwok, Trundle, eric.snow
2012-02-24 01:06:39brett.cannonsetmessageid: <>
2012-02-24 01:06:38brett.cannonlinkissue2377 messages
2012-02-24 01:06:38brett.cannoncreate