This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author brett.cannon
Recipients brett.cannon, pitrou
Date 2012-02-17.20:24:21
SpamBayes Score 0.0
Marked as misclassified No
Message-id <CAP1=2W5Soph9v775fW+YHRQiD7wETJ8TPCoJMv=1hgt7-Dr0Hg@mail.gmail.com>
In-reply-to <1329506921.3678.22.camel@localhost.localdomain>
Content
On Fri, Feb 17, 2012 at 14:31, Antoine Pitrou <report@bugs.python.org>wrote:

>
> Antoine Pitrou <pitrou@free.fr> added the comment:
>
> > Why pre-calculate everything? In the most common case any single
> > module will be imported once, if at all. And once it is imported it
> > will get cached in sys.modules, alleviating the need to hit the finder
> > again. So from a performance standpoint wouldn't it be better not to
> > do all of the pre-calculation and instead do that as needed assuming
> > that sys.modules will shield the finder from having to do repetitive
> > things like figuring out what loader is needed?
>
> I figured it would avoid repetitive tests for all 10 suffixes.
> That said, I have now tried the alternative: find_module() is around 50%
> slower, but building the cache is 10x faster. Perhaps this is a winner.
>

What is the time increase for find_module() vs. the speed-up of building
the cache? I.e. how many imports are needed before doing the full
calculation is a benefit? And would it make sense to have a hybrid of
caching the contents for fast start-up but then caching full details after
a successful find? That would mean no work is ever simply tossed out and
forgotten.

> It would depend on the situation (short or long sys.path, few or many
> imports, etc.). Perhaps you can try both patches on your bootstrap repo?
>

Yep, that's not hard (and it will only get faster as I replace the bodies
of __import__() and _gcd_import() with C code so that sys.modules is C-fast
again). Question is what to benchmark against? I should probably get the
standard benchmarks up and running and see how those are affected
(especially the start-up ones).

>
> >  Plus if the finder gets its cache invalidated frequently it  will
> > simply be wasting its time.
>
> Well, in real-world situations I don't think the cache will ever get
> invalidated: because imports are mostly done at startup, and because
> invalidating the cache means you are installing new libraries or
> updating existing ones while a running program is about to import
> something.
>

I agree, but it was just something to consider.

>
> > Otherwise it's good to know three of us now have independently come up
> > with fundamentally the same idea for speeding up imports. =)
>
> Yup :)
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue14043>
> _______________________________________
>
History
Date User Action Args
2012-02-17 20:24:22brett.cannonsetrecipients: + brett.cannon, pitrou
2012-02-17 20:24:21brett.cannonlinkissue14043 messages
2012-02-17 20:24:21brett.cannoncreate