This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib2: resolves extremly slow import (of "everything")
Type: Stage:
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, jimjjewett, kxroberto
Priority: normal Keywords: patch

Created on 2006-05-09 15:59 by kxroberto, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
urllib2_py24_fastload.patch kxroberto, 2006-05-09 15:59
Messages (4)
msg50228 - (view) Author: kxroberto (kxroberto) Date: 2006-05-09 15:59
This superseeds the old patch #1053150 (for an older
Python; it was stopped: "Jeremy doesn't like the idea")
in order to import the expensive modules behind urllib2
late.

I'm recommending now again to do this, as things are
almost unacceptable meanwhile.

In Py24, simply importing original urllib2 costs upto
to a second on my slower machines. the startup time of
some of my bigger apps/scripts goes mainly to importing
urllib2. More than half of the time goes into importing
cookielib (regarding profiler runs). Its almost
unusable so now in CGI scripts.

New modules were added to urllib2 meanwhile, and worst
of all the cookielib was inserted into urllib2 the same
old style "import everything on top of the file in a
kind of C-#include manner". 

Python offers best dynamic modularization of code. That
should be exploited for such an expensive
virtualization module like urllib2. There are usually
only very locations, where the sub-modules are referenced. 
This patch also enables to strip off unnecessary
modules (down to _MozillaCookieJar!) for
cx_freeze/py2exe distribution. 

( Since long I have this patch on my list, which I
apply after each Python installation regularly. )

--

As a side effect of this import-all practice a lazy
cookielib dependency came into normal Request
constructor code:
"origin_req_host = cookielib.request_host(self)"

I'd recommend, to copy/move this simple tool function
request_host into urllib2 in order to resolve the
cookielib dependency completely. (not done so far in
the patch)



-robert






msg50229 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-05-17 15:17
Logged In: YES 
user_id=849994

Fixed in rev. 46029.
msg50230 - (view) Author: Jim Jewett (jimjjewett) Date: 2006-05-17 22:44
Logged In: YES 
user_id=764593

Note that lazy importing can interact very badly with 
threads.

Why did you change the signature of OpenenDirector._open?  
The base class ignores the data, but subclasses may not.

Removing the SSL guard "if hasattr(httplib, 'HTTPS')" is 
questionable, since the ssl library is external and must be 
compiled separately, and therefore may not exist on some 
platforms even without other source customizations.

msg50231 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-05-18 05:54
Logged In: YES 
user_id=849994

Jim: Note that I didn't apply the patch from here, but only
added lazy-loading of ftplib, cookielib and mimetypes.
History
Date User Action Args
2022-04-11 14:56:17adminsetgithub: 43339
2006-05-09 15:59:48kxrobertocreate