Author vstinner
Recipients Arfrever, amaury.forgeotdarc, benjamin.peterson, brett.cannon, eric.araujo, georg.brandl, r.david.murray, terry.reedy, vstinner
Date 2011-01-19.01:22:03
SpamBayes Score 1.4988e-15
Marked as misclassified No
Message-id <1295400133.63.0.413112936466.issue3080@psf.upfronthosting.co.za>
In-reply-to
Content
Here is a work-in-progress patch: issue3080-3.patch. The patch is HUGE and written for Python 3.3.

$ diffstat issue3080-3.patch 
 Doc/c-api/module.rst   |   24 
 Include/import.h       |   73 +
 Include/moduleobject.h |    2 
 Include/pycapsule.h    |    4 
 Modules/zipimport.c    |  272 +++---
 Objects/moduleobject.c |   52 -
 PC/import_nt.c         |   84 +-
 Python/dynload_aix.c   |    2 
 Python/dynload_dl.c    |    2 
 Python/dynload_hpux.c  |    2 
 Python/dynload_next.c  |    4 
 Python/dynload_os2.c   |    2 
 Python/dynload_shlib.c |    2 
 Python/dynload_win.c   |    2 
 Python/import.c        | 1910 +++++++++++++++++++++++++++----------------------
 Python/importdl.c      |   79 +-
 Python/importdl.h      |    2 
 issue3080.py           |   29 
 18 files changed, 1484 insertions(+), 1063 deletions(-)

As expected, most of the work in done in import.c.

Decode the module name earlier and encode it later. Try to manipulate PyUnicodeObject objects instead of char* buffers (so we have directly the string length).

Split the huge and very complex find_module() function into 3 functions (find_module, find_module_filename and find_module2) and document them. Drop OS/2 support in find_module() (it can be kept, but it was easier for me to drop it and the OS/2 maintainer wrote that Python 3 is far from being compatible with OS/2).

The patch creates some functions: PyModule_GetNameObject(), PyImport_ExecCodeModuleUnicode(), PyImport_AddModuleUnicode(), PyImport_ImportFrozenModuleUnicode(), PyModule_NewUnicode(), ...

Use "U" format to parse a module name, and "%R" to format a module name (to escape surrogates characters and add quotes, instead of "... '%.200s' ...").

PyWin_FindRegisteredModule() is now private. Remove fqname argument from _PyImport_GetDynLoadFunc(), it wasn't used.

Replace open_exclusive() by fopen(name, "wb") on Windows: is it correct?

TODO:

 - rename xxxobj => xxx to keep original names and have a short patch (eg. I renamed name to nameobj during the transition to detect bugs)
 - catch encoding errors in case_ok()
 - don't encode in case_ok() if case_ok() does nothing (eg. on Linux)
 - find a better name for find_module2()

The patch contains a tiny script, issue3080.py, to test the patch using an ISO-8859-1 locale.

I will open a thread on the mailing list (python-dev) to decide if this patch is needed or not. If we agree that this issue should be fixed, I will split the patch into smaller parts and start a review process.
History
Date User Action Args
2011-01-19 01:22:14vstinnersetrecipients: + vstinner, brett.cannon, georg.brandl, terry.reedy, amaury.forgeotdarc, benjamin.peterson, eric.araujo, Arfrever, r.david.murray
2011-01-19 01:22:13vstinnersetmessageid: <1295400133.63.0.413112936466.issue3080@psf.upfronthosting.co.za>
2011-01-19 01:22:08vstinnerlinkissue3080 messages
2011-01-19 01:22:08vstinnercreate