classification
Title: crash and SystemError in case of a bad zipimport._zip_directory_cache
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 3.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: Oren Milman, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-09-20 13:33 by Oren Milman, last changed 2018-09-19 07:22 by serhiy.storchaka. This issue is now closed.

Messages (3)
msg302612 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-09-20 13:33
The following code causes the interpreter to crash (in case 'foo.zip' exists):

import zipimport
zipimport._zip_directory_cache['foo.zip'] = None
importer = zipimport.zipimporter('foo.zip')
importer.find_loader('bar')

This is because zipimport_zipimporter___init___impl() (in Modules/zipimport.c)
looks for the zipfile in _zip_directory_cache, and in case it is found, stores
its item in the new ZipImporter. Later, check_is_directory() assumes the stored
item is a dictionary, and passes it to PyDict_Contains(), which crashes.


Similarly, the following code causes a 'SystemError: new style getargs format
but argument is not a tuple':

import zipimport
importer = zipimport.zipimporter('foo.zip')
zipimport._zip_directory_cache['foo.zip']['foo\\__init__.py'] = None
importer.load_module('foo')

The same would happen if we replace the last line with
"importer.get_data('foo\\__init__.py')" or "importer.get_source('foo')".

This is because various parts of the code assume that the zipfile's item in
_zip_directory_cache is a dictionary, and that the module's item in this
dictionary is a tuple, which is eventually passed to get_data(), which passes
it to PyArg_ParseTuple(), which raises the SystemError.


Actually, I should have found this as part of #28261. ISTM that the fix for
this issue can easily also fix the issue described in #28261, by checking in
get_data() whether toc_entry is an 8-tuple.


Also, PyDict_GetItem() suppresses all errors, so in some places, e.g. in 
get_module_info(), a bad _zip_directory_cache would probably just be ignored.
But ISTM that we should raise an error saying 'invalid _zip_directory_cache'
in any place where _zip_directory_cache is accessed (in case it is invalid).

What do you think?
msg303166 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-09-27 17:44
Yet another code that causes a SystemError:
import zipimport
importer = zipimport.zipimporter('foo.zip')
tup_as_list = list(zipimport._zip_directory_cache['foo.zip']['foo\\__init__.py'])
tup_as_list[0] = None
zipimport._zip_directory_cache['foo.zip']['foo\\__init__.py'] = tuple(tup_as_list)
importer.load_module('foo')

This could be fixed by checking in get_code_from_data() whether modpath is a string.
msg325720 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-09-19 07:22
zipimport have been rewritten in pure Python (issue25711). All crashes are gone with the C code.
History
Date User Action Args
2018-09-19 07:22:32serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg325720

resolution: out of date
stage: resolved
2017-09-27 17:44:14Oren Milmansetmessages: + msg303166
2017-09-20 13:33:29Oren Milmancreate