classification
Title: ResourceReader for FileLoader inconsistently handles path separators
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mekk, barry, brett.cannon, indygreg
Priority: normal Keywords:

Created on 2019-02-26 21:05 by indygreg, last changed 2019-11-12 03:18 by indygreg.

Messages (6)
msg336712 - (view) Author: Gregory Szorc (indygreg) * Date: 2019-02-26 21:05
The implementation of the ResourceReader API for the FileLoader class in importlib/_bootstrap_external.py is inconsistent with regards to handling of path separators.

Specifically, "is_resource()" returns False if "resource" has a path separator. But "open_resource()" will happily open resources containing a path separator.

I would think the two would agree about whether a path with separators is a resource or not. The documentation at https://docs.python.org/3.7/library/importlib.html#importlib.abc.ResourceReader implies that resources in subdirectories should not be allowed.

One can easily demonstrate this behavior oddity with Mercurial:

(Pdb) p sys.modules['mercurial'].__spec__.loader.get_resource_reader('mercurial').open_resource('help/config.txt')
<_io.FileIO name='/home/gps/src/hg/mercurial/help/config.txt' mode='rb' closefd=True>
(Pdb) p sys.modules['mercurial'].__spec__.loader.get_resource_reader('mercurial').is_resource('help/config.txt')
False

The behavior has been present since the functionality was added (https://github.com/python/cpython/pull/5168).
msg336723 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2019-02-26 23:45
On Feb 26, 2019, at 13:05, Gregory Szorc <report@bugs.python.org> wrote:
> 
> I would think the two would agree about whether a path with separators is a resource or not. The documentation at https://docs.python.org/3.7/library/importlib.html#importlib.abc.ResourceReader implies that resources in subdirectories should not be allowed.

Historical context is in the standalone tracker:

https://gitlab.com/python-devs/importlib_resources/issues/58

Clearly, with the current definition of the API, open_resource() should not allow slashes.  Please follow up on the above link if you have opinions about changing the behavior.  We would have an opportunity to relax that constraint in Python 3.8, if it’s something we want to do.
msg356361 - (view) Author: Marcin Kasperski (Mekk) Date: 2019-11-11 14:13
Hmm, I noticed this but accidentally and tried to port https://github.com/pypa/setuptools/issues/1635 to new api. Well:

>>> import multiprocessing
>>> import sys
>>> reader = sys.modules['multiprocessing'].__spec__.loader.get_resource_reader('multiprocessing')
>>> reader.open_resource('../../../../etc/passwd')
<_io.FileIO name='/usr/lib/python3.7/multiprocessing/../../../../etc/passwd' mode='rb' closefd=True>

I suppose this is the case which deserve some thought (originally I faced it when some webapp used pkg_resources to provide static files and used resource api as a way to validate urls impacted by external input).

Tested on python 3.7.3, on Ubuntu 19.04.
msg356378 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2019-11-11 18:24
One simple restriction would be to disallow relative paths outside of the resource anchor location.
msg356408 - (view) Author: Gregory Szorc (indygreg) * Date: 2019-11-12 03:12
I think disallowing relative paths that are parents of the current anchor point is a reasonable restriction and acceptable backwards incompatible behavior.

Disallowing all relative paths with slashes is a larger issue. I would support that if designing/implementing things today as it simplifies things greatly. But since an implementation allowing slashes has shipped in 3.7 and 3.8, I'm not sure if the backwards incompatibility could be stomached, so I'm not going to advocate for it.
msg356409 - (view) Author: Gregory Szorc (indygreg) * Date: 2019-11-12 03:18
I just noticed that there is a parallel discussion ongoing in https://gitlab.com/python-devs/importlib_resources/issues/58.
History
Date User Action Args
2019-11-12 03:18:08indygregsetmessages: + msg356409
2019-11-12 03:12:45indygregsetmessages: + msg356408
2019-11-11 18:24:24barrysetmessages: + msg356378
2019-11-11 14:13:26Mekksetnosy: + Mekk
messages: + msg356361
2019-02-27 22:58:33brett.cannonsetnosy: + brett.cannon
2019-02-26 23:45:24barrysetmessages: + msg336723
2019-02-26 21:05:00indygregcreate