classification
Title: Can not import modules if sys.prefix contains DELIM
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, ced, eric.snow, eryksun, ncoghlan, paul.moore, r.david.murray, steve.dower, tim.golden, vstinner, zach.ware
Priority: normal Keywords:

Created on 2017-08-15 12:29 by ced, last changed 2019-09-26 00:55 by vstinner. This issue is now closed.

Messages (17)
msg300293 - (view) Author: Cédric Krier (ced) * Date: 2017-08-15 12:29
If the path on which Python is installed contains the DELIM, the resulted sys.path is split.
I think maybe there should be a escape mechanism for the PYTHONPATH.
msg300294 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-08-15 13:00
By DELIM, you mean the shell ':'?  As far as I've been able to determine there is no way defined in posix to escape that character.  If you can find one, please let us know.  (I think the same is true for the Windows semicolon but I'm not sure.)
msg300295 - (view) Author: Cédric Krier (ced) * Date: 2017-08-15 13:15
Yes I mean ':' for posix.
Indeed I do not known a posix way to escape the colon (it was discussed here: https://stackoverflow.com/questions/14661373/how-to-escape-colon-in-path-on-unix).
But it does not mean that the method makepathobject could not support one. I'm thinking about path single-quoted.
msg300296 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-08-15 13:44
I'm not sure there is anything we should do here, then, because we are conforming to the posix parsing for PATH in our PYTHONPATH implementation.

I think if you want to pursue this further you should take it to the python-ideas mailing list.  I'm going to close the issue, but you can reopen it if python-ideas thinks it is worth doing something (and there is something we can reasonably do :)
msg300302 - (view) Author: Cédric Krier (ced) * Date: 2017-08-15 17:12
A last comment, I do not think it is an issue to follow posix way to parse PATH. But for me, the problem is that Python adds without sanitation the sys.prefix to the PYTHONPATH. So I think internally Python should not use PATH notation to extend the PYTHONPATH because it is not a robust notation.
msg300304 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-08-15 17:32
You mean to create the entries on sys.path that do not come from the PYTHONPATH?
msg300307 - (view) Author: Cédric Krier (ced) * Date: 2017-08-15 18:20
On 2017-08-15 17:32, R. David Murray wrote:
> You mean to create the entries on sys.path that do not come from the PYTHONPATH?

Yes because such path could contain ':'.
msg300308 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-08-15 18:30
OK, that seems reasonable to me.  I'll reopen the issue.  Assuming other developers agree that this should be changed, I'm not sure if it will qualify as a bug or an enhancement, so I'm leaving versions unselected for now :)
msg300329 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-08-16 07:13
I'm marking this as Python 3.7 only, not because I don't think it's a bug, but because getpath.c is a maintainability nightmare and I strongly prefer we avoid going anywhere near it in maintenance releases :)

Targeting Python 3.7+ also means we may be able to take advantage of the initial phase of PEP 432 changes and look at switching getpath.c to use higher level CPython data structures (specifically list objects) internally.

I've also added the Windows folks to the nosy list to check if Windows might have a similar problem with semi-colons appearing in sys.prefix.
msg300360 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2017-08-16 13:13
We don't. Semicolons are not valid path characters, so you can't have them in a single path (unless you're deliberately trying to break your entire machine...)
msg300361 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-08-16 13:24
Is this post wrong then?:

   https://superuser.com/questions/584870/how-can-i-add-a-folder-containing-a-semicolon-to-the-windows-path

("I noticed that the semicolon ; is a valid character for Windows (NTFS) file and directory names.")
msg300370 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2017-08-16 15:31
> Is this post wrong then?

No, it's not wrong that semicolon is a valid character in Windows NTFS and FAT32 filesystems.

However, the answer by Kevin Edwards that recommends using double quotes needs qualification. It only works in CMD. It doesn't work in PowerShell nor the runtime library functions used by CreateProcess or SearchPath (i.e. RtlDosSearchPath_U[str]) or the loader's LdrpSearchPath function. These all split PATH on semicolon, with no supported escape character or quoting. In fact they retain double quotes in the computed filename when testing for existence, so quoting paths in PATH is wrong in general.
msg300374 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-08-16 16:24
All right.  So the challenge here for windows is: if python is installed on a path that has a semicolon in one of the directory names, is it even possible to populate sys.path such that modules can be imported from the lib directory that is under that path?  (IMO we shouldn't jump through big hoops to support this, but if it can be done and the refactoring Nick is contemplating makes it *relatively* easily, it would probably be good to do, to be consistent).
msg300375 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2017-08-16 16:41
It's simple to work around using a junction or symbolic link, so is this worth pursuing?
msg300388 - (view) Author: Cédric Krier (ced) * Date: 2017-08-16 21:08
I'm wondering if it could have security implications and be used to fool user by changing the PYTHONPATH.
msg300403 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-08-17 08:29
If you have access to modify PYTHONPATH at all, you can already shadow almost all standard library modules:

$ PYTHONPATH=/MY_CHOSEN_DIRECTORY python3 -m site
sys.path = [
    '/home/ncoghlan',
    '/MY_CHOSEN_DIRECTORY',
    '/usr/lib64/python36.zip',
    '/usr/lib64/python3.6',
    '/usr/lib64/python3.6/lib-dynload',
    '/home/ncoghlan/.local/lib/python3.6/site-packages',
    '/usr/lib64/python3.6/site-packages',
    '/usr/lib/python3.6/site-packages',
]

The only ones you can't shadow that way are builtin and frozen modules, and any modules that get imported even before PYTHONPATH is processed. So no, this doesn't open up any new attack vectors that weren't already present by design.

As far as whether or not it's worth fixing goes, yes, I think so - one of my original motivations for writing PEP 432 was to allow the use of CPython data structures when calculating the initial value of sys.path, and this is a nice concrete example of a bug arising from the current implementation.
msg353247 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-26 00:55
I consider that the current behavior is correct. They are other ways to put a path which contains ":" into sys.path. I close the issue.
History
Date User Action Args
2019-09-26 00:55:34vstinnersetstatus: open -> closed

nosy: + vstinner
messages: + msg353247

resolution: not a bug
stage: resolved
2017-08-17 08:29:49ncoghlansetmessages: + msg300403
2017-08-16 21:08:22cedsetmessages: + msg300388
2017-08-16 16:41:32eryksunsetmessages: + msg300375
2017-08-16 16:31:27r.david.murraysettitle: Can not import site from sys.prefix containing DELIM -> Can not import modules if sys.prefix contains DELIM
2017-08-16 16:24:30r.david.murraysetmessages: + msg300374
2017-08-16 15:31:15eryksunsetnosy: + eryksun
messages: + msg300370
2017-08-16 13:24:47r.david.murraysetmessages: + msg300361
2017-08-16 13:13:23steve.dowersetmessages: + msg300360
2017-08-16 07:13:39ncoghlansetnosy: + paul.moore, tim.golden, zach.ware, steve.dower

messages: + msg300329
versions: + Python 3.7
2017-08-15 18:30:55r.david.murraysetnosy: + brett.cannon, ncoghlan, eric.snow
2017-08-15 18:30:09r.david.murraysetstatus: closed -> open
resolution: not a bug -> (no value)
messages: + msg300308

stage: resolved -> (no value)
2017-08-15 18:20:06cedsetmessages: + msg300307
2017-08-15 17:32:15r.david.murraysetmessages: + msg300304
2017-08-15 17:12:16cedsetmessages: + msg300302
2017-08-15 13:44:15r.david.murraysetstatus: open -> closed
type: crash -> behavior
messages: + msg300296

resolution: not a bug
stage: resolved
2017-08-15 13:15:45cedsetmessages: + msg300295
2017-08-15 13:00:07r.david.murraysetnosy: + r.david.murray
messages: + msg300294
2017-08-15 12:29:44cedcreate