classification
Title: Avoid adding an empty directory to sys.path when running a module with `-m`
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: Viv, brett.cannon, eric.snow, eryksun, jwilk, ncoghlan, ned.deily, njs, vstinner, ztane
Priority: normal Keywords: patch

Created on 2018-03-12 11:57 by ztane, last changed 2018-09-26 16:09 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 6231 merged ncoghlan, 2018-03-25 10:39
PR 6237 merged ncoghlan, 2018-03-25 13:16
PR 6236 merged ncoghlan, 2018-03-25 13:43
PR 9589 merged vstinner, 2018-09-26 15:05
Messages (30)
msg313641 - (view) Author: Antti Haapala (ztane) * Date: 2018-03-12 11:57
I think this is a really stupid security bug. Running a module with `-mmodule` seems to add '' as a path in sys.path, and in front. This is doubly wrong, because '' will stand for whatever the current working directory might happen to be at the time of the *subsequent import statements*, i.e. it is far worse than https://bugs.python.org/issue16202

I.e. whereas python3 /usr/lib/module.py wouldn't do that, python3 -mmodule would make it so that following a chdirs in code, imports would be executed from arbitrary locations. Verified on MacOS X, Ubuntu 17.10, using variety of Python versions up to 3.7.
msg313966 - (view) Author: Jakub Wilk (jwilk) Date: 2018-03-16 18:50
FWIW, this behavior is documented:

https://docs.python.org/3/using/cmdline.html#cmdoption-m
"As with the -c option, the current directory will be added to the start of sys.path."

With the -c option, at least you could easily remove the sys.path element yourself:

   python -c 'import sys; sys.path.remove(""); ...'

(This works, because sys is always a builtin module, so it won't be imported from cwd.)

I don't see any obvious way to make "python -m foo" secure in untrusted cwd, though.
The best I could come up with is:

   python -c 'import sys; sys.path.remove(""); import runpy; runpy._run_module_as_main("foo")'

which is quite insane.
msg314020 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-18 05:31
This isn't considered a security issue, as running "python3" interactively behaves in exactly the same way (i.e. tracking changes to the current working directory for the duration of the session), and running "python3 script.py" adds the full path to the current directory.

In all cases, the expectation is that end users will at least enable isolated mode if they don't want to risk importing arbitrary code from user controlled directories.

    $ echo "print('Hello')" > foo.py
    $ python3 -m foo
    Hello
    $ python3 -Im foo
    /usr/bin/python3: No module named foo

However, I'm flagging this as an enhancement request for 3.8+ (with a reworded issue title), as the non-isolated `-m` switch algorithm for sys.path[0] calculation could be made more robust as follows:

1. Start out with "os.getcwd()" rather than the empty string
2. Once `__main__.__file__` has been calculated, delete sys.path[0] if __main__ was found somewhere else

A potentially related enhancement would be to modify directory & zipfile execution to only look for `__main__.py` in `sys.path[0]` rather than searching the whole of `sys.path` (which is what currently happens).
msg314021 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-18 05:44
Also, a small upstream community interaction tip: if you want people to seriously consider your requests for changes in default behaviour (which inevitably risk backwards compatibility breaks), don't start out by insulting them.

Python's defaults are currently set up for a *trusted personal automation tool*, where the person writing the code is also the person running it.

By starting out with an insult like "I think this is a really stupid security bug", you're actually saying "I know very little about Python's history, or the audiences it was originally written to serve, and instead of politely suggesting an alternative behaviour that would be more robust in the face of system configuration errors, I'm going to try to use shame, guilt, and embarrassment to get people to do work for me". That kind of behaviour *isn't* a good way to get your issues addressed, but it *is* a good way to encourage people to decide that volunteering as an open source maintainer isn't worth the associated hassles.

The opening insult added nothing to your issue report, and could more productively have been replaced with an explanation of the expectations you had of the default behaviour, how you came by those expectations, and how the current behaviour failed to meet them.
msg314023 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-18 05:58
I've also separated out https://bugs.python.org/issue33095 (a docs issue about making isolated mode more discoverable) based on the jwilk's comment that it wasn't clear how to disable the default "add the current directory to sys.path" behaviour.
msg314025 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-03-18 06:19
Whoa, wait, what?

I agree that the original post is not as diplomatic as it could be, but my reaction to learning about this just now is also shock and confusion, so I guess I can sympathize with the OP a bit...

The reason I'm surprised is that -- while this probably wasn't fully anticipated when -m was designed -- it's turned out to be a bit of a meme to replace calls like 'pip ...' with 'python -m pip ...', or 'virtualenv ...' with 'python -m virtualenv ...', etc. I thought these were generally pretty much equivalent. I definitely did *not* know that running 'python -m pip' could lead to executing arbitrary code from the cwd, and I'm sure I've run it inside e.g. random git checkouts. If someone had tried to spearphish me with this they would totally have succeeded. (I hope they haven't?)

If you want to run a file in the current directory, is there any advantage to doing 'python -m myscript' instead of 'python myscript.py'? Could we declare that the latter is the One Obvious Way and remove support for the former entirely?
msg314028 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-18 07:04
"python -m mypkg.myscript" does the right thing as far as local packages are concerned, whereas "python -m mypkg/myscript.py" will set you up for double-import bugs.

Note that you can almost always trigger arbitrary non-obvious code execution just by writing sitecustomize.py to the current directory, and any package you install can add a "<installation-site-packages>/arbitrary-code.pth" or "<user-site-packages>/arbitrary-code.pth" file that gets run at startup (setuptools has long relied on this to implement various features).

Opting in to isolated mode turns *all* of those features off by saying "I'm expecting to run system code only here, not custom user code".
msg314036 - (view) Author: Jakub Wilk (jwilk) Date: 2018-03-18 10:46
-I implies -s, which is not something I want.
msg314040 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-18 13:21
https://bugs.python.org/issue13475 is the existing enhancement request to expose sys.path[0] management independently of the other execution isolation features.
msg314081 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-03-19 08:16
@ncoghlan: The comparison I'm worried about is specifically this one: IIUC, right now it's safe to run 'pip --version' in an arbitrary directory, but it's not safe to run 'python -m pip --version' in an arbitrary directory. Am I wrong? (I actually couldn't convince either version to execute arbitrary code in 2 minutes of trying, but that's my understanding of the discussion so far.)

If that's correct, then I don't think this is like... the hugest security bug ever, but... I also think that it's irresponsible for e.g. packaging.python.org to be recommending people run 'python -m pip' the way it does now, and we need to find some way to change things so our beginner docs aren't triggering arbitrary code execution in a rare and subtle case.

We could add a bunch of warnings to packaging.python.org, explaining about how the code execution can be triggered, but that seems unsatisfactory given how those docs are targeted at beginners, plus there are so many places around the internet that recommend 'python -m pip' we'd never find them all.

We could update all those docs to instead recommend 'python -Im pip', but that has the same problems: no-one will understand, and people won't do it.

We could stop recommending 'python -m pip' entirely, but that runs into all the problems that have motivated this in the first place.

So I think we should find a way to make it so 'python -m pip' *never* executes code from the current directory (modulo the usual caveats, like the user explicitly setting PYTHONPATH to an insecure value etc.).

If 'python -m mypkg.myscript' is important, maybe we can make it 'PYTHONPATH=. python -m mypkg.myscript', or 'python -M mypkg.myscript', or making 'python mypkg/myscript.py' DTRT, or... something?
msg314088 - (view) Author: Antti Haapala (ztane) * Date: 2018-03-19 09:06
Took 2 seconds.

% sudo python3 -mpip --version
hello world
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.6/runpy.py", line 142, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/usr/lib/python3/dist-packages/pip/__init__.py", line 4, in <module>
    import locale
  File "/usr/lib/python3.6/locale.py", line 180, in <module>
    _percent_re = re.compile(r'%(?:\((?P<key>.*?)\))?'
AttributeError: module 're' has no attribute 'compile'
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 53, in apport_excepthook
    if not enabled():
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 28, in enabled
    return re.search(r'^\s*enabled\s*=\s*0\s*$', conf, re.M) is None
AttributeError: module 're' has no attribute 'search'

Original exception was:
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.6/runpy.py", line 142, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/usr/lib/python3/dist-packages/pip/__init__.py", line 4, in <module>
    import locale
  File "/usr/lib/python3.6/locale.py", line 180, in <module>
    _percent_re = re.compile(r'%(?:\((?P<key>.*?)\))?'
AttributeError: module 're' has no attribute 'compile'

Same for `python -mhttp.server`, say. 

----

I'd prefer there be a change that the default be always safe from some version on, so that the REPL can do whatever it does, but `-m` etc probably shouldn't even have neither the *initial* current directory *nor* the current current directory in the path unless the interactive session is requested. I am not worried about the garbage that the user would have installed in their own directories breaking things.
msg314089 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-03-19 09:14
Ah, yeah, I see:

~/t$ echo 'print("hi")' > re.py
~/t$ pip --version
pip 9.0.1 from /home/njs/.user-python3.5-64bit/local/lib/python3.5/site-packages (python 3.5)
~/t$ python -m pip --version
hi
Traceback (most recent call last):
[...]

But if I create a sitecustomize.py or an io.py in the current directory (knowing that 'import io' happens implicitly startup), then those *don't* seem to get picked up by 'python -m pip' or 'python -c ...' or plain 'python'. I guess the cwd doesn't get added to sys.path until after initial bootstrapping is finished.
msg314092 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-19 12:54
It occurs to me that there may be some additional unshared context here: the way `python -m pip` searches for the module to execute is much closer to the way Windows searches for a command like `pip` (i.e. current directory first) than it is to the way *nix systems search for it (i.e. if the current directory isn't on PATH it must be specified explicitly as "./pip"). If you spend a lot of time on Windows systems, or working with Windows users, it becomes a habit to assume that folks aren't going to expect it to be safe to run arbitrary commands from arbitrary directories.

This behaviour means that if you want to intercept "python -m pip", the current easiest filename to use to intercept it is just "pip.py", similar to the way you can use "pip.exe"  or "python.exe" to intercept those commands on Windows.

I do think switching from a Windows-style default search behaviour to a *nix style default search behaviour is potentially reasonable, but the related backwards compatibility considerations would likely push it into being a PEP level change.

I'll also note that Nathaniel's right that I was wrong about `sitecustomize` always being easy to intercept from the current directory, though, as sys.path[0] gets set *after* "import site" has already executed.

I was just confusing myself, as my default approach to double-checking the behaviour of the "-m" switch is to run "python -m site", but that's misleading in this case since doing that *also* re-runs the site and user customisation modules (and will pick them up from the current working directory) - it's closely akin to testing `python3 -c "import runpy; runpy.run_module('site', run_name='__main__')"`
msg314130 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2018-03-20 03:08
> the way `python -m pip` searches for the module to execute is much 
> closer to the way Windows searches for a command like `pip` (i.e. 
> current directory first)

That's classic Windows behavior. However, search paths for CreateProcess and the loader are composed on demand, which allows different behavior to be selected easily enough. The current behavior depends on a mix of environment variables, registry settings, reserved names (known DLLs, API sets), application and DLL manifests, .local redirection, and in-process configuration. For example, to skip the working directory when searching for DLLs, there's the CWDIllegalInDllSearch registry setting, SetDllDirectoryW(L""), or SetDefaultDllDirectories (the latter removes PATH as well, so it's not suited for loosely-couple systems). To skip  the working directory when searching for executables, define the environment variable "NoDefaultCurrentDirectoryInExePath".
msg314192 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-21 12:02
(Adding the other import system maintainers to the nosy list, along with Ned as the release manager for 3.6 and 3.7)

Summarizing my current thoughts on this:

I think the fact that "-m" currently adds the empty directory as sys.path[0] instead of the fully resolved cwd is definitely worth changing for 3.8, and is at least arguably a bugfix that should be applied to 3.7 prior to release. (It's probably too late in 3.6's lifecycle to be worth backporting that far, even if we do reclassify this part as a bugfix instead of an enhancement)

The above is what I consider to be in scope for *this* particular issue.

(The details of the current behaviour probably arose from my borrowing an existing code path originally designed for "-c" rather than from a deliberate design decision)



As a separate question, I'd personally be open to the idea of:

1. Deprecating the implicit addition of the current working directory to sys.path (and have that emit a deprecation warning in 3.8)

2. Add support for an explicit leading dot on "-m" arguments to say "Run this from the current directory" (the -m switch was added in 2.4, while the explicit relative import syntax wasn't chosen until 2.5, so this wasn't an available option for the original feature design). Note: given implicit namespace packages, it may make sense to allow additional leading dots in order to add a parent directory to sys.path instead of the current directory.

3. Add a "--basepath <dir>" argument to have "-m" resolve explicit relative imports against a directory other than the current one (this would be a different potential spelling for https://bugs.python.org/issue13475 )

4. Once the current directory is no longer added implicitly (in 3.9 at the earliest), if runpy fails to find the main module, then check the current directory and ask "Did you mean '.name'?" in the error message.

However, that second part is the part I consider out of scope for this issue, and think would require a PEP due to the backwards compatibility challenges.
msg314214 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2018-03-21 17:28
The proposed change for 3.8 and 3.7 seems reasonable to me.
msg314233 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-03-22 02:02
I am OK with changing '' to the initial resolved working directory for 3.7 as long as we get that change in now, i.e. before the 3.7.0b3 ABI freeze next Monday.  And I agree that we should not attempt to change this behavior for 3.6.x at this stage of its life cycle.
msg314248 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-22 12:16
Brett or Eric, any chance one of you could run with this for 3.7b3? I already have a startup refactoring related regression that I'm aiming to have fixed before then.

Thanks to Victor's refactoring work, there's at least a clear interception point now where we can treat the empty string differently depending on whether the command line option was `-c`, `-m`, or not specified at all: https://github.com/python/cpython/blob/master/Python/pathconfig.c#L259

As an initial attempt, I think the necessary fix will be along the lines of checking for 'n == 0 && argc > 1 && wcscmp(argv0, L"-m") == 0', and resolving the current working directory in that case.
msg314276 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2018-03-22 17:55
I can't make any promises unfortunately.
msg314403 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-25 10:54
PR posted with the change to use an absolute path for the starting working directory in the "-m" case.

That PR also includes a change to improve the fidelity of the test suite: back when I first wrote test_cmd_line_script, I was mainly focused on testing the runpy aspects, and not the sys.path initialisation aspects, so the way the tests worked didn't really check the latter properly.

The test updates get rid of the launch script that was previously confusing matters, and instead test sys.path[0] initialisation properly (relying on `PYTHONPATH` for the zipimport related cases where just changing the working directory isn't sufficient).
msg314404 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-25 11:22
It turned out some tests in CPython's own test suite were implicitly relying on the old behaviour where the current working directory automatically ended up on sys.path (see the changes to test_bdb and test_doctest in the updated PR).
msg314411 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-25 13:20
Initial fix has been merged to master, CI runs pending for the backport to 3.7 and a follow-up master branch PR to remove a debugging print I noticed when resolving a test_import conflict in the backport.

I won't get to merging those until some time after work tomorrow (probably 8 pm'ish in UTC+10), so if anyone wanted to merge them before that, it would likely be a good idea :)
msg314412 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-25 13:45
3.7 CI finished before I logged off for the night, so this is good to go for 3.7.0b3 now :)
msg314414 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-25 13:47
New changeset a9e5d0e9ef27b14e34631d415e727a07d0f63bef by Nick Coghlan in branch 'master':
bpo-33053: Remove test_cmd_line_script debugging print (GH-6237)
https://github.com/python/cpython/commit/a9e5d0e9ef27b14e34631d415e727a07d0f63bef
msg314415 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-25 13:50
Marking as fixed, since this is now the version likely to go out in 3.7.0b3 - if we find further problems with it (beyond the potential enhancement discussed above to make local directory usage opt-in), then those can go in a new issue.
msg314431 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-25 20:49
New changeset d5d9e02dd3c6df06a8dd9ce75ee9b52976420a8b by Nick Coghlan in branch 'master':
bpo-33053: -m now adds *starting* directory to sys.path (GH-6231)
https://github.com/python/cpython/commit/d5d9e02dd3c6df06a8dd9ce75ee9b52976420a8b
msg314432 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-25 20:52
New changeset ee3784594b33c72c3fdca6a71892d22f14045ab6 by Nick Coghlan in branch '3.7':
bpo-33053: -m now adds *starting* directory to sys.path (GH-6231) (#6236)
https://github.com/python/cpython/commit/ee3784594b33c72c3fdca6a71892d22f14045ab6
msg315043 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-04-06 23:36
(See Issue33185 for regression.)
msg315079 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-04-08 04:48
Some notes from my investigation of bpo-33185 that seem more appropriate here, rather than on that issue:

* several of the developer-centric utilities in the standard library have a shared need to be friendly to imports from the current working directory.
* timeit uses os.curdir, but could be switched to os.getcwd()
* pydoc uses a literal '.', but could be switched to os.getcwd()
* trace, profile, cProfile, pdb, doctest, and IDLE's pyshell all add the directory containing the file under test

Aside from switching pydoc from a literal '.' to os.curdir, I'm not going to change any of those (hence why I'm putting these notes here), but I wanted to capture this info in case does decide to follow through on a "less isolated than isolated mode, but still omits the current directory from sys.path" execution mode.
msg326481 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-26 16:09
New changeset 43500a5907eb9ae2e470dcbffe73012cd456f5a1 by Victor Stinner in branch '3.6':
bpo-28655: Fix test_import.test_missing_source_legacy() (GH-9589)
https://github.com/python/cpython/commit/43500a5907eb9ae2e470dcbffe73012cd456f5a1
History
Date User Action Args
2018-09-26 16:09:42vstinnersetnosy: + vstinner
messages: + msg326481
2018-09-26 15:05:53vstinnersetpull_requests: + pull_request8987
2018-04-08 04:48:45ncoghlansetmessages: + msg315079
2018-04-06 23:36:42ned.deilysetmessages: + msg315043
2018-03-25 20:52:43ncoghlansetmessages: + msg314432
2018-03-25 20:49:15ncoghlansetmessages: + msg314431
2018-03-25 13:50:03ncoghlansetstatus: open -> closed
resolution: fixed
messages: + msg314415

stage: patch review -> resolved
2018-03-25 13:47:56ncoghlansetmessages: + msg314414
2018-03-25 13:45:30ncoghlansetpriority: release blocker -> normal

messages: + msg314412
2018-03-25 13:43:52ncoghlansetpull_requests: + pull_request5974
2018-03-25 13:20:06ncoghlansetmessages: + msg314411
2018-03-25 13:16:14ncoghlansetpull_requests: + pull_request5973
2018-03-25 11:22:54ncoghlansetmessages: + msg314404
2018-03-25 10:54:03ncoghlansetmessages: + msg314403
2018-03-25 10:39:18ncoghlansetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request5970
2018-03-25 08:17:48ncoghlansetassignee: ncoghlan
2018-03-22 17:55:35brett.cannonsetmessages: + msg314276
2018-03-22 14:43:55ned.deilysetmessages: - msg314252
2018-03-22 13:04:23Vivsetnosy: + Viv
messages: + msg314252
2018-03-22 12:16:39ncoghlansetmessages: + msg314248
2018-03-22 02:02:38ned.deilysetpriority: normal -> release blocker
versions: + Python 3.7
nosy: + ned.deily

messages: + msg314233
2018-03-21 17:28:15brett.cannonsetmessages: + msg314214
2018-03-21 12:02:27ncoghlansetnosy: + brett.cannon, eric.snow
messages: + msg314192
2018-03-20 03:08:12eryksunsetnosy: + eryksun
messages: + msg314130
2018-03-19 12:54:24ncoghlansetmessages: + msg314092
stage: needs patch
2018-03-19 09:14:19njssetmessages: + msg314089
2018-03-19 09:06:39ztanesetmessages: + msg314088
2018-03-19 08:17:00njssetmessages: + msg314081
2018-03-18 13:21:06ncoghlansetmessages: + msg314040
2018-03-18 10:46:07jwilksetmessages: + msg314036
2018-03-18 07:04:27ncoghlansetmessages: + msg314028
2018-03-18 06:19:42njssetnosy: + njs
messages: + msg314025
2018-03-18 05:58:43ncoghlansetmessages: + msg314023
2018-03-18 05:44:11ncoghlansetmessages: + msg314021
2018-03-18 05:31:20ncoghlansettype: security -> enhancement
title: Running a module with `-m` will add empty directory to sys.path -> Avoid adding an empty directory to sys.path when running a module with `-m`
messages: + msg314020
versions: + Python 3.8
2018-03-16 20:07:07ned.deilysetnosy: + ncoghlan
2018-03-16 18:50:11jwilksetnosy: + jwilk
messages: + msg313966
2018-03-12 11:57:15ztanecreate