Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add -P command line option and PYTHONSAFEENV environment variable #57684

Closed
ncoghlan opened this issue Nov 25, 2011 · 51 comments
Closed

Add -P command line option and PYTHONSAFEENV environment variable #57684

ncoghlan opened this issue Nov 25, 2011 · 51 comments
Labels
3.9 only security fixes type-feature A feature request or enhancement

Comments

@ncoghlan
Copy link
Contributor

BPO 13475
Nosy @gpshead, @ncoghlan, @pitrou, @kristjanvalur, @vstinner, @mattheww, @jwilk, @merwok, @ericsnowcurrently
PRs
  • gh-57684: Add -P cmdline option and PYTHONSAFEPATH env var #31542
  • Files
  • issue13475_1.diff
  • issue13475_2.diff: updated patch to address Antoine's comments
  • issue13475_long_only.diff: switch to long options
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2011-11-25.00:46:33.649>
    labels = ['type-feature', '3.9']
    title = "Add '--mainpath'/'--nomainpath' command line options to override sys.path[0] initialisation"
    updated_at = <Date 2022-02-24.02:55:12.913>
    user = 'https://github.com/ncoghlan'

    bugs.python.org fields:

    activity = <Date 2022-02-24.02:55:12.913>
    actor = 'vstinner'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = []
    creation = <Date 2011-11-25.00:46:33.649>
    creator = 'ncoghlan'
    dependencies = []
    files = ['25723', '25730', '25759']
    hgrepos = []
    issue_num = 13475
    keywords = ['patch']
    message_count = 37.0
    messages = ['148295', '148305', '148310', '148336', '148350', '148378', '148379', '148394', '148433', '148458', '148552', '161676', '161677', '161685', '161686', '161689', '161691', '161695', '161696', '161749', '161926', '162677', '162708', '165164', '168428', '176160', '176522', '176526', '176527', '176529', '216789', '314041', '314193', '339624', '413868', '413871', '413877']
    nosy_count = 11.0
    nosy_names = ['gregory.p.smith', 'ncoghlan', 'pitrou', 'kristjan.jonsson', 'vstinner', 'mattheww', 'jwilk', 'eric.araujo', 'Arfrever', 'eric.snow', 'nisaac']
    pr_nums = ['31542']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue13475'
    versions = ['Python 3.9']

    @ncoghlan
    Copy link
    Contributor Author

    PEP-395 spends a lot of time discussing ways that the current automatic initialisation of sys.path[0] can go wrong, and even the proposed improvements in that PEP don't claim to fix the default behaviour for every possible scenario (just many of the most common ones).

    The unittest module gets around similar problems with test autodiscovery by providing an explicit "-t" (for 'toplevel') command line option.

    While '-t' is not available for the main interpreter executable (and nor is '-d' for directory), '-p' for "path0" is a possibility.

    Directory execution (for example) would then be equivalent to:

    python -p dirname dirname/main.py

    A separate '--nopath0' option could also be provided to explicitly disable path initialisation based on the script being executed (see http://lists.debian.org/debian-python/2011/11/msg00058.html)

    @ericsnowcurrently
    Copy link
    Member

    +1 Both the -p and --nopath0 would be great additions and a good match for PEP-395.

    So "-p ." would be equivalent to the current implicit sys.path[0] initialization, right? Would any other effects happen with "-p" than sys.path[0] initialization? I'll follow up with my one concern (the implicit "-p .") in a follow-up message.

    @ericsnowcurrently
    Copy link
    Member

    The current behavior is an implicit "-p .", which causes all sorts of hard-to-figure-out problems, most of which PEP-395 is rightly trying to fix. I'm suggesting that the next step would be to make "--nopath0" the default (rendering the flag unnecessary). Since this changes the status quo, I'll try to make the case here that it's worth it.

    ---

    First of all, the current implicit sys.path[0] initialization for scripts has been around since forever. Why would we change it now? The inspiration is three-fold: we _said_ we got rid of implicit relative imports already, PEP-395 already has made a case for changing the status quo (though not to this degree), and a "-p" flag provides a simple mechanism to get the current behavior explicitly.

    Like Nick says in PEP-395, "PEP-328 eliminated implicit relative imports from imported modules. This PEP proposes that the de facto implicit relative imports from main modules that are provided by the current initialisation behaviour for sys.path[0] also be eliminated."

    I'm just saying we should go all the way, and that the "-p" flag would allow that. As far as I can tell, there are two meaningful use cases for the currently implicit sys.path[0]: When you are first _learning_ Python (not actually using it)... When you are working on a new project... Let's look at both, first relative to the currently implicit "-p .", then to the explicit site.

    1. (current behavior) implicit sys.path[0] initialization

    When a beginner is first learning Python, they learn at the interactive prompt or by running a single script. Pretty quickly they learn about writing their own modules and using import to load them.

    Where do the modules live? In the CWD from which they are calling the Python executable. The current implicit "-p ." means they can import their modules right there, and not have to learn yet about sys.path. That's nice since they can concentrate their focus.

    But later they _will_ run into hard-to-figure-out problems when they start relying on what are effectively implicit relative imports. Or they will run into one of the problems that PEP-395 explains, which are likewise head-scratchers. All of this can be pretty frustrating when you still don't know about things like the import machinery.

    Consider how that beginner uses the implicit sys.path[0] behavior on a project, once they have some experience (hopefully I don't misrepresent something Nick explained to me). The src/ directory of the project won't be under sys.path yet. So to quickly test out their script, they change to src/ and run their script from there (something PEP-395 rightfully proposes fine-tuning). They conveniently don't have to manually fiddle with sys.path. That's a pretty neat trick.

    The problem I have with this is that there's no difference between the sys.path workaround for the script and it's expected normal usage. If you aren't aware of what's going on behind-the-scenes, you may end up with some confusing import exceptions later on.

    1. explicit sys.path[0] initialization

    With a -p flag, it's really easy to say "-p ." and get the previous implicit behavior. Consider the impact on our beginner. The try to run their script that imports modules in their CWD, but end up with an ImportError because we no longer have an implicit "-p .".

    How is this an improvement? No more mysterious default behavior. There's only one way it will fail and it will fail the same way every time. They'll learn pretty quickly (if not taught already) you have to use the -p flag if you want to import modules that aren't in the "standard" places. The ImportError message, special-cased for imports in __main__, could help in that regard.

    <aside>
    As an aside, implicit or explicit "--nopath0", the path to the file for the failed import can really help in related situations. Including it as an attribute on the ImportError, and especially in the error message, would rock. See bpo-1559549.
    </aside>

    For the "src/" use-case, the developer won't be able to rely on the implicit behavior anymore. A simple, explicit "-p ." fills the gap and makes it clear that they are doing something unusual. On top of that, they can use -p with the path to the "src/" directory from anywhere and not have to change their directory. All the better if -p takes advantage of the proposals in PEP-395.

    In the end, this boils down to opt-in vs. opt-out. Either you opt-in to the sys.path[0] initialization by using -p, or you opt-out of the default implicit "-p ." by passing "--nopath0". It's a case of "explicit is better than implicit" without a clear reason, to me, that the status quo is worth the exception anymore in the face of the alternative.

    My only concern would be backwards compatibility. How much do scripts out there rely on the implicit sys.path[0] initialization that aren't already broken by the removal of implicit relative imports? I simply don't know.

    However, I do know that having the -p flag that Nick has recommended would be awesome.

    @merwok
    Copy link
    Member

    merwok commented Nov 25, 2011

    I don’t understand all the issues and it’s too late for me to read all the thread, but I hope these comments are helpful:

    • If our goal is to help naïve users, then one or two new options wouldn’t help (people want to run “python path/to/thing.py” and want it to Just Work™)

    • Getting a very helpful error message is second-best to Just Working™

    • This probably needs a wider audience, python-ideas or python-dev, and maybe a PEP to explain all the issues and all the considered solutions.

    @ericsnowcurrently
    Copy link
    Member

    First of all, I don't want Nick's proposal in this issue (the -p and --nopath0 flags) to be misunderstood because of me. His is a great idea that will make a really useful shortcut available and will _not_ change any current behavior.

    My (overly) long message is an attempt to justify going with a more drastic variation, to which your response applies exclusively. I stand by what I said, but I would be quite happy with the original proposal, to which your response does not apply. :)

    Just wanted to make all that clear so I don't get in the way of a good thing. :)

    Your comments are spot on. Your concerns are exactly the ones I would need to assuage before my alternative would be acceptable. When I get a chance I'll take this to python-ideas.

    @ncoghlan
    Copy link
    Contributor Author

    Yeah, sorry Eric (Snow), you're trying to bite off *way* more than is reasonable by proposing to removing sys.path[0] auto-initialisation. While it has its problems, it's also far too entrenched in people's expections of Python's behaviour to consider removing at this late date. The earliest such a radical change could likely be considered is Python 4.

    Éric (Araujo): the "just do the right thing" aspect is covered by PEP-395. This is an orthogonal proposal that allows power users the ability to override the automatic initialisation to handle those cases where it goes wrong.

    That said, I may still make this suggestion part of PEP-395 rather than a distinct proposal, even though there's no direct dependency (in either direction) between this suggestion and the other ideas in that PEP.

    @ericsnowcurrently
    Copy link
    Member

    Yeah, the more I think about it, the more I agree it's Python 4 fodder.

    @merwok
    Copy link
    Member

    merwok commented Nov 26, 2011

    Eric: Actually I posted my comments after reading only Nick’s message, not yours, so no worry :)

    Nick:

    the "just do the right thing" aspect is covered by PEP-395.
    This is great. Thanks for clarifying.

    This is an orthogonal proposal that allows power users the ability to override the
    automatic initialisation to handle those cases where it goes wrong.
    If I’m understanding correctly, bpo-12934 is one of those issues. I agree that an option would be better than nothing, but 1) it would not help with older Pythons 2) it would not help naïve users 3) it would complicate the command-line interface of Python. (Not saying I’m against the idea, just listing cons.)

    @ncoghlan
    Copy link
    Contributor Author

    Zbigniew posted a nice summary of some of the issues sys.path[0] autoinitialisation can cause to python-dev: http://mail.python.org/pipermail/python-dev/2011-November/114668.html

    @mattheww
    Copy link
    Mannequin

    mattheww mannequin commented Nov 27, 2011

    The proposed --nopath0 option is something I've wished I had in the past.

    If this is added, it would be good if it could be given a single-letter form too, because it's an option that would be useful in #! lines (they don't reliably support using more than one command-line argument, and single-letter switches can be combined while long-form ones can't).

    @ncoghlan
    Copy link
    Contributor Author

    I realised "-P" is available for use as the short form of the "suppress sys.path[0] initialisation" option.

    So "-p" would override the calculation of sys.path[0], while "-P" would switch it off completely (similar to the way "-S" and "-E" switch off other aspects of the normal initialisation process).

    Whether or not to provide "--path0" and "--nopath0" as long form alternatives would then be a separate question.

    @ericsnowcurrently
    Copy link
    Member

    Here's a stab at implementing -p/-P. There are a couple warnings that I'm not sure about and I've undoubtedly missed some detail, but it should be pretty close.

    @pitrou
    Copy link
    Member

    pitrou commented May 26, 2012

    Before this is assigned a short option form, I would like to ask whether anybody but experts will be able to make a proper use of this option.
    (I also don't understand what it adds over PYTHONPATH)

    As for the patch, it lacks error checking when calling C API functions.

    @ericsnowcurrently
    Copy link
    Member

    Before this is assigned a short option form, I would like to ask
    whether anybody but experts will be able to make a proper use of this
    option.

    Do you mean relative to a long form? And what would constitute improper use for the option?

    As Nick noted earlier in this issue, the implicit setting of sys.path[0] is a convenience for some cases, and I can appreciate that. My complaint is that it is not an obvious behavior. Furthermore, in an import system that has eliminated implicit relative imports it can lead to confusing behavior. I consider Nick's proposal the best solution given the constraints we have.

    If people do not use it, the status quo is unaffected. If they do use it, they will be explicitly affecting sys.path[0]. If they use it without knowing what it's for, it will at worst cause imports to happen from unexpected sources (or not happen at all). To me this doesn't seem that different from the way things are now when someone doesn't understand about sys.path[0] initialization.

    (I also don't understand what it adds over PYTHONPATH)

    It provides an explicit alternative to the default implicit insertion to sys.path[0]. If the default behavior were no implicit initialization, then I'd agree that PYTHONPATH is sufficient.

    As for the patch, it lacks error checking when calling C API
    functions.

    Ah, yes. Spaced it. Patch updated (and warnings fixed).

    This brings me to ask what the behavior should be when we have errors come back from those C API functions? In the patch I just have it fall back to the default sys.path[0] behavior. However, wouldn't an error indicate a deeper problem? If so, shouldn't Py_FatalError() be called?

    @pitrou
    Copy link
    Member

    pitrou commented May 26, 2012

    > Before this is assigned a short option form, I would like to ask
    > whether anybody but experts will be able to make a proper use of this
    > option.

    Do you mean relative to a long form?

    Yes.

    And what would constitute improper use for the option?

    As a replacement of PYTHONPATH or "setup.py install" or "setup.py
    develop", I guess.

    As Nick noted earlier in this issue, the implicit setting of
    sys.path[0] is a convenience for some cases, and I can appreciate
    that. My complaint is that it is not an obvious behavior.
    Furthermore, in an import system that has eliminated implicit relative
    imports it can lead to confusing behavior.

    Agreed. I'm not arguing against the principle.

    > (I also don't understand what it adds over PYTHONPATH)

    It provides an explicit alternative to the default implicit insertion
    to sys.path[0]. If the default behavior were no implicit
    initialization, then I'd agree that PYTHONPATH is sufficient.

    Ah, indeed, I'd missed that.

    This brings me to ask what the behavior should be when we have errors
    come back from those C API functions? In the patch I just have it
    fall back to the default sys.path[0] behavior.

    Well, no, errors should not pass silently. The error should be
    propagated. Here, it probably means print the error and abort (or
    whatever strategy the rest of the function adopts).

    However, wouldn't an error indicate a deeper problem? If so,
    shouldn't Py_FatalError() be called?

    Py_FatalError() is a low-level exit; it dumps core. PyErr_Print()
    followed by a proper exit may be better.

    @pitrou pitrou changed the title Add '-p'/'--path0' command line option to override sys.path[0] initialisation Add '-p'/'--path0' command line option to override sys.path May 26, 2012
    @ericsnowcurrently
    Copy link
    Member

    did you mean to change the title? this isn't about overriding sys.path, but rather just about explicitly dictating the initialization of sys.path[0].

    @pitrou
    Copy link
    Member

    pitrou commented May 27, 2012

    did you mean to change the title?

    No, I think it's a bug of the roundup e-mail gateway.

    @pitrou pitrou changed the title Add '-p'/'--path0' command line option to override sys.path Add '-p'/'--path0' command line option to override sys.path[0] initialisation May 27, 2012
    @ncoghlan
    Copy link
    Contributor Author

    FWIW, I now think this should *only* be a long option. Short options are precious, and this is an unusual enough use case that I'm not yet sure it deserves one.

    In particular, we may decide to use "-p" later for adding directories to sys.path, rather than specifically overriding sys.path[0]

    1 similar comment
    @ncoghlan
    Copy link
    Contributor Author

    FWIW, I now think this should *only* be a long option. Short options are precious, and this is an unusual enough use case that I'm not yet sure it deserves one.

    In particular, we may decide to use "-p" later for adding directories to sys.path, rather than specifically overriding sys.path[0]

    @ericsnowcurrently
    Copy link
    Member

    Long options only would be fine with me. So "--path0" and "--nopath0"?

    @ericsnowcurrently
    Copy link
    Member

    here's an updated patch with long options only. I'm only 95% confident about my changes to getopt.c...there's probably a better way to do it. However, the patch stands on its own two feet.

    @ericsnowcurrently
    Copy link
    Member

    any chance on this for 3.3?

    @ncoghlan
    Copy link
    Contributor Author

    I was tempted to just add this (perhaps as a -X option) but, on reflection, I'm going to go with "No, not for 3.3".

    I want to take a long hard look at the whole sys.path[0] initialisation process when I update PEP-395 to account for namespace packages, and it doesn't make sense to skip ahead with part of the conclusion. (I may even end up splitting that PEP into two pieces)

    I'll also look into the embedding API, to see if/how embedding applications can fully control sys.path initialisation.

    Python has survived without this feature for the last couple of decades, there's no need to be hasty in adding it now.

    @ncoghlan ncoghlan assigned ncoghlan and unassigned ncoghlan Jun 13, 2012
    @ericsnowcurrently
    Copy link
    Member

    Your plan sounds good, Nick.

    @gpshead
    Copy link
    Member

    gpshead commented Aug 17, 2012

    I'd also like a command line flag to override PYTHONPATH (which could also be used in combination with -E so that you could still set the PYTHONPATH while ignoring everything else). I'll file a separate feature request for that.

    @kristjanvalur
    Copy link
    Mannequin

    kristjanvalur mannequin commented Nov 23, 2012

    Butting in here:
    Early on in the startup, a search is made for "site.py" using automatic sys.path settings. Do the suggestions here propose to override this? I know there is a -s flag, but all these flags start to be confusing.

    I have been looking for ways to completely ignore automatic settings for sys.path, either from internal heuristics (either build environment or installed environment) or PYTHONPATH.
    The reason for this is that as a developer in a large project, one that has many branches, each with their potentially different version of python compiled and with their own libraries, and where a "standard" python distribution may even be installed on the machine as well, we still want to be able to run scripts with each branch' own python with relative simplicity with some sort of guarantee that we are not inadvertently pulling in modules from other, unrelated branches, or god forbid, the "installed" python.

    PEP-405 seemed to go a long way, although it didn't provide a means in the pyvenv.cfg to completely override sys.path from the startup.

    I wonder if there isn't a synergy here with PEP-405, e.g. would it make sense to have a pyvenv.cfg file next to the __main__.py file?

    @ncoghlan ncoghlan added the 3.9 only security fixes label Apr 8, 2019
    @ncoghlan ncoghlan changed the title Add '-p'/'--path0' command line option to override sys.path[0] initialisation Add '--mainpath'/'--nomainpath' command line options to override sys.path[0] initialisation Apr 8, 2019
    @ncoghlan ncoghlan added the type-feature A feature request or enhancement label Apr 8, 2019
    @vstinner
    Copy link
    Member

    See also "Python flag/envvar not to put current directory to sys.path (but don’t ignore PYTHONPATH)" discussion:
    https://discuss.python.org/t/python-flag-envvar-not-to-put-current-directory-to-sys-path-but-dont-ignore-pythonpath/4235/2

    @vstinner
    Copy link
    Member

    The problem of long option names is that they cannot be used in a Unix shebang: "#!/usr/bin/python3 --long-option".

    I propose to add the -P option to not add sys.path[0]: I wrote python/issues-test-cpython#31542.

    @vstinner
    Copy link
    Member

    See also bpo-16202 "sys.path[0] security issues".

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @vstinner
    Copy link
    Member

    vstinner commented May 5, 2022

    Discussions about this issue or about changing the default to "not add the unsafe path":

    @vstinner
    Copy link
    Member

    vstinner commented May 5, 2022

    This issue was worked around manually by the dnf program to fix ImportError:

    here = sys.path[0]
    if here == '/usr/bin':
        # we never import Python modules from /usr/bin
        # removing this lowers the risk of accidental imports of weird files
        del sys.path[0]
    

    @vstinner
    Copy link
    Member

    vstinner commented May 5, 2022

    IPython uses a similar workaround:

        # Remove the CWD from sys.path while we load stuff.
        # This is added back by InteractiveShellApp.init_path()
        if sys.path[0] == "":
            del sys.path[0]

    Change added to fix a common mistake: an user got an error when running in IPython a file called math.py which overrides the stdlib math module.

    @vstinner
    Copy link
    Member

    vstinner commented May 5, 2022

    The jsontools plugin of ohmyzsh uses import sys; del sys.path[0] without checking sys.path[0] value: jsontools.plugin.zsh.

    @vstinner
    Copy link
    Member

    vstinner commented May 5, 2022

    See also the related issue: #90991 (previously known as bpo-46835) to enhance the ImportError when importing a broken six.pyc.

    @vstinner
    Copy link
    Member

    vstinner commented May 5, 2022

    If #31542 is merged, it will become possible to deprecate PySys_SetArgv() and PySys_SetArgvEx() in favor of the new PyConfig API: PySys_SetArgvEx(argc, argv, 0) can then be implemented with safe_path=1.

    vstinner added a commit that referenced this issue May 5, 2022
    Add the -P command line option and the PYTHONSAFEPATH environment
    variable to not prepend a potentially unsafe path to sys.path.
    
    * Add sys.flags.safe_path flag.
    * Add PyConfig.safe_path member.
    * Programs/_bootstrap_python.c uses config.safe_path=0.
    * Update subprocess._optim_args_from_interpreter_flags() to handle
      the -P command line option.
    * Modules/getpath.py sets safe_path to 1 if a "._pth" file is
      present.
    vstinner added a commit that referenced this issue May 6, 2022
    Fix tests failing with the PYTHONSAFEPATH=1 env var.
    
    Enhance also -P help in Python usage (python --help).
    vstinner added a commit that referenced this issue May 6, 2022
    Mention also -P and PYTHONSAFEPATH in the Security Considerations
    page.
    @vstinner vstinner changed the title Add '--mainpath'/'--nomainpath' command line options to override sys.path[0] initialisation Add -P command line option and PYTHONSAFEENV environment variable May 6, 2022
    @AA-Turner
    Copy link
    Member

    @vstinner is there anything left to do on this issue?

    A

    @vstinner
    Copy link
    Member

    vstinner commented Jun 7, 2022

    Yes: merge or reject #92361 (-p option).

    @kumaraditya303
    Copy link
    Contributor

    Closing as #92361 was rejected.

    @ericsnowcurrently
    Copy link
    Member

    I don't think the fate of this issue is directly tied to #92361. @vstinner, is there some reason it should be?

    @merwok
    Copy link
    Member

    merwok commented Jul 6, 2022

    This issue was about adding -P, which was done, then AT asked what was left to be done and VS replied with that other PR to add -p, so that’s why KA closed this.

    @vstinner
    Copy link
    Member

    vstinner commented Jul 6, 2022

    Since 2011, the issue title and the proposed solution changed multiple times. I added -P command line option to Python 3.11: https://docs.python.org/dev/using/cmdline.html#cmdoption-P

    It seems like adding -p is not needed only the default sys.path behavior changes. I wanted to add -p for PYTHONSAFEPATH, but I didn't convince anyone that it's required, so for now, there is no -p option.

    IMO the initial issue has been fixed, so I close the issue.

    @vstinner vstinner closed this as completed Jul 6, 2022
    @ericsnowcurrently
    Copy link
    Member

    Thanks for getting this done, @vstinner!

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants