Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readline module loading in interactive mode #56447

Open
NielsHeinen mannequin opened this issue Jun 2, 2011 · 26 comments
Open

Readline module loading in interactive mode #56447

NielsHeinen mannequin opened this issue Jun 2, 2011 · 26 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-security A security issue

Comments

@NielsHeinen
Copy link
Mannequin

NielsHeinen mannequin commented Jun 2, 2011

BPO 12238
Nosy @loewis, @jcea, @ncoghlan, @pitrou, @vstinner, @tiran, @merwok, @bitdancer, @ericsnowcurrently, @serhiy-storchaka, @zooba
Superseder
  • bpo-28192: Don't import readline in isolated mode
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2011-06-02.11:57:39.476>
    labels = ['type-security', 'interpreter-core']
    title = 'Readline module loading in interactive mode'
    updated_at = <Date 2020-03-06.20:51:30.573>
    user = 'https://bugs.python.org/NielsHeinen'

    bugs.python.org fields:

    activity = <Date 2020-03-06.20:51:30.573>
    actor = 'brett.cannon'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2011-06-02.11:57:39.476>
    creator = 'Niels.Heinen'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 12238
    keywords = []
    message_count = 17.0
    messages = ['137473', '137475', '137529', '137741', '137752', '137804', '137821', '137824', '137828', '173544', '173551', '173552', '173561', '187597', '187598', '252029', '277330']
    nosy_count = 13.0
    nosy_names = ['loewis', 'jcea', 'ncoghlan', 'pitrou', 'vstinner', 'christian.heimes', 'eric.araujo', 'r.david.murray', 'eric.snow', 'Niels.Heinen', 'serhiy.storchaka', 'steve.dower', 'yaccz']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = '28192'
    type = 'security'
    url = 'https://bugs.python.org/issue12238'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6']

    @NielsHeinen
    Copy link
    Mannequin Author

    NielsHeinen mannequin commented Jun 2, 2011

    Running the python binary without a script or using the -i flag will
    start the process in interactive mode. The interactive mode requires an
    external module to be loaded: readline.

    Per default behavior, Python also tries to load this module from the current working directory (see also trace below)

    strcpy(0x7fff17609ed8, ".so") = 0x7fff17609ed8
    fopen64("readline.so", "rb" <unfinished ...>
    SYS_open("readline.so", 0, 0666) = -2
    <... fopen64 resumed> ) = 0
    strcpy(0x7fff17609ed8, "module.so") = 0x7fff17609ed8
    fopen64("readlinemodule.so", "rb" <unfinished ...>
    SYS_open("readlinemodule.so", 0, 0666)

    The module is imported in Modules/main.c line 663:

      if ((Py_InspectFlag || ......
        isatty(fileno(stdin))) {
          PyObject *v;
          v = PyImport_ImportModule("readline");

    Why consider this a security bug: basically because you don't expect a
    program to import a shared library from your current directory _unless_
    you explicitly tell it to (e.g. import blah).

    On a multi user system, someone could plant a malicious shared libraries
    named "readline.so" in an attempt to hack a user that runs python in
    interactive mode.

    The risk obviously _very_ low but nevertheless worth to consider improving by, for example, loading readline with a more strict path? (e.g. python lib directories only?)

    Niels

    AN EXAMPLE:
    -----------
    The code below is compiled to readline.so and stored in /tmp:

      void __attribute__ ((constructor)) _load();
      void _load() {
          printf("DING DONG!\n");

    }

    foo@foo:/tmp$ ls -l /tmp/readline.so
    -rwxr-x--- 1 nnnnn nnn 7952 Mar 29 16:24 /tmp/readline.so
    foo@foo:/tmp$ python
    Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
    [GCC 4.4.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    DING DONG!

    >>

    @NielsHeinen NielsHeinen mannequin added the type-security A security issue label Jun 2, 2011
    @bitdancer
    Copy link
    Member

    This is a general principle of how Python runs in interactive mode and is not confined to loading readline. The same would be true for any module loaded during startup, and there are quite a few that are so loaded. Since loading modules from the current working directory is an important feature of using python in interactive mode, this is not something that is likely to be changed.

    @merwok
    Copy link
    Member

    merwok commented Jun 3, 2011

    +1 to what David said. See also bpo-5753.

    @NielsHeinen
    Copy link
    Mannequin Author

    NielsHeinen mannequin commented Jun 6, 2011

    Hi Eric, David,

    This means that you cannot type "python" and press <enter> in any shared directory without the risk of a malicious readlinemodule.so being imported and executed.

    I think this is different from a scenario where someone explicitly runs a script or imports a module in interactive mode where it is also reasonable that such a person understands the importing mechanism.

    Thanks for the quick responses btw!

    Niels

    @bitdancer
    Copy link
    Member

    I've done a little poking around, and it looks like you are correct and I'm wrong. It appears that readline.so is or should be a special case. I've added some people to nosy to see what they think.

    Specifically, it appears that if I put a file that should shadow a library module that is imported at python startup time (eg: os.py) into my current working directory I still get the os.py from the appropriate lib directory, even though '' is first in my sys.path. This is not how I thought it worked, but it is my observation. I tested this on 2.6.6, 2.7.1 and 3.3 tip.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 7, 2011

    I don't think readline is "special-cased":

    $ echo "1/0" > logging.py
    $ cpython/default/python
    Python 3.3a0 (default:d8502fee4638+, Jun  6 2011, 19:13:58) 
    [GCC 4.4.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import logging
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "logging.py", line 1, in <module>
        1/0
    ZeroDivisionError: division by zero

    @bitdancer
    Copy link
    Member

    Python 3.3a0 (default:7323a865457a+, Jun  5 2011, 19:22:38) 
    [GCC 4.5.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> sys.modules['logging']
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    KeyError: 'logging'
    >>> sys.modules['os']
    <module 'os' from '/home/rdmurray/python/p33/Lib/os.py'>

    The difference is that logging is not imported at startup. So, however os (and friends, there are a lot of modules in sys.modules at startup) is imported, it is different from how readline.so is imported.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 7, 2011

    The difference is that logging is not imported at startup. So, however
    os (and friends, there are a lot of modules in sys.modules at startup)
    is imported, it is different from how readline.so is imported.

    For the record, os is imported by the _io module:

    /* put os in the module state */
    state->os_module = PyImport_ImportModule("os");
    if (state->os_module == NULL)
        goto fail;
    

    (in Modules/_io/_iomodule.c)

    This probably happens before sys.path is
    adjusted/tweaked/fixed/garbled/whatever.

    @bitdancer
    Copy link
    Member

    Yeah, that would be my guess. And readline.so is imported in main at a point where it has decided we are going into interactive mode, which is presumably after all other initialization has taken place, including the path munging.

    Thus my suggestion that that particular import of readline.so should be special cased...

    @serhiy-storchaka
    Copy link
    Member

    This issue was fixed in 3.3, but not in 2.7 or 3.2.

    $ strace ./python -i </dev/null 2>&1 | grep readline
    stat64("/home/serhiy/py/cpython3.3/build/lib.linux-i686-3.3/readline.cpython-33m.so", {st_mode=S_IFREG|0755, st_size=52511, ...}) = 0
    open("/home/serhiy/py/cpython3.3/build/lib.linux-i686-3.3/readline.cpython-33m.so", O_RDONLY) = 4
    open("/lib/libreadline.so.6", O_RDONLY) = 4

    @serhiy-storchaka serhiy-storchaka added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Oct 22, 2012
    @pitrou
    Copy link
    Member

    pitrou commented Oct 22, 2012

    Serhiy, I don't think it's fixed in 3.3, or perhaps I'm misunderstanding what you mean by "fixed". If you create readline.cpython-33m.so in your cwd and then run python, the fake readline will still be loaded instead of the real one.

    For example (here with 3.4):

    $ touch readline.cpython-34dm.so
    $ ./python
    Python 3.4.0a0 (default:2a0c9472c89c, Oct 21 2012, 23:24:06) 
    [GCC 4.5.2] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import readline
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: ./readline.cpython-34dm.so: file too short

    @pitrou
    Copy link
    Member

    pitrou commented Oct 22, 2012

    Regardless, I'm not sure what we should do about this issue. Loading readline is obviously provided as a convenience to make the interpreter prompt easier to use. Several of us would probably like to go a bit further and also add tab-completion (see bpo-5845).

    It stands that while the -S and -E option allow to disable any customization a user might have done (which is necessary for e.g. suid scripts), the automatic insertion of '' into sys.path has no way of being undone.

    @serhiy-storchaka
    Copy link
    Member

    I understand what happens. Python 3.3+ uses getdents(), not stat() for module search. Therefore stat() is not called for non-existed files.

    It stands that while the -S and -E option allow to disable any customization a user might have done (which is necessary for e.g. suid scripts), the automatic insertion of '' into sys.path has no way of being undone.

    Python used not only for suid scripts.

    @vstinner
    Copy link
    Member

    We may add a command line option and/or an environment variable to not add the current directory to sys.path.

    Changing the current behaviour may break many applications / use cases.

    @vstinner
    Copy link
    Member

    We may add a command line option and/or an environment variable to not add the current directory to sys.path.

    Oh, this is exactly what the issue bpo-16499 proposes.

    @bitdancer
    Copy link
    Member

    This issue was reported again in bpo-25288.

    To summarize: the cwd should only be used for imports *after* the command prompt is displayed, and readline is imported *before* the prompt is displayed but currently is imported from the cwd. This should be fixed.

    @tiran
    Copy link
    Member

    tiran commented Sep 24, 2016

    Steve took care of the readline import for isolated mode in bpo-28192. We can't change the default behavior. If you want to prevent Python from important files from either cwd, user packages or env vars, you have to use isolated mode. System scripts should use the isolated mode flag, too.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel
    Copy link
    Member

    We may add a command line option and/or an environment variable to not add the current directory to sys.path.

    That's done now, right? #31542

    @iritkatriel iritkatriel added the pending The issue will be closed if no feedback is provided label May 20, 2022
    @tiran
    Copy link
    Member

    tiran commented May 20, 2022

    GH-92346 also changed the order of imports. readline, rlcompleter, and dependencies are now loaded before sys.path is extended.

    @iritkatriel iritkatriel removed the pending The issue will be closed if no feedback is provided label May 20, 2022
    @disconnect3d
    Copy link
    Contributor

    disconnect3d commented Jul 13, 2022

    Hey, thanks for resolving this. Just a heads up that readline.so (or readline.py) is not the only library that is/was being loaded this way - things like _bz2.so, _ssl.so, _json.so etc. are also loaded in a similar fashion. That's actually something I have shown on a talk on EuroPython 2022 along with testing this in a Python 3.11 beta Docker container where those libs were not loaded. So it seems that it might have been fixed but I don't know for sure.

    I haven't really investigated why they those others (underscore prefixed) libraries were loaded in the first place but it would be great if someone looked into it deeper and actually confirmed this is fully fixed. It seems that maybe there were loaded in the first place because the readline.so import did load them somehow? But then why they were only imported after invoking the first command in the interpreter? I am not sure.

    cc: @tiran @iritkatriel ; I'm happy to talk or look into this during the conference or sprints if any of you will be around ;)

    @ericsnowcurrently
    Copy link
    Member

    Yeah, it's worth double-checking.

    @tiran
    Copy link
    Member

    tiran commented Jul 21, 2022

    By default Python adds the current working directory to sys.path when you run a Python interpreter interactively. This is a fundamental design choice that makes it easier for beginners. My PR GH-92346 is not to be understand as a security fix. It is an attempt to reduce the amount of invalid security bug reports that we receive. :)

    @vstinner
    Copy link
    Member

    vstinner commented Aug 3, 2022

    By default Python adds the current working directory to sys.path when you run a Python interpreter interactively.

    Python 3.11 has a new -P command line option to opt out from this behavior ;-) https://docs.python.org/dev/using/cmdline.html#cmdoption-P

    @amc1804
    Copy link

    amc1804 commented Sep 25, 2022

    I applaud this step in the right direction, and wonder if it should apply not only to readline & rlcompleter but to the entire site module. Even after GH-92346, if you run python interactively and call help(), if there is a pydoc.py in the current directory, that's what you'll get. What other surprises are lurking in site.py (or will be added to it)? Rather than play whack-a-mole, could we delay the extension of sys.path until after all automatic imports (including site)? Surely those automatically-imported modules must intend their own imports to come from installed libraries, not from the current directory or the directory containing the script, right?

    @zooba
    Copy link
    Member

    zooba commented Sep 26, 2022

    You can already pass -S to skip the entire site module.

    @amc1804
    Copy link

    amc1804 commented Sep 30, 2022

    You can already pass -S to skip the entire site module.

    That's good to know, but that was true before GH-92346 too. One could have avoided the readline/rlcompleter breakage by not loading them at all. GH-92346 helpfully unbroke readline/rlcompleter, and I suggest that the same idea could be used to unbreak the rest of the site module. The extension of sys.path is meant to benefit the interactive read-eval-print loop, but it confuses the automatic initialization, so why not delay it until that's done?

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests