classification
Title: Ensure that IDLE's stdlib imports are from the stdlib
Type: behavior Stage: test needed
Components: IDLE Versions: Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: louielu, mdcowles, terry.reedy
Priority: normal Keywords: patch

Created on 2016-01-18 04:05 by terry.reedy, last changed 2017-06-23 00:45 by terry.reedy.

Pull Requests
URL Status Linked Edit
PR 1364 open louielu, 2017-04-30 14:46
Messages (13)
msg258496 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-01-18 04:05
In the past few months, there has been an increase "IDLE fails to start" reports that are apparently due to shadowing of stdlib modules by user modules in the same directly as as user code.  For instance, a beginner creates turtle.py and then writes "import turtle" in the same file or another in the same directory.

Note that only Python-coded modules can be shadowed. Imports of builtin modules that are not accelerators for a .py module are handled internally and do not involve sys.path.  (If turtle were builtin, turtle.py would not interfere with 'import turtle.) In particular, sys is builtin and 'import sys' is guaranteed to work and give access to sys.path.

CPython avoids shadowing of the stdlib in its own imports.  Perhaps it does not add '', standing for the current working directory, to the front of sys.path until after its imports are done.  A self-contained application can avoid the problem by properly naming its own modules.  An application like IDLE that runs user code needs to protect itself from user files.  It does not now.

IDLE normally runs with two processes. I believe shadowing is more a problem for the user process that executes user code, with the working directory set to that of the user code.  (I believe that this is one reason why user code must be saved, to a particular directory, before it is run.)  Since, with one exception, all imports in run.py are done before running user code, I believe a sufficient fix would be temporarily delete '' while run.py does its own imports.

By default, the IDLE gui process should not have a problem.  The working directory is that of python.exe, which initially has no conflicting files.  To guard against user additions, '' could be permanently removed.  However, for option -n single-process mode, it would have to be restored (a reason for wanting to remove that mode).  I am not sure of the effect of all the other options.  At present, I believe, all imports are done at startup.  So the same idea used for the user process might work, at least for the present.

Automated tests would be nice.  I might want to some day 
limit initial imports to the minimum needed to show the initial window (or windows), so IDLE might start faster.  But they are not trivial.  And there is the time factor. A single cold start of IDLE takes as long (about 4 seconds on my machine) as the average test file in the test suite.
msg258499 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-01-18 05:22
Looking at https://stackoverflow.com/questions/15888186/cant-run-python-via-idle-from-explorer-2013-idles-subprocess-didnt-make-c and https://stackoverflow.com/questions/874757/python-idle-subprocess-error it seems that people have put things like tkinter.py, random.py, and subprocess.py in the python directory, along side of python.exe.  So the IDLE process definitely also needs a patch.
msg292130 - (view) Author: Matthew Cowles (mdcowles) Date: 2017-04-22 17:55
This bug should be fixed urgently. Over at python-help we just got another Python beginner who was using IDLE and then started to get the "IDLE's subprocess didn't make connection" error. The problem was that the user had named one of their programs random.py. That's a bad introduction to Python and programming.
msg292159 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-04-23 11:15
I will make this one of my priority issues once I get up to speed with the new workflow.  However, it is not a trivial issue, and I am not sure that all my beliefs of a year ago were correct.  What I really do not want to do is break IDLE running when users do not make a mistake.

Do you have access to IDLE on any system other than Windows?
msg292160 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-04-23 11:57
From the viewpoint of IDLE and masking, the stdlib has three parts: idlelib, other modules that IDLE imports, and other modules that IDLE does not import.  As I already noted, this division is different for IDLE and user processes.

IDLE may delay importing some idlelib modules, and I intend to do this more if possible.  But I am not going to worry about someone creating an idlelib directory with duplicate names.  If someone does this, too bad.

Modules that IDLE does not import are not a concern for the operation of IDLE. If a *user* program masks an stdlib module that the program intends to import, it is not IDLE's direct concern.

This issue is about other modules that IDLE *does* import -- directly *or* indirectly.  IDLE does not import 'random' -- the word does not appear in idlelib.  On the other hand, in 3.6.1,
>>> import sys
>>> 'random' in sys.modules
True
For a user random.py to block IDLE startup, it must be that some module that imports it accesses some attribute during import, possibly as part of a 'from' import.

In order to make new code unit-testable, it should be put in a new 'fix_path' function, with a detailed specification.
msg292169 - (view) Author: Matthew Cowles (mdcowles) Date: 2017-04-23 15:31
> Do you have access to IDLE on any system other than Windows?

I don't have a Windows machine at all.

For what it's worth, here's the behavior I'm talking about, albeit with an old version of Python:

$ mkdir testidle
$ cd testidle 
$ /usr/local/bin/python /Library/Frameworks/Python.framework/Versions/Current/lib/python2.7/idlelib/idle.py # Works
$ >random.py
$ /usr/local/bin/python /Library/Frameworks/Python.framework/Versions/Current/lib/python2.7/idlelib/idle.py
[. . .]
Unhandled server exception!
Thread: SockThread
[. . .]
    import tempfile
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/tempfile.py", line 34, in <module>
    from random import Random as _Random
ImportError: cannot import name Random

*** Unrecoverable, server exiting!

Of course no one starts IDLE from a command line in real life and so a beginner gets send down a blind alley with a misleading error message about firewall software.
msg292546 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-04-28 19:19
I and others constantly tell people to start IDLE from the command line when they have a problem.  Today, someone actually followed the advice and the traceback revealed that the problem was a personal tkinter.py masking the stdlib file!  Renaming it to mytkinter.py fixed the problem.  This tell me that 'fixing' sys.path for the idle process must be done immediately after importing sys and before importing tkinter.  In the user process, the sys import should again be first, followed by the fixup.

I have verified that a user 'sys.py' does not mask sys imports.  Python itself imports sys, so IDLE's 'import sys' just gets the existing sys.modules['sys'].
msg292632 - (view) Author: Louie Lu (louielu) * Date: 2017-04-30 14:51
Because sys module is correctly imported, we can modify sys.path to change the import behave.

Add new PR 1364 for this.

This add a new private function `_fix_import_path` that will remove the local import path `''`, when running IDLE from command line, it will recover the import path at the end of the argparse part. So no need to worry about user to import other local modules.

That also mean, if our user has the same name file, they will not be able to import it anymore. (cause IDLE import it first).
msg293679 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-05-15 06:31
Thank you for making a start on code for this issue.  After thinking about this a few hours, here is what I currently think.

I don't want a tiny module for one function.  I would like a module with objects common to the idle and user processes, in particular things needed by both pyshell and run modules. This would include things now imported into pyshell from run. (These were initially imported in the other direction, but switching sped up run startup.)  I would like to refactor and include the near duplicate warning functions.  The imports of io, linecache, and maybe warnings in the new module would come after the new function is called in the module.  At least initially, I would not include anything dependent on another IDLE module. 

Possible names: runutil, runcom, runshell, ??? (and test_...).

Call the new function fix_path (nearly everything in idlelib is private already).  Add "def unfix_path(): if [path[0] != '': <insert ''>.  It is possible that we will need to return and pass back what, if anything, is removed.  I am not sure yet.

When one runs 'python <path>/file.py', sys.path[0] is <path>, not ''.  So one can start idlelib with, for instance, "python <path-to_idlelib>/idle.py" and sys.path[0] will be <idledir>. It slows imports slightly and could be removed (and never re-added). I need to check for other instances of sys.path[0] != '' and <idledir> additions.

Here is one. Add "print(sys.path, file=sys.__stdout__)" to run.py and run it and the result in Shell is
['C:\\Programs\\Python36\\lib\\idlelib', 'C:\\Programs\\Python36\\Lib\\idlelib', 'C:\\Programs\\Python36\\python36.zip', ...

If I start IDLE from the console, path (in console) starts with '' instead of idledir x 2.  It appears twice because of the tkinter input.  The original sys.path is the same in both processes. If I then run a file with 'print sys.path', user-process sys.path begins with filedir, consoledir. I am not quite how '' is replaced by consoledir, but the presence of consoledir is wrong (explained elsewhere).

If I start from the icon, output never appears. The result needs to be saved and displayed otherwise, such as in a tkinter messagebox. If I then run the same file, path (displayed in Shell) begins with just the file directory.

More experiments are needed, such as starting from Explorer [Edit with IDLE].  We will need some repeated on Linux and MAC

In run.py, would "unfix()" be needed after imports?  I think not when running a file, because pyshell.ModifiedInterpreter.prepend_syspath and MI.run_command together prepend usercodedir to sys.path.  I believe this is true when running a startup file.  So we only need to be concerned about restoring '' (or anything else deleted) sometime before running user code from the shell.  

Within run, all but one of the delayed non-main imports are either duplicates that can be removed or can be moved to the top, where they will surely run before sys.path is augmented.

The un-movable delayed pydoc import is in the handle method of run.MyHandler(RPCHandler(socketserver.BaseRequestHandler)) and is called in the base .__init__.  This must happen *before* run_command sends the rpc request to modify sys.path.  So it should be safe if we do not previously unfix sys.path.

My conclusion: We may sometimes need to restore something so imports in Shell work the same as in Python REPL. The last possibility is in run.main just before 'while 1:'. 

Importing in the IDLE process and pyshell are messy.  As a temporary partial fix, we could unfix() after the top imports in pyshell, or after the top imports in pyshell.main, or at the bottom of pyshell.main just before entering the loop.  But if there are any further delayed non-idlelib imports, the IDLE process remains vulnerable to shadow files.  But I don't yet know if this is possible.

As with run, do we need to explicitly unfix in pyshell?  In normal mode, 
no user code runs in the idle process. So unfix is not needed.  With -n set, I think unfix is still not needed for user files, as MI.run_command runs the same userdir prepend in self.locals, where user code is exec'ed.  Indeed, in order to protect all IDLE imports (in the IDLE process), sys.path should be left reduced *except while running user code in the IDLE process, in -n mode*.

The irreducible problem for -n is that once user code *is* run, the fact that IDLE simulates python -i and the fact that -n means no Shell restart, the directory cannot be removed.  If there is a conflict, we cannot help it.  The oddity of -n mode is that one can run additional files from other directories and make sys.path grow indefinitely.  Thus files can interfere with each other as well as with IDLE.   In python, one can run dir1/file1.py with -i and 'exec(open(dir2/file2.py).read)', and so on, but exec does not prepend dir2 to sys.path.

What about the pyshell.main section "# process sys.argv and sys.path"?  I think the sys.path parts are wrong, at least for protecting IDLE imports, and should be removed.

    for i in range(len(sys.path)):
        sys.path[i] = os.path.abspath(sys.path[i])

Everything is already absolute except for '' and changing the '' added by python to abspath('') == getcwd() seems wrong.  It makes the IDLE shell unnecessarily different from the Python REPL.  If I run
  path-a> python -i path-b/tem.py
I cannot import files in path-a.  The same should be true if I run
  path-a> python -m idlelib
and then edit and run path-b/tem.py.

    elif args:
        enable_edit = True
        pathx = []
        for filename in args:
            pathx.append(os.path.dirname(filename))
        for dir in pathx:
            dir = os.path.abspath(dir)
            if not dir in sys.path:
                sys.path.insert(0, dir)

These insertions should only happen in -n mode and will happen if and when run, not before being opened to be edited (which could even fail).

    else:
        dir = os.getcwd()
        if dir not in sys.path:
            sys.path.insert(0, dir)

With '' changed (wrongly) to getcwd, this will usually do nothing. If I run 'python', sys.path start with '', not os.getcwd.  If I os.chdir(newdir), then '' refers to newdir, not the startig dir.  However, in -n mode

Conclusion: delete the above and add
    if not use_subprocess (and other conditions?):
        sys.path.insert(0, '')
at the last possible moment, just before
msg293723 - (view) Author: Matthew Cowles (mdcowles) Date: 2017-05-15 16:36
I'm very willing to believe that I'm missing something critical here, but wouldn't it be satisfactory to just move "" from the beginning to the end of sys.path in IDLE?

I can imagine that it might cause a few problems for users, but it is simple and the problems would be easy to explain.

IDLE is commonly used by beginners and over at python-help I would not relish the prospect of saying, "With IDLE, the import rules change on account of various conditions...." I don't think that would be much better than the current state.
msg293728 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-05-15 18:29
What I want from this issue , from a user perspective, is a) for IDLE to run without stumbling over user files, and b) for user code to see the same sys.path when executing under IDLE as when executed directly with the same cpython binary in the same mode.
msg293741 - (view) Author: Matthew Cowles (mdcowles) Date: 2017-05-15 21:52
> a) for IDLE to run without stumbling over user files
> b) for user code to see the same sys.path when executing under IDLE as when 
> executed directly with the same cpython binary in the same mode.

If you can achieve that, no one will be more impressed than I am. I would take the easy way out.
msg296367 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-06-19 18:31
#25488 and #26143 address related issues of sys.path as seen by IDLE and user code.  #26143 focuses on what IDLE sees, and user modules shadowing IDLE modules, #25488 focuses on what user code sees, and the presence of idlelib in sys.path making buggy code work.  It includes startup information relevant to #26143 also.  Both should be solved by patching IDLE in same place.
History
Date User Action Args
2017-06-23 00:45:44terry.reedysetkeywords: + patch
2017-06-19 18:31:49terry.reedysetmessages: + msg296367
2017-06-19 18:11:19terry.reedysetcomponents: + IDLE
2017-05-15 21:52:12mdcowlessetmessages: + msg293741
2017-05-15 18:29:52terry.reedysetmessages: + msg293728
2017-05-15 16:36:03mdcowlessetmessages: + msg293723
2017-05-15 06:31:56terry.reedysetmessages: + msg293679
2017-04-30 14:51:11louielusetnosy: + louielu
messages: + msg292632
2017-04-30 14:46:21louielusetpull_requests: + pull_request1474
2017-04-28 19:19:20terry.reedysetmessages: + msg292546
2017-04-23 15:31:29mdcowlessetmessages: + msg292169
2017-04-23 11:57:03terry.reedysetmessages: + msg292160
2017-04-23 11:15:19terry.reedysetstage: needs patch -> test needed
messages: + msg292159
versions: + Python 3.6, Python 3.7
2017-04-22 17:55:00mdcowlessetnosy: + mdcowles
messages: + msg292130
2016-01-18 05:22:45terry.reedysetassignee: terry.reedy
messages: + msg258499
2016-01-18 04:05:50terry.reedycreate