classification
Title: Add option to py_compile to compile for syntax checking without writing bytecode
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: Pavel Roskin, brett.cannon, eric.araujo, proski, r.david.murray, terry.reedy
Priority: normal Keywords:

Created on 2015-10-03 00:52 by proski, last changed 2017-06-18 18:23 by terry.reedy. This issue is now closed.

Messages (11)
msg252185 - (view) Author: Pavel Roskin (proski) Date: 2015-10-03 00:52
$ echo "'''Simple script'''" >simple-script
$ PYTHONDONTWRITEBYTECODE=1 python3 -B -m py_compile simple-script
$ ls __pycache__
simple-scriptcpython-35.pyc

py_compile should recognize when the user doesn't want the bytecode to be produced. Otherwise, it's not usable in makefiles for a quick code check.

See also http://stackoverflow.com/questions/4284313/
msg252216 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-03 15:34
The only reason to call py_compile is to get byte code.  Honoring PYTHONDONTWRITEBYTECODE would, IMO, be a bug, at least according to its documentation (by implication, it isn't explicit about it, and perhaps it should be).

Your use case could be added as a feature by adding a command line option to py_compile, but that would be 3.6 only.

On the other hand, you can achieve your use case via the following:

    python -B -c 'import $MYFILE'

(without the '.py' on the end, obviously).

which is actually shorter, so at first I was inclined to just reject that as unneeded.  (In a Makefile you'd want to CD to the directory or put the directory on the PYTHONPATH, which makes it slightly longer but not much.)

py_compile has an interesting little bug, though: if you pass it a script name, it will happily create an *invalid* .pyc file (eg: python -m py_compile results in a tempcpython-36.pyc file).  compileall on the other hand just ignores files that don't end in .py, which is also a bit odd when the file is named explicitly on the path.  So I suppose that's a bug too.

But absent py_compile's bug, there's no easy way that I know of to "syntax check" a *script* without executing it.  So this is probably a worthwhile enhancement.

Note: if we were designing this from scratch we might decide that honoring -B/PYTHONDONTWRITEBYTECODE was the right way to spell this, but since we are not, I don't think we can change py_compile's behavior in this regard for backward compatibility reasons, so adding an option to py_compile seems the best course.
msg252626 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2015-10-09 16:37
I had the same reasoning as RDM when I worked on byte-compilation in distutils2: https://hg.python.org/distutils2/rev/7c0a88497b5c

Using py_compile or compileall means that you want to create pyc or pyo files.
Defining PYTHONDONTWRITEBYTECODE or -B means that you don’t want the Python interpreter to byte-compile module as a side-effect of importing them.
These two things seem orthogonal to me.
msg252634 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-10-09 18:36
I agree that the proposal as written should be rejected.  I am inclined to think this issue should be closed.

I do not understand the claim about 'python -m py_compile'.  For me, this does nothing, as I would expect from reading the code.  If args = sys.arg[1:] is empty, the else clause for loop immediately quits.

Syntax checking is easily done with compile; that is how IDLE does it.  
  python -c "compile(open(<filename>).read(), '', 'exec').
should do what Pavel tried to do, except for getting a SyntaxError instead of py_compile.PyCompileError.
msg252638 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-09 19:23
OK, I'll change the title to reflect the current proposal, and we'll see if anyone is interested in proposing a patch.

The bug with python -m py_compile is when you do:

    python -m py_compile myscript

where myscript is a file containing python code (note there is no .py extension).  In this case you will end up with:

  __pycache__/myscriptcpython-36.pyc

(for example).  This is clearly a bug, but should be reported in a new issue.
msg252641 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2015-10-09 19:52
You can verify a script is syntactically correct by compiling it to an AST or just calling compile() which is another way of doing essentially what `import` does but without having to worry about import-related side-effects in the code being checked.

But is this really worth adding to the stdlib? You can run your tests to verify the code is syntactically sound, run pylint, etc. I think this is probably straying a bit too much into the tooling arena to make the maintenance burden worth having in the stdlib.
msg252645 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-09 20:05
Well, the thing is that py_compile *already* has all the needed logic, the flag would just allow us to add an if statement before the two lines that write the compiled bytecode out to the file system.  py_compile also has the advantage that it supports the importlib loader logic.  The goal here (from my point of view) is to have a simple command line way of checking the syntax of a script, so that last may not be important.  

The python -c using 'compile(open' *is* reasonably brief, but it is not as elegant as the 'perl -c' mentioned in the linked stackoverflow question.  'python -m py_compile' isn't quite a succinct as 'perl -c', but it is a lot closer than 

  python -c "compile(open(<filename>).read(), '', 'exec')"

and a lot easier to remember.  Now, you can argue that referencing perl in this context is a bit of 'keeping up with the joneses', but I think there is an argument to be made that it is worthwhile in this case.  I won't be heartbroken if the idea gets shot down, though :)
msg252661 - (view) Author: Pavel Roskin (proski) Date: 2015-10-09 21:15
That's what I have now:

check:
        $(PYTHON) -m py_compile $(SOURCES)
        rm -f $(addsuffix c, $(SOURCES))

make check
python -m py_compile redacted-build redacted-git-diff redacted-git-gc redacted-git-status redacted-init redacted-server
redactedbuilder.py
rm -f redacted-buildc redacted-git-diffc redacted-git-gcc redacted-git-statusc redacted-initc redacted-serverc redactedb
uilder.pyc

That's what David is suggesting:

check:
        for file in $(SOURCES); do \
            python -c "compile(open('$$file').read(), '', 'exec')" || exit 1; \
        done

make check
for file in redacted-build redacted-git-diff redacted-git-gc redacted-git-status redacted-init redacted-server redactedb
uilder.py; do \
            python -c "compile(open('$file').read(), '', 'exec')" || exit 1; \
        done

That's what I could have if I live long enough to see Python 3.6 on my development machine.

check:
        $(PYTHON) -m py_compile --no-output $(SOURCES)

make check
python -m py_compile --no-output redacted-build redacted-git-diff redacted-git-gc redacted-git-status redacted-init redacted-server
redactedbuilder.py

If that does not seem like an important improvement, then I can live with what I have.
msg252671 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-10-09 22:13
On the side issue: While the example given, which uses the py_compile.compile defaults via the command line interface, is useless, I disagree that writing a .pyc file for a file without .py is a bug.

Python will run python code with any filename as main module (and not write .pyc).  It will only import the *same code* (and normally write .pyc) if the filename ends with .py (or .pyw on windows).  However, 'import script' will import script.pyc (on the search path) without a script.py file existing.  Using py_compile.compile('script', 'script.pyc') makes that possible.  (I just tried it.)

.
msg252762 - (view) Author: Pavel Roskin (Pavel Roskin) Date: 2015-10-11 05:23
Thank you for the comments. I was annoyed by py_compile making files with names very similar to the original scripts, names that could not even be recognized by shell patterns in .gitignore unless scripts ending with "c" are banned. But that problem is addressed in Python 3. I don't really care what files are in __pycache__, I won't have that urge to remove them. And then I simply was annoyed by the fact that py_compile was ignoring my attempts to suppress its output. Now I understand the reason for that. Migrating to Python 3 would address my original problem with strange looking cache files. I'm going to close this issue.
msg296285 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-06-18 18:23
Closing, as suggested above.
History
Date User Action Args
2017-06-18 18:23:13terry.reedysetstatus: open -> closed
resolution: out of date
messages: + msg296285

stage: needs patch -> resolved
2015-10-11 05:23:06Pavel Roskinsetnosy: + Pavel Roskin
messages: + msg252762
2015-10-09 22:13:57terry.reedysetmessages: + msg252671
2015-10-09 21:15:04proskisetmessages: + msg252661
2015-10-09 20:05:41r.david.murraysetmessages: + msg252645
2015-10-09 19:52:59brett.cannonsetnosy: + brett.cannon
messages: + msg252641
2015-10-09 19:23:51r.david.murraysetmessages: + msg252638
title: py_compile disregards PYTHONDONTWRITEBYTECODE and -B -> Add option to py_compile to compile for syntax checking without writing bytecode
2015-10-09 18:36:56terry.reedysetnosy: + terry.reedy
messages: + msg252634
2015-10-09 16:37:47eric.araujosetnosy: + eric.araujo
messages: + msg252626
2015-10-03 15:34:31r.david.murraysetversions: + Python 3.6, - Python 3.5
nosy: + r.david.murray

messages: + msg252216

type: enhancement
stage: needs patch
2015-10-03 00:52:11proskicreate