New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py_compile.compile() replaces target files, breaking special files and symlinks #61424
Comments
Since d4eb02b6aac9 py_compile.compile() replaces target files, breaking special files and symlinks. Any custom permissions set on target files are also lost. This is a major regression. # cd /tmp |
So py_compile always replaced the file, even in Python 3.3: http://hg.python.org/cpython/file/79ea59b394bf/Lib/py_compile.py#l141 (this is why I changed the title of the bug). What did change, though, is that py_compile now writes bytecode files just as import does, which means copying the file permissions of the source file for the bytecode file. E.g. if you listed the permissions for test.py in your example output I bet it's -rw-r----- just like what /dev/null ended up with. I'm not going to change this as this is very much on purpose for security reasons; it's more of a long-standing bug that py_compile never used the proper permissions than it is changing the semantics to what they are today. Since py_compile is for compiling and writing out source code it should match what import does (if you are doing this to verify the file will parse properly, which is what writing to /dev/null suggests, it would be more accurate to just pass the raw source code through the AST module and make sure no exceptions are raised, else just write into the /tmp directory and let the OS clean up). What I will do, though, it document this fact in the docs with a "Changed in Python 3.4" note and in What's New since people should be aware of it. |
In case of /dev/null, the main problem is that it became a regular file. It was previously a character device. Writing to character devices should not replace them. What about problem with symlinks?: |
py_compile.compile() was always replacing contents of target file, but was not causing that a given link name would refer to a different inode. builtins.open() has correct behavior: # stat /dev/null |
So that all happens because importlib does an atomic write of the file which uses os.replace(): http://hg.python.org/cpython/file/83d70dd58fef/Lib/importlib/_bootstrap.py#l121 . Unless there is some way that I can't think of to have the atomic write still exist but not change the type of file it replaces I am still not willing to revert my change just for this use case. There should be a single implementation of bytecode file generation and py_compile (along with compileall) should be nothing more than convenience modules for forcing the generation of those files under the same semantics as if they were done as a side-effect of importing some source code. |
How about using the |
Use the mode how exactly? I mean isn't the problem the os.replace() call and not os.open() on the source file? |
If you want to reproduce the original file's access rights, you have to |
I meant: we want the replace to happen atomically :-) |
I think Arfever is more frustrated by the os.replace() call than the permissions. |
Ah, right. Well, there would be an argument not to use os.replace() in py_compile, since it's an offline processing step which generally shouldn't race with another (online) processing step. Still, I wonder what the use case is (apart from the /dev/null case for which the answer is simply "don't do it" :-)). |
Exactly, and I don't want to have to slice up the internal API anymore in order to support this edge case which I don't think is important enough to support. |
New changeset 55806d234653 by Brett Cannon in branch 'default': |
Antoine says:
But I think that's not necessarily true. http://mail.python.org/pipermail/python-dev/2013-May/126241.html |
Ha :) Then you know which kind of patch you have to try. |
Can someone confirm the answer to Arfrever's original question: a seemingly innocent use case of py_compile.compile(), which works fine until Python 3.3, when executed as root, can in Python 3.4 fundamentally break down a complete Posix system in a non-obvious way? |
Yes, the switch to renaming can change behaviour in some corner cases (I say "corner cases" because the expected situation is to have your pyc files be regular files, not symlinks or character devices). Python is certainly not the only application where you can bust /dev/null by specifying it as target location. Mutating a file in place is a source of unpredictable issues with concurrent access of the file being written to, which is why it was changed to renaming. I think the class of issues which was solved (presumably) is much more important than the use case of compiling to /dev/null. As for symlinks, I'd naively expect breaking symlinks to be a feature, but YMMV. |
My point is that in five years' time some existing well-tested sysadmin-like program will be run with Python 3.4 for the first time, and suddenly whole systems will break. I don't want to sound negative but this is the worst behavior ever (with the possible exception of programs that do bad things beyond your own system, like viruses). Please at least crash with an exception instead: add one easy check to py_compile.compile() and complain if cfile points to an existing non-regular file. |
I fixed syntax sanity check in Portage (package manager in Gentoo) to avoid breaking system. This sanity check is always run during upgrade of Portage itself. |
Well, we could add a dedicated function for sanity checking, then. |
That's a good point (except that symlinks should probably be allowed too). |
If someone wants to submit a patch to make the change that's fine, but they are going to have the same issue when they simply execute an import that byte-compiles as a side-effect so you will have to decide if you are only worried about py_compile. |
I'm going to toss Armin's request over to core-mentorship since the fix is easy. Is all you are after, Armin, a |
That's not really what I'm after, but that's a workaround I proposed to avoid the worst breakage. I don't have an opinion about what to do with symlinks. |
Armin: I'm going off of http://bugs.python.org/msg189428 where you say you want an exception raised. If there is something else you want to suggest that's fine as long as it's not to revert the semantics since I'm not willing to do that. |
Brett: I don't really have a constructive opinion on this matter. I could suggest putting a bit more logic in py_compile.compile(cfile=...) to revert to the old behavior if we specify an already-existing target, in order to fix the other reported issue that custom permissions set on the target file are lost. But I would understand if you decide that this is not compatible with the gains of atomic file creations. I only have a problem with replacing block devices with regular files. I don't have a particular opinion if the target file is a symlink (as long, of course, as it doesn't lead to whatever it points to being replaced, but it shouldn't). |
Thanks for the clarification, Armin! |
OK, so here was what a patch will need:
|
I'm going to resolve this issue myself, but "blog" about it on core-mentorship so others can follow along with how I resolve the bug. |
New changeset 46ef1d2af352 by Brett Cannon in branch 'default': |
Exception messages are wrong. They contain name of source file instead of target file. |
New changeset e2ccfb096e6a by Brett Cannon in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: