New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow interpreter to execute a zip file #45107
Comments
The motivation for this is that distributing a zip file is a very light and easy way to distribute a python program with multiple packages/modules. I have done this on Linux, Mac and Windows and it works very nicely -- except that you need a few extra files to bootstrap it: set PYTHONPATH to the zip file and run the main function. With this small patch, you get rid of the need for extra files. At the bottom is a demo on Linux. On Windows, you can do a similar thing by making a file that is both a zip file and a batch file. The batch file will pass %~f0 (itself) to the -z flag of the Python interpreter. I ran this by Guido and he seemed to think it was a fine idea. At Google, we have numerous platform-specific hacks in a program called "autopar" to solve this problem. I have included the basic patch, but if you guys agree with this, I will add some tests and documentation. And I think it might be useful to include something in the Tools/ directory to do what update_zip.sh does below (add a __zipmain__ module and a shebang/batch file header to a zip file, to make it executable)? I think this may also help to fix a bug with eggs: http://peak.telecommunity.com/DevCenter/setuptools#eggsecutable-scripts IMPORTANT NOTE: Eggs with an "eggsecutable" header cannot be renamed, or invoked via symlinks. They must be invoked using their original filename, in order to ensure that, once running, pkg_resources will know what project and version is in use. The header script will check this and exit with an error if the .egg file has been renamed or is invoked via a symlink that changes its base name. andychu testdata$ ls # The main program you're going to run in "development mode" andychu testdata$ ./foo.py foo bar # Same program, packaged into a zip file andychu testdata$ ./foo_exe.zip foo bar # Contents of the zip file andychu testdata$ unzip -l foo_exe.zip # Demo script to build an executable zip file. andychu testdata$ cat header.txt andychu testdata$ cat update_zip.sh # Make a regular zip file. # Add a shebang line to it. # Make it executable. |
I like the general idea, but it should be possible to use runpy.run_module to get __name__ set correctly (as that is what happens when you execute a module from a zipfile with -m). Another advantage of using run_module is that it would allow runzip() to take a second argument (possibly defaulting to "__zipmain__") which would specify the module to be executed from the zipfile (the remaining 3 run_module arguments could also be passed in, and set appropriately from main.c). Adding the new function as runpy.run_zip() (instead of adding a new module) would also be good. For Windows, an alternative to making the zip file both a batch and a zip file would be to adopt a .pyz extension convention for these files - the file associations can then be set up to invoke the script appropriately with python -z (similar to the way that .pyw files are associated with pythonw instead of the standard python executable). That way the same file could be executed on both Linux (via an embedded shebang line) and on Windows (via filename association), as is the case with standard .py Python scripts. My final question is whether the change to sys.path should be reverted once the module execution is complete - my suspicion is that it should, but I need to look into it a bit more before giving a definite answer (for the command line flag case, this behaviour obviously doesn't matter - it is only significant if the Python method is invoked directly in the context of a larger program). |
Nick, you're right, I think it can use run_module and be in the runpy module. Let me make those changes and send you another patch. |
Nick, I've updated the code to use a new runpy.run_zip function, which calls run_module. This does make it a bit cleaner. Let me know what you think. If the code is good I'll write some tests and documentation. Also, I'm not sure if the '-c' is really appropriate in sys.argv, but that seems to be what the -m flag uses. It seems like it might make sense to have sys.argv[0] be the zip file, if it is really a first class executable. And I think a script to build one of these files would be appropriate, which I can add. You could pass it the main module and main function, and it would generate a __zipmain__ stub and add it to the zip file. And it is a good idea if the file is cross platform, so a .pyz extension would work. Sorry the delayed response, I was a bit busy at work this week... but I'll respond sooner this time. : ) Example: andychu trunk$ testdata/foo_exe.zip foo bar File Added: runzip7.diff |
Here is a script that documents how to make such files. I think the important part is just documenting the format. Then people can write whatever tools they need around it. Many people could get by with this simple tool, but others might want something more elaborate. Demo: andychu testprog$ find andychu testprog$ find -name "*.py" | xargs ../Tools/scripts/makepyz.py -a zip,pyz,unix -z foo.zip -p package1 -m foo -y /usr/local/google/clients/python/trunk/python andychu testprog$ ./foo.zip File Added: makepyz.py |
I'm going to be off the net for a few days - I'll have a look at the updated patch when I get back late next week, |
Nick, have you had a chance to look this over again? I mainly care about the -z flag support. The makepyz.py script is just a demo, though I think it is useful as documentation as well. |
The new patch looks much better - the only thing is that run_zip needs to do sys.path.pop(0) to correctly remove the zipfile from the front of the path. However, I do see your point about whether or not including the current directory on sys.path is the right thing to do for this case - it may be better to set <zipfile_name>/zipmain.py as argv[0] before invoking PySys_SetArgv, and then use __zipmain__ as the module to be executed on the same code path as the -m switch normally uses. Rather than continuing this discussion here on SF, it may be best to post your proposal to python-dev. I personally like the idea, but a new idiom for running Python scripts will need broader support than just me. Getting input from the py2exe and py2app folks that can be found on python-dev would also be good. |
Good point, however I decided to set sys.path[0] and sys.argv[0] a little differently, based on some more testing, as you can see explained in the new patch I just uploaded. Those are details; I'll post to python-dev and see what people think of the general idea. If it's accepted then we can figure out the details. For now I made the function very specific to the -z flag. I'm not sure I have a use case for invoking a zip file from another python module. If we were to put that back in, it might be better to have 2 separate functions anyway, since this one is only 3 lines basically. File Added: runzip8.diff |
I don't see the need for that on Linux: you can do the same thing already with a shell script. martin@mira:~$ cat runzip.sh martin@mira: So unless that adds a functionality that I'm missing, I'm -1 on this patch. |
I like the -z option - I'm in favour of that as it stands (you need to add documentation). This is what the patch covers, and I'd like to see it implemented as is. The helper script is useful, but not essential. To include in the distribution, you'd have to consider how to deploy it: module executable via -m, .py file in the Scripts directory, shell script/.bat file in the Scripts directory. Of these, only a module using -m is really portable. It may be easier just to just have it as sample code in the documentation which can be cut and pasted as required. (That's what I'd recommend). For Windows, if you expect to define a file extension for these files, you need to consider console vs GUI issues. File extensions are more useful in a GUI context, so maybe .pyz files should be executed with "pythonw -z". Or maybe there should be 2 extensions, .pyz (console) and .pwz (GUI)? I don't have an answer to this, and honestly, if there's any controversy, I wouldn't bother, but just leave it to the user to decide and implement a local solution (much as Python doesn't add its directory to %PATH%) If you wanted to define a standard, you'd need patches to the Windows MSI builder to implement it. |
Martin, your trick won't work if you remove "foo.py" from the directory you ran "bar". ;) |
Patch implementing an alternate approach: support automatically |
I like PJE's approach, and the patch works for me. About the only thing I'd change is to switch the expression in
An optimising compiler is going to produce similar code either way, and Adding a simple test of the functionality to test_cmd_line would also be |
PJE's patch looks OK. I agree with Nick that the chain of &&s in |
PJE's patch looks good to me too. Stylistic nits:
|
Attached an updated version of PJE's patch with the suggested cleanups The basic tests and the directory tests are currently working, but for I'm posting the patch anyway to see if anyone else can spot where it's |
I worked out what was wrong with my unit tests (I was incorrectly I've updated the patch here, and will be committing the change once the |
Committed as rev 59039 (now to see how the buildbots react for other |
Reverted status to open until I figure out why the tests are failing on |
I can look into this, as I have OSX on my laptop. |
Actually the failures aren't OSX-specific: ====================================================================== Traceback (most recent call last):
File "Lib/test/test_cmd_line_script.py", line 117, in test_directory
self._check_script(script_dir, script_name, script_dir)
File "Lib/test/test_cmd_line_script.py", line 96, in _check_script
self.assertEqual(exit_code, 0, data)
AssertionError: /usr/local/google/home/guido/python/py3k/python:
'/tmp/tmpLGqOxc' is a directory, cannot continue ====================================================================== Traceback (most recent call last):
File "Lib/test/test_cmd_line_script.py", line 124, in
test_directory_compiled
self._check_script(script_dir, compiled_name, script_dir)
File "Lib/test/test_cmd_line_script.py", line 96, in _check_script
self.assertEqual(exit_code, 0, data)
AssertionError: /usr/local/google/home/guido/python/py3k/python:
'/tmp/tmprNwPih' is a directory, cannot continue ====================================================================== Traceback (most recent call last):
File "Lib/test/test_cmd_line_script.py", line 130, in test_zipfile
self._check_script(zip_name, None, zip_name)
File "Lib/test/test_cmd_line_script.py", line 96, in _check_script
self.assertEqual(exit_code, 0, data)
AssertionError: File "/tmp/tmpInCAJO/test_zip.zip", line 1
PK# statements being executed
^
SyntaxError: invalid syntax
[25429 refs] ====================================================================== Traceback (most recent call last):
File "Lib/test/test_cmd_line_script.py", line 137, in
test_zipfile_compiled
self._check_script(zip_name, None, zip_name)
File "Lib/test/test_cmd_line_script.py", line 96, in _check_script
self.assertEqual(exit_code, 0, data)
AssertionError: File "/tmp/tmpqh6g1C/test_zip.zip", line 1
SyntaxError: Non-UTF-8 code starting with '\xc8' in file
/tmp/tmpqh6g1C/test_zip.zip on line 2, but no encoding declared; see
http://python.org/dev/peps/pep-0263/ for details
[25428 refs] |
Oops, those are failures under 3.0, probably due to Crys's merge. On |
Fixed the OSX failure in revision 59055; it was due to /tmp being a Keeping this open until the 3.0 version is working. |
3.0 fix committed as revision 59058. |
Updated issue title to reflect what was actually implemented |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: