Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow interpreter to execute a zip file #45107

Closed
andy-chu mannequin opened this issue Jun 19, 2007 · 26 comments
Closed

Allow interpreter to execute a zip file #45107

andy-chu mannequin opened this issue Jun 19, 2007 · 26 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@andy-chu
Copy link
Mannequin

andy-chu mannequin commented Jun 19, 2007

BPO 1739468
Nosy @gvanrossum, @loewis, @pfmoore, @pjeby, @ncoghlan, @avassalotti
Files
  • runzip6.diff: Patch to add the -z interpreter flag
  • runzip7.diff
  • makepyz.py
  • runzip8.diff: Patch to add the -z interpreter flag
  • runmain.patch: Alternate approach that doesn't need a command-line option
  • runmain_with_tests.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ncoghlan'
    closed_at = <Date 2007-11-19.18:39:55.770>
    created_at = <Date 2007-06-19.03:40:37.000>
    labels = ['interpreter-core']
    title = 'Allow interpreter to execute a zip file'
    updated_at = <Date 2008-02-23.05:56:24.830>
    user = 'https://bugs.python.org/andy-chu'

    bugs.python.org fields:

    activity = <Date 2008-02-23.05:56:24.830>
    actor = 'ncoghlan'
    assignee = 'ncoghlan'
    closed = True
    closed_date = <Date 2007-11-19.18:39:55.770>
    closer = 'gvanrossum'
    components = ['Interpreter Core']
    creation = <Date 2007-06-19.03:40:37.000>
    creator = 'andy-chu'
    dependencies = []
    files = ['8053', '8054', '8055', '8056', '8335', '8770']
    hgrepos = []
    issue_num = 1739468
    keywords = ['patch']
    message_count = 26.0
    messages = ['52772', '52773', '52774', '52775', '52776', '52777', '52778', '52779', '52780', '52781', '52782', '52783', '55338', '55824', '55826', '55855', '57605', '57613', '57614', '57634', '57637', '57641', '57645', '57647', '57650', '62719']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'loewis', 'paul.moore', 'pje', 'ncoghlan', 'andy-chu', 'alexandre.vassalotti']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1739468'
    versions = ['Python 2.6', 'Python 3.0']

    @andy-chu
    Copy link
    Mannequin Author

    andy-chu mannequin commented Jun 19, 2007

    The motivation for this is that distributing a zip file is a very light and easy way to distribute a python program with multiple packages/modules. I have done this on Linux, Mac and Windows and it works very nicely -- except that you need a few extra files to bootstrap it: set PYTHONPATH to the zip file and run the main function.

    With this small patch, you get rid of the need for extra files. At the bottom is a demo on Linux.

    On Windows, you can do a similar thing by making a file that is both a zip file and a batch file. The batch file will pass %~f0 (itself) to the -z flag of the Python interpreter.

    I ran this by Guido and he seemed to think it was a fine idea. At Google, we have numerous platform-specific hacks in a program called "autopar" to solve this problem.

    I have included the basic patch, but if you guys agree with this, I will add some tests and documentation. And I think it might be useful to include something in the Tools/ directory to do what update_zip.sh does below (add a __zipmain__ module and a shebang/batch file header to a zip file, to make it executable)?

    I think this may also help to fix a bug with eggs:

    http://peak.telecommunity.com/DevCenter/setuptools#eggsecutable-scripts

    IMPORTANT NOTE: Eggs with an "eggsecutable" header cannot be renamed, or invoked via symlinks. They must be invoked using their original filename, in order to ensure that, once running, pkg_resources will know what project and version is in use. The header script will check this and exit with an error if the .egg file has been renamed or is invoked via a symlink that changes its base name.

    andychu testdata$ ls
    __zipmain__.py foo.py foo.pyc foo.zip foo_exe.zip header.txt update_zip.sh

    # The main program you're going to run in "development mode"

    andychu testdata$ ./foo.py foo bar
    argv: ['./foo.py', 'foo', 'bar']

    # Same program, packaged into a zip file

    andychu testdata$ ./foo_exe.zip foo bar
    argv: ['-c', 'foo', 'bar']

    # Contents of the zip file

    andychu testdata$ unzip -l foo_exe.zip
    Archive: foo_exe.zip
    warning [foo_exe.zip]: 51 extra bytes at beginning or within zipfile
    (attempting to process anyway)
    Length Date Time Name
    -------- ---- ---- ----
    243 06-18-07 20:01 __zipmain__.py
    301 06-18-07 19:58 foo.py
    -------- -------
    544 2 files

    # Demo script to build an executable zip file.

    andychu testdata$ cat header.txt
    #!/usr/local/google/clients/python/trunk/python -z

    andychu testdata$ cat update_zip.sh
    #!/bin/bash

    # Make a regular zip file.
    zip foo.zip foo.py __zipmain__.py

    # Add a shebang line to it.
    cat header.txt foo.zip > foo_exe.zip

    # Make it executable.
    chmod 750 foo_exe.zip

    @andy-chu andy-chu mannequin assigned ncoghlan Jun 19, 2007
    @andy-chu andy-chu mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jun 19, 2007
    @ncoghlan
    Copy link
    Contributor

    I like the general idea, but it should be possible to use runpy.run_module to get __name__ set correctly (as that is what happens when you execute a module from a zipfile with -m). Another advantage of using run_module is that it would allow runzip() to take a second argument (possibly defaulting to "__zipmain__") which would specify the module to be executed from the zipfile (the remaining 3 run_module arguments could also be passed in, and set appropriately from main.c).

    Adding the new function as runpy.run_zip() (instead of adding a new module) would also be good.

    For Windows, an alternative to making the zip file both a batch and a zip file would be to adopt a .pyz extension convention for these files - the file associations can then be set up to invoke the script appropriately with python -z (similar to the way that .pyw files are associated with pythonw instead of the standard python executable). That way the same file could be executed on both Linux (via an embedded shebang line) and on Windows (via filename association), as is the case with standard .py Python scripts.

    My final question is whether the change to sys.path should be reverted once the module execution is complete - my suspicion is that it should, but I need to look into it a bit more before giving a definite answer (for the command line flag case, this behaviour obviously doesn't matter - it is only significant if the Python method is invoked directly in the context of a larger program).

    @andy-chu
    Copy link
    Mannequin Author

    andy-chu mannequin commented Jun 21, 2007

    Nick, you're right, I think it can use run_module and be in the runpy module. Let me make those changes and send you another patch.

    @andy-chu
    Copy link
    Mannequin Author

    andy-chu mannequin commented Jun 27, 2007

    Nick, I've updated the code to use a new runpy.run_zip function, which calls run_module. This does make it a bit cleaner.

    Let me know what you think. If the code is good I'll write some tests and documentation.

    Also, I'm not sure if the '-c' is really appropriate in sys.argv, but that seems to be what the -m flag uses. It seems like it might make sense to have sys.argv[0] be the zip file, if it is really a first class executable.

    And I think a script to build one of these files would be appropriate, which I can add. You could pass it the main module and main function, and it would generate a __zipmain__ stub and add it to the zip file. And it is a good idea if the file is cross platform, so a .pyz extension would work.

    Sorry the delayed response, I was a bit busy at work this week... but I'll respond sooner this time. : )

    Example:

    andychu trunk$ testdata/foo_exe.zip foo bar
    __name__: __main__
    argv: ['-c', 'foo', 'bar']
    andychu trunk$ echo $?
    3

    File Added: runzip7.diff

    @andy-chu
    Copy link
    Mannequin Author

    andy-chu mannequin commented Jun 28, 2007

    Here is a script that documents how to make such files. I think the important part is just documenting the format. Then people can write whatever tools they need around it. Many people could get by with this simple tool, but others might want something more elaborate.

    Demo:

    andychu testprog$ find
    .
    ./init.py
    ./package1
    ./package1/init.py
    ./package1/foo.py
    ./package1/lib.py
    ./package1/init.pyc
    ./package1/lib.pyc
    ./package1/foo.pyc

    andychu testprog$ find -name "*.py" | xargs ../Tools/scripts/makepyz.py -a zip,pyz,unix -z foo.zip -p package1 -m foo -y /usr/local/google/clients/python/trunk/python
    Added ./init.py to foo.zip
    Added ./package1/init.py to foo.zip
    Added ./package1/foo.py to foo.zip
    Added ./package1/lib.py to foo.zip
    Added __zipmain__.py to foo.zip
    Prepended #!/usr/local/google/clients/python/trunk/python -z to foo.zip.
    chmod foo.zip 0700

    andychu testprog$ ./foo.zip
    lib module here
    argv: ['-c']
    andychu testprog$ echo $?
    3

    File Added: makepyz.py

    @ncoghlan
    Copy link
    Contributor

    I'm going to be off the net for a few days - I'll have a look at the updated patch when I get back late next week,

    @andy-chu
    Copy link
    Mannequin Author

    andy-chu mannequin commented Jul 7, 2007

    Nick, have you had a chance to look this over again? I mainly care about the -z flag support. The makepyz.py script is just a demo, though I think it is useful as documentation as well.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jul 8, 2007

    The new patch looks much better - the only thing is that run_zip needs to do sys.path.pop(0) to correctly remove the zipfile from the front of the path.

    However, I do see your point about whether or not including the current directory on sys.path is the right thing to do for this case - it may be better to set <zipfile_name>/zipmain.py as argv[0] before invoking PySys_SetArgv, and then use __zipmain__ as the module to be executed on the same code path as the -m switch normally uses.

    Rather than continuing this discussion here on SF, it may be best to post your proposal to python-dev. I personally like the idea, but a new idiom for running Python scripts will need broader support than just me. Getting input from the py2exe and py2app folks that can be found on python-dev would also be good.

    @andy-chu
    Copy link
    Mannequin Author

    andy-chu mannequin commented Jul 11, 2007

    Good point, however I decided to set sys.path[0] and sys.argv[0] a little differently, based on some more testing, as you can see explained in the new patch I just uploaded.

    Those are details; I'll post to python-dev and see what people think of the general idea. If it's accepted then we can figure out the details. For now I made the function very specific to the -z flag.

    I'm not sure I have a use case for invoking a zip file from another python module. If we were to put that back in, it might be better to have 2 separate functions anyway, since this one is only 3 lines basically.

    File Added: runzip8.diff

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jul 12, 2007

    I don't see the need for that on Linux: you can do the same thing already with a shell script.
    In the example below, foo.zip contains foo.py.

    martin@mira:~$ cat runzip.sh
    #!/bin/sh
    export PYTHONPATH=$0
    exec python -c 'import foo;foo.main()'
    # THE END

    martin@mira:$ cat runzip.sh foo.zip >bar
    martin@mira:
    $ chmod +x bar.zip
    martin@mira:~$ ./bar
    hello

    So unless that adds a functionality that I'm missing, I'm -1 on this patch.

    @pfmoore
    Copy link
    Member

    pfmoore commented Jul 12, 2007

    I like the -z option - I'm in favour of that as it stands (you need to add documentation). This is what the patch covers, and I'd like to see it implemented as is.

    The helper script is useful, but not essential. To include in the distribution, you'd have to consider how to deploy it: module executable via -m, .py file in the Scripts directory, shell script/.bat file in the Scripts directory. Of these, only a module using -m is really portable. It may be easier just to just have it as sample code in the documentation which can be cut and pasted as required. (That's what I'd recommend).

    For Windows, if you expect to define a file extension for these files, you need to consider console vs GUI issues. File extensions are more useful in a GUI context, so maybe .pyz files should be executed with "pythonw -z". Or maybe there should be 2 extensions, .pyz (console) and .pwz (GUI)? I don't have an answer to this, and honestly, if there's any controversy, I wouldn't bother, but just leave it to the user to decide and implement a local solution (much as Python doesn't add its directory to %PATH%) If you wanted to define a standard, you'd need patches to the Windows MSI builder to implement it.

    @avassalotti
    Copy link
    Member

    Martin, your trick won't work if you remove "foo.py" from the directory you ran "bar". ;)

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Aug 27, 2007

    Patch implementing an alternate approach: support automatically
    importing __main__ when sys.argv[0] is an importable path. This allows
    zip files, directories, and any future importable locations (e.g. URLs)
    to be used on the command line. Note that this also means that you
    don't need an option on the #! line in a zip file, which avoids hairy #!
    issues on platforms like Linux (where a #! line can have at most one
    argument). __main__ is used instead of __zipmain__, since it is not
    zipfile specific.

    @ncoghlan
    Copy link
    Contributor

    I like PJE's approach, and the patch works for me.

    About the only thing I'd change is to switch the expression in
    PyImport_GetImporter to a simple chain of if-statements in order to:

    • silence the warning from GCC about an unused value
    • make it more obvious to a reader what the function is doing

    An optimising compiler is going to produce similar code either way, and
    it took me a moment to realise that the && operations are being used
    purely for their short-circuiting effect, even though there is no real
    advantage to using an expression instead of a statement at that point in
    the code.

    Adding a simple test of the functionality to test_cmd_line would also be
    good.

    @pfmoore
    Copy link
    Member

    pfmoore commented Sep 11, 2007

    PJE's patch looks OK. I agree with Nick that the chain of &&s in
    PyImport_GetImporter should be expanded into a chain of ifs. As it
    stands, the code is needlessly obfuscated.

    @gvanrossum
    Copy link
    Member

    PJE's patch looks good to me too.

    Stylistic nits:

    • The proper name of the now-public null importer type ought to be
      PyNullImporter_Type, to rhyme with e.g. PyString_Type

    • There's a multi-line if that has the closing parenthesis in an odd
      place at the start of the next line. The preferred style is to place the
      close paren after the last condition, and put the open curly on a line
      by itself.

    @ncoghlan
    Copy link
    Contributor

    Attached an updated version of PJE's patch with the suggested cleanups
    and a new unit test file (test_cmd_line_script.py). Finding the
    roundtuits to finish the latter is actually what has taken me so long.

    The basic tests and the directory tests are currently working, but for
    some reason the zipfile tests are attempting to load __main__ using
    pkgutil.ImpLoader instead of the zipimport module.

    I'm posting the patch anyway to see if anyone else can spot where it's
    going wrong before I find some more time to try and figure it out for
    myself.

    @ncoghlan
    Copy link
    Contributor

    I worked out what was wrong with my unit tests (I was incorrectly
    including the path information when adding the test script to the zipfile)

    I've updated the patch here, and will be committing the change once the
    test suite finishes running.

    @ncoghlan
    Copy link
    Contributor

    Committed as rev 59039 (now to see how the buildbots react for other
    platforms...)

    @ncoghlan
    Copy link
    Contributor

    Reverted status to open until I figure out why the tests are failing on
    the Mac OSX buildbot.

    @ncoghlan ncoghlan reopened this Nov 19, 2007
    @ncoghlan ncoghlan reopened this Nov 19, 2007
    @gvanrossum
    Copy link
    Member

    I can look into this, as I have OSX on my laptop.

    @gvanrossum
    Copy link
    Member

    Actually the failures aren't OSX-specific:

    ======================================================================
    FAIL: test_directory (main.CmdLineTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Lib/test/test_cmd_line_script.py", line 117, in test_directory
        self._check_script(script_dir, script_name, script_dir)
      File "Lib/test/test_cmd_line_script.py", line 96, in _check_script
        self.assertEqual(exit_code, 0, data)
    AssertionError: /usr/local/google/home/guido/python/py3k/python:
    '/tmp/tmpLGqOxc' is a directory, cannot continue

    ======================================================================
    FAIL: test_directory_compiled (main.CmdLineTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Lib/test/test_cmd_line_script.py", line 124, in
    test_directory_compiled
        self._check_script(script_dir, compiled_name, script_dir)
      File "Lib/test/test_cmd_line_script.py", line 96, in _check_script
        self.assertEqual(exit_code, 0, data)
    AssertionError: /usr/local/google/home/guido/python/py3k/python:
    '/tmp/tmprNwPih' is a directory, cannot continue

    ======================================================================
    FAIL: test_zipfile (main.CmdLineTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Lib/test/test_cmd_line_script.py", line 130, in test_zipfile
        self._check_script(zip_name, None, zip_name)
      File "Lib/test/test_cmd_line_script.py", line 96, in _check_script
        self.assertEqual(exit_code, 0, data)
    AssertionError:   File "/tmp/tmpInCAJO/test_zip.zip", line 1
        PK# statements being executed
          ^
    SyntaxError: invalid syntax
    [25429 refs]

    ======================================================================
    FAIL: test_zipfile_compiled (main.CmdLineTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Lib/test/test_cmd_line_script.py", line 137, in
    test_zipfile_compiled
        self._check_script(zip_name, None, zip_name)
      File "Lib/test/test_cmd_line_script.py", line 96, in _check_script
        self.assertEqual(exit_code, 0, data)
    AssertionError:   File "/tmp/tmpqh6g1C/test_zip.zip", line 1
    SyntaxError: Non-UTF-8 code starting with '\xc8' in file
    /tmp/tmpqh6g1C/test_zip.zip on line 2, but no encoding declared; see
    http://python.org/dev/peps/pep-0263/ for details
    [25428 refs]

    @gvanrossum
    Copy link
    Member

    Oops, those are failures under 3.0, probably due to Crys's merge. On
    Linux, the 2.6 version of the test doesn't fail. I see 2 failing tests
    on OSX with the 2.6 version, which I will look into.

    @gvanrossum
    Copy link
    Member

    Fixed the OSX failure in revision 59055; it was due to /tmp being a
    symlink, and fixed by application of realpath().

    Keeping this open until the 3.0 version is working.

    @gvanrossum
    Copy link
    Member

    3.0 fix committed as revision 59058.

    @ncoghlan
    Copy link
    Contributor

    Updated issue title to reflect what was actually implemented

    @ncoghlan ncoghlan changed the title Add a -z interpreter flag to execute a zip file Allow interpreter to execute a zip file Feb 23, 2008
    @ncoghlan ncoghlan changed the title Add a -z interpreter flag to execute a zip file Allow interpreter to execute a zip file Feb 23, 2008
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants