New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disabling changing sys.argv[0] with runpy.run_module(...alter_sys=True) #70576
Comments
For the purposes of pex (https://github.com/pantsbuild/pex), it would be useful to allow calling run_module without sys.argv[0] changing. In general, this behavior is useful if the script intends to re-exec itself (so it needs to know the original arguments that it was started with). To make run_module more useful in general, I propose adding a
|
python-dev thread: https://mail.python.org/pipermail/python-dev/2016-February/143374.html Notably: |
This seems like a reasonable enhancement to me, but I'd appreciate your thoughts on how it might relate to a couple of other proposed ideas:
My own current assessment is "this RFE neither helps nor hinders those other ideas", so this can also be read as an attempt to get a potential contributor with an interest in runpy to take a look at those long-languishing ideas to see if they pique your interest ;) |
The PSF also has a Contributor Licensing Agreement in place for contributions to CPython, so if you could sign that, that would be great: https://www.python.org/psf/contrib/contrib-form/ Highlights from a contributor point of view:
|
Looks like by signed CLA just made it through, so that should be settled. For the other bugs, it seems that overloading run_module & run_path seems to be getting a bit cumbersome, so it might make sense to have some sort of Runner object that has things like Either way, I think that's outside the scope of this change. Unfortunately this is already shaving a yak for me (facebook/buck#651 (comment)) and I'd rather not go deeper. |
So how might I get this patch committed? :) |
ping |
Thanks for the ping. The actual code changes look OK to me for the initially proposed design, which means the main missing piece would be documentation updates (both to the docstrings and to runpy module reference). However, thinking about how those docs might be written, I'm starting to think the specific proposed design would be inherently confusing for folks that aren't already familiar with runpy's internals, and it might be better to use a callback API with a few predefined helper functions. That is, suppose the new parameter was a callback accepting the following parameters:
And then we have 3 predefined callbacks/callback factories in runpy: def keep_argv(module, argv):
return argv
def set_argv0(module, argv):
return module.__file__ + argv[1:]
def set_argv(argv, *, implicit_argv0=True):
argv_override = list(argv)
if implicit_argv0:
def _set_argv(module, original_argv):
return [module.__file__] + argv_override
else:
def _set_argv(module, original_argv):
return argv_override
return _set_argv Then the three scenarios in your original post would look like: runpy.run_module(mod_name, argv=set_argv0)
runpy.run_module(mod_name, argv=keep_argv)
runpy.run_module(mod_name, argv=set_argv(iterable)) (and similarly for run_path) "argv=None" would be the default, and equivalent to specifying "argv=set_argv0" "argv=set_argv(iterable, implicit_argv0=False)" would allow even more precise control of the argv settings seen by the running module. Future and/or custom variations would be straightforward, since we're just passing in a callable. The documentation benefit is that the "argv" parameter docs can just describe the callback signature and the use of "set_argv0" as the default, with the details of the individual behaviours moving to the definitions of the corresponding functions. |
Hey Nick, Sorry for the long delay. Unfortunately Python isn't my main work language anymore so working on this has proved to be quite a context switch. I'm going to try to finish this up now. The attached patch implements a new pattern for wrapping runpy - one that I hope is a bit more general than just setting argv. In particular, using the new load_module/load_path doesn't automatically change argv at all when calling run. The callers can do pretty much whatever they want before calling run(). From a docs perspective this is quite a bit easies to understand. You call runpy.load_* to find the module, change whatever you want: globals/module dict values/names/etc, and call .run(). We would even expose a convenient `with runpy.ModifiedArgv(argv):` to help with the argv swapping. "Simple" use-cases can continue to use runpy.run_* without needing to get into the save/restore we do here. Let me know what you think of this approach and I can flesh out the docs around it. Otherwise, I'm more than happy to implement the callback approach you suggested. Thanks, |
Ah, very nice. (And no worries on taking an as-you-have-time approach to this - you'll see from the dates on some of the referenced issues below that even I'm in that situation where runpy is concerned) I think you're right that offering a 2-phase load/run API will make a lot of sense to folks already familiar with the find/exec model for imports, and it also aligns with this old design concept for making runpy friendlier to modules that need access to the created globals even if the execution step fails: http://bugs.python.org/issue9325#msg133833 I'd just completely missed that that idea was potentially relevant here as well :) I'll provide a few more detailed comments inline. The scale of the refactoring does make me wonder if there might be a way to account for the "target module" idea in http://bugs.python.org/issue19982 though. |
Hey Nick, Definitely agree that this refactor is big enough to try adding target modules. There's a somewhat hidden feature in the second patch that does this: Given that goal, I'm a bit worried about how to accurately describe the behavior of
I'm leaning towards the second option for API symmetry. The largest hurdle is defining a behavior w.r.t. globals that is least surprising. Maybe something like - if set to True, the globals in the target will be overwritten (i.e. .update) with the globals in the runner when Separately, what needs this type of behavior, other than for backwards compatibility? Do you know of any specific use-case? It feels like almost everything should be covered by a combination of add_to_sys_modules (i.e. temporary modules in sys.modules) and inspecting runner.globals after execution. What do you think? Mike. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: