classification
Title: Provide `sys.executable_argv` for host application's command line arguments
Type: enhancement Stage:
Components: Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: barry, haypo, jwilk, ncoghlan, steven.daprano
Priority: normal Keywords:

Created on 2017-03-20 06:33 by ncoghlan, last changed 2017-03-22 07:22 by ncoghlan.

Messages (13)
msg289873 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-03-20 06:33
Issue 14208 was ultimately resolved through an import system specific solution, with PEP 451 making the module name passed to `python -m` available as `__main__.__spec__.name`.

However, there are other situations where it may be useful to offer an implementation-dependent attribute in the `sys` module that provides access to a copy of the host application's raw `argv` details, rather than the filtered `sys.argv` details that are left after the host application's command line processing has been completed.

In the case of CPython, where `sys.argv` represents the arguments to the Python level __main__ function, `sys._raw_argv` would be a copy of the argv argument to the C level main() or wmain() function (as appropriate for the platform).
msg289885 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2017-03-20 13:02
As bytes?
msg289914 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-03-21 05:14
For CPython, I was thinking of having it be "whatever gets passed to Py_Main", and that accepts wchar_t in Py3 [1], so on *Nix systems, the command line has already been decoded with [2] by the time it runs.

[1] https://docs.python.org/3/c-api/veryhigh.html#c.Py_Main
[2] https://docs.python.org/3/c-api/sys.html#c.Py_DecodeLocale

In the case of Windows, the wchar_t array is received straight from the OS as UTF-16-LE.
msg289933 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2017-03-21 11:17
Why is the name flagged as a private implementation detail? I.e. a single leading underscore. I'd be reluctant to rely on this in production code, given how strong the _private convention is.

Suggest just `sys.raw_args` instead.
msg289934 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-03-21 11:47
> As bytes?

No, text please. Text is just more convenient in Python, and it's trivial to retrieve original bytes:

raw_args_bytes = [os.fsencode(arg) for arg in sys._raw_args]
msg289935 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2017-03-21 13:43
On Mar 21, 2017, at 11:47 AM, STINNER Victor wrote:

>No, text please. Text is just more convenient in Python, and it's trivial to
>retrieve original bytes:
>
>raw_args_bytes = [os.fsencode(arg) for arg in sys._raw_args]

Well, "raw args" implies minimal or no processing, so bytes would make the
most sense.  It doesn't bother me that it's inconvenient; this won't be an oft
used API and the conversion to strings should be just as easy.
msg289936 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-03-21 13:53
> Well, "raw args" implies minimal or no processing,

Ok, so call it "original", sys.orig_arv, in that case ;-)
msg289937 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-03-21 14:04
There is already an existing public C API get retrieve original program arguments *as text*:

/* Make the *original* argc/argv available to other modules.
   This is rare, but it is needed by the secureware extension. */

void
Py_GetArgcArgv(int *argc, wchar_t ***argv)
{
    *argc = orig_argc;
    *argv = orig_argv;
}

Are you talking about exposing these arguments at the Python level?
msg289938 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-03-21 14:54
@Steven This is an implementation detail in the same sense that sys._getframe() is: it's not something that's actually going to make sense in all contexts. For example, if Py_Main() is never called (for CPython), it would still be None, and other implementations may not define it at all. And even when it's set for CPython, the exact details of what it contains are going to be at least somewhat platform dependent.

@Barry On Windows we define `mainw` rather than `main`, so it's the UTF-16-LE encoded text that is the "raw" form. That means the "raw" here refers to "before the Python interpreter CLI processing" - the normalization step to get the command line to wchar_t regardless of platform isn't going to be skipped (since the interpreter runtime itself never even sees the raw bytes in Python 3).

One option would be to use a longer name like `sys._executable_argv`, since in the typical case, `sys.executable` and `sys._executable_argv[0]` will be the same.

@Victor It wouldn't be exactly the same as Py_GetArgcArgv, as I'd propose making a pristine copy *before* Py_Main() mutates anything - we do some in-place editing of entries while figuring out what "sys.argv[0]" should look like at the Python level.
msg289939 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-03-21 15:00
> For example, if Py_Main() is never called (for CPython), it would still be None,

What is the content of sys.argv in that case? Can't we use the same value for sys._raw_argv?
msg289940 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-03-21 15:18
If the embedding application doesn't call PySys_SetArgv or PySys_SetArgvEx, then there is no `argv` attribute defined in the sys module (I wasn't actually sure what happened in that case, so I went and checked the code).

For the reference CLI, the relevant call happens in Py_Main() after all the interpreter level arguments have been processed.
msg289941 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-03-21 15:32
> If the embedding application doesn't call PySys_SetArgv or PySys_SetArgvEx, then there is no `argv` attribute defined in the sys module (I wasn't actually sure what happened in that case, so I went and checked the code).

Ok, so just don't define sys._raw_argv in that case. But it doesn't seem enough to me to justify to make the attribute private.

https://docs.python.org/dev/library/sys.html#sys.getallocatedblocks can be seen as an implementation detail.The method name has no underscore prefix, but a simple fallback: return 0 if the feature is is not implemented.
msg289975 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-03-22 07:22
OK, I've changed the proposed attribute name to be `sys.executable_argv`, with the rationale being that it's "argv as originally seen by sys.executable, rather than by Python's __main__ module"

As part of documenting this, both it and the `argv` documentation can make it clear that they may be entirely absent if the host application doesn't set them.
History
Date User Action Args
2017-03-22 07:22:54ncoghlansetmessages: + msg289975
title: Provide `sys._raw_argv` for host application's command line arguments -> Provide `sys.executable_argv` for host application's command line arguments
2017-03-21 15:32:33hayposetmessages: + msg289941
2017-03-21 15:18:51ncoghlansetmessages: + msg289940
2017-03-21 15:00:18hayposetmessages: + msg289939
2017-03-21 14:54:56ncoghlansetmessages: + msg289938
2017-03-21 14:04:54hayposetmessages: + msg289937
2017-03-21 13:53:00hayposetmessages: + msg289936
2017-03-21 13:43:20barrysetmessages: + msg289935
2017-03-21 11:47:17hayposetnosy: + haypo
messages: + msg289934
2017-03-21 11:17:38steven.dapranosetnosy: + steven.daprano
messages: + msg289933
2017-03-21 05:14:28ncoghlansetmessages: + msg289914
2017-03-20 13:02:32barrysetmessages: + msg289885
2017-03-20 13:01:54barrysetnosy: + barry
2017-03-20 11:57:37jwilksetnosy: + jwilk
2017-03-20 06:33:43ncoghlancreate