Issue39461
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2020-01-27 10:25 by Antony.Lee, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Messages (20) | |||
---|---|---|---|
msg360750 - (view) | Author: Antony Lee (Antony.Lee) * | Date: 2020-01-27 10:25 | |
As of Py3.8/Linux: In [1]: os.environ["foo"] = Path("bar") --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-4-2827297496cb> in <module> ----> 1 os.environ["foo"] = Path("bar") ~/miniconda3/envs/default/lib/python3.8/os.py in __setitem__(self, key, value) 676 def __setitem__(self, key, value): 677 key = self.encodekey(key) --> 678 value = self.encodevalue(value) 679 self.putenv(key, value) 680 self._data[key] = value ~/miniconda3/envs/default/lib/python3.8/os.py in encode(value) 746 def encode(value): 747 if not isinstance(value, str): --> 748 raise TypeError("str expected, not %s" % type(value).__name__) 749 return value.encode(encoding, 'surrogateescape') 750 def decode(value): TypeError: str expected, not PosixPath In [2]: subprocess.run('echo "$foo"', env={**os.environ, "foo": Path("bar")}, shell=True) bar Out[2]: CompletedProcess(args='echo "$foo"', returncode=0) I guess it would be nice if it was possible to set os.environ entries to Path-like values, but most importantly, it seems a bit inconsistent that doing so is not possible on os.environ, but works when setting the `env` of a subprocess call. |
|||
msg360756 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-01-27 12:26 | |
subprocess.Popen encodes the env parameter using os.fsencode(). os.fsencode() has been patched in bpo-27182 to use os.fspath(): commit c1cbeedf0c650c3f7c64f04479070d39e15e1baf Author: Ethan Furman <ethan@stoneleaf.us> Date: Sat Jun 4 10:19:27 2016 -0700 issue27182: update fsencode and fsdecode for os.path(); patch by Dusty Phillips For os.environ, you can use: os.environ["foo"] = str(Path("bar")) I don't know if os.environ is supposed to use the fspath protocol. I add Ethan and Brett to the nosy list. fspath is specified by the PEP 519 "Adding a file system path protocol": https://www.python.org/dev/peps/pep-0519/ |
|||
msg360773 - (view) | Author: Ethan Furman (ethan.furman) * | Date: 2020-01-27 15:57 | |
Adding `os.environ` support makes sense to me. |
|||
msg360775 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-01-27 16:04 | |
Currently, os.environ behaves as a dictionary. When you put value into os.environ[key], os.environ[key] gives you back this value. If we start to convert value to a different type (convert something to str), it can be surprising. |
|||
msg360777 - (view) | Author: Ethan Furman (ethan.furman) * | Date: 2020-01-27 16:32 | |
True, but so is having Path objects not seemlessly usable. Also, isn't os.environ a special case where all values should be strings? |
|||
msg360779 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2020-01-27 16:43 | |
Implicit conversion can also hide some bugs. Semantically os.environ is an str to str mapping and os.environb is a bytes to bytes mapping. |
|||
msg360784 - (view) | Author: Ethan Furman (ethan.furman) * | Date: 2020-01-27 17:36 | |
True, but this example of implicit conversion is only for Path objects which are currently implicitly converted throughout the stdlib where appropriate, and this looks like one of those appropriate places. I don't know enough about os.environb to offer an opinion on auto-conversion there. |
|||
msg360788 - (view) | Author: Eryk Sun (eryksun) * | Date: 2020-01-27 18:45 | |
Supporting __fspath__ for os.environ[b] makes it consistent with POSIX os.putenv, which already supports it via PyUnicode_FSConverter. For example: os.putenv(Path('spam'), Path('eggs')) getenv = ctypes.CDLL('libc.so.6').getenv getenv.restype = ctypes.c_char_p >>> getenv(b'spam') b'eggs' For symmetry, os.putenv in Windows could be updated to support __fspath__. An old patch for bpo-28188 implements this, along with bytes support, via PyUnicode_FSDecoder. But bytes support would be limited to UTF-8 strings, so it wouldn't be as flexible as os.environb in POSIX, which supports arbitrary bytes except for embedded nulls. |
|||
msg360795 - (view) | Author: Brett Cannon (brett.cannon) * | Date: 2020-01-27 20:12 | |
I don't think this should be done (and I honestly would have disagreed with the Popen change for its 'env' argument or any other place that is dealing with environment variables). Environment variables are strings, period, so they should be specified as such; explicit is better than implicit (and there isn't enough pragmatic benefit to override this in my opinion). |
|||
msg360798 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-01-27 20:41 | |
> I don't think this should be done (and I honestly would have disagreed with the Popen change for its 'env' argument or any other place that is dealing with environment variables). Environment variables are strings, period, so they should be specified as such; explicit is better than implicit (and there isn't enough pragmatic benefit to override this in my opinion). I don't think that subprocess accepts fspath on purpose. It uses fsencode() because it's convenient to encode Unicode to the filesystem encoding using the right error handler. It's a side effect of the commit c1cbeedf0c650c3f7c64f04479070d39e15e1baf. In fact, I'm opposed to this commit, but now it's too change to change it again :-) os.putenv() is unrelated to os.environ. Calling os.putenv() does not update os.environ, whereas setting a key in os.environ calls internally os.putenv(). I like the idea of keeping os.environ as simple as a dict mapping strings to trings (in term of API, internally it's more complicated). |
|||
msg360801 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2020-01-27 21:02 | |
Path-like objects are now unintentionally accepted for many non-path things, like hostname in socket.sethostname() and username in os.initgroups(). I also think it was a mistake, and we should not make new mistakes. |
|||
msg360803 - (view) | Author: Eryk Sun (eryksun) * | Date: 2020-01-27 21:22 | |
> I honestly would have disagreed with the Popen change for its 'env' > argument or any other place that is dealing with environment > variables os.spawnve (Windows) and os.execve support __fspath__ for the env dict partially due to me (bpo-28114), so sorry if that was the wrong decision. However, at least on the POSIX side, I was restoring previous behavior that aligned with POSIX os.putenv. An example of what's supported currently: Windows: >>> path = os.environ['comspec'] >>> argv = ['cmd', '/c', 'echo %SPAM%'] >>> env = os.environ.copy() >>> env[Path('SPAM')] = Path('eggs') >>> os.spawnve(0, path, argv, env) eggs 0 POSIX: >>> path = '/bin/bash' >>> argv = ['bash', '-c', 'echo $SPAM'] >>> env = os.environ.copy() >>> env[Path('SPAM')] = Path('eggs') >>> os.spawnve(0, path, argv, env) eggs 0 |
|||
msg360804 - (view) | Author: Ethan Furman (ethan.furman) * | Date: 2020-01-27 21:26 | |
The idea behind PEP 519 was to alleviate str(path_obj) calls between the os/program interface. We can either make that as consistent as we can as we find places that still require the str(path_obj) idiom, or we can make users remember which ones do, and which ones don't, easily use a Path object. In my opinion, having to remember is a very unpleasant and frustrating user experience. Serhiy Storchaka: ---------------- > Path-like objects are now unintentionally accepted for many non-path > things, like hostname in socket.sethostname() and username in > os.initgroups(). I also think it was a mistake, and we should not > make new mistakes. There will always be ways for users to make mistakes when using API's. Whether they passed a Path path to os.sethostname() or a string path to os.sethostbyname(), it's still their bug, and linters/type-checkers exist to help catch those kinds of bugs. Brett Canon: ----------- > Environment variables are strings, period, so they should be specified > as such; Out of curiosity, any idea how often non-strings are used to set os.environ, and so need explicit conversion? |
|||
msg360890 - (view) | Author: Brett Cannon (brett.cannon) * | Date: 2020-01-28 18:35 | |
> but now it's too change to change it again :-) I actually don't think it is if we want to revert this. We can raise a deprecation warning if the call to os.fsencode() leads to a value different than its argument and say that people are expected to handle conversions themselves. > The idea behind PEP 519 was to alleviate str(path_obj) calls between the os/program interface ... where file paths were explicitly expected. os.environ is not a place where file paths are explicitly expected, just a place where they _might_ end up. Basically I only consider PEP 519 valid in places where file paths are the only thing that are expected (e.g. all the path manipulation functions in os.path). Everywhere else where paths are only a "maybe" I would say it shouldn't be relied upon to do implicit conversions. > Out of curiosity, any idea how often non-strings are used to set os.environ, and so need explicit conversion? I have anecdotal evidence from outside of Python that people will do this when they can and then are surprised when the conversion doesn't work the way they expect (e.g. floats). |
|||
msg360894 - (view) | Author: Antony Lee (Antony.Lee) * | Date: 2020-01-28 18:44 | |
FWIW, I'm actually fine with not supporting Path objects in os.environ, as long as the behavior is *consistent* with the env kwarg to subprocess.run() -- note that the original title of the thread only pointed out to the inconsistency, and did not strongly request that os.environ support Paths. |
|||
msg360901 - (view) | Author: Eryk Sun (eryksun) * | Date: 2020-01-28 21:19 | |
> as long as the behavior is *consistent* with the env kwarg to > subprocess.run() subprocess isn't consistent with itself across platforms. The env parameter in Windows is strictly an str->str mapping, like os.environ. This is coded in getenvironment in Modules/_winapi.c: if (! PyUnicode_Check(key) || ! PyUnicode_Check(value)) { PyErr_SetString(PyExc_TypeError, "environment can only contain strings"); goto error; } At a lower level, should the env parameter of os.spawnve and os.execve allow path-like objects? Should (Unix) os.putenv allow the name and value to be path-like? Currently these cases use PyUnicode_FSConverter and PyUnicode_FSDecoder, which support __fspath__ via PyOS_FSPath. PyUnicode_FSDecoder also supports the buffer protocol, with a warning. Since we have cases like this where the filesystem encoding is used for string data that's not actually a file path, should alternate ParseTuple converters be added that are limited to just str and bytes? Maybe name them PyUnicode_FSStringEncoder and PyUnicode_FSStringDecoder, where "String" emphasizes that fspath and buffer objects are not allowed. For example: int PyUnicode_FSStringEncoder(PyObject *path, void *addr) { PyObject *output = NULL; if (path == NULL) { Py_DECREF(*(PyObject **)addr); *(PyObject **)addr = NULL; return 1; } if (PyBytes_Check(path)) { output = path; Py_INCREF(output); } else if (PyUnicode_Check(path)) { output = PyUnicode_EncodeFSDefault(path); if (!output) return 0; } else { PyErr_Format(PyExc_TypeError, "path should be str or bytes, not " "%.200s", _PyType_Name(Py_TYPE(path))); return 0; } if ((size_t)PyBytes_GET_SIZE(output) != strlen(PyBytes_AS_STRING(output))) { PyErr_SetString(PyExc_ValueError, "embedded null byte"); Py_DECREF(output); return 0; } *(PyObject **)addr = output; return Py_CLEANUP_SUPPORTED; } int PyUnicode_FSStringDecoder(PyObject *path, void *addr) { PyObject *output = NULL; if (arg == NULL) { Py_DECREF(*(PyObject **)addr); *(PyObject **)addr = NULL; return 1; } if (PyUnicode_Check(path)) { output = path; Py_INCREF(output); } else if (PyBytes_Check(path)) { output = PyUnicode_DecodeFSDefaultAndSize(PyBytes_AS_STRING(path), PyBytes_GET_SIZE(path)); if (!output) return 0; } else { PyErr_Format(PyExc_TypeError, "path should be str or bytes, not " "%.200s", _PyType_Name(Py_TYPE(path))); return 0; } if (PyUnicode_READY(output) == -1) { Py_DECREF(output); return 0; } if (findchar(PyUnicode_DATA(output), PyUnicode_KIND(output), PyUnicode_GET_LENGTH(output), 0, 1) >= 0) { PyErr_SetString(PyExc_ValueError, "embedded null character"); Py_DECREF(output); return 0; } *(PyObject **)addr = output; return Py_CLEANUP_SUPPORTED; } |
|||
msg360905 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2020-01-28 22:07 | |
I don't think proposal this makes sense. os.environ is a dict-like object providing clean access to the environment variables. Use of Paths is an orthogonal problem unrelated to environment variables. |
|||
msg360927 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-01-29 01:55 | |
I understand that os.fsencode() was modified to support the fspath protocol to be consistent with the C implementation PyUnicode_FSConverter() which calls PyOS_FSPath(). I agree that at least in C, we need two functions: one which accepts (str, bytes), another which also supports the fspath protocol. IMHO PyUnicode_FSConverter() was modified to support fspath because it was convenient to only modify one function. But this change wasn't designed to decide in each function if fspath should be accepted or not. For me, os.putenv() should not accept fspath neither. This issue is not specific to os.environ. It's more general about the PEP 519 implementation. IMHO the implementation should be "adjusted". I like the deprecation idea ;-) We did something similar with functions accepting float by mistake. First a deprecation warning was emited, and then it became a hard error (exception). |
|||
msg360929 - (view) | Author: Ethan Furman (ethan.furman) * | Date: 2020-01-29 02:01 | |
Well, I would prefer if Path objects were seamless to use since at one time they were actually strings, but I can live with Brett's rationale that they are only seamless where paths are explicitly expected. |
|||
msg375750 - (view) | Author: Terry J. Reedy (terry.reedy) * | Date: 2020-08-21 11:27 | |
os.environ is initially a copy of the os string-string mapping. That some string values happen to represent file paths is opaque to the mapping. So, to me, looking at os.environ by itself, there is no particular reason to autoconvert Path values but not anything else nor to autoconvert values but not keys. If os.environ['key'] = Path('boo') worked, I would expect os.environ['key2'] = 333 to work. I would consider a new inconsistency here to be worse than the existing one between os.environ and subprocess.Popen(env=...). It would be OK with me if the latter were fixed. It is not unheard of for CPython to accept objects that conceptually should not be. Some instances have be 'grandparented', others not. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:25 | admin | set | github: 83642 |
2020-08-31 21:17:29 | ethan.furman | set | status: open -> closed resolution: rejected stage: resolved |
2020-08-21 11:27:37 | terry.reedy | set | nosy:
+ terry.reedy messages: + msg375750 |
2020-01-29 02:01:48 | ethan.furman | set | messages: + msg360929 |
2020-01-29 01:55:41 | vstinner | set | messages: + msg360927 |
2020-01-28 22:07:57 | rhettinger | set | nosy:
+ rhettinger messages: + msg360905 |
2020-01-28 21:19:03 | eryksun | set | messages: + msg360901 |
2020-01-28 18:44:56 | Antony.Lee | set | messages: + msg360894 |
2020-01-28 18:35:53 | brett.cannon | set | messages: + msg360890 |
2020-01-27 21:26:46 | ethan.furman | set | messages: + msg360804 |
2020-01-27 21:22:53 | eryksun | set | messages: + msg360803 |
2020-01-27 21:02:45 | serhiy.storchaka | set | messages: + msg360801 |
2020-01-27 20:41:44 | vstinner | set | messages: + msg360798 |
2020-01-27 20:12:23 | brett.cannon | set | messages: + msg360795 |
2020-01-27 18:45:46 | eryksun | set | nosy:
+ eryksun messages: + msg360788 |
2020-01-27 17:36:34 | ethan.furman | set | messages: + msg360784 |
2020-01-27 16:43:54 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg360779 |
2020-01-27 16:32:49 | ethan.furman | set | messages: + msg360777 |
2020-01-27 16:04:06 | vstinner | set | messages: + msg360775 |
2020-01-27 15:57:30 | ethan.furman | set | messages: + msg360773 |
2020-01-27 12:26:30 | vstinner | set | type: enhancement title: os.environ does not support Path-like values, but subprocess(..., env=...) does -> [RFE] os.environ should support Path-like values, like subprocess(..., env=...) |
2020-01-27 12:26:04 | vstinner | set | nosy:
+ brett.cannon, vstinner, ethan.furman messages: + msg360756 |
2020-01-27 10:25:58 | Antony.Lee | create |