New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os.execvpe() doesn't support surrogates in env #52638
Comments
It would be nice to support the PEP-383 (surrogateescape) for environment variables in os.execvpe(). Attached patch uses PyUnicode_AsEncodedString(val, Py_FileSystemDefaultEncoding, "surrogateescape") to encode an environment variable value. I'm not sure that PyUnicode_AsEncodedString(val, Py_FileSystemDefaultEncoding, "surrogateescape") does always return a PyBytes object. I not patched environment keys, but it might be useful. |
See also issue bpo-4036. |
See also bpo-8393. |
My patch doesn't work for types bytes and bytearray. I noticed that py3k uses surrogateescape to encode environment variable values ;-) |
Other notes: Environment variable *names* use also surrogateescape "encoding". os.spawnve() and os.spawnvpe() should also be patched (the code should also be factorized). |
New version of the patch:
Because of the factorization, the error messages doesn't contain the function name anymore. spawnve() and spawnvpe() omit BEGINLIBPATH and ENDLIBPATH, as execve(): "that Would Confuse Programs if Passed On". I suppose that if execve() ignore them, spawn*e() should also ignore them. I don't have an OS/2, so I'm unable to test my patch on this OS :-/ Note: The patch fixes also subprocess to support bytes and bytearray in the environment dictionary. |
Current code of execve() has a bug: it uses the length of the environment variable value in *characters* and not in *bytes* to allocate the "p" buffer. I remember that someone wrote a comment somewhere about that... The result is that the environment variable value is truncated by 1 byte. Example (copy of http://dpaste.com/184803/): $ cat test.py
#!/usr/bin/python
# -*- coding: utf-8 -*- import os
env = {"VAR": "ćd"}
os.execve("test.sh", [], env)
$ cat test.sh
#!/bin/bash declare -p VAR |
Commited: r80421 (py3k), blocked in 3.1 (80422). The commit fixes also os.getenv() to support bytes environment name. |
I blocked the fix in Python 3.1 because it's non trivial and I prefer to avoid complex changes in Python 3.1. But then I realized that Python 3.1 has two bugs about environment variables. It uses sys.getfilesystemencoding()+surrogateecape to decode variables and sys.getdefaultencoding()+strict to encode variables: the encoding is different! It counts the number of *characters* to allocate the *byte* string buffer and so non-ASCII values are truncated. So I decided to backport the fix: r80494. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: