New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parse_envlist(): os.execve(), os.spawnve(), etc. crash in Python 3.6.0 when env contains byte strings #72301
Comments
Trying to build numpy-1.11.2rc1 wheels for Python 3.6.0b1 on Windows 10 with The crash is at <https://hg.python.org/cpython/file/v3.6.0b1/Objects/unicodeobject.c#l1607\>. The call stack is attached. Locals: |
end-begin = 0x004550fc90512200-0x00454c49464f5250 = 5168087289776 Seems find_maxchar_surrogates() is called with wrong arguments. |
I ran "python setup.py bdist_wheel" on numpy-1.11.2rc1 on Linux: no issue. I compiled Python 3.6 (the default branch in fact, so the future 3.7) in debug mode/64 bit with SSL, but the setup.py command failed with: Don't know how to compile Fortran code on platform 'nt' I don't want to spend too much time to try to reproduce the issue. Can you please try to get the Python traceback? Try: python -X faulthandler setup.py bdist_wheel It seems like the crash occurs in parse_envlist(), function to parse environment variables in os.spawnve(). Can you try to dump the env parameter of os.spawnve()? For example, you can put the following lines at the beginning of setup.py: --- import os
orig_spawnve = os.spawnve
def log_spawnve(*args):
print("spawnve: %r" % (args,))
return orig_spawnve(*args)
os.spawnve = log_spawnve My test: >>> import os, sys; args=[sys.executable, "-c", "pass"]; os.spawnve(os.P_WAIT, args[0], args, None)
spawnve: (0, '/home/haypo/prog/python/default/python', ['/home/haypo/prog/python/default/python', '-c', 'pass'], None)
127 |
parse_envlist is calling PyUnicode_FromFormat with the format "%U=%U", but passing it bytes:
parse_envlist used to call PyUnicode_FSConverter on key and val; get the underlying buffers via PyBytes_AsString; allocate a new buffer and copy key and val into "%s=%s". Now it instead calls PyUnicode_FromFormat for "%U=%U" and then fsconvert_strdup, which has been modified to call PyUnicode_AsWideCharString on Windows and otherwise PyUnicode_FSConverter. For Windows it could use PyUnicode_FSDecoder on key and val in combination with PyUnicode_FromFormat and the format string "%U=%U". For Unix it could use PyUnicode_FSConverter for key and val in combination with PyBytes_FromFormat and the format string "%s=%s". |
Ah, it's a regression introduced by the issue bpo-27781 with the change e20c7d8a8187: parse_envlist() was modified to first call PyUnicode_FromFormat("%U=%U") and then convert. |
Here's a snippet to reproduce this bug: import os, sys
environb = {os.fsencode(k):os.fsencode(v) for k,v in os.environ.items()}
os.spawnve(os.P_WAIT, sys.executable, ('python', '--version'), environb) (Now that Windows Python provisionally supports bytes via utf-8:surrogatepass, maybe it should also have an os.environb mapping.) |
Whoops, yeah that's my fault. I'll get to it before b2, but I need to catch up on my day job the next week or two, so if someone else wants to apply a fix (and add a test - test_crashers, presumably?) then go for it! |
Here's a patch for Unix. I will add Windows support when I get my Windows VM.
Unfortunately, test_crashers doesn't run since 2011 (skipped in 481ad9129a0f.) parse_envlist() is only used by os.execve() and os.spawnve() so I'm not sure what's the best way to test the patch (other than adapting Eryk's snippet in msg276241.) |
The issue is not specific to Windows, the following example also crash on Linux: import os, sys
args = [sys.executable, "-c", "pass"]
os.execve(args[0], args, os.environb) |
Berker, this is basically what I had in my initial patch on the Unix side. I also addressed the Windows issues in parse_envlist and fsconvert_strdup. I'm uploading that patch for reference. It needs a test. I also need to verify that there are no additional problems with the high-level os._execvpe function regarding bytes support on Windows. |
New changeset 0ca42273c714 by Victor Stinner in branch '3.6': |
Berker and/or Eryksun: Please write unit tests for your patch. I just added a new SpawnTests to test_os which tests all os.spawn*() functions. Please add at least one unit test with env contains a bytes key/value entry. Maybe add a use_bytes=False parameter to create_args() to use bytes for the variable added by the test? |
Thanks for the spawn test framework, Victor. I've added a use_bytes argument to encode the args and env using os.fsencode. It's encoding args as well because parse_arglist calls fsconvert_strdup, which was assuming Unicode strings on Windows instead of first calling PyUnicode_FSDecoder. test_spawnve_bytes passes for me on Linux and Windows. |
Eryk's patch looks good to me, thanks! I will wait for others to review the patch. |
New changeset 8d80a4291ddc by Berker Peksag in branch '3.6': New changeset 8ad99fdc84a3 by Berker Peksag in branch 'default': |
Thanks for the patch, Eryk! |
Hum, can't we use MS_WINDOWS here? Or maybe pass a parameter in |
We already use HAVE_WEXECV and HAVE_WSPAWNV elsewhere to choose the wide-char versions of these functions. You've got a few changes to make if you want to maintain consistency. Personally I prefer feature flags to platform flags, then let pyconfig.h control them, especially when we're not directly using platform-specific APIs. |
Ok fine ;-) Let's keep #if defined(HAVE_WEXECV) || defined(HAVE_WSPAWNV). |
Misc/NEWS
so that it is managed by towncrier #552Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: