msg276189 - (view) |
Author: Christoph Gohlke (cgohlke) |
Date: 2016-09-13 06:55 |
Trying to build numpy-1.11.2rc1 wheels for Python 3.6.0b1 on Windows 10 with `python.exe setup.py bdist_wheel`, python36.dll (32 and 64 bit) segfaults in the `find_maxchar_surrogates` function. Some other packages built OK, e.g. Cython, pyzmq and Pillow.
The crash is at <https://hg.python.org/cpython/file/v3.6.0b1/Objects/unicodeobject.c#l1607>. The call stack is attached.
Locals:
+ begin 0x00454c49464f5250 <Error reading characters of string.> const wchar_t *
ch 1509860640 unsigned int
+ end 0x004550fc90512200 <Error reading characters of string.> const wchar_t *
+ iter 0x00000259a500e7d8 L"\x2" const wchar_t *
+ maxchar 0x0000005a59fea520 {0} unsigned int *
+ num_surrogates 0x0000005a59fea528 {0} __int64 *
|
msg276194 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-09-13 07:18 |
end-begin = 0x004550fc90512200-0x00454c49464f5250 = 5168087289776
Seems find_maxchar_surrogates() is called with wrong arguments.
|
msg276215 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-13 08:57 |
> Trying to build numpy-1.11.2rc1 wheels for Python 3.6.0b1 on Windows 10 with `python.exe setup.py bdist_wheel`, python36.dll (32 and 64 bit) segfaults in the `find_maxchar_surrogates` function.
I ran "python setup.py bdist_wheel" on numpy-1.11.2rc1 on Linux: no issue.
I compiled Python 3.6 (the default branch in fact, so the future 3.7) in debug mode/64 bit with SSL, but the setup.py command failed with:
Don't know how to compile Fortran code on platform 'nt'
I don't want to spend too much time to try to reproduce the issue.
Can you please try to get the Python traceback? Try:
python -X faulthandler setup.py bdist_wheel
It seems like the crash occurs in parse_envlist(), function to parse environment variables in os.spawnve(). Can you try to dump the env parameter of os.spawnve()?
For example, you can put the following lines at the beginning of setup.py:
---
import os
orig_spawnve = os.spawnve
def log_spawnve(*args):
print("spawnve: %r" % (args,))
return orig_spawnve(*args)
os.spawnve = log_spawnve
---
My test:
---
>>> import os, sys; args=[sys.executable, "-c", "pass"]; os.spawnve(os.P_WAIT, args[0], args, None)
spawnve: (0, '/home/haypo/prog/python/default/python', ['/home/haypo/prog/python/default/python', '-c', 'pass'], None)
127
---
|
msg276234 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-09-13 09:57 |
parse_envlist is calling PyUnicode_FromFormat with the format "%U=%U", but passing it bytes:
|Debug Error!
Program: C:\Program Files\Python36\python_d.exe
abort() has been called
(Press Retry to debug the application)
(1e8.374): Break instruction exception - code 80000003 (first chance)
ucrtbased!issue_debug_notification+0x45:
00007ffc`aa1f1985 cc int 3
1:002> k 0n12
Child-SP RetAddr Call Site
00000030`a5fe9bb0 00007ffc`aa1f1b23 ucrtbased!issue_debug_notification+0x45
00000030`a5fe9c00 00007ffc`aa212d7d ucrtbased!__acrt_report_runtime_error+0x13
00000030`a5fe9c60 00007ffc`aa2184cf ucrtbased!abort+0x1d
00000030`a5fe9ca0 00007ffc`aa2164c8 ucrtbased!common_assert_to_stderr<wchar_t>+0xbf
00000030`a5fe9d00 00007ffc`aa218e7f ucrtbased!common_assert<wchar_t>+0x68
00000030`a5fe9d40 00000000`60e59045 ucrtbased!_wassert+0x2f
00000030`a5fe9d70 00000000`60eb9f71 python36_d!_PyUnicode_CheckConsistency+0x45
00000030`a5fe9e00 00000000`60e4cd07 python36_d!unicode_fromformat_arg+0xc81
00000030`a5fe9fa0 00000000`60e4cc31 python36_d!PyUnicode_FromFormatV+0xa7
00000030`a5fea040 00000000`60c69853 python36_d!PyUnicode_FromFormat+0x31
00000030`a5fea080 00000000`60c68286 python36_d!parse_envlist+0x1c3
00000030`a5fea180 00000000`60c5d4ee python36_d!os_spawnve_impl+0x1d6
1:002> .frame 6
06 00000030`a5fe9d70 00000000`60eb9f71 python36_d!_PyUnicode_CheckConsistency+0x45
1:002> ?? op->ob_type->tp_name
char * 0x00000000`60fe1248
"bytes"
parse_envlist used to call PyUnicode_FSConverter on key and val; get the underlying buffers via PyBytes_AsString; allocate a new buffer and copy key and val into "%s=%s". Now it instead calls PyUnicode_FromFormat for "%U=%U" and then fsconvert_strdup, which has been modified to call PyUnicode_AsWideCharString on Windows and otherwise PyUnicode_FSConverter.
For Windows it could use PyUnicode_FSDecoder on key and val in combination with PyUnicode_FromFormat and the format string "%U=%U". For Unix it could use PyUnicode_FSConverter for key and val in combination with PyBytes_FromFormat and the format string "%s=%s".
|
msg276235 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-13 10:04 |
> parse_envlist is calling PyUnicode_FromFormat with the format "%U=%U", but passing it bytes:
Ah, it's a regression introduced by the issue #27781 with the change e20c7d8a8187: parse_envlist() was modified to first call PyUnicode_FromFormat("%U=%U") and then convert.
|
msg276241 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-09-13 10:43 |
Here's a snippet to reproduce this bug:
import os, sys
environb = {os.fsencode(k):os.fsencode(v) for k,v in os.environ.items()}
os.spawnve(os.P_WAIT, sys.executable, ('python', '--version'), environb)
(Now that Windows Python provisionally supports bytes via utf-8:surrogatepass, maybe it should also have an os.environb mapping.)
|
msg276304 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-09-13 17:06 |
Whoops, yeah that's my fault. I'll get to it before b2, but I need to catch up on my day job the next week or two, so if someone else wants to apply a fix (and add a test - test_crashers, presumably?) then go for it!
|
msg276388 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2016-09-14 07:28 |
Here's a patch for Unix. I will add Windows support when I get my Windows VM.
> [...] (and add a test - test_crashers, presumably?) [...]
Unfortunately, test_crashers doesn't run since 2011 (skipped in 481ad9129a0f.) parse_envlist() is only used by os.execve() and os.spawnve() so I'm not sure what's the best way to test the patch (other than adapting Eryk's snippet in msg276241.)
|
msg276391 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-14 07:55 |
The issue is not specific to Windows, the following example also crash on Linux:
import os, sys
args = [sys.executable, "-c", "pass"]
os.execve(args[0], args, os.environb)
|
msg276400 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-09-14 08:48 |
Berker, this is basically what I had in my initial patch on the Unix side. I also addressed the Windows issues in parse_envlist and fsconvert_strdup. I'm uploading that patch for reference. It needs a test. I also need to verify that there are no additional problems with the high-level os._execvpe function regarding bytes support on Windows.
|
msg276429 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-09-14 12:57 |
New changeset 0ca42273c714 by Victor Stinner in branch '3.6':
Issue #28114: Add unit tests on os.spawn*()
https://hg.python.org/cpython/rev/0ca42273c714
|
msg276430 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-14 12:59 |
Berker and/or Eryksun: Please write unit tests for your patch. I just added a new SpawnTests to test_os which tests all os.spawn*() functions.
Please add at least one unit test with env contains a bytes key/value entry. Maybe add a use_bytes=False parameter to create_args() to use bytes for the variable added by the test?
|
msg276457 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2016-09-14 14:19 |
Thanks for the spawn test framework, Victor. I've added a use_bytes argument to encode the args and env using os.fsencode. It's encoding args as well because parse_arglist calls fsconvert_strdup, which was assuming Unicode strings on Windows instead of first calling PyUnicode_FSDecoder. test_spawnve_bytes passes for me on Linux and Windows.
|
msg276465 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2016-09-14 15:23 |
Eryk's patch looks good to me, thanks! I will wait for others to review the patch.
|
msg276578 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-09-15 17:19 |
New changeset 8d80a4291ddc by Berker Peksag in branch '3.6':
Issue #28114: Fix a crash in parse_envlist() when env contains byte strings
https://hg.python.org/cpython/rev/8d80a4291ddc
New changeset 8ad99fdc84a3 by Berker Peksag in branch 'default':
Issue #28114: Merge from 3.6
https://hg.python.org/cpython/rev/8ad99fdc84a3
|
msg276581 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2016-09-15 17:48 |
Thanks for the patch, Eryk!
|
msg276593 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-15 19:06 |
> #if defined(HAVE_WEXECV) || defined(HAVE_WSPAWNV)
Hum, can't we use MS_WINDOWS here? Or maybe pass a parameter in
parse_envlist() for Windows?
|
msg276599 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-09-15 19:17 |
We already use HAVE_WEXECV and HAVE_WSPAWNV elsewhere to choose the wide-char versions of these functions. You've got a few changes to make if you want to maintain consistency.
Personally I prefer feature flags to platform flags, then let pyconfig.h control them, especially when we're not directly using platform-specific APIs.
|
msg276600 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-15 19:17 |
Ok fine ;-) Let's keep #if defined(HAVE_WEXECV) || defined(HAVE_WSPAWNV).
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:36 | admin | set | github: 72301 |
2017-03-31 16:36:10 | dstufft | set | pull_requests:
+ pull_request861 |
2016-09-15 19:17:47 | vstinner | set | messages:
+ msg276600 |
2016-09-15 19:17:06 | steve.dower | set | messages:
+ msg276599 |
2016-09-15 19:06:12 | vstinner | set | messages:
+ msg276593 |
2016-09-15 17:48:30 | berker.peksag | set | status: open -> closed resolution: fixed messages:
+ msg276581
stage: patch review -> resolved |
2016-09-15 17:19:44 | python-dev | set | messages:
+ msg276578 |
2016-09-14 15:23:02 | berker.peksag | set | messages:
+ msg276465 |
2016-09-14 15:12:45 | eryksun | set | files:
+ issue_28114_03.patch |
2016-09-14 14:19:27 | eryksun | set | files:
+ issue_28114_02.patch
messages:
+ msg276457 |
2016-09-14 12:59:57 | vstinner | set | messages:
+ msg276430 |
2016-09-14 12:57:56 | python-dev | set | nosy:
+ python-dev messages:
+ msg276429
|
2016-09-14 08:48:10 | eryksun | set | files:
+ issue_28114_01.patch
messages:
+ msg276400 |
2016-09-14 07:55:03 | vstinner | set | messages:
+ msg276391 |
2016-09-14 07:53:14 | vstinner | set | title: Crash in unicodeobject.c find_maxchar_surrogates on python-3.6.0b1 for Windows -> parse_envlist(): os.execve(), os.spawnve(), etc. crash in Python 3.6.0 when env contains byte strings |
2016-09-14 07:28:11 | berker.peksag | set | files:
+ issue28114_unix.diff
nosy:
+ berker.peksag messages:
+ msg276388
keywords:
+ patch stage: patch review |
2016-09-13 17:06:04 | steve.dower | set | messages:
+ msg276304 |
2016-09-13 10:43:16 | eryksun | set | messages:
+ msg276241 |
2016-09-13 10:04:15 | vstinner | set | messages:
+ msg276235 |
2016-09-13 09:57:58 | eryksun | set | nosy:
+ eryksun messages:
+ msg276234
|
2016-09-13 08:57:24 | vstinner | set | messages:
+ msg276215 |
2016-09-13 07:18:07 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka
messages:
+ msg276194 versions:
+ Python 3.7 |
2016-09-13 06:55:35 | cgohlke | create | |