msg318323 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-31 15:13 |
Follow-up of bpo-33719.
C:\Users\vstinner\AppData\Local\Programs\Python\Python37>python.exe -m test test_marshal -v
== CPython 3.7.0b5 (v3.7.0b5:abb8802389, May 31 2018, 01:54:01) [MSC v.1913 64 bit (AMD64)]
== Windows-10-10.0.16299-SP0 little-endian
== cwd: C:\Users\vstinner\AppData\Local\Temp\test_python_3836
== CPU count: 2
== encodings: locale=cp1252, FS=utf-8
Run tests sequentially
0:00:00 [1/1] test_marshal
(...)
test_loads_2x_code (test.test_marshal.BugsTestCase) ... Windows fatal exception: stack overflow
Current thread 0x000003a0 (most recent call first):
File "C:\Users\vstinner\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 178 in handle
File "C:\Users\vstinner\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 743 in assertRaises
File "C:\Users\vstinner\AppData\Local\Programs\Python\Python37\lib\test\test_marshal.py", line 215 in test_loads_2x_code
(...)
Crashes in test_marshal is on old topic:
* bpo-1050
* bpo-2286
* bpo-25264
* bpo-22734
* bpo-27019
Current stack size: 2 million bytes (1.9 MiB)
PCbuild/python.vcxproj: <StackReserveSize>2000000</StackReserveSize>
PCbuild/pythonw.vcxproj: <StackReserveSize>2000000</StackReserveSize>
|
msg318325 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-31 15:17 |
I compiled the master branch of Python in release mode using VS2015 (MSC v.1912 64 bit) and I failed to reproduce the crash:
* PCbuild/build.bat -e -p x64
* python -m test -v test_marshal
* no crash
|
msg318326 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-31 15:26 |
> I compiled the master branch of Python in release mode using VS2015 (MSC v.1912 64 bit) and I failed to reproduce the crash
I also failed to reproduce the crash in the 3.7 branch.
I guess that the python.org binary has been compiled differently.
|
msg318327 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-05-31 15:36 |
The uploaded binary is compiled with PGO enabled (and trained on most of the test suite). I'll check it out - hopefully we don't need to do anything drastic and can get away with either a compiler update or disabling optimizations on a single function.
|
msg318328 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-05-31 15:36 |
Ned, FYI
|
msg318329 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-31 15:39 |
> priority: normal -> release blocker
I don't think that it's a release blocker. test_marshal does only crash on corner cases which should not occur on usual "valid" data.
|
msg318330 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-05-31 15:48 |
A crash in the test suite should be fixed, especially since we have protection against this crash (and a test that validates it).
In this case, apparently the stack allocation for each frame of r_object grew and now there isn't room for 2000 calls (the value of MAX_MARSHAL_STACK_DEPTH). This isn't really a robust way of handling it anyway, so I'll check out whether there's an easy way to safely probe the stack before recursing, otherwise we'll just have to cut the number a bit further.
|
msg318331 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-05-31 16:13 |
I need to stop working on this right now, but here's the locals layout in a normal release build in r_object:
@rdi @rdi p = 0x00000034`655ea3d0
00000034`65403f60 @rsp+0x0080 v = 0x00000000`00000000
00000034`65403fc0 @rsp+0x00e0 buf = char [256] ""
00000034`654040c0 @rsp+0x01e0 buf = char [256] ""
In the PGO build, it looks like this:
000000be`1e003b50 @rsp+0x0080 v = 0x00000000`00000000
000000be`1e003b58 @rsp+0x0088 is_interned = 0n0
000000be`1e003ef0 @rsp+0x0420 buf = char [256] ""
000000be`1e003ff0 @rsp+0x0520 buf = char [256] ""
I need to ping the team and figure out why the buffers are so far removed from the rest of the stack, and figure out what's in the gap. That seems to be the core of the problem.
|
msg318442 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-06-01 17:46 |
FYI, I posted the problem at https://developercommunity.visualstudio.com/content/problem/265778/pgo-generates-large-stack-frame.html (I _think_ it's public).
Until then, it's probably best to reduce the recursion limit. Going to a depth of 1000 seems okay, and it certainly seems less bad than disabling optimisations.
|
msg318497 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2018-06-02 10:52 |
As for the test itself, the original test was added in dc78cc6f7cea2040114a350aa9db835cc433ae05. It tested that the stack overflow is not happen when unmarshal a fake code object with a deeply nested dict instead of co_code. It was renamed to test_loads_recursion in cf0fab2686799b562c9c6f6255557e36f9af095e. In issue16475 (d7009c69136a3809282804f460902ab42e9972f6) it was updated to mirror the changed structure of the code object in Python 3, but the original test was kept with the name test_loads_2x_code. Actually both tests work the same, because the value of few first bytes doesn't affect the result of these tests.
Yet this test is fragile, it depends on the unstable structure of the code object, and it serves its purpose only because input data checking is not very strong in the current implementation. Future changes in the marshal module can make this test invalid, and this will not be noticed.
In PR 7336 this test is rewritten in more reliable way.
This likely is not directly related to the crash.
|
msg318689 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-06-04 19:24 |
Ned - PR 7401 fixes the crash at the cost of reducing the permitted recursion depth in marshal.
Your call whether we take this fix or disable PGO, but I'd much rather take this fix (PGO has full-interpreter effects whereas this is very narrow). When I hear back about the compiler bug - if it is one - we should be able to raise the limit again without causing any compatibility issues.
|
msg318690 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2018-06-04 19:28 |
Steve, I'm OK with the proposed reduction as a workaround. We have other platform- and build-specific limits based on stack sizes etc. It's hard to have one-size-fits-all.
|
msg318693 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-06-04 20:25 |
New changeset 2a4a62ba4ae770bbc7b7fdec0760031c83fe1f7b by Steve Dower in branch 'master':
bpo-33720: Reduces maximum marshal recursion depth on release builds. (GH-7401)
https://github.com/python/cpython/commit/2a4a62ba4ae770bbc7b7fdec0760031c83fe1f7b
|
msg318695 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-06-04 20:26 |
Thanks. I merged, and miss-islington is backporting it. I guess at this stage you're manually cherry-picking into your own 3.7 branch for release? I'll merge for 3.7.1 though.
|
msg318696 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2018-06-04 20:28 |
> I guess at this stage you're manually cherry-picking into your own 3.7 branch for release?
No, not until 3.7.0rc1 next week. No need to cherry pick. Thanks!
|
msg318698 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-06-04 20:41 |
New changeset 103058e19bea48f4a80afde51df04338a753e952 by Miss Islington (bot) in branch '3.7':
bpo-33720: Reduces maximum marshal recursion depth on release builds. (GH-7401)
https://github.com/python/cpython/commit/103058e19bea48f4a80afde51df04338a753e952
|
msg318714 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2018-06-04 22:04 |
With the workaround merged for 3.7.0rc1, I'm downgrading from "release blocker" to "high" since I believe Steve wants to keep this issue open for a possible compiler resolution.
|
msg318771 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-06-05 16:59 |
Thanks. Agreed this should stay open, but it is blocked on a compiler fix.
|
msg320053 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-06-20 10:18 |
> Thanks. Agreed this should stay open, but it is blocked on a compiler fix.
Do you want to have a different limit depending on the compiler and the compiler version? I'm not sure that it's worth it. I'm happy with the heuristic "use lower limit on Windows". By design, the test is fragile.
I would suggest to close the issue, but you can keep it open if you prefer to increase the limit on a fixed compiler.
|
msg320221 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2018-06-22 10:29 |
What are your thoughts about rewriting the test (PR 7336)?
|
msg321012 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2018-07-04 06:47 |
The alternate solution is provided by PR 8071. The culprit of the main stack consumption is the code for unmarshalling float and complex in protocols 0 and 1 which uses 256-bytes buffers on the stack. Moving it to the separate function allows to decrease the stack consumption and increase the maximum marshal recursion depth on Windows.
|
msg321054 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-07-04 15:45 |
Thanks for writing that second PR, Serhiy! I'd already spotted that as the likely fix on our side, though I do want to complete the investigation into why MSVC generated unnecessarily sparse frames. But that is basically an internal discussion for me to follow up on at work.
|
msg321055 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2018-07-04 15:45 |
I added the backport to 3.7 label to your PR, as I want to restore the marshal stack depth limit in 3.7.1.
|
msg321059 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2018-07-04 17:17 |
Unfortunately PR 8071 doesn't fix the crash on PGO builds. :-( Initially I tested it only on the debug build, and now have tested on the PGO build.
|
msg321084 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2018-07-05 08:17 |
New changeset fc05e68d8fac70349b7ea17ec14e7e0cfa956121 by Serhiy Storchaka in branch 'master':
bpo-33720: Improve tests for the stack overflow in marshal.loads(). (GH-7336)
https://github.com/python/cpython/commit/fc05e68d8fac70349b7ea17ec14e7e0cfa956121
|
msg321087 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-07-05 08:45 |
New changeset 3f121a40c7526aedb2a40a1be8f0ae96b267bf26 by Miss Islington (bot) in branch '3.7':
bpo-33720: Improve tests for the stack overflow in marshal.loads(). (GH-7336)
https://github.com/python/cpython/commit/3f121a40c7526aedb2a40a1be8f0ae96b267bf26
|
msg321089 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-07-05 08:48 |
New changeset 878c4fe40e8954372e9bf80ae555045ecae68e73 by Miss Islington (bot) in branch '3.6':
bpo-33720: Improve tests for the stack overflow in marshal.loads(). (GH-7336)
https://github.com/python/cpython/commit/878c4fe40e8954372e9bf80ae555045ecae68e73
|
msg321092 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2018-07-05 09:20 |
New changeset 9720f60f2aba457121bfe42d09aa3ed91f28b86f by Serhiy Storchaka in branch '2.7':
[2.7] bpo-33720: Improve tests for the stack overflow in marshal.loads(). (GH-7336). (GH-8107)
https://github.com/python/cpython/commit/9720f60f2aba457121bfe42d09aa3ed91f28b86f
|
msg322279 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2018-07-24 07:55 |
New changeset c5734998d91e9953fd179ba6ed7015b6343e8191 by Serhiy Storchaka in branch 'master':
bpo-33720: Refactor marshalling/unmarshalling floats. (GH-8071)
https://github.com/python/cpython/commit/c5734998d91e9953fd179ba6ed7015b6343e8191
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:01 | admin | set | github: 77901 |
2018-07-24 08:07:53 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2018-07-24 07:55:52 | serhiy.storchaka | set | messages:
+ msg322279 |
2018-07-05 09:20:21 | serhiy.storchaka | set | messages:
+ msg321092 |
2018-07-05 08:50:16 | serhiy.storchaka | set | pull_requests:
+ pull_request7701 |
2018-07-05 08:48:49 | miss-islington | set | messages:
+ msg321089 |
2018-07-05 08:45:26 | miss-islington | set | messages:
+ msg321087 |
2018-07-05 08:20:31 | miss-islington | set | pull_requests:
+ pull_request7700 |
2018-07-05 08:18:32 | miss-islington | set | pull_requests:
+ pull_request7699 |
2018-07-05 08:17:29 | serhiy.storchaka | set | messages:
+ msg321084 |
2018-07-04 17:17:17 | serhiy.storchaka | set | messages:
+ msg321059 |
2018-07-04 15:45:58 | steve.dower | set | messages:
+ msg321055 |
2018-07-04 15:45:14 | steve.dower | set | messages:
+ msg321054 |
2018-07-04 06:47:06 | serhiy.storchaka | set | messages:
+ msg321012 |
2018-07-03 20:29:02 | serhiy.storchaka | set | pull_requests:
+ pull_request7681 |
2018-06-22 10:29:24 | serhiy.storchaka | set | messages:
+ msg320221 |
2018-06-20 10:18:25 | vstinner | set | messages:
+ msg320053 |
2018-06-05 16:59:36 | steve.dower | set | messages:
+ msg318771 |
2018-06-04 22:04:14 | ned.deily | set | priority: release blocker -> high
messages:
+ msg318714 |
2018-06-04 20:41:51 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg318698
|
2018-06-04 20:28:17 | ned.deily | set | messages:
+ msg318696 |
2018-06-04 20:26:25 | steve.dower | set | messages:
+ msg318695 |
2018-06-04 20:25:27 | miss-islington | set | pull_requests:
+ pull_request7031 |
2018-06-04 20:25:03 | steve.dower | set | messages:
+ msg318693 |
2018-06-04 19:28:37 | ned.deily | set | messages:
+ msg318690 |
2018-06-04 19:24:19 | steve.dower | set | messages:
+ msg318689 |
2018-06-04 16:22:24 | steve.dower | set | pull_requests:
+ pull_request7027 |
2018-06-02 10:52:15 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg318497
|
2018-06-02 09:54:47 | serhiy.storchaka | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request6967 |
2018-06-01 17:46:37 | steve.dower | set | messages:
+ msg318442 |
2018-05-31 16:13:29 | steve.dower | set | messages:
+ msg318331 |
2018-05-31 15:48:12 | steve.dower | set | messages:
+ msg318330 |
2018-05-31 15:39:24 | vstinner | set | messages:
+ msg318329 |
2018-05-31 15:36:53 | steve.dower | set | priority: normal -> release blocker versions:
+ Python 3.8 nosy:
+ ned.deily
messages:
+ msg318328
|
2018-05-31 15:36:27 | steve.dower | set | assignee: steve.dower messages:
+ msg318327 |
2018-05-31 15:26:26 | vstinner | set | messages:
+ msg318326 |
2018-05-31 15:17:52 | vstinner | set | versions:
+ Python 3.7, - Python 3.8 |
2018-05-31 15:17:44 | vstinner | set | messages:
+ msg318325 |
2018-05-31 15:14:15 | vstinner | set | type: crash |
2018-05-31 15:13:35 | vstinner | create | |