classification
Title: regression in Cython when pickling objects
Type: behavior Stage: patch review
Components: Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: pablogsal, scoder, serhiy.storchaka, tcaswell, vstinner
Priority: normal Keywords: patch

Created on 2019-06-15 03:43 by tcaswell, last changed 2019-06-20 21:17 by pablogsal.

Files
File name Uploaded Description Edit
np_fail.txt tcaswell, 2019-06-15 03:43
cython_test_log.txt tcaswell, 2019-06-15 04:47
Pull Requests
URL Status Linked Edit
PR 14099 merged pablogsal, 2019-06-15 10:37
PR 14111 merged miss-islington, 2019-06-15 14:58
PR 14112 merged miss-islington, 2019-06-15 14:58
PR 14127 merged pablogsal, 2019-06-16 11:43
Messages (20)
msg345655 - (view) Author: Thomas Caswell (tcaswell) * Date: 2019-06-15 03:43
The fix for 
https://bugs.python.org/issue37213 in 3498c642f4e83f3d8e2214654c0fa8e0d51cebe5 (https://github.com/python/cpython/pull/13969) seems to break building numpy (master) with cython (master) due to a pickle failure.

The traceback (which is mostly in the cython source) is in an attachment.

I bisected it back to this commit and tested that reverting this commit fixes the numpy build.
msg345658 - (view) Author: Thomas Caswell (tcaswell) * Date: 2019-06-15 04:47
This change also causes failures in the cython test suite (see attached).
msg345660 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 07:17
Thanks for the report. I will investigate, but given that the fix just optimizes bytecode as it was done in 3.7, and this is a picke failure, it seems that is likely that some data may need to be regenerated in Cython?

Stephan probably can confirm this meanwhile we investigate.
msg345664 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 09:26
Did you open a bug in the Cython tracker?
msg345669 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 10:11
[I have deleted some comments claiming that I was not able to reproduce the issue because I managed to reproduce the issue correctly at the end]
msg345670 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 10:37
I managed to find the original cause, although I don't understand exactly why it happens. The error is caused by the 'if False' handling of https://github.com/python/cpython/pull/13332. There is some failure in that logic, so for now I made a PR reverting that change.
msg345673 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-06-15 11:27
Could you please provide an example of the improperly optimized code?
msg345676 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 11:43
I was not able to find it in Cython but PR14099 fixes it so I prefer to revert the optimization for now and then try to add it back once we understand what's failing exactly.
msg345682 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 14:58
New changeset 7a68f8c28bb78d957555a5001dac4df6345434a0 by Pablo Galindo in branch 'master':
bpo-37289: Remove 'if False' handling in the peephole optimizer (GH-14099)
https://github.com/python/cpython/commit/7a68f8c28bb78d957555a5001dac4df6345434a0
msg345685 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 15:20
Serhiy, this is the code that was incorrectly optimized:

  3           0 LOAD_GLOBAL              0 (g)
              2 LOAD_ATTR                1 (__code__)
              4 POP_JUMP_IF_FALSE        8
              6 JUMP_FORWARD             4 (to 12)
        >>    8 LOAD_CONST               0 (None)
             10 POP_JUMP_IF_FALSE       20

  4     >>   12 LOAD_GLOBAL              2 (print)
             14 LOAD_CONST               2 ('Called!')
             16 CALL_FUNCTION            1
             18 POP_TOP
        >>   20 LOAD_CONST               0 (None)
             22 RETURN_VALUE

def g():
    if True if g.__code__ else None:
        print("Called!")

g()

The problem is that [LOAD_CONST               0 (None)] made the peephole to delete the block because it does not detect that previous conditions in the same conditional could be True. The general case was made by looking back for POP_JUMP_IF_{FALSE,TRUE} but with this construct JUMP_FORWARD appears.

I think handling this condition in the peephole optimizer can be very dangerous error prone to get right because all the information is lost, but doing it at the ast level makes us lose syntax errors in optimized blocks, which is also wrong. Any ideas?

It happens here in Cython:

https://github.com/cython/cython/blob/master/Cython/Compiler/Nodes.py#L5131

That's why the errors manifest as pickling errors
msg345686 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 15:21
Just to be clear, before PR14099, the code was wrongly optimized to:

  3           0 LOAD_GLOBAL              0 (g)
              2 LOAD_ATTR                1 (__code__)
              4 POP_JUMP_IF_FALSE        8
              6 JUMP_FORWARD             0 (to 8)

  4     >>    8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
msg345687 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 15:22
New changeset 284daeade210d3aac049f4278a1fb76d19e6d78a by Pablo Galindo (Miss Islington (bot)) in branch '3.8':
bpo-37289: Remove 'if False' handling in the peephole optimizer (GH-14099) (GH-14112)
https://github.com/python/cpython/commit/284daeade210d3aac049f4278a1fb76d19e6d78a
msg345688 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 15:22
New changeset 81fecf7b7a6e766a213c6e670219c1da52461589 by Pablo Galindo (Miss Islington (bot)) in branch '3.7':
bpo-37289: Remove 'if False' handling in the peephole optimizer (GH-14099) (GH-14111)
https://github.com/python/cpython/commit/81fecf7b7a6e766a213c6e670219c1da52461589
msg345690 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-06-15 15:41
Do you mind to add a test?
msg345691 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 15:51
Sure, will do.

How you think we should handle this particular optimization? Do you see a more resilient way of doing this on the peephole optimizer?
msg345693 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-15 15:54
Would it make sense to put this on the compiler? When we run compile_if we could check if there is one constant in the test and of the constant is true we skip the block.
msg345698 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-06-15 16:36
Yes, it could be moved to the compiler. See issue32477 which moves at that direction.
msg345702 - (view) Author: Thomas Caswell (tcaswell) * Date: 2019-06-15 16:44
I can confirm that master branches of cpython, cython, and numpy all build together again, thank you for the quick investigation and fix!
msg345718 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-06-15 23:22
I don't understand if this issue is a recent regression or not. Python 3.7 has been modified, but the reporter only mentions Python 3.8.

I concur with Serhiy, a test is needed to prevent future regressions on this fix.
msg346164 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-06-20 21:17
New changeset 1e61504b8b941494f3ac03e0449ddbbba8960175 by Pablo Galindo in branch 'master':
bpo-37289: Add a test for if with ifexpr in the peephole optimiser to detect regressions (GH-14127)
https://github.com/python/cpython/commit/1e61504b8b941494f3ac03e0449ddbbba8960175
History
Date User Action Args
2019-06-20 21:17:06pablogsalsetmessages: + msg346164
2019-06-16 11:43:42pablogsalsetpull_requests: + pull_request13974
2019-06-15 23:22:49vstinnersetmessages: + msg345718
2019-06-15 16:44:46tcaswellsetmessages: + msg345702
2019-06-15 16:36:55serhiy.storchakasetmessages: + msg345698
2019-06-15 15:54:22pablogsalsetmessages: + msg345693
2019-06-15 15:51:01pablogsalsetmessages: + msg345691
2019-06-15 15:41:25serhiy.storchakasetmessages: + msg345690
2019-06-15 15:22:37pablogsalsetmessages: + msg345688
2019-06-15 15:22:11pablogsalsetmessages: + msg345687
2019-06-15 15:21:48pablogsalsetmessages: + msg345686
2019-06-15 15:20:40pablogsalsetmessages: + msg345685
2019-06-15 14:58:40miss-islingtonsetpull_requests: + pull_request13961
2019-06-15 14:58:13miss-islingtonsetpull_requests: + pull_request13960
2019-06-15 14:58:03pablogsalsetmessages: + msg345682
2019-06-15 11:44:29pablogsalsetversions: + Python 3.7
2019-06-15 11:43:59pablogsalsetmessages: + msg345676
2019-06-15 11:27:00serhiy.storchakasetmessages: + msg345673
2019-06-15 10:37:26pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request13954
2019-06-15 10:37:03pablogsalsetmessages: + msg345670
2019-06-15 10:11:50pablogsalsetmessages: + msg345669
2019-06-15 10:10:56pablogsalsetmessages: - msg345663
2019-06-15 10:10:46pablogsalsetmessages: - msg345665
2019-06-15 10:04:16pablogsalsetmessages: - msg345668
2019-06-15 09:46:28pablogsalsetmessages: + msg345668
2019-06-15 09:28:49pablogsalsetmessages: + msg345665
2019-06-15 09:26:30pablogsalsetmessages: + msg345664
2019-06-15 09:24:25pablogsalsettitle: regression in cython due to peephole optomization -> regression in Cython when pickling objects
2019-06-15 09:22:15pablogsalsetmessages: + msg345663
2019-06-15 07:17:54pablogsalsetmessages: + msg345660
2019-06-15 04:47:44tcaswellsetfiles: + cython_test_log.txt

messages: + msg345658
2019-06-15 03:43:32tcaswellcreate