Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd Bytecode Generation in 3.10 #90880

Closed
saulshanabrook mannequin opened this issue Feb 11, 2022 · 8 comments
Closed

Odd Bytecode Generation in 3.10 #90880

saulshanabrook mannequin opened this issue Feb 11, 2022 · 8 comments
Labels
3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) pending The issue will be closed if no feedback is provided

Comments

@saulshanabrook
Copy link
Mannequin

saulshanabrook mannequin commented Feb 11, 2022

BPO 46724
Nosy @markshannon, @saulshanabrook, @JelleZijlstra
PRs
  • bpo-46724: Fix dis support for overflow args #31285
  • bpo-46724: Use JUMP_ABSOLUTE for all backward jumps. #31326
  • [3.10] bpo-46724: Use JUMP_ABSOLUTE for all backward jumps. #31352
  • [3.10] bpo-46724: Use JUMP_ABSOLUTE for all backward jumps. (GH-31326) #31354
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-02-11.17:22:26.225>
    labels = ['interpreter-core', '3.10', '3.11']
    title = 'Odd Bytecode Generation in 3.10'
    updated_at = <Date 2022-03-22.14:04:22.463>
    user = 'https://github.com/saulshanabrook'

    bugs.python.org fields:

    activity = <Date 2022-03-22.14:04:22.463>
    actor = 'Mark.Shannon'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2022-02-11.17:22:26.225>
    creator = 'saulshanabrook'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46724
    keywords = ['patch']
    message_count = 7.0
    messages = ['413088', '413130', '413131', '413279', '413329', '413472', '415773']
    nosy_count = 3.0
    nosy_names = ['Mark.Shannon', 'saulshanabrook', 'JelleZijlstra']
    pr_nums = ['31285', '31326', '31352', '31354']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue46724'
    versions = ['Python 3.10', 'Python 3.11']

    @saulshanabrook
    Copy link
    Mannequin Author

    saulshanabrook mannequin commented Feb 11, 2022

    I noticed that in Python 3.10, and also in main, a certain control flow construct produces some very odd bytecode (showing on main but same on python 3.10 tags):

    ./python.exe -c 'import dis; dis.dis("while not (a < b < c): pass")'
                  0 RESUME                   0
    
      1           2 LOAD_NAME                0 (a)
                  4 LOAD_NAME                1 (b)
                  6 SWAP                     2
                  8 COPY                     2
                 10 COMPARE_OP               0 (<)
                 12 POP_JUMP_IF_FALSE       11 (to 22)
                 14 LOAD_NAME                2 (c)
                 16 COMPARE_OP               0 (<)
                 18 POP_JUMP_IF_TRUE        28 (to 56)
                 20 JUMP_FORWARD             1 (to 24)
            >>   22 POP_TOP
            >>   24 LOAD_NAME                0 (a)
                 26 LOAD_NAME                1 (b)
                 28 SWAP                     2
                 30 COPY                     2
                 32 COMPARE_OP               0 (<)
                 34 POP_JUMP_IF_FALSE       23 (to 46)
                 36 LOAD_NAME                2 (c)
                 38 COMPARE_OP               0 (<)
                 40 POP_JUMP_IF_FALSE       12 (to 24)
                 42 LOAD_CONST               0 (None)
                 44 RETURN_VALUE
            >>   46 POP_TOP
                 48 EXTENDED_ARG           255
                 50 EXTENDED_ARG         65535
                 52 EXTENDED_ARG         16777215
                 54 JUMP_FORWARD         4294967280 (to 8589934616)
            >>   56 LOAD_CONST               0 (None)
                 58 RETURN_VALUE
    

    The last JUMP_FORWARD has a rather larger argument! This was the minimal example I could find to replicate this.

    However, this is an example of some runnable code that also encounters it:

    a = b = c = 1
    while not (a < b < c):
        if c == 1:
            c = 3
        else:
            b = 2
        print(a, b, c)
    

    This actually executes fine, but I notice that when it's executing it does execute that very large arg, but that the oparg to JUMP_FORWARD ends up being negative! By adding some tracing, I was able to see that the oparg variable in the TARGET(JUMP_FORWARD) case is -32.

    I am not sure if this is a bug or intended behavior. It does seem a bit odd to have this unnecessarily large argument that ends up turning into a negative jump! But the behavior seems fine.

    At the least, maybe dis should be modified so that it properly sees this as a negative jump and debugs it properly? I am happy to submit a PR to modify dis to handle this case, but I also wanted to flag that maybe it's a bug to being with.

    @saulshanabrook saulshanabrook mannequin added 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Feb 11, 2022
    @saulshanabrook
    Copy link
    Mannequin Author

    saulshanabrook mannequin commented Feb 12, 2022

    I added a patch to make dis work with the overflow format.

    I am looking into why the bytecode is emitted like this in the first place now.

    I see that currently, Python is trying to write a negative bytecode arg for the JUMP_ABSOLUTE and instrsize is saying this requires 4 args (which is not true, it only requires one). Is it intended behavior that the bytecode args will be negative? If so, then instrsize might have to be updated to handle this. It takes in an unsigned int but the oparg is actually a signed int.

    @saulshanabrook
    Copy link
    Mannequin Author

    saulshanabrook mannequin commented Feb 12, 2022

    Ah, I see that to represent a negative 4 byte signed integer as an arg, we actually do need the 4 instructions, since each is a byte, to get the most significant bit to a 1?

    I wasn't able to find anything anywhere which says if the bytecode should or should not contain negative opargs, so I am not sure if this is a bug or intended behavior, to emit the three EXTENDED_ARGs in order to represent

    @markshannon
    Copy link
    Member

    New changeset 3be1a44 by Mark Shannon in branch 'main':
    bpo-46724: Use JUMP_ABSOLUTE for all backward jumps. (GH-31326)
    3be1a44

    @markshannon
    Copy link
    Member

    New changeset d4e4ef1 by Mark Shannon in branch '3.10':
    [3.10] bpo-46724: Use JUMP_ABSOLUTE for all backward jumps. (GH-31326) (GH-31354)
    d4e4ef1

    @markshannon
    Copy link
    Member

    New changeset c3ce778 by Saul Shanabrook in branch 'main':
    bpo-46724: Fix dis support for overflow args (GH-31285)
    c3ce778

    @markshannon
    Copy link
    Member

    I think this is fixed (for 3.11 at least) by #31888

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel
    Copy link
    Member

    I'm not seeing it on 3.10 anymore either:

    % python3.10 -c 'import dis; dis.dis("while not (a < b < c): pass")'
      1           0 LOAD_NAME                0 (a)
                  2 LOAD_NAME                1 (b)
                  4 DUP_TOP
                  6 ROT_THREE
                  8 COMPARE_OP               0 (<)
                 10 POP_JUMP_IF_FALSE       10 (to 20)
                 12 LOAD_NAME                2 (c)
                 14 COMPARE_OP               0 (<)
                 16 POP_JUMP_IF_TRUE        24 (to 48)
                 18 JUMP_FORWARD             1 (to 22)
            >>   20 POP_TOP
            >>   22 LOAD_NAME                0 (a)
                 24 LOAD_NAME                1 (b)
                 26 DUP_TOP
                 28 ROT_THREE
                 30 COMPARE_OP               0 (<)
                 32 POP_JUMP_IF_FALSE       22 (to 44)
                 34 LOAD_NAME                2 (c)
                 36 COMPARE_OP               0 (<)
                 38 POP_JUMP_IF_FALSE       11 (to 22)
                 40 LOAD_CONST               0 (None)
                 42 RETURN_VALUE
            >>   44 POP_TOP
                 46 JUMP_ABSOLUTE           11 (to 22)
            >>   48 LOAD_CONST               0 (None)
                 50 RETURN_VALUE
    

    @iritkatriel iritkatriel added the pending The issue will be closed if no feedback is provided label Jul 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) pending The issue will be closed if no feedback is provided
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants