Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py_CompileString does not respect the coding cookie with the new parser if flags are empty #89980

Closed
pablogsal opened this issue Nov 16, 2021 · 8 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@pablogsal
Copy link
Member

BPO 45822
Nosy @Yhg1s, @gpshead, @ambv, @lysnikolaou, @pablogsal, @miss-islington
PRs
  • bpo-45822: Respect PEP 263's coding cookies in the parser even if flags are not provided #29582
  • [3.9] bpo-45822: Respect PEP 263's coding cookies in the parser even if flags are not provided (GH-29582) #29585
  • [3.10] bpo-45822: Respect PEP 263's coding cookies in the parser even if flags are not provided (GH-29582). #29586
  • bpo-45822: Minor cleanups to the test_Py_CompileString test #29750
  • [3.10] bpo-45822: Minor cleanups to the test_Py_CompileString test (GH-29750) #29758
  • [3.9] bpo-45822: Minor cleanups to the test_Py_CompileString test (GH-29750) #29759
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-11-16.22:36:10.358>
    created_at = <Date 2021-11-16.19:39:26.715>
    labels = ['interpreter-core', '3.9', '3.10', '3.11']
    title = 'Py_CompileString does not respect the coding cookie with the new parser if flags are empty'
    updated_at = <Date 2021-12-11.00:03:19.977>
    user = 'https://github.com/pablogsal'

    bugs.python.org fields:

    activity = <Date 2021-12-11.00:03:19.977>
    actor = 'lukasz.langa'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-11-16.22:36:10.358>
    closer = 'pablogsal'
    components = ['Parser']
    creation = <Date 2021-11-16.19:39:26.715>
    creator = 'pablogsal'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 45822
    keywords = ['patch']
    message_count = 8.0
    messages = ['406425', '406427', '406433', '406507', '406508', '406944', '408275', '408276']
    nosy_count = 6.0
    nosy_names = ['twouters', 'gregory.p.smith', 'lukasz.langa', 'lys.nikolaou', 'pablogsal', 'miss-islington']
    pr_nums = ['29582', '29585', '29586', '29750', '29758', '29759']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue45822'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @pablogsal
    Copy link
    Member Author

    When executing Py_CompileString with a source string that has a coding cookie, this is not respected as with the old parser.

    @pablogsal pablogsal added 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Nov 16, 2021
    @Yhg1s
    Copy link
    Member

    Yhg1s commented Nov 16, 2021

    Py_CompileString() in Python 3.9 and later, using the PEG parser, appears to no longer honours source encoding cookies. A reduced test case:

        #include "Python.h"
        #include <stdio.h>
    
        const char *src = (
        "# -*- coding: Latin-1 -*-\n"
        "'''\xc3'''\n");
    
        int main(int argc, char **argv)
        {
            Py_Initialize();
            PyObject *res = Py_CompileString(src, "some_path", Py_file_input);
            if (res) {
                fprintf(stderr, "Compile succeeded.\n");
                return 0;
            } else {
                fprintf(stderr, "Compile failed.\n");
                PyErr_Print();
                return 1;
            }
        }

    Compiling and running the resulting binary with Python 3.8 (or earlier):

    % ./encoding_bug
    Compile succeeded.
    

    With 3.9 and PYTHONOLDPARSER=1:

    % PYTHONOLDPARSER=1 ./encoding_bug
    Compile succeeded.
    

    With 3.9 (without the env var) or 3.10:
    % ./encoding_bug
    Compile failed.
    File "some_path", line 2
    '''�'''
    ^
    SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xc3 in position 0: unexpected end of data

    Writing the same bytes to a file and making python3.9 or python3.10 import them works fine, as does passing the bytes to compile():

        Python 3.10.0+ (heads/3.10-dirty:7bac598819, Nov 16 2021, 20:35:12) [GCC 8.3.0] on linux
        Type "help", "copyright", "credits" or "license" for more information.
        >>> b = open('encoding_bug.py', 'rb').read()
        >>> b
        b"# -*- coding: Latin-1 -*-\n'''\xc3'''\n"
        >>> import encoding_bug
        >>> encoding_bug.__doc__
        'Ã'
        >>> co = compile(b, 'some_path', 'exec')
        >>> co
        <code object <module> at 0x7f447e1b0c90, file "some_path", line 1>
        >>> co.co_consts[0]
        'Ã'

    It's just Py_CompileString() that fails. I don't understand why, and I do believe it's a regression.

    @miss-islington
    Copy link
    Contributor

    New changeset da20d74 by Pablo Galindo Salgado in branch 'main':
    bpo-45822: Respect PEP-263's coding cookies in the parser even if flags are not provided (GH-29582)
    da20d74

    @ambv
    Copy link
    Contributor

    ambv commented Nov 17, 2021

    New changeset e3aa9fd by Pablo Galindo Salgado in branch '3.10':
    [3.10] bpo-45822: Respect PEP-263's coding cookies in the parser even if flags are not provided (GH-29582) (GH-29586)
    e3aa9fd

    @ambv
    Copy link
    Contributor

    ambv commented Nov 17, 2021

    New changeset 0ef308a by Pablo Galindo Salgado in branch '3.9':
    bpo-45822: Respect PEP-263's coding cookies in the parser even if flags are not provided (GH-29582) (GH-29585)
    0ef308a

    @pablogsal
    Copy link
    Member Author

    New changeset abfc794 by Pablo Galindo Salgado in branch 'main':
    bpo-45822: Minor cleanups to the test_Py_CompileString test (GH-29750)
    abfc794

    @ambv
    Copy link
    Contributor

    ambv commented Dec 11, 2021

    New changeset e1e3f64 by Miss Islington (bot) in branch '3.10':
    bpo-45822: Minor cleanups to the test_Py_CompileString test (GH-29750) (GH-29758)
    e1e3f64

    @ambv
    Copy link
    Contributor

    ambv commented Dec 11, 2021

    New changeset 5f622f1 by Miss Islington (bot) in branch '3.9':
    bpo-45822: Minor cleanups to the test_Py_CompileString test (GH-29750) (GH-29759)
    5f622f1

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants