Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double coding cookie #70768

Closed
serhiy-storchaka opened this issue Mar 17, 2016 · 8 comments
Closed

Double coding cookie #70768

serhiy-storchaka opened this issue Mar 17, 2016 · 8 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 26581
Nosy @malemburg, @gvanrossum, @loewis, @vstinner, @serhiy-storchaka
Files
  • tokenize_double_coding.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-03-20.21:52:08.611>
    created_at = <Date 2016-03-17.12:00:36.043>
    labels = ['interpreter-core', 'type-bug', 'library']
    title = 'Double coding cookie'
    updated_at = <Date 2016-03-20.21:52:08.611>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2016-03-20.21:52:08.611>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-03-20.21:52:08.611>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core', 'Library (Lib)']
    creation = <Date 2016-03-17.12:00:36.043>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['42185']
    hgrepos = []
    issue_num = 26581
    keywords = ['patch']
    message_count = 8.0
    messages = ['261909', '262051', '262052', '262053', '262054', '262089', '262090', '262092']
    nosy_count = 6.0
    nosy_names = ['lemburg', 'gvanrossum', 'loewis', 'vstinner', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue26581'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6']

    @serhiy-storchaka
    Copy link
    Member Author

    When Python source file contains double coding cookies on different lines, the first wins. When it contains double coding cookies on the same line, the last wins.

    PEP-263 was sufficiently vague about this. Now this is clarified (22490711c870). The first coding cookie should always win.

    Proposed patch fixes Python tokenizer, the tokenize module, and other places. Tests are taken from bpo-25643.

    @serhiy-storchaka serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Mar 17, 2016
    @serhiy-storchaka
    Copy link
    Member Author

    I just tested with Emacs, and it looks that when specify different codings on two different lines, the first coding wins, but when specify different codings on the same line, the last coding wins.

    Therefore current CPython behavior can be correct, and the regular expression in PEP-263 should be changed to use greedy repetition.

    @gvanrossum
    Copy link
    Member

    Do you have write permission to the PEP? Just update it.

    @serhiy-storchaka
    Copy link
    Member Author

    Yes, I have. But I were not sure what behavior should be correct in Python. On one side, always choosing the first declaration (on the same or on different lines) looks more consistent. On other side, current behavior was in CPython from the initial implementing PEP-263 in bpo-526840 and it matches Emacs behavior (if I understand this correctly).

    I can update the regular expression, but may be this obscure corner case needs the verbal explanation.

    @gvanrossum
    Copy link
    Member

    Right. Please go ahead with both. I am fine with defining the current
    behavior correct.

    --Guido (mobile)
    On Mar 19, 2016 9:37 AM, "Serhiy Storchaka" <report@bugs.python.org> wrote:

    Serhiy Storchaka added the comment:

    Yes, I have. But I were not sure what behavior should be correct in
    Python. On one side, always choosing the first declaration (on the same or
    on different lines) looks more consistent. On other side, current behavior
    was in CPython from the initial implementing PEP-263 in bpo-526840 and it
    matches Emacs behavior (if I understand this correctly).

    I can update the regular expression, but may be this obscure corner case
    needs the verbal explanation.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue26581\>


    @serhiy-storchaka
    Copy link
    Member Author

    Ah, I made a mistake! In 2.7 the first coding on the same line wins. And that behavior was from start. Regression was unintentionally introduced in bpo-18470.

    Thus *there is* a bug in Python 3. PEP-263 doesn't need more changes, but Python tokenizer and related tools do.

    Sorry for misleading.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 20, 2016

    New changeset 23a7481eafd4 by Serhiy Storchaka in branch 'default':
    Issues bpo-25643, bpo-26581: Added new tests for detecting Python source code encoding.
    https://hg.python.org/cpython/rev/23a7481eafd4

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 20, 2016

    New changeset 1c44cea2ea8f by Serhiy Storchaka in branch '3.5':
    Issue bpo-26581: Use the first coding cookie on a line, not the last one.
    https://hg.python.org/cpython/rev/1c44cea2ea8f

    New changeset 8506d127d482 by Serhiy Storchaka in branch '2.7':
    Issue bpo-26581: Use the first coding cookie on a line, not the last one.
    https://hg.python.org/cpython/rev/8506d127d482

    New changeset e86cd4a872b8 by Serhiy Storchaka in branch 'default':
    Issue bpo-26581: Use the first coding cookie on a line, not the last one.
    https://hg.python.org/cpython/rev/e86cd4a872b8

    @serhiy-storchaka serhiy-storchaka self-assigned this Mar 20, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants