Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove deprecated re features #71217

Closed
serhiy-storchaka opened this issue May 15, 2016 · 16 comments
Closed

Remove deprecated re features #71217

serhiy-storchaka opened this issue May 15, 2016 · 16 comments
Assignees
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir topic-regex type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 27030
Nosy @warsaw, @ned-deily, @ezio-melotti, @bitdancer, @serhiy-storchaka
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Files
  • re_remove_deprecated.patch
  • re-sub-allow-unknown-escapes.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-12-06.22:26:24.830>
    created_at = <Date 2016-05-15.18:38:16.972>
    labels = ['expert-regex', 'type-feature', 'library', '3.7']
    title = 'Remove deprecated re features'
    updated_at = <Date 2017-03-31.16:36:29.026>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2017-03-31.16:36:29.026>
    actor = 'dstufft'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-12-06.22:26:24.830>
    closer = 'ned.deily'
    components = ['Library (Lib)', 'Regular Expressions']
    creation = <Date 2016-05-15.18:38:16.972>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['42862', '45774']
    hgrepos = []
    issue_num = 27030
    keywords = ['patch']
    message_count = 16.0
    messages = ['265641', '268222', '268223', '268224', '268334', '281485', '281487', '281491', '281496', '281498', '281503', '281511', '282514', '282556', '282557', '282572']
    nosy_count = 7.0
    nosy_names = ['barry', 'ned.deily', 'ezio.melotti', 'mrabarnett', 'r.david.murray', 'python-dev', 'serhiy.storchaka']
    pr_nums = ['552']
    priority = None
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue27030'
    versions = ['Python 3.6', 'Python 3.7']

    @serhiy-storchaka
    Copy link
    Member Author

    Proposed patch removes following deprecated re features:

    • Three unused not documented functions: isident(), isdigit() and isname(). They were deprecated since Python 3.3 (bpo-14462).

    • '\' + ASCII character now is error if this combination is not defined. This allows to add new control combinations without breaking compatibility. This was deprecated since Python 3.5 (bpo-23622).

    • Support for re.LOCALE with string patterns or with re.ASCII. This was deprecated since Python 3.5 (bpo-22407).

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir topic-regex type-feature A feature request or enhancement labels May 15, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 11, 2016

    New changeset 09d1af3fe332 by Serhiy Storchaka in branch 'default':
    Issue bpo-27030: Unknown escapes consisting of '\' and ASCII letter in
    https://hg.python.org/cpython/rev/09d1af3fe332

    @serhiy-storchaka
    Copy link
    Member Author

    Thanks Jim for the review.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 11, 2016

    New changeset 8ed3880e94e5 by Serhiy Storchaka in branch 'default':
    Issue bpo-27030: The re.LOCALE flag now can be used only with bytes patterns.
    https://hg.python.org/cpython/rev/8ed3880e94e5

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 12, 2016

    New changeset a2482e805dff by Martin Panter in branch '3.5':
    Fix buggy RE “\parrot_example.py”, uncovered by Issue bpo-27030
    https://hg.python.org/cpython/rev/a2482e805dff

    New changeset be193f8dbe4c by Martin Panter in branch 'default':
    Issue bpo-27030: Merge RE fix from 3.5
    https://hg.python.org/cpython/rev/be193f8dbe4c

    New changeset c5411cfa0bd3 by Martin Panter in branch '2.7':
    Fix buggy RE “\parrot_example.py”, uncovered by Issue bpo-27030
    https://hg.python.org/cpython/rev/c5411cfa0bd3

    @warsaw
    Copy link
    Member

    warsaw commented Nov 22, 2016

    FWIW, this breaks Mailman 3.1 (and probably 2.1)

    @warsaw
    Copy link
    Member

    warsaw commented Nov 22, 2016

    Specifically, point #2; undefined combinations of \ + ASCII becoming an error.

    @serhiy-storchaka
    Copy link
    Member Author

    Could Mailman be fixed? Undefined combinations of \ + ASCII emitted warnings in 3.5. And now they emit warnings even just in string literals (bpo-27364). If Mailman use undefined escape combinations, it could suffer from bpo-27364 too.

    @warsaw
    Copy link
    Member

    warsaw commented Nov 22, 2016

    On Nov 22, 2016, at 04:13 PM, Serhiy Storchaka wrote:

    Could Mailman be fixed? Undefined combinations of \ + ASCII emitted warnings
    in 3.5. And now they emit warnings even just in string literals
    (bpo-27364). If Mailman use undefined escape combinations, it could suffer
    from bpo-27364 too.

    No, I think this is a valid bug/regression.

    The Mailman code is basically trying to do this:

        p = re.compile('%\d*d')
        p.sub(r'\s*\d+\s*', some_string)

    And so we get the error:

    sre_constants.error: bad escape \s at position 0

    But this directly contradicts the documentation for re.sub():

    "... if it is a string, any backslash escapes in it are processed. That is, \n
    is converted to a single newline character, \r is converted to a carriage
    return, and so forth. Unknown escapes such as \& are left alone."

    Clearly \s is not being left alone, so this is a real regression.

    @serhiy-storchaka
    Copy link
    Member Author

    This part of the documentation was just overlooked. bpo-28450 is opened for this.

    @serhiy-storchaka
    Copy link
    Member Author

    If you insist I could revert converting warnings to errors (only in replacement string or all?) in 3.6. But I think they should left errors in 3.7. The earlier we make undefined escapes the errors, the earlier we can define new special escape sequences without confusing users. It is bad if the escape sequence is valid in two Python versions but has different meaning.

    @bitdancer
    Copy link
    Member

    There is still the argument that we shouldn't break 2.7 compatibility unnecessarily until 2.7 is out of maintenance. That is: warnings are good, removals are bad. (I haven't read through this issue, so I may be off base.)

    @serhiy-storchaka
    Copy link
    Member Author

    Here is a patch that partially reverts changes of bpo-27030. It allows using unknown escapes in re.sub() replacement template, but they are deprecated and will be errors in 3.7.

    If this is good to you Barry, and Ned allows, it can be committed in 3.6 only.

    But this could delay adding support of new escapes (like \xXX, \uXXXX, etc) in replacement templates.

    @ned-deily
    Copy link
    Member

    It is unfortunate that the deprecation note did not explicitly mention replacement templates as well as regexp patterns. While there are good arguments to be made for either case, I think it makes more sense to treat the replacement template as following the rules for strings rather than for regexps which would argue for making the proposed change to cause only a deprecation warning in templates for 3.6 - although I'm not happy about making such a change at the last minute. Serhiy, please push to 3.6 for rc1. There should also be a note in the What's New for 3.6 document.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 6, 2016

    New changeset 1b162d6e3d01 by Serhiy Storchaka in branch '3.6':
    Issue bpo-27030: Unknown escapes in re.sub() replacement template are allowed
    https://hg.python.org/cpython/rev/1b162d6e3d01

    New changeset 5904d2ced3d8 by Serhiy Storchaka in branch 'default':
    Merge documentation for issue bpo-27030 from 3.6.
    https://hg.python.org/cpython/rev/5904d2ced3d8

    @ned-deily
    Copy link
    Member

    Thanks, Serhiy, for reverting the error handling for 3.6.0 (in 3.6.0rc1). I'm going to mark this as closed. bpo-28450 is still open at the moment regarding the documentation and possible 3.7 changes.

    @ned-deily ned-deily added the 3.7 (EOL) end of life label Dec 6, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir topic-regex type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants