Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re: DeprecationWarning for flag not at the start of expression is cutoff too early #83575

Closed
jugmac00 mannequin opened this issue Jan 20, 2020 · 8 comments
Closed

re: DeprecationWarning for flag not at the start of expression is cutoff too early #83575

jugmac00 mannequin opened this issue Jan 20, 2020 · 8 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes topic-regex type-feature A feature request or enhancement

Comments

@jugmac00
Copy link
Mannequin

jugmac00 mannequin commented Jan 20, 2020

BPO 39394
Nosy @ezio-melotti, @markshannon, @serhiy-storchaka, @jugmac00, @miss-islington
PRs
  • bpo-39394: Improve warning message in the re module #31988
  • [3.10] bpo-39394: Improve warning message in the re module (GH-31988) #31989
  • [3.9] bpo-39394: Improve warning message in the re module (GH-31988) #31990
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-03-19.14:11:50.447>
    created_at = <Date 2020-01-20.10:44:04.924>
    labels = ['expert-regex', 'type-feature', '3.9', '3.10', '3.11']
    title = 're: DeprecationWarning for `flag not at the start of expression` is cutoff too early'
    updated_at = <Date 2022-03-19.14:11:50.447>
    user = 'https://github.com/jugmac00'

    bugs.python.org fields:

    activity = <Date 2022-03-19.14:11:50.447>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-03-19.14:11:50.447>
    closer = 'serhiy.storchaka'
    components = ['Regular Expressions']
    creation = <Date 2020-01-20.10:44:04.924>
    creator = 'jugmac00'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 39394
    keywords = ['patch']
    message_count = 8.0
    messages = ['360306', '360308', '360316', '393802', '415541', '415545', '415550', '415551']
    nosy_count = 6.0
    nosy_names = ['ezio.melotti', 'mrabarnett', 'Mark.Shannon', 'serhiy.storchaka', 'jugmac00', 'miss-islington']
    pr_nums = ['31988', '31989', '31990']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue39394'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @jugmac00
    Copy link
    Mannequin Author

    jugmac00 mannequin commented Jan 20, 2020

    The usage of flags not at the start of an expression is deprecated.

    Also see "Deprecate the use of flags not at the start of regular expression" / https://bugs.python.org/issue22493

    A deprecation warning is issued, but is cutoff at 20 characters.

    For complex expressions this is way too small.

    Example ( jedie/python-creole#31 ):

    current output

    /home/jugmac00/Projects/bliss_deployment/work//home/jugmac00/.batou-shared-eggs/python_creole-1.3.2-py3.7.egg/creole/parser/creol2html_parser.py:48
    /home/jugmac00/Projects/bliss_deployment/work/
    /home/jugmac00/.batou-shared-eggs/python_creole-1.3.2-py3.7.egg/creole/parser/creol2html_parser.py:48: DeprecationWarning: Flags not at the start of the expression '(?P<image>\n ' (truncated)
    re.VERBOSE | re.UNICODE

    output with patched sre_parse.py

    creole/parser/creol2html_parser.py:51
    /home/jugmac00/Projects/python-creole/creole/parser/creol2html_parser.py:51: DeprecationWarning: Flags not at the start of the expression '\n \\| \\s*\n (\n (?P<head> [=][^|]+ ) |\n (?P<cell> ( (?P<link>\n \\[\\[\n (?P<link_target>.+?) \\s*\n ([|] \\s* (?P<link_text>.+?) \\s*)?\n ]]\n )|\n (?P<macro_inline>\n << \\s* (?P<macro_inline_start>\\w+) \\s* (?P<macro_inline_args>.*?) \\s* >>\n (?P<macro_inline_text>(.|\\n)?)\n <</ \\s (?P=macro_inline_start) \\s* >>\n )\n |(?P<macro_tag>\n <<(?P<macro_tag_name> \\w+) (?P<macro_tag_args>.*?) \\s* />>\n )|(?i)(?P<image>\n {{\n (?P<image_target>.+?) \\s\n (\\| \\s* (?P<image_text>.+?) \\s*)?\n }}\n )|(?P<pre_inline> {{{ (?P<pre_inline_text>.*?) }}} ) | [^|])+ )\n ) \\s*\n '
    cell_re = re.compile(x, re.VERBOSE | re.UNICODE)

    (Line number differs because there was a change in the source between these two test runs).

    I would like to create a pr and remove the limitation to 20 characters completely, but wanted to get feedback before I do so.

    The deprecation warning was created by Tim Graham - maybe he could elaborate why it was cut at 20 chars at first?
    abf275a

    @jugmac00 jugmac00 mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes topic-regex type-feature A feature request or enhancement labels Jan 20, 2020
    @serhiy-storchaka
    Copy link
    Member

    Why do you want to output the full regular expression? Is not source file path, line number, and starting 20 characters not enough to identify the affected regular expression?

    @vstinner vstinner changed the title DeprecationWarning for flag not at the start of expression is cutoff too early re: DeprecationWarning for flag not at the start of expression is cutoff too early Jan 20, 2020
    @vstinner vstinner changed the title DeprecationWarning for flag not at the start of expression is cutoff too early re: DeprecationWarning for flag not at the start of expression is cutoff too early Jan 20, 2020
    @jugmac00
    Copy link
    Mannequin Author

    jugmac00 mannequin commented Jan 20, 2020

    Why do you want to output the full regular expression?

    The current output gives no clue about which flag is problematic, nor does it show the complete output (which at least would include the problematic flag), nor does it show the exact line, as it refers only to the line where compile gets called.

    The warning points to following line ( https://github.com/jedie/python-creole/blob/4e74f29daaf5026a3d4d6dae9f2e74f5f3655439/creole/parser/creol2html_parser.py#L49-L50 ):

    cell_re = re.compile(SpecialRules.cell, re.VERBOSE | re.UNICODE)

    And SpecialRules.cell is a quite a big class ( https://github.com/jedie/python-creole/blob/4e74f29daaf5026a3d4d6dae9f2e74f5f3655439/creole/parser/creol2html_rules.py#L16-L97 ) defining lots of partial expressions.

    Even if spotting this line ( https://github.com/jedie/python-creole/blob/4e74f29daaf5026a3d4d6dae9f2e74f5f3655439/creole/parser/creol2html_rules.py#L54 ) at the first glance it looks like it starts with the flag and should be correct (but is not as it turned out later).

    Is not source file path, line number, and starting 20 characters not enough to identify the affected regular expression?

    It definitely was not enough for me (new to this code base as I only tried to report deprecation warnings in my application), and when you have a look at the comment ( jedie/python-creole#31 (comment) ) it even was not enough for the author/maintainer of this package.

    Do you expect any downside of printing the complete warning?

    @markshannon
    Copy link
    Member

    I have to admit that I find the truncated version more readable.

    Some sort of truncation is useful, as a regex could be thousands of character long.

    Adding the offset to the warning message seems like a useful addition.

    @serhiy-storchaka
    Copy link
    Member

    This warning was introduced in 3.6. It is a time to convert it into an error. RE error messages contain position.

    But I understand that very few users will use 3.11 in nearest future, so I am going to add a position to warning message and backport this change. It is not a bugfix in strong meaning, but I think it is safe to backport it.

    @serhiy-storchaka serhiy-storchaka added 3.10 only security fixes 3.11 only security fixes and removed 3.7 (EOL) end of life 3.8 only security fixes labels Mar 19, 2022
    @serhiy-storchaka
    Copy link
    Member

    New changeset 4142961 by Serhiy Storchaka in branch 'main':
    bpo-39394: Improve warning message in the re module (GH-31988)
    4142961

    @miss-islington
    Copy link
    Contributor

    New changeset 906f1a4 by Miss Islington (bot) in branch '3.10':
    bpo-39394: Improve warning message in the re module (GH-31988)
    906f1a4

    @miss-islington
    Copy link
    Contributor

    New changeset cbcd2e3 by Miss Islington (bot) in branch '3.9':
    bpo-39394: Improve warning message in the re module (GH-31988)
    cbcd2e3

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes topic-regex type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants