Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support named Unicode escapes (\N{name}) in re #74873

Closed
jonathaneunice mannequin opened this issue Jun 17, 2017 · 7 comments
Closed

support named Unicode escapes (\N{name}) in re #74873

jonathaneunice mannequin opened this issue Jun 17, 2017 · 7 comments
Assignees
Labels
3.8 only security fixes stdlib Python modules in the Lib dir topic-regex type-feature A feature request or enhancement

Comments

@jonathaneunice
Copy link
Mannequin

jonathaneunice mannequin commented Jun 17, 2017

BPO 30688
Nosy @ned-deily, @ezio-melotti, @serhiy-storchaka, @jonathaneunice, @fangyi-zhou
PRs
  • bpo-30688: support \N{name} escapes in re patterns #2261
  • bpo-30688: Support \N{name} escapes in re patterns. #5588
  • bpo-30688: Import unicodedata only when needed #5606
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2018-02-09.22:09:13.904>
    created_at = <Date 2017-06-17.07:38:56.908>
    labels = ['expert-regex', '3.8', 'type-feature', 'library']
    title = 'support named Unicode escapes (\\N{name}) in re'
    updated_at = <Date 2018-02-10.07:03:38.402>
    user = 'https://github.com/jonathaneunice'

    bugs.python.org fields:

    activity = <Date 2018-02-10.07:03:38.402>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2018-02-09.22:09:13.904>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)', 'Regular Expressions']
    creation = <Date 2017-06-17.07:38:56.908>
    creator = 'jonathaneunice'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30688
    keywords = ['patch']
    message_count = 7.0
    messages = ['296234', '311912', '311913', '311923', '311926', '311938', '311939']
    nosy_count = 6.0
    nosy_names = ['ned.deily', 'ezio.melotti', 'mrabarnett', 'serhiy.storchaka', 'jonathaneunice', 'fangyizhou']
    pr_nums = ['2261', '5588', '5606']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue30688'
    versions = ['Python 3.8']

    @jonathaneunice
    Copy link
    Mannequin Author

    jonathaneunice mannequin commented Jun 17, 2017

    The re module specially handles Unicode escapes (\uXXXX and \UXXXXXXXX) so that even raw strings (r'...') have symbolic Unicode characters. But it has not supported named Unicode escapes such as r'\N{EM DASH}', making the escapes for string literals and the escapes for regular expressions asymmetric

    @jonathaneunice jonathaneunice mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jun 17, 2017
    @serhiy-storchaka serhiy-storchaka self-assigned this Jun 17, 2017
    @serhiy-storchaka serhiy-storchaka added 3.8 only security fixes and removed 3.7 (EOL) end of life labels Feb 4, 2018
    @serhiy-storchaka
    Copy link
    Member

    New changeset a445feb by Serhiy Storchaka in branch 'master':
    bpo-30688: Support \N{name} escapes in re patterns. (GH-5588)
    a445feb

    @serhiy-storchaka
    Copy link
    Member

    Thank you for your contribution Jonathan!

    @fangyi-zhou
    Copy link
    Mannequin

    fangyi-zhou mannequin commented Feb 10, 2018

    Hello

    This leads to build failures due to circular dependency

    At generate-posix-vars stage, unicodedata is imported (due to import pprint)but it has not been built due to it being a C module. However, building C modules happen after generate-posix-vars.

    See full message below.

    ./python.exe -E -S -m sysconfig --generate-posix-vars ;\
    	if test $? -ne 0 ; then \
    		echo "generate-posix-vars failed" ; \
    		rm -f ./pybuilddir.txt ; \
    		exit 1 ; \
    	fi
    Traceback (most recent call last):
      File "/Users/fangyi/cpython/Lib/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/Users/fangyi/cpython/Lib/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/Users/fangyi/cpython/Lib/sysconfig.py", line 700, in <module>
        _main()
      File "/Users/fangyi/cpython/Lib/sysconfig.py", line 688, in _main
        _generate_posix_vars()
      File "/Users/fangyi/cpython/Lib/sysconfig.py", line 350, in _generate_posix_vars
        import pprint
      File "/Users/fangyi/cpython/Lib/pprint.py", line 38, in <module>
        import re
      File "/Users/fangyi/cpython/Lib/re.py", line 123, in <module>
        import sre_compile
      File "/Users/fangyi/cpython/Lib/sre_compile.py", line 14, in <module>
        import sre_parse
      File "/Users/fangyi/cpython/Lib/sre_parse.py", line 16, in <module>
        import unicodedata
    ModuleNotFoundError: No module named 'unicodedata'
    generate-posix-vars failed

    @ned-deily
    Copy link
    Member

    The buidbots are broken by this. Please fix or revert.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 5df5286 by Serhiy Storchaka (Zhou Fangyi) in branch 'master':
    bpo-30688: Import unicodedata only when needed. (GH-5606)
    5df5286

    @serhiy-storchaka
    Copy link
    Member

    Thank you Fangyi Zhou for your report and fix. Changes are trivial and didn't require to sign CLA.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes stdlib Python modules in the Lib dir topic-regex type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants