classification
Title: support named Unicode escapes (\N{name}) in re
Type: enhancement Stage: resolved
Components: Library (Lib), Regular Expressions Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: ezio.melotti, fangyizhou, jonathaneunice, mrabarnett, ned.deily, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-06-17 07:38 by jonathaneunice, last changed 2018-02-10 07:03 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 2261 closed jonathaneunice, 2017-06-17 08:44
PR 5588 merged serhiy.storchaka, 2018-02-08 17:09
PR 5606 merged fangyizhou, 2018-02-10 01:13
Messages (7)
msg296234 - (view) Author: Jonathan Eunice (jonathaneunice) * Date: 2017-06-17 07:38
The re module specially handles Unicode escapes (\uXXXX and \UXXXXXXXX) so that even raw strings (r'...') have symbolic Unicode characters. But it has not supported named Unicode escapes such as r'\N{EM DASH}', making the escapes for string literals and the escapes for regular expressions asymmetric
msg311912 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-02-09 22:08
New changeset a445feb72902e4a3c5ae712f0c289309e1580d52 by Serhiy Storchaka in branch 'master':
bpo-30688: Support \N{name} escapes in re patterns. (GH-5588)
https://github.com/python/cpython/commit/a445feb72902e4a3c5ae712f0c289309e1580d52
msg311913 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-02-09 22:09
Thank you for your contribution Jonathan!
msg311923 - (view) Author: Fangyi Zhou (fangyizhou) * Date: 2018-02-10 00:31
Hello

This leads to build failures due to circular dependency

At generate-posix-vars stage, unicodedata is imported (due to import pprint)but it has not been built due to it being a C module. However, building C modules happen after generate-posix-vars.

See full message below.

./python.exe -E -S -m sysconfig --generate-posix-vars ;\
	if test $? -ne 0 ; then \
		echo "generate-posix-vars failed" ; \
		rm -f ./pybuilddir.txt ; \
		exit 1 ; \
	fi
Traceback (most recent call last):
  File "/Users/fangyi/cpython/Lib/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/fangyi/cpython/Lib/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/fangyi/cpython/Lib/sysconfig.py", line 700, in <module>
    _main()
  File "/Users/fangyi/cpython/Lib/sysconfig.py", line 688, in _main
    _generate_posix_vars()
  File "/Users/fangyi/cpython/Lib/sysconfig.py", line 350, in _generate_posix_vars
    import pprint
  File "/Users/fangyi/cpython/Lib/pprint.py", line 38, in <module>
    import re
  File "/Users/fangyi/cpython/Lib/re.py", line 123, in <module>
    import sre_compile
  File "/Users/fangyi/cpython/Lib/sre_compile.py", line 14, in <module>
    import sre_parse
  File "/Users/fangyi/cpython/Lib/sre_parse.py", line 16, in <module>
    import unicodedata
ModuleNotFoundError: No module named 'unicodedata'
generate-posix-vars failed
msg311926 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-02-10 02:41
The buidbots are broken by this.  Please fix or revert.
msg311938 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-02-10 06:59
New changeset 5df5286abda57a0b3865d4fc3e25aaf1a820ef49 by Serhiy Storchaka (Zhou Fangyi) in branch 'master':
bpo-30688: Import unicodedata only when needed. (GH-5606)
https://github.com/python/cpython/commit/5df5286abda57a0b3865d4fc3e25aaf1a820ef49
msg311939 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-02-10 07:03
Thank you Fangyi Zhou for your report and fix. Changes are trivial and didn't require to sign CLA.
History
Date User Action Args
2018-02-10 07:03:38serhiy.storchakasetpriority: critical -> normal
2018-02-10 07:03:26serhiy.storchakasetmessages: + msg311939
2018-02-10 06:59:32serhiy.storchakasetmessages: + msg311938
2018-02-10 02:41:04ned.deilysetpriority: normal -> critical
nosy: + ned.deily
messages: + msg311926

2018-02-10 01:13:18fangyizhousetpull_requests: + pull_request5417
2018-02-10 00:31:20fangyizhousetnosy: + fangyizhou
messages: + msg311923
2018-02-09 22:09:13serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg311913

stage: patch review -> resolved
2018-02-09 22:08:19serhiy.storchakasetmessages: + msg311912
2018-02-08 17:09:14serhiy.storchakasetkeywords: + patch
pull_requests: + pull_request5405
2018-02-04 12:53:57serhiy.storchakasetversions: + Python 3.8, - Python 3.7
2017-06-17 10:17:15serhiy.storchakasetnosy: + ezio.melotti
components: + Regular Expressions
2017-06-17 10:16:55serhiy.storchakasetassignee: serhiy.storchaka

nosy: + serhiy.storchaka, mrabarnett
stage: patch review
2017-06-17 08:44:51jonathaneunicesetpull_requests: + pull_request2311
2017-06-17 07:38:56jonathaneunicecreate