Message 410770 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	jirkamarsik
Recipients	ezio.melotti, jirkamarsik, mrabarnett
Date	2022-01-17.12:31:30
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1642422690.85.0.963373421806.issue46410@roundup.psfhosted.org>
In-reply-to

Content
re.compile(r"\N{name of Unicode Named Character Sequence}"), e.g. re.compile(r"\N{KEYCAP NUMBER SIGN}"), throws a TypeError. The regular expression parser relies on 'unicodedata' to lookup character names. The 'unicodedata' module recently added support for Unicode Named Character Sequences (https://www.unicode.org/Public/13.0.0/ucd/NamedSequences.txt). Trying to use these named character sequences in a regular expression leads to a 'TypeError', as the regexp parser tries to call 'ord' on a string with length > 1.

re.compile(r"\N{name of Unicode Named Character Sequence}"), e.g. re.compile(r"\N{KEYCAP NUMBER SIGN}"), throws a TypeError. The regular expression parser relies on 'unicodedata' to lookup character names. The 'unicodedata' module recently added support for Unicode Named Character Sequences (https://www.unicode.org/Public/13.0.0/ucd/NamedSequences.txt). Trying to use these named character sequences in a regular expression leads to a 'TypeError', as the regexp parser tries to call 'ord' on a string with length > 1.

History
Date	User	Action	Args
2022-01-17 12:31:30	jirkamarsik	set	recipients: + jirkamarsik, ezio.melotti, mrabarnett
2022-01-17 12:31:30	jirkamarsik	set	messageid: <1642422690.85.0.963373421806.issue46410@roundup.psfhosted.org>
2022-01-17 12:31:30	jirkamarsik	link	issue46410 messages
2022-01-17 12:31:30	jirkamarsik	create