Message 349014 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Greg Price
Recipients	Greg Price
Date	2019-08-05.01:06:02
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1564967162.48.0.525882022653.issue37758@roundup.psfhosted.org>
In-reply-to

Content
The unicodedata module has two test cases which run through the database and make a hash of its visible outputs for all codepoints, comparing the hash against a checksum. These are helpful regression tests for making sure the behavior isn't changed by patches that didn't intend to change it. But Unicode has grown since Python first gained support for it, when Unicode itself was still rather new. These test cases were added in commit 6a20ee7de back in 2000, and they haven't needed to change much since then... but they should be changed to look beyond the Basic Multilingual Plane (`range(0x10000)`) and cover all 17 planes of Unicode's final form. Spotted in discussion on GH-15019 (https://github.com/python/cpython/pull/15019#discussion_r308947884 ). I have a patch for this which I'll send shortly.

The unicodedata module has two test cases which run through the database and make a hash of its visible outputs for all codepoints, comparing the hash against a checksum.  These are helpful regression tests for making sure the behavior isn't changed by patches that didn't intend to change it.

But Unicode has grown since Python first gained support for it, when Unicode itself was still rather new.  These test cases were added in commit 6a20ee7de back in 2000, and they haven't needed to change much since then... but they should be changed to look beyond the Basic Multilingual Plane (`range(0x10000)`) and cover all 17 planes of Unicode's final form.

Spotted in discussion on GH-15019 (https://github.com/python/cpython/pull/15019#discussion_r308947884 ).  I have a patch for this which I'll send shortly.

History
Date	User	Action	Args
2019-08-05 01:06:02	Greg Price	set	recipients: + Greg Price
2019-08-05 01:06:02	Greg Price	set	messageid: <1564967162.48.0.525882022653.issue37758@roundup.psfhosted.org>
2019-08-05 01:06:02	Greg Price	link	issue37758 messages
2019-08-05 01:06:02	Greg Price	create