Author taleinat
Recipients ezio.melotti, lemburg, loewis, taleinat, terry.reedy
Date 2014-06-22.16:07:28
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1403453248.76.0.109053443584.issue21765@psf.upfronthosting.co.za>
In-reply-to
Content
> Note that the proposed patch only manages to replicate the
> ID_Start and ID_Continue properties.

Is this just because of the mishandling of the Other_ID_Start and Other_ID_Continue properties, or something else as well? I based my code on the definitions in:

https://docs.python.org/3/reference/lexical_analysis.html#identifiers

Are those actually wrong?


Note that my code uses category(normalize(char)[0]), so it's always making sure that the first character is valid. Actually, though, I now realize that it should check all of the values returned by normalize().

Regarding testing ('a'+something).isidentifier(), Terry already suggested something along those lines. I think I'll end up using something of the sort, to avoid adding additional complex Unicode-related code to maintain in the future.
History
Date User Action Args
2014-06-22 16:07:28taleinatsetrecipients: + taleinat, lemburg, loewis, terry.reedy, ezio.melotti
2014-06-22 16:07:28taleinatsetmessageid: <1403453248.76.0.109053443584.issue21765@psf.upfronthosting.co.za>
2014-06-22 16:07:28taleinatlinkissue21765 messages
2014-06-22 16:07:28taleinatcreate