Message 359538 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	pombredanne
Recipients	christian.heimes, ezio.melotti, pombredanne, vstinner, zamsalak
Date	2020-01-07.20:34:35
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1578429275.84.0.204773944655.issue34723@roundup.psfhosted.org>
In-reply-to

Content
Thank for the (re) explanation. Unicode is tough! Basically this is the issue i have really in the end with the folding: what used to be a proper alpha string is not longer one after a lower() because the second codepoint is a punctuation and I use a regex split on the \W word class that then behaves differently when the string is lowercased as we have an extra punctuation then to break on. I will find a way around these (rare) cases alright! Sorry for the noise. ``` >>> 'İ'.isalpha() True >>> 'İ'.lower().isalpha() False ```

Thank for the (re) explanation. Unicode is tough!
Basically this is the issue i have really in the end with the folding: what used to be a proper alpha string is not longer one after a lower() because the second codepoint is a punctuation and I use a regex split on the \W word class that then behaves differently when the string is lowercased as we have an extra punctuation then to break on. I will find a way around these (rare) cases alright! 

Sorry for the noise.

```
>>> 'İ'.isalpha()
True
>>> 'İ'.lower().isalpha()
False
```

History
Date	User	Action	Args
2020-01-07 20:34:35	pombredanne	set	recipients: + pombredanne, vstinner, christian.heimes, ezio.melotti, zamsalak
2020-01-07 20:34:35	pombredanne	set	messageid: <1578429275.84.0.204773944655.issue34723@roundup.psfhosted.org>
2020-01-07 20:34:35	pombredanne	link	issue34723 messages
2020-01-07 20:34:35	pombredanne	create