Message359538
Thank for the (re) explanation. Unicode is tough!
Basically this is the issue i have really in the end with the folding: what used to be a proper alpha string is not longer one after a lower() because the second codepoint is a punctuation and I use a regex split on the \W word class that then behaves differently when the string is lowercased as we have an extra punctuation then to break on. I will find a way around these (rare) cases alright!
Sorry for the noise.
```
>>> 'İ'.isalpha()
True
>>> 'İ'.lower().isalpha()
False
``` |
|
Date |
User |
Action |
Args |
2020-01-07 20:34:35 | pombredanne | set | recipients:
+ pombredanne, vstinner, christian.heimes, ezio.melotti, zamsalak |
2020-01-07 20:34:35 | pombredanne | set | messageid: <1578429275.84.0.204773944655.issue34723@roundup.psfhosted.org> |
2020-01-07 20:34:35 | pombredanne | link | issue34723 messages |
2020-01-07 20:34:35 | pombredanne | create | |
|