Message371657
> It seems you don't know some knowledge of encoding yet.
I don't have to be ashamed of my knowledge of encoding. Yet you are right that I was missing a subtlety, which is that latin-1 is a strict subset of Unicode rather than a completely arbitrary encoding. Thank you for that.
So what you are saying is that group names in bytes regexes can only be specified directly (without -explicit- encoding), so de facto they are limited to the latin-1 subset.
Very well.
But then, once again:
1) why convert them to string when spitting them out? bytes they were when going in, bytes they should remain... **By converting them you are choosing an arbitrary encoding, even if it is the "natural" one.**
2) this limitation to the latin-1 subset is not compatible with the documentation, which says that valid Python identifiers are valid group names. If this was really the case, then I would expect to be able to use any string for which .isidentifier() is true as a group name, programmatically. |
|
Date |
User |
Action |
Args |
2020-06-16 13:51:43 | matpi | set | recipients:
+ matpi, malin |
2020-06-16 13:51:43 | matpi | set | messageid: <1592315503.45.0.107578900502.issue40980@roundup.psfhosted.org> |
2020-06-16 13:51:43 | matpi | link | issue40980 messages |
2020-06-16 13:51:43 | matpi | create | |
|