Message299706
I have a few criticism to do against that proto-PEP
http://mail.python.org/pipermail/python-dev/2001-July/015938.html
In particular, the fact that all those functions return an index prevents any state keeping.
That's a problem because:
> next_<indextype>(u, index) -> integer
As you've seen it, in grapheme clustering (as well as words and line breaking), we have to have an automaton to decide on the breaking point. Which means that starting at an arbitrary index is not possible.
> prev_<indextype>(u, index) -> integer
Is it really necessary? It means implementing the same logic to go backward. In our current case, we'd need a backward grapheme cluster break automaton too.
> <indextype>_start(u, index) -> integer
> <indextype>_end(u, index) -> integer
Not doable in O(1) for the same reason as next_<indextype>(). We need a context, and the code point itself cannot give enough information to know if it's the start/end of a given indextype. |
|
Date |
User |
Action |
Args |
2017-08-03 13:05:32 | Guillaume Sanchez | set | recipients:
+ Guillaume Sanchez, lemburg, loewis, terry.reedy, vstinner, benjamin.peterson, ezio.melotti, mrabarnett, steven.daprano, r.david.murray, serhiy.storchaka, Socob |
2017-08-03 13:05:32 | Guillaume Sanchez | set | messageid: <1501765532.54.0.799356866824.issue30717@psf.upfronthosting.co.za> |
2017-08-03 13:05:32 | Guillaume Sanchez | link | issue30717 messages |
2017-08-03 13:05:32 | Guillaume Sanchez | create | |
|