Message224444
Martin, i think the most intuitive and easiest way for working with strings in C are just char arrays.
Starting with the main() argv being char*, probably most programmers just go with char* and all the encoding just works.
This is because contact with encoding is only needed for the user input software (xorg, keyboard input) and user output (-> your terminal emulator, the gui, ...).
No matter what stuff your program receives, the encoding only matters for the actual output display software to select the correct visual representation.
Requiring a conversion to wide chars just increases the interface complexity and adds really unneeded data transformations that are completely obsolete with UTF-8.
What I'd really like to see in CPython is that the internal storage (and the way it's exposed in the C-API) is just raw bytes (=> char*).
This allows super-easy integration in C projects that probably all just use char as their string type (see the doc example mentioned earlier).
PEP 393 states: "(..) the specification chooses UTF-8 as the recommended way of exposing strings to C code."
And for that, I think using char instead of wchar_t is a better solution for interface developers. |
|
Date |
User |
Action |
Args |
2014-07-31 20:03:00 | jj | set | recipients:
+ jj, loewis, vstinner, ezio.melotti, zach.ware |
2014-07-31 20:03:00 | jj | set | messageid: <1406836980.31.0.548665855711.issue22108@psf.upfronthosting.co.za> |
2014-07-31 20:03:00 | jj | link | issue22108 messages |
2014-07-31 20:03:00 | jj | create | |
|