Issue2799
Created on 2008-05-09 10:31 by lemburg, last changed 2008-05-10 14:11 by lemburg.
| Messages | |||
|---|---|---|---|
| msg66463 (view) | Author: Marc-Andre Lemburg (lemburg) | Date: 2008-05-09 10:31 | |
The API PyUnicode_AsString() is pretty useless by itself - there's no way to access the size information of the returned string without again going to the Unicode object. I'd suggest to remove the API altogether and not only deprecating it. Furthermore, the API PyUnicode_AsStringAndSize() does not follow the API signature of PyString_AsStringAndSize() in that it passes back the pointer to the string as output parameter. That should be changed as well. Note that PyString_AsStringAndSize() already does this for both 8-bit strings and Unicode, so the special Unicode API is not really needed at all or you may want to rename PyString_AsStringAndSize() to PyUnicode_AsStringAndSize(). Finally, since there are many cases where the string buffer contents are copied to a new buffer, it's probably worthwhile to add a new API which does the copying straight away and also deals with the overflow cases in a central place. I'd suggest PyUnicode_AsChar() (with an API like PyUnicode_AsWideChar()). (this was taken from a comment on #1950) |
|||
| msg66498 (view) | Author: Alexandre Vassalotti (alexandre.vassalotti) | Date: 2008-05-09 22:45 | |
Honestly, I am not sure if removing PyUnicode_AsString() is a good idea. There is many cases where the size of the returned string is not needed. Furthermore, this would be a rather major backward-incompatible change to be included in a beta release. [copied from duplicate issue #2807] |
|||
| msg66526 (view) | Author: Marc-Andre Lemburg (lemburg) | Date: 2008-05-10 14:11 | |
IMO, it's better to correct API design errors early, rather than going through a deprecation process. Note that PyUnicode_AsString() is also different than its cousind PyString_AsString(). PyString_AsString() is mostly used to access the char* buffer used by the string object in order to change it, e.g. by first constructing a new PyString object and then filling it in by accessing the internal char* buffer directly. Doing the same with PyUnicode_AsString() will not work. What's worse: direct changes would go undetected, since the UTF8 PyString object is held by the PyUnicode object internally. Even if you just use PyUnicode_AsString() for reading and get the size information from somewhere else, the API doesn't make sure that the PyUnicode object doesn't have embedded 0 code points (which PyString_AsString() does). PyUnicode_AsString() would have to use PyString_AsString() for this instead of the PyString_AS_STRING() macro. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2008-05-10 14:11:13 | lemburg | set | messages: + msg66526 |
| 2008-05-09 22:45:14 | alexandre.vassalotti | set | nosy:
+ alexandre.vassalotti messages: + msg66498 |
| 2008-05-09 22:43:17 | alexandre.vassalotti | link | issue2807 superseder |
| 2008-05-09 10:31:51 | lemburg | create | |