Issue 10435: Document unicode C-API in reST

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/54644

classification

Title:	Document unicode C-API in reST
Type:		Stage:	resolved
Components:	Documentation	Versions:	Python 3.2

process

Status:	closed	Resolution:	duplicate
Dependencies:		Superseder:	Document PyUnicode_* API View: 1944
Assigned To:	belopolsky	Nosy List:	BreamoreBoy, belopolsky, berker.peksag, ezio.melotti, hodgestar, lemburg, loewis, vstinner
Priority:	normal	Keywords:	patch

Created on 2010-11-16 16:16 by belopolsky, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
issue10435.diff	belopolsky, 2010-11-16 23:58		review
issue10435a.diff	belopolsky, 2010-11-17 02:58		review

Messages (21)
msg121302 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2010-11-16 16:16
The following C-APIs are only documented in comments inside unicode.h: PyUnicode_GetMax PyUnicode_Resize PyUnicode_InternImmortal PyUnicode_FromOrdinal PyUnicode_GetDefaultEncoding PyUnicode_AsDecodedObject PyUnicode_AsDecodedUnicode PyUnicode_AsEncodedObject PyUnicode_AsEncodedUnicode PyUnicode_BuildEncodingMap PyUnicode_EncodeDecimal PyUnicode_Append PyUnicode_AppendAndDel PyUnicode_Partition PyUnicode_RPartition PyUnicode_RSplit PyUnicode_IsIdentifier Py_UNICODE_strlen Py_UNICODE_strcpy Py_UNICODE_strcat Py_UNICODE_strncpy Py_UNICODE_strcmp Py_UNICODE_strncmp Py_UNICODE_strchr Py_UNICODE_strrchr
msg121321 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2010-11-16 22:17
On Tue, Nov 16, 2010 at 10:38 AM, M.-A. Lemburg <mal@egenix.com> wrote: > Alexander Belopolsky wrote: .. >> I also have a similar question about C API. Here, in absence of >> __all__, the answer should be clear: all symbols in public header >> files should start with either _Py_ or Py_ and those that start with >> Py_ are public. The question is what should be done with names that >> start with Py_, but are not documented? Can we add an underscore to >> those names? If so, should a (deprecated) alias be made available? >> Should they be documented as deprecated? >> >> I think these questions can only be answered on a case by case bases >> which choices being: >> >> 1. Document. >> 2. Document as deprecated. >> 3. Document as deprecated, add underscore prefix and retain a deprecated alias. >> 4. Add an underscore prefix. >> >> The specific set of names that I would like to consider is the >> following from unicode.h. I am marking with () the names that I >> think should be documented and with (D) those that should be >> deprecated: >> >> PyUnicode_GetMax >> PyUnicode_Resize () >> PyUnicode_InternImmortal >> PyUnicode_FromOrdinal () >> PyUnicode_GetDefaultEncoding (D) >> PyUnicode_AsDecodedObject >> PyUnicode_AsDecodedUnicode >> PyUnicode_AsEncodedObject >> PyUnicode_AsEncodedUnicode >> PyUnicode_BuildEncodingMap >> PyUnicode_EncodeDecimal () >> PyUnicode_Append () >> PyUnicode_AppendAndDel () >> PyUnicode_Partition () >> PyUnicode_RPartition () >> PyUnicode_RSplit () >> PyUnicode_IsIdentifier () >> Py_UNICODE_strlen >> Py_UNICODE_strcpy >> Py_UNICODE_strcat >> Py_UNICODE_strncpy >> Py_UNICODE_strcmp >> Py_UNICODE_strncmp >> Py_UNICODE_strchr >> Py_UNICODE_strrchr > > For Unicode, unicodeobject.h defines which APIs are private or not. > APIs which don't appear in the header file are either private or > need to be added to the header file (but I don't think there are > any in this category). > > All APIs in the header that do not appear in the documentation, > should be added there as well. unicodeobject.h already provides > documentation for most of the APIs you've listed above (except some > new ones that were added later on). > > One API I'm not sure about is PyUnicode_AppendAndDel(). It's somewhat > obscure and given that we already have PyUnicode_Concat(), I think > it should be made private and eventually dropped. > I would also like to nominate PyUnicode_AsEncodedObject and PyUnicode_AsEncodedUnicode. The later is a particularly attractive candidate for removal because it appears to be broken: v = PyCodec_Encode(unicode, encoding, errors); if (v == NULL) goto onError; if (!PyUnicode_Check(v)) { PyErr_Format(PyExc_TypeError, "encoder did not return an str object (type=%.400s)", Py_TYPE(v)->tp_name); Since PyCodec_Encode() returns bytes in 3.x, the code above will always raise an error.
msg121323 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2010-11-16 22:45
PyUnicode_AsDecodedObject() and PyUnicode_AsDecodedUnicode() appear to be broken as well: both start with a PyUnicode_Check(unicode) and then pass unicode to PyCodec_Decode() which expects bytes.
msg121325 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2010-11-16 22:54
Please note that PyCodec_Encode()/PyCodec_Decode() will return whatever the codec returns for these operations. The codec system is not limited to converting between Unicode and bytes only. A typical example is a same-type codec such as rot13 that only transforms Unicode data.
msg121326 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2010-11-16 23:14
On Tue, Nov 16, 2010 at 5:54 PM, Marc-Andre Lemburg <report@bugs.python.org> wrote: > > Marc-Andre Lemburg <mal@egenix.com> added the comment: > > Please note that PyCodec_Encode()/PyCodec_Decode() will return whatever the codec returns for these operations. > > The codec system is not limited to converting between Unicode and bytes only. Not according to the latest reST documentation: """ * Encoding converts a string object to a bytes object using a particular character set encoding (e.g., cp1252 or iso-8859-1). * Decoding converts a bytes object encoded using a particular character set encoding to a string object. """ http://docs.python.org/dev/library/codecs.html?highlight=codecs#codecs.Codec.encode > A typical example is a same-type codec such as rot13 that only transforms Unicode data. I thought rot13 would only transform English (or Latin) alphabet.
msg121328 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2010-11-16 23:58
Attached patch documents all previously undocumented unicode C API functions. Note that for the PyUnicode_As{En,De}codedObject() and PyUnicode_As{En,De}DecodedUnicode() functions I attempted to capture what they are supposed to do rather than what the current implementation does.
msg121330 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2010-11-17 00:19
Alexander Belopolsky wrote: > > Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment: > > On Tue, Nov 16, 2010 at 5:54 PM, Marc-Andre Lemburg > <report@bugs.python.org> wrote: >> >> Marc-Andre Lemburg <mal@egenix.com> added the comment: >> >> Please note that PyCodec_Encode()/PyCodec_Decode() will return whatever the codec returns for these operations. >> >> The codec system is not limited to converting between Unicode and bytes only. > > Not according to the latest reST documentation: > > """ > * Encoding converts a string object to a bytes object using a > particular character set encoding (e.g., cp1252 or iso-8859-1). > > * Decoding converts a bytes object encoded using a particular > character set encoding to a string object. > """ http://docs.python.org/dev/library/codecs.html?highlight=codecs#codecs.Codec.encode That's another documentation bug, then. The codec system has always supported other type combinations for encoding/decoding as well. Only certain methods on str and bytes objects in 3.x limit the possible types to either str or bytes - which probably results in the idea that Python codecs don't support anything else. The text from the 2.7 documentation is correct, also for 3.x: http://docs.python.org/library/codecs.html#codec-objects >> A typical example is a same-type codec such as rot13 that only transforms Unicode data. > > I thought rot13 would only transform English (or Latin) alphabet. Right, everything else passes through as-is. Other examples are codecs that escape certain code points using e.g. XML entity sequences, backslash notations or other such techniques. For bytes, you have the zip, base64 and hex codecs which work in a similar way.
msg121331 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2010-11-17 00:25
On Tue, Nov 16, 2010 at 7:19 PM, Marc-Andre Lemburg <report@bugs.python.org> wrote: .. >> * Decoding converts a bytes object encoded using a particular >> character set encoding to a string object. >> """ http://docs.python.org/dev/library/codecs.html?highlight=codecs#codecs.Codec.encode > > That's another documentation bug, then. The codec system has always > supported other type combinations for encoding/decoding as well. > > Only certain methods on str and bytes objects in 3.x limit the possible > types to either str or bytes - which probably results in the > idea that Python codecs don't support anything else. > > The text from the 2.7 documentation is correct, also for 3.x: > > http://docs.python.org/library/codecs.html#codec-objects > I agree and will handle this in #10435 because codecs.h (unsurprisingly) supports your POV and we don't want C-API docs to be in conflict with Py-API docs. If you have time, please take a look at PyUnicode_As{En,De}codedObject() and PyUnicode_As{En,De}DecodedUnicode() documentation in the attached patch.
msg121332 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2010-11-17 01:11
> I agree and will handle this in #10435 because codecs.h s/#10435/#10439/
msg121335 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2010-11-17 02:58
It looks like I misunderstood what PyUnicode_As{En,De}codedObject() and PyUnicode_As{En,De}codedUnicode() functions are designed to do. Attaching a corrected patch, issue10435a.diff.
msg121371 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2010-11-17 18:04
Alexander Belopolsky wrote: > > If you have time, please take a look at > PyUnicode_As{En,De}codedObject() and > PyUnicode_As{En,De}DecodedUnicode() documentation in the attached > patch. Thanks. I'll try to have a look later tonight.
msg121386 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2010-11-17 22:20
Thanks for your work on this. Please see my comments below: --- Include/unicodeobject.h (revision 86478) +++ Include/unicodeobject.h (working copy) @@ -737,7 +737,7 @@ const char errors / error handling / ); -/ Encodes a Unicode object and returns the result as Python string +/* Encodes a Unicode object and returns the result as Python bytes object. / PyUnicode_AsEncodedObject() encodes the Unicode object to whatever the codec returns, so the "bytes" is wrong in the above line. --- Doc/c-api/unicode.rst (revision 86477) +++ Doc/c-api/unicode.rst (working copy) @@ -528,7 +567,22 @@ using the Python codec registry. Return NULL* if an exception was raised by the codec. +.. c:function:: PyObject* PyUnicode_AsDecodedObject(PyObject unicode, const char encoding, const char errors) + Create a Unicode object by decoding the encoded Unicode object + unicode. The function does not guarantee that a Unicode object will be returned. It merely passes a Unicode object to a codec's decode function and returns whatever the codec returns. + encoding* and errors have the same meaning as the + parameters of the same name in the :func:`unicode` built-in + function. The codec to be used is looked up using the Python codec + registry. Return NULL if an exception was raised by the codec. + Note that Python codecs do not accept Unicode objects for decoding, + so this method is only useful with user or 3rd party codecs. Please strike the last sentence. The codecs that were wrongly removed from Python3 will get added back and provide such functionality. +.. c:function:: PyObject* PyUnicode_AsEncodedObject(PyObject unicode, const char encoding, const char errors) + + Use c:func:`PyUnicode_AsEncodedString` instead. That's not a useful hint as PyUnicode_AsEncodedString() does something different than PyUnicode_AsEncodedObject(). + Same as c:func:`PyUnicode_AsEncodedString`, but without shortcuts + for common built-in encodings and without checking the type of the + object returned by encoding via the codec registry. This method is + only useful with user or 3rd party codec that encodes string into + something other than bytes. This should read: Decodes a Unicode object by passing the given Unicode object unicode* to the codec for encoding. encoding and errors have the same meaning as the parameters of the same name in the :func:`unicode` built-in function. The codec to be used is looked up using the Python codec registry. Return NULL if an exception was raised by the codec. +.. c:function:: PyObject* PyUnicode_AsEncodedUnicode(PyObject unicode, const char encoding, const char errors) + + Use c:func:`PyUnicode_AsEncodedString` instead. Please remove this as well. + Same as c:func:`PyUnicode_AsEncodedObject`, but raises + :exc:`TypeError` is encoding via the codec registry returns an + object other than string. This method is only useful with user or + 3rd party codec that encodes string into string. Please remove the last sentence. +.. c:function: int PyUnicode_EncodeDecimal(Py_UNICODE s, Py_ssize_t length, + char output, const char errors) + + Takes a Unicode string holding a decimal value and writes it into + an output buffer using standard ASCII digit codes. + + The output buffer has to provide at least length+1 bytes of storage + area. The output string is 0-terminated. + + The encoder converts whitespace to ' ', decimal characters to their + corresponding ASCII digit and all other Latin-1 characters except + \0 as-is. Characters outside this range (Unicode ordinals 1-256) + are treated as errors. This includes embedded NULL bytes. + + Error handling is defined by the errors argument: + + NULL or "strict": raise a ValueError + "ignore": ignore the wrong characters (these are not copied to the + output buffer) + "replace": replaces illegal characters with '?' + + Returns 0 on success, -1 on failure. + +.. c:function:: void PyUnicode_Append(PyObject *pleft, PyObject right) + + Concat two strings and put the result in pleft. Sets pleft to + NULL on error. + +.. c:function:: void PyUnicode_AppendAndDel(PyObject *pleft, PyObject right) + + Concat two strings and put the result in pleft and drop the right + object. Sets pleft to NULL on error. + + Please don't document these two obscure APIs. Instead we should make them private functions by prepending them with an underscore. If you look at the implementations of those two APIs, they are little more than a macros around PyUnicode_Concat(). 3rd party extensions should use PyUnicode_Concat() to achieve the same effect. +.. c:function:: void PyUnicode_InternImmortal(PyObject **string) + + Use :c:func:`PyUnicode_InternInPlace` instead. + + Same as :c:func:`PyUnicode_InternInPlace`, but the interned string + will never be released. + I don't think it's a good idea to make this a public API. 3rd party extensions should not need to make use of such APIs. Instead, we should make this a private API.
msg122155 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2010-11-22 19:00
On Wed, Nov 17, 2010 at 5:20 PM, Marc-Andre Lemburg <report@bugs.python.org> wrote: .. > -/* Encodes a Unicode object and returns the result as Python string > +/* Encodes a Unicode object and returns the result as Python bytes > object. / > > > PyUnicode_AsEncodedObject() encodes the Unicode object to > whatever the codec returns, so the "bytes" is wrong in the > above line. > The above line describes PyUnicode_AsEncodedString(), not PyUnicode_AsEncodedObject(). The former has PyBytes_Check(v) after calling v = PyCodec_Encode(..). As far as I can tell this is the only difference that makes PyUnicode_AsEncodedObject() not redundant. .. > +.. c:function:: PyObject PyUnicode_AsDecodedObject(PyObject unicode, const char encoding, const char errors) > > + Create a Unicode object by decoding the encoded Unicode object > + unicode. > > The function does not guarantee that a Unicode object will be > returned. It merely passes a Unicode object to a codec's > decode function and returns whatever the codec returns. > Good point. I am changing "Unicode object" to "Python object". .. > + Note that Python codecs do not accept Unicode objects for decoding, > + so this method is only useful with user or 3rd party codecs. > > Please strike the last sentence. The codecs that were wrongly removed > from Python3 will get added back and provide such functionality. > Would it be acceptable to keep this note, but add "as of version 3.2" or something like that? I don't think there is a chance that these codecs will be added in 3.2 given the current schedule. .. > This should read: > > Decodes a Unicode object by passing the given Unicode object > unicode* to the codec for encoding. > encoding and errors have the same meaning as the > parameters of the same name in the :func:`unicode` built-in > function. The codec to be used is looked up using the Python codec > registry. Return NULL if an exception was raised by the codec. > Is the following better? """ Decodes a Unicode object by passing the given Unicode object unicode to the codec for encoding. encoding and errors have the same meaning as the parameters of the same name in the :func:`unicode` built-in function. The codec to be used is looked up using the Python codec registry. Return NULL if an exception was raised by the codec. As of Python 3.2, this method is only useful with user or 3rd party codec that encodes string into something other than bytes. For encoding to bytes, use c:func:`PyUnicode_AsEncodedString` instead. """ .. > > +.. c:function:: void PyUnicode_Append(PyObject *pleft, PyObject right) .. > + > +.. c:function:: void PyUnicode_AppendAndDel(PyObject *pleft, PyObject right) .. > > Please don't document these two obscure APIs. Instead we should > make them private functions by prepending them with an underscore. > If you look at the implementations of those two APIs, they > are little more than a macros around PyUnicode_Concat(). > I don't agree that they are obscure. Python uses them in multiple places and developers seem to know about them. See patches submitted to issue4113 and issue7584. > 3rd party extensions should use PyUnicode_Concat() to achieve > the same effect. > Hmm. I would not be surprised if current 3rd party extensions used PyUnicode_AppendAndDel() more often than PyUnicode_Concat(). (I know that I learned about PyUnicode_AppendAndDel() before PyUnicode_Concat().) Is there anything that makes PyUnicode_AppendAndDel() undesirable? I don't mind adding a recommendation to use PyUnicode_Concat() if there is a practical reason for it or even a warning that PyUnicode_AppendAndDel() may be deprecated in the future, but renaming it to _PyUnicode_AppendAndDel() seems premature. .. > > I don't think it's a good idea to make this a public API. > 3rd party extensions should not need to make use of such > APIs. > > Instead, we should make this a private API. I agree, but isn't it prudent to document it as deprecated for 3rd party use first?
msg122215 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2010-11-23 13:46
Alexander Belopolsky wrote: > > Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment: > > On Wed, Nov 17, 2010 at 5:20 PM, Marc-Andre Lemburg > <report@bugs.python.org> wrote: > .. >> -/* Encodes a Unicode object and returns the result as Python string >> +/* Encodes a Unicode object and returns the result as Python bytes >> object. / >> >> >> PyUnicode_AsEncodedObject() encodes the Unicode object to >> whatever the codec returns, so the "bytes" is wrong in the >> above line. >> > > The above line describes PyUnicode_AsEncodedString(), not > PyUnicode_AsEncodedObject(). The former has PyBytes_Check(v) after > calling v = PyCodec_Encode(..). As far as I can tell this is the > only difference that makes PyUnicode_AsEncodedObject() not redundant. In that case, the change is fine. > .. >> +.. c:function:: PyObject PyUnicode_AsDecodedObject(PyObject unicode, const char encoding, const char errors) >> >> + Create a Unicode object by decoding the encoded Unicode object >> + unicode. >> >> The function does not guarantee that a Unicode object will be >> returned. It merely passes a Unicode object to a codec's >> decode function and returns whatever the codec returns. >> > > Good point. I am changing "Unicode object" to "Python object". > > .. >> + Note that Python codecs do not accept Unicode objects for decoding, >> + so this method is only useful with user or 3rd party codecs. >> >> Please strike the last sentence. The codecs that were wrongly removed >> from Python3 will get added back and provide such functionality. >> > > Would it be acceptable to keep this note, but add "as of version 3.2" > or something like that? I don't think there is a chance that these > codecs will be added in 3.2 given the current schedule. Please remove the sentence or change it to: Note that most Python codecs only accept Unicode objects for decoding. > .. >> This should read: >> >> Decodes a Unicode object by passing the given Unicode object >> unicode* to the codec for encoding. >> encoding and errors have the same meaning as the >> parameters of the same name in the :func:`unicode` built-in >> function. The codec to be used is looked up using the Python codec >> registry. Return NULL if an exception was raised by the codec. >> > > Is the following better? > > """ > Decodes a Unicode object by passing the given Unicode object > unicode to the codec for encoding. encoding and errors > have the same meaning as the parameters of the same name in the > :func:`unicode` built-in function. The codec to be used is > looked up using the Python codec registry. Return NULL if an > exception was raised by the codec. > > As of Python 3.2, this method is only useful with user or 3rd > party codec that encodes string into something other than bytes. Same as above. > For encoding to bytes, use c:func:`PyUnicode_AsEncodedString` > instead. > """ > .. >> >> +.. c:function:: void PyUnicode_Append(PyObject *pleft, PyObject right) > .. >> + >> +.. c:function:: void PyUnicode_AppendAndDel(PyObject *pleft, PyObject right) > .. >> >> Please don't document these two obscure APIs. Instead we should >> make them private functions by prepending them with an underscore. >> If you look at the implementations of those two APIs, they >> are little more than a macros around PyUnicode_Concat(). >> > > I don't agree that they are obscure. Python uses them in multiple > places and developers seem to know about them. See patches submitted > to issue4113 and issue7584. I found these references: http://osdir.com/ml/python.python-3000.cvs/2007-11/msg00270.html and http://riverbankcomputing.co.uk/hg/sip/annotate/91a545605044/siplib/siplib.c so you're right: they are already in use in the wild. Too bad... Please add these porting notes to the documentation: PyUnicode_Append() works like the PyString_Concat(), while PyUnicode_AppendAndDel() works like PyString_ConcatAndDel(). >> 3rd party extensions should use PyUnicode_Concat() to achieve >> the same effect. >> > > Hmm. I would not be surprised if current 3rd party extensions used > PyUnicode_AppendAndDel() more often than PyUnicode_Concat(). (I know > that I learned about PyUnicode_AppendAndDel() before > PyUnicode_Concat().) Certainly not more often. PyUnicode_Concat() has been around much longer than the other two APIs which are only available in Python3. > Is there anything that makes PyUnicode_AppendAndDel() undesirable? I > don't mind adding a recommendation to use PyUnicode_Concat() if there > is a practical reason for it or even a warning that > PyUnicode_AppendAndDel() may be deprecated in the future, but renaming > it to _PyUnicode_AppendAndDel() seems premature. Both APIs are just slight variants of the PyUnicode_Concat() API. They change parameters in-place which is rather uncommon for the Unicode API and don't return their result - in fact the error reporting is somewhat broken: APIs which do in-place modifcations usually return an integer for error reporting. These APIs set the pleft to NULL instead. Finally, the naming is of PyUnicode_AppendAndDel() is not ideal. "Del" would suggest that an object is deleted, but in reality it is only decrefed. It is also not clear that the second argument is affected, but not the first one. > .. >> [PyUnicode_InternImmortal(PyObject *p)] >> I don't think it's a good idea to make this a public API. >> 3rd party extensions should not need to make use of such >> APIs. >> >> Instead, we should make this a private API. > > I agree, but isn't it prudent to document it as deprecated for 3rd > party use first? I don't think that's needed in this case. The API is not used outside Python3, it seems. If people complain in beta phase, we can always add a deprecation function wrapper instead.
msg241743 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2015-04-21 20:42
I've looked at c-api/unicode.rst and I can't see any correlation between it and the names listed here in msg121302. So either this was never completed or it's been all change in the mean time, so could somebody take a look please.
msg241746 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2015-04-21 21:38
Mark, Unicode C-APIs have changed a lot since this issue was opened, but I think many of the listed functions are still present but not properly documented. You can help by checking the Include/unicode.h file and compiling a list of functions that are there, don't start with _ and not documented in the reference manual.
msg241747 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2015-04-21 21:40
Sorry for the broken link, the correct header file is Include/unicodeobject.h
msg241748 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2015-04-21 21:43
Okay Alexander I'll give it a go, but not tonight :)
msg242295 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2015-04-30 23:45
List of just about everything that's in the header file but not in the rst file as I'm not sure which bits you normally wouldn't bother with. Py_USING_UNICODE Py_UNICODE_SIZE Py_UNICODE_WIDE Py_UNICODE_COPY Py_UNICODE_FILL Py_UNICODE_HIGH_SURROGATE Py_UNICODE_LOW_SURROGATE Py_UNICODE_MATCH PyUnicode_WSTR_LENGTH PyUnicode_AS_DATA PyUnicode_IS_ASCII PyUnicode_IS_COMPACT PyUnicode_IS_COMPACT_ASCII PyUnicode_IS_READY Py_UNICODE_REPLACEMENT_CHARACTER PyUnicode_FromString PyUnicode_GetMax PyUnicode_Resize PyUnicode_InternImmortal PyUnicode_CHECK_INTERNED PyUnicode_FromOrdinal PyUnicode_GetDefaultEncoding PyUnicode_AsDecodedObject PyUnicode_AsDecodedUnicode PyUnicode_AsEncodedObject PyUnicode_AsEncodedUnicode PyUnicode_BuildEncodingMap PyUnicode_DecodeCodePageStateful PyUnicode_EncodeDecimal PyUnicode_Append PyUnicode_AppendAndDel PyUnicode_Partition PyUnicode_RPartition PyUnicode_RSplit PyUnicode_IsIdentifier Py_UNICODE_strlen Py_UNICODE_strcpy Py_UNICODE_strcat Py_UNICODE_strncpy Py_UNICODE_strcmp Py_UNICODE_strncmp Py_UNICODE_strchr Py_UNICODE_strrchr
msg242498 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2015-05-03 18:31
Py_UNICODE_TOLOWER, Py_UNICODE_TOUPPER and Py_UNICODE_TOTITLE are all labelled deprecated in 3.3 and presumably can be removed completely. Alternatively should these like many others be scheduled for removal in 4.0?
msg264573 - (view)	Author: Berker Peksag (berker.peksag) *	Date: 2016-04-30 18:40
This is a duplicate of issue 1944.

History
Date	User	Action	Args
2022-04-11 14:57:08	admin	set	github: 54644
2016-04-30 18:40:27	berker.peksag	set	status: open -> closed superseder: Document PyUnicode_* API nosy: + berker.peksag messages: + msg264573 resolution: duplicate stage: patch review -> resolved
2015-05-03 18:31:58	BreamoreBoy	set	messages: + msg242498
2015-04-30 23:45:26	BreamoreBoy	set	messages: + msg242295
2015-04-21 21:43:59	BreamoreBoy	set	messages: + msg241748
2015-04-21 21:40:28	belopolsky	set	messages: + msg241747
2015-04-21 21:38:09	belopolsky	set	messages: + msg241746
2015-04-21 20:42:11	BreamoreBoy	set	nosy: + BreamoreBoy messages: + msg241743
2010-11-23 13:46:24	lemburg	set	messages: + msg122215
2010-11-22 19:00:55	belopolsky	set	messages: + msg122155
2010-11-20 23:20:47	belopolsky	link	issue8647 superseder
2010-11-20 23:20:12	belopolsky	link	issue8646 superseder
2010-11-20 23:19:42	belopolsky	link	issue8645 superseder
2010-11-20 16:25:41	hodgestar	set	nosy: + hodgestar
2010-11-17 22:20:13	lemburg	set	messages: + msg121386
2010-11-17 18:04:55	lemburg	set	messages: + msg121371
2010-11-17 02:58:13	belopolsky	set	files: + issue10435a.diff messages: + msg121335
2010-11-17 01:11:54	belopolsky	set	messages: + msg121332
2010-11-17 01:04:03	ezio.melotti	set	nosy: + ezio.melotti
2010-11-17 00:25:11	belopolsky	set	messages: + msg121331
2010-11-17 00:19:07	lemburg	set	messages: + msg121330
2010-11-16 23:58:06	belopolsky	set	files: + issue10435.diff keywords: + patch messages: + msg121328 stage: needs patch -> patch review
2010-11-16 23:14:55	belopolsky	set	messages: + msg121326
2010-11-16 22:54:42	lemburg	set	messages: + msg121325
2010-11-16 22:45:32	belopolsky	set	messages: + msg121323
2010-11-16 22:21:43	belopolsky	set	nosy: + lemburg, loewis, vstinner
2010-11-16 22:17:57	belopolsky	set	messages: + msg121321
2010-11-16 16:31:41	georg.brandl	link	issue9076 superseder
2010-11-16 16:23:15	belopolsky	link	issue8649 superseder
2010-11-16 16:16:45	belopolsky	create