diff -r e6553446cfda Doc/c-api/unicode.rst --- a/Doc/c-api/unicode.rst Sun Dec 18 12:34:34 2011 +0100 +++ b/Doc/c-api/unicode.rst Sun Dec 18 17:56:55 2011 +0100 @@ -651,7 +651,9 @@ Return a read-only pointer to the Unicode object's internal :c:type:`Py_UNICODE` buffer, *NULL* if *unicode* is not a Unicode object. This will create the :c:type:`Py_UNICODE` representation of the object if it - is not yet available. + is not yet available. Note that the resulting :c:type:`Py_UNICODE` string + may contain embedded null characters, which would cause the string to be + truncated when used in most C functions. Please migrate to using :c:func:`PyUnicode_AsUCS4`, :c:func:`PyUnicode_Substring`, :c:func:`PyUnicode_ReadChar` or similar new @@ -668,7 +670,11 @@ .. c:function:: Py_UNICODE* PyUnicode_AsUnicodeAndSize(PyObject *unicode, Py_ssize_t *size) Like :c:func:`PyUnicode_AsUnicode`, but also saves the :c:func:`Py_UNICODE` - array length in *size*. + array length in *size*. Note that the resulting string may contain embedded + null characters, and would thus be truncated when used in most C functions. + Note that the resulting :c:type:`Py_UNICODE` string may contain embedded + null characters, which would cause the string to be truncated when used in + most C functions. .. versionadded:: 3.3 @@ -677,8 +683,10 @@ Create a copy of a Unicode string ending with a nul character. Return *NULL* and raise a :exc:`MemoryError` exception on memory allocation failure, - otherwise return a new allocated buffer (use :c:func:`PyMem_Free` to free the - buffer). + otherwise return a new allocated buffer (use :c:func:`PyMem_Free` to free + the buffer). Note that the resulting :c:type:`Py_UNICODE` string may contain + embedded null characters, which would cause the string to be truncated when + used in most C functions. .. versionadded:: 3.2 @@ -817,7 +825,8 @@ Encode a Unicode object to :c:data:`Py_FileSystemDefaultEncoding` with the ``'surrogateescape'`` error handler, or ``'strict'`` on Windows, and return - :class:`bytes`. + :class:`bytes`. Note that the resulting :class:`bytes` object may contain + null bytes. If :c:data:`Py_FileSystemDefaultEncoding` is not set, fall back to the locale encoding. @@ -853,7 +862,10 @@ copied or -1 in case of an error. Note that the resulting :c:type:`wchar_t` string may or may not be 0-terminated. It is the responsibility of the caller to make sure that the :c:type:`wchar_t` string is 0-terminated in case this is - required by the application. + required by the application. Also, note that the :c:type:`wchar_t` string + might contain null characters, which would cause the string to be truncated + when used with most C functions. This could be detected by comparing the + resulting :c:type:`Py_ssize_t` with the result of :c:func:`wcslen()`. .. c:function:: wchar_t* PyUnicode_AsWideCharString(PyObject *unicode, Py_ssize_t *size) @@ -863,9 +875,13 @@ of wide characters (excluding the trailing 0-termination character) into *\*size*. - Returns a buffer allocated by :c:func:`PyMem_Alloc` (use :c:func:`PyMem_Free` - to free it) on success. On error, returns *NULL*, *\*size* is undefined and - raises a :exc:`MemoryError`. + Returns a buffer allocated by :c:func:`PyMem_Alloc` (use + :c:func:`PyMem_Free` to free it) on success. On error, returns *NULL*, + *\*size* is undefined and raises a :exc:`MemoryError`. Note that the + resulting :c:type:`wchar_t` string might contain null characters, which + would cause the string to be truncated when used with most C functions. This + could be detected by comparing *\*size* with the result of + :c:func:`wcslen()`. .. versionadded:: 3.2