This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients benjamin.peterson, ezio.melotti, lemburg, ncoghlan, serhiy.storchaka, vstinner, xiang.zhang
Date 2016-10-13.08:02:14
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1476345735.1.0.264775958349.issue28426@psf.upfronthosting.co.za>
In-reply-to
Content
PyUnicode_AsDecodedObject() and PyUnicode_AsEncodedObject() were meant as C API implementations of the unicode.decode() and unicode.encode() methods in Python2. Not having PyUnicode_AsDecodedObject() documented was likely an oversight on my part.

In Python2, unicode.decode() and unicode.encode() were more or less direct interfaces to the codec registry. In Python 2.7 this was changed to issue a warning for porting to Python 3.

In Python3, the methods were changed to only return unicode objects and to reflect this change without breaking the C API, the new PyUnicode_AsDecodedUnicode() and PyUnicode_AsEncodedUnicode() were added.

I guess the more recent changes simply didn't pay attention to this difference anymore and put restrictions on the output of PyUnicode_AsDecodedObject() and PyUnicode_AsEncodedObject() which were not originally intended, hence the crash you are seeing, Serhiy.

Going forward, C extensions in Python3 could indeed use the PyCodec_*() APIs directly.
History
Date User Action Args
2016-10-13 08:02:15lemburgsetrecipients: + lemburg, ncoghlan, vstinner, benjamin.peterson, ezio.melotti, serhiy.storchaka, xiang.zhang
2016-10-13 08:02:15lemburgsetmessageid: <1476345735.1.0.264775958349.issue28426@psf.upfronthosting.co.za>
2016-10-13 08:02:14lemburglinkissue28426 messages
2016-10-13 08:02:14lemburgcreate