This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author sibiryakov
Recipients sibiryakov
Date 2018-01-17.15:27:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
The CPython interpreter gets SIGSEGV or SIGABRT during the run. The script attempts to decode binary file using UTF-16-LE encoding and custom error handler. The error handler is poorly built, and doesn't respect the unicode standard with wrong calculation of the new position for decoder to continue. This somehow interfere with internal C code doing memory allocation. The result is invalid writes outside of allocated block.

Here is how it looks like with Python 3.7.0a4+ (heads/master:44a70e9, Jan 17 2018, 12:18:45) run under Valgrind 3.11.0. Please see the full Valgrind output in attached valgrind.log.

==24836== Invalid write of size 4
==24836==    at 0x4C6B17: ucs4lib_utf16_decode (codecs.h:540)
==24836==    by 0x4C6B17: PyUnicode_DecodeUTF16Stateful (unicodeobject.c:5600)
==24836==    by 0x55AAD3: _codecs_utf_16_le_decode_impl (_codecsmodule.c:363)
==24836==    by 0x55AB6C: _codecs_utf_16_le_decode (_codecsmodule.c.h:371)
==24836==    by 0x4315D6: _PyMethodDef_RawFastCallKeywords (call.c:651)
==24836==    by 0x431840: _PyCFunction_FastCallKeywords (call.c:730)
==24836==    by 0x4ED159: call_function (ceval.c:4580)
==24836==    by 0x4ED159: _PyEval_EvalFrameDefault (ceval.c:3134)
==24836==    by 0x4E302D: PyEval_EvalFrameEx (ceval.c:545)
==24836==    by 0x4E3A42: _PyEval_EvalCodeWithName (ceval.c:3971)
==24836==    by 0x430EDD: _PyFunction_FastCallDict (call.c:376)
==24836==    by 0x4336B0: PyObject_Call (call.c:226)
==24836==    by 0x433839: PyEval_CallObjectWithKeywords (call.c:826)
==24836==    by 0x4FEAA6: _PyCodec_DecodeInternal (codecs.c:471)
==24836==  Address 0x6cf4bf8 is 0 bytes after a block of size 339,112 alloc'd
==24836==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/
==24836==    by 0x467635: _PyMem_RawMalloc (obmalloc.c:75)
==24836==    by 0x467B7D: _PyMem_DebugRawAlloc (obmalloc.c:2033)
==24836==    by 0x467C1F: _PyMem_DebugRawMalloc (obmalloc.c:2062)
==24836==    by 0x467C40: _PyMem_DebugMalloc (obmalloc.c:2202)
==24836==    by 0x468BFF: PyObject_Malloc (obmalloc.c:616)
==24836==    by 0x493902: PyUnicode_New (unicodeobject.c:1293)
==24836==    by 0x4BEA4F: _PyUnicodeWriter_PrepareInternal (unicodeobject.c:13456)
==24836==    by 0x4C6D39: _PyUnicodeWriter_WriteCharInline (unicodeobject.c:13494)
==24836==    by 0x4C6D39: PyUnicode_DecodeUTF16Stateful (unicodeobject.c:5637)
==24836==    by 0x55AAD3: _codecs_utf_16_le_decode_impl (_codecsmodule.c:363)
==24836==    by 0x55AB6C: _codecs_utf_16_le_decode (_codecsmodule.c.h:371)
==24836==    by 0x4315D6: _PyMethodDef_RawFastCallKeywords (call.c:651)
Date User Action Args
2018-01-17 15:27:50sibiryakovsetrecipients: + sibiryakov
2018-01-17 15:27:50sibiryakovsetmessageid: <>
2018-01-17 15:27:50sibiryakovlinkissue32583 messages
2018-01-17 15:27:50sibiryakovcreate