Issue 39113: PyUnicode_AsUTF8AndSize Sometimes Segfaults With Incomplete Surrogate Pair

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/83294

classification

Title:	PyUnicode_AsUTF8AndSize Sometimes Segfaults With Incomplete Surrogate Pair
Type:		Stage:	resolved
Components:		Versions:

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	serhiy.storchaka, william.ayd
Priority:	normal	Keywords:

Created on 2019-12-21 03:32 by william.ayd, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
testmodule.c	william.ayd, 2019-12-21 03:32	Extension Module For Use in Identifying Segfault

Messages (3)
msg358755 - (view)	Author: (william.ayd) *	Date: 2019-12-21 03:32
With the attached extension module, if I run the following in the REPL: >>> import libtest >>> >>> libtest.error_if_not_utf8("foo") 'foo' >>> libtest.error_if_not_utf8("\ud83d") Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'utf-8' codec can't encode character '\ud83d' in position 0: surrogates not allowed >>> libtest.error_if_not_utf8("foo") 'foo' Things seem OK. But the next invocation of >>> libtest.error_if_not_utf8("\ud83d") Then causes a segfault. Note that the order of the input seems important; simply repeating the call with the invalid surrogate doesn't cause the segfault
msg358757 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2019-12-21 05:40
Your function returns a borrowed reference. It xiuld cause ceash even without calling PyUnicode_AsUTF8AndSize. Add Py_INCREF(str)
msg358762 - (view)	Author: (william.ayd) *	Date: 2019-12-21 07:15
Hmm my mistake - thanks!

History
Date	User	Action	Args
2022-04-11 14:59:24	admin	set	github: 83294
2019-12-21 16:48:44	serhiy.storchaka	set	status: open -> closed stage: resolved
2019-12-21 07:15:50	william.ayd	set	messages: + msg358762
2019-12-21 05:40:56	serhiy.storchaka	set	resolution: not a bug messages: + msg358757 nosy: + serhiy.storchaka
2019-12-21 03:32:54	william.ayd	create