Author william.ayd
Recipients william.ayd
Date 2019-12-21.03:32:54
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
With the attached extension module, if I run the following in the REPL:

>>> import libtest
>>> libtest.error_if_not_utf8("foo")
>>> libtest.error_if_not_utf8("\ud83d")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud83d' in position 0: surrogates not allowed
>>> libtest.error_if_not_utf8("foo")

Things seem OK. But the next invocation of

>>> libtest.error_if_not_utf8("\ud83d")

Then causes a segfault. Note that the order of the input seems important; simply repeating the call with the invalid surrogate doesn't cause the segfault
Date User Action Args
2019-12-21 03:32:54william.aydsetrecipients: + william.ayd
2019-12-21 03:32:54william.aydsetmessageid: <>
2019-12-21 03:32:54william.aydlinkissue39113 messages
2019-12-21 03:32:54william.aydcreate