This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author huwcbjones
Recipients huwcbjones
Date 2021-05-18.11:12:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1621336329.38.0.49764469303.issue44170@roundup.psfhosted.org>
In-reply-to
Content
I've experienced a UnicodeDecodeError when adding unicode strings that contain multibye utf-8 characters into a shareable list.
My observation is that ShareableList chunks the list of strings before sending it over the process boundary, however this chunking process is not multibyte aware and will chunk in the middle of multibyte characters.
On the other end, this results in the ShareableList throwing a UnicodeDecodeError when it fails to decode not-a-full multibyte utf-8 character.

From running the attached MWE, I see that the string is sent in two chunks, the first being b'Boom \xf0\x9f\x92\xa5 \xf0\x9f\x92\xa5 \xf0' which clearly splits the 4 bytes of the 💥 character into the first byte and remaining 3 bytes.
History
Date User Action Args
2021-05-18 11:12:09huwcbjonessetrecipients: + huwcbjones
2021-05-18 11:12:09huwcbjonessetmessageid: <1621336329.38.0.49764469303.issue44170@roundup.psfhosted.org>
2021-05-18 11:12:09huwcbjoneslinkissue44170 messages
2021-05-18 11:12:09huwcbjonescreate