Message341363
In the Unicode HOWTO: http://docs.python.org/3.3/howto/unicode.html
It says the following:
"UTF-8 has several convenient properties:
(...)
2. A Unicode string is turned into a sequence of bytes containing no embedded zero bytes. This avoids byte-ordering issues, and means UTF-8 strings can be processed by C functions such as strcpy() and sent through protocols that can’t handle zero bytes."
This is not right. UTF-8 uses the zero byte to represent the Unicode codepoint U+0000 (the ASCII NULL character). This is a valid character in UTF-8 and is handled just fine by python's UTF-8 string encoding/decoding. |
|
Date |
User |
Action |
Args |
2019-05-04 00:00:17 | mbiggs | set | recipients:
+ mbiggs, docs@python |
2019-05-04 00:00:17 | mbiggs | set | messageid: <1556928017.33.0.648089706151.issue36789@roundup.psfhosted.org> |
2019-05-04 00:00:17 | mbiggs | link | issue36789 messages |
2019-05-04 00:00:17 | mbiggs | create | |
|