This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients
Date 2003-02-10.10:17:22
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.
History
Date User Action Args
2007-08-23 15:20:29adminlinkissue683592 messages
2007-08-23 15:20:29admincreate