Message128382
If you have an undecodable filenames on UNIX, Python 3 escapes undecodable bytes using surrogates. pydoc: HTMLDoc.index() uses indirectly os.listdir() which does such operation, and later filenames are encoded to UTF-8 (the whole HTML content is encoded to UTF-8).
In practice, you cannot import such .py file, you run them using "python script.py", so we can maybe just ignore modules with undecodable filenames. For example:
def isUndecodableFilename(filename):
return any((0xD800 <= ord(ch) <= 0xDFFF) for ch in filename)
Or we can escape the surrogate characters, but I don't know how. Write "\uDC80" in a HTML document is not a good idea, especially in an URL (e.g. Firefox replaces \ by / in URLs). |
|
Date |
User |
Action |
Args |
2011-02-11 12:55:09 | vstinner | set | recipients:
+ vstinner, docs@python |
2011-02-11 12:55:09 | vstinner | set | messageid: <1297428909.69.0.247564466299.issue11186@psf.upfronthosting.co.za> |
2011-02-11 12:55:09 | vstinner | link | issue11186 messages |
2011-02-11 12:55:08 | vstinner | create | |
|