Message115989
Hello. I noticed test suite reports WARNING every time.
///////////////////////////////////////////////////
E:\python-dev>py3k -m test.regrtest test_os
WARNING: The filename '@test_464_tmp-共有される' CAN be encoded by the filesyste
m encoding (mbcs). Unicode filename tests may not be effective
(snip)
///////////////////////////////////////////////////
This happens because TESTFN_UNICODE_UNDECODABLE in Lib/test/support.py
*is* decodable on Japanese environment (cp932).
It is easy to make this really undecodable in Japanese.
Using the characters like "\u2661" or "\u2668" (Former is heart mark,
latter is "Onsen" - Hot spring mark) I could remove the warning by this.
TESTFN_UNENCODABLE = TESTFN + "-\u5171\u6709\u3055\u308c\u308b\u2661\u2668"
///////////////////////////////////////////////////
And another issue. This happens only on test_unicode_file,
///////////////////////////////////////////////////
E:\python-dev>py3k -m test.test_unicode_file
Traceback (most recent call last):
File "e:\python-dev\py3k\lib\test\test_unicode_file.py", line 12, in <module>
TESTFN_UNICODE.encode(TESTFN_ENCODING)
UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: inval
id character
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "e:\python-dev\py3k\lib\runpy.py", line 160, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "e:\python-dev\py3k\lib\runpy.py", line 73, in _run_code
exec(code, run_globals)
File "e:\python-dev\py3k\lib\test\test_unicode_file.py", line 16, in <module>
raise unittest.SkipTest("No Unicode filesystem semantics on this platform.")
unittest.case.SkipTest: No Unicode filesystem semantics on this platform.
///////////////////////////////////////////////////
This happens because TESTFN_UNICODE cannot be encoded in Japanese.
E:\python-dev>py3k
Python 3.2a2+ (py3k:84663M, Sep 10 2010, 13:24:41) [MSC v.1400 32 bit (Intel)] o
n win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("-\xe0\xf2")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'cp932' codec can't encode character '\xe0' in position 1: i
llegal multibyte sequence
But interesting, this bytes sequence "\xe0\xf2" can be read as
cp932 multibyte characters.
E:\python-dev>python
Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print "\xe0\xf2"
瑣
>>> "\xe0\xf2".decode("cp932")
u'\u7463'
E:\python-dev>py3k
Python 3.2a2+ (py3k:84663M, Sep 10 2010, 13:24:41) [MSC v.1400 32 bit (Intel)] o
n win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print('\u7463')
瑣
I believe this value "\xe0\xf2" came from python2.x, maybe "\u7463"
should be used here? I'm not sure it can be decoded everywhere using
other encodings, though. |
|
Date |
User |
Action |
Args |
2010-09-10 09:39:53 | ocean-city | set | recipients:
+ ocean-city |
2010-09-10 09:39:53 | ocean-city | set | messageid: <1284111593.27.0.664276526228.issue9819@psf.upfronthosting.co.za> |
2010-09-10 09:39:51 | ocean-city | link | issue9819 messages |
2010-09-10 09:39:50 | ocean-city | create | |
|