This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ocean-city
Recipients ocean-city
Date 2010-09-10.09:39:49
SpamBayes Score 4.9071858e-14
Marked as misclassified No
Message-id <1284111593.27.0.664276526228.issue9819@psf.upfronthosting.co.za>
In-reply-to
Content
Hello. I noticed test suite reports WARNING every time.

///////////////////////////////////////////////////

E:\python-dev>py3k -m test.regrtest test_os
WARNING: The filename '@test_464_tmp-共有される' CAN be encoded by the filesyste
m encoding (mbcs). Unicode filename tests may not be effective
(snip)

///////////////////////////////////////////////////

This happens because TESTFN_UNICODE_UNDECODABLE in Lib/test/support.py
*is* decodable on Japanese environment (cp932).

It is easy to make this really undecodable in Japanese.
Using the characters like "\u2661" or "\u2668" (Former is heart mark,
latter is "Onsen" - Hot spring mark) I could remove the warning by this.
    TESTFN_UNENCODABLE = TESTFN + "-\u5171\u6709\u3055\u308c\u308b\u2661\u2668"

///////////////////////////////////////////////////

And another issue. This happens only on test_unicode_file,

///////////////////////////////////////////////////

E:\python-dev>py3k -m test.test_unicode_file
Traceback (most recent call last):
  File "e:\python-dev\py3k\lib\test\test_unicode_file.py", line 12, in <module>
    TESTFN_UNICODE.encode(TESTFN_ENCODING)
UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: inval
id character

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "e:\python-dev\py3k\lib\runpy.py", line 160, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "e:\python-dev\py3k\lib\runpy.py", line 73, in _run_code
    exec(code, run_globals)
  File "e:\python-dev\py3k\lib\test\test_unicode_file.py", line 16, in <module>
    raise unittest.SkipTest("No Unicode filesystem semantics on this platform.")

unittest.case.SkipTest: No Unicode filesystem semantics on this platform.

///////////////////////////////////////////////////

This happens because TESTFN_UNICODE cannot be encoded in Japanese.

E:\python-dev>py3k
Python 3.2a2+ (py3k:84663M, Sep 10 2010, 13:24:41) [MSC v.1400 32 bit (Intel)] o
n win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("-\xe0\xf2")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'cp932' codec can't encode character '\xe0' in position 1: i
llegal multibyte sequence

But interesting, this bytes sequence "\xe0\xf2" can be read as
cp932 multibyte characters.

E:\python-dev>python
Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print "\xe0\xf2"
瑣
>>> "\xe0\xf2".decode("cp932")
u'\u7463'

E:\python-dev>py3k
Python 3.2a2+ (py3k:84663M, Sep 10 2010, 13:24:41) [MSC v.1400 32 bit (Intel)] o
n win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print('\u7463')
瑣

I believe this value "\xe0\xf2" came from python2.x, maybe "\u7463"
should be used here? I'm not sure it can be decoded everywhere using
other encodings, though.
History
Date User Action Args
2010-09-10 09:39:53ocean-citysetrecipients: + ocean-city
2010-09-10 09:39:53ocean-citysetmessageid: <1284111593.27.0.664276526228.issue9819@psf.upfronthosting.co.za>
2010-09-10 09:39:51ocean-citylinkissue9819 messages
2010-09-10 09:39:50ocean-citycreate