Message86994
The built-in compile() expects source is encoded in utf-8.
This behavior make it harder to implement alternative shell
like IDLE and IPython. (http://bugs.python.org/issue1542677 and
https://bugs.launchpad.net/ipython/+bug/339642 are related bugs.)
Below is current compile() behavior.
# Python's interactive shell in Windows cp932 console.
>>> "あ"
'\x82\xa0'
>>> u"あ"
u'\u3042'
# compile() fails to decode str.
>>> code = compile('u"あ"', '__interactive__', 'single')
>>> exec code
u'\x82\xa0' # u'\u3042' expected.
# compile() encodes unicode to utf-8.
>>> code = compile(u'"あ"', '__interactive__', 'single')
>>> exec code
'\xe3\x81\x82' # '\x82\xa0' (cp932) wanted, but I get utf-8.
Currentry, using PEP0263 like below is needed to get compile
code in expected encoding.
>>> code = compile('# coding: cp932\n%s' % ('"あ"',), '__interactive__',
'single')
>>> exec code
'\x82\xa0'
>>> code = compile('# coding: cp932\n%s' % ('u"あ"',), '__interactive__',
'single')
>>> exec code
u'\u3042'
But I feel compile() with PEP0263 is bit dirty hack.
I think adding a 'encoding' argument that have a 'utf-8' as default value to
compile() is cleaner way and it doesn't break backward compatibility.
Following example is describe behavior of compile() with encoding option.
# coding: utf-8 (in utf-8 context)
code = compile('"あ"', '__foo.py', 'single')
exec code #=> '\xe3\x81\x82'
code = compile('"あ"', '__foo.py', 'single', encoding='cp932') =>
UnicodeDecodeError
code = compile(u'"あ"', '__foo.py', 'single')
exec code #=> '\xe3\x81\x82'
code = compile(u'"あ"', '__foo.py', 'single', encoding='cp932')
exec code #=> '\x82\xa0' |
|
Date |
User |
Action |
Args |
2009-05-03 01:55:25 | methane | set | recipients:
+ methane |
2009-05-03 01:55:25 | methane | set | messageid: <1241315725.1.0.346952935822.issue5911@psf.upfronthosting.co.za> |
2009-05-03 01:55:23 | methane | link | issue5911 messages |
2009-05-03 01:55:22 | methane | create | |
|