This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author methane
Recipients methane
Date 2009-05-03.01:55:22
SpamBayes Score 1.8464098e-08
Marked as misclassified No
Message-id <1241315725.1.0.346952935822.issue5911@psf.upfronthosting.co.za>
In-reply-to
Content
The built-in compile() expects source is encoded in utf-8.
This behavior make it harder to implement alternative shell
like IDLE and IPython. (http://bugs.python.org/issue1542677 and
https://bugs.launchpad.net/ipython/+bug/339642 are related bugs.)

Below is current compile() behavior.

# Python's interactive shell in Windows cp932 console.
>>> "あ"
'\x82\xa0'
>>> u"あ"
u'\u3042'

# compile() fails to decode str.
>>> code = compile('u"あ"', '__interactive__', 'single')
>>> exec code
u'\x82\xa0'  # u'\u3042' expected.

# compile() encodes unicode to utf-8.
>>> code = compile(u'"あ"', '__interactive__', 'single')
>>> exec code
'\xe3\x81\x82' # '\x82\xa0' (cp932) wanted, but I get utf-8.

Currentry, using PEP0263 like below is needed to get compile
code in expected encoding.

>>> code = compile('# coding: cp932\n%s' % ('"あ"',), '__interactive__', 
'single')
>>> exec code
'\x82\xa0'
>>> code = compile('# coding: cp932\n%s' % ('u"あ"',), '__interactive__', 
'single')
>>> exec code
u'\u3042'

But I feel compile() with PEP0263 is bit dirty hack.
I think adding a 'encoding' argument that have a 'utf-8' as default value to
compile() is cleaner way and it doesn't break backward compatibility.

Following example is describe behavior of compile() with encoding option.

# coding: utf-8 (in utf-8 context)
code = compile('"あ"', '__foo.py', 'single')
exec code #=> '\xe3\x81\x82'

code = compile('"あ"', '__foo.py', 'single', encoding='cp932') => 
UnicodeDecodeError

code = compile(u'"あ"', '__foo.py', 'single')
exec code #=> '\xe3\x81\x82'

code = compile(u'"あ"', '__foo.py', 'single', encoding='cp932')
exec code #=> '\x82\xa0'
History
Date User Action Args
2009-05-03 01:55:25methanesetrecipients: + methane
2009-05-03 01:55:25methanesetmessageid: <1241315725.1.0.346952935822.issue5911@psf.upfronthosting.co.za>
2009-05-03 01:55:23methanelinkissue5911 messages
2009-05-03 01:55:22methanecreate