Author sgala
Recipients ajaksu2, inada.naoki, kbk, sgala
Date 2009-04-12.09:02:26
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1239526952.71.0.692053615077.issue1542677@psf.upfronthosting.co.za>
In-reply-to
Content
Updating the components as the error surfaces in the compile builtin.
the compile builtin works when given unicode, but fails when using a
utf8 (local input encoding) string.

Rather than adding a "coding" string to compile, my guess is that
compile should be fixed or fed a unicode string. See the effects on the
shell:

>>> print len('à')
2
>>> print len(u'à')
1
>>> exec compile("print len('à')",'test', 'single')
2
>>> exec compile("print len(u'à')",'test', 'single')
2
>>> exec compile("print len('à')".decode("utf8"),'test', 'single')
2
>>> exec compile("print len(u'à')".decode("utf8"),'test', 'single')
1
>>> 

So the error disappears when the string fed to exec compile is properly
decoded to unicode.

In idlelib there is an attempt to encode the input to
IOBindings.encoding, but IOBindings.encoding is broken here, as
locale.nl_langinfo(locale.CODESET) gives 'ANSI_X3.4-1968', which looks
up as 'ascii', while locale.getpreferredencoding() gives 'UTF-8' (as it
should).


If I comment the whole attempt, idle works (for this test, not fully
tested):

sgala@marlow ~ $ diff -u /tmp/PyShell.py 
/usr/lib64/python2.6/idlelib/PyShell.py
--- /tmp/PyShell.py	2009-04-12 11:01:01.000000000 +0200
+++ /usr/lib64/python2.6/idlelib/PyShell.py	2009-04-12
10:59:16.000000000 +0200
@@ -592,14 +592,14 @@
         self.more = 0
         self.save_warnings_filters = warnings.filters[:]
         warnings.filterwarnings(action="error", category=SyntaxWarning)
-        if isinstance(source, types.UnicodeType):
-            import IOBinding
-            try:
-                source = source.encode(IOBinding.encoding)
-            except UnicodeError:
-                self.tkconsole.resetoutput()
-                self.write("Unsupported characters in input\n")
-                return
+        #if isinstance(source, types.UnicodeType):
+        #    import IOBinding
+        #    try:
+        #        source = source.encode(IOBinding.encoding)
+        #    except UnicodeError:
+        #        self.tkconsole.resetoutput()
+        #        self.write("Unsupported characters in input\n")
+        #        return
         try:
             # InteractiveInterpreter.runsource() calls its runcode()
method,
             # which is overridden (see below)


>>> print len('á')
2
>>> print len(u'á')
1
>>> print 'á'
á
>>> print u'á'
á
>>> 


Now using Python 2.6.1 (r261:67515, Apr 10 2009, 14:34:00) on x86_64
History
Date User Action Args
2009-04-12 09:02:33sgalasetrecipients: + sgala, kbk, ajaksu2, inada.naoki
2009-04-12 09:02:32sgalasetmessageid: <1239526952.71.0.692053615077.issue1542677@psf.upfronthosting.co.za>
2009-04-12 09:02:31sgalalinkissue1542677 messages
2009-04-12 09:02:29sgalacreate