This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients bsherwood, ezio.melotti, gpolo, serhiy.storchaka, terry.reedy, vstinner
Date 2013-11-10.18:02:18
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1384106539.73.0.990805074425.issue19020@psf.upfronthosting.co.za>
In-reply-to
Content
> I am also not clear on the relation between the UnicodeDecodeError and tuple splitting. Does '_flatten((self._w, cmd)))' call split or splitlist on the tuple arg? Is so, do you know why a problem with that would lead to the UDError? Does your patch fix the leading '0' regression?

The traceback is misleading. Full statement is:

            for x in self.tk.split(
                    self.tk.call(_flatten((self._w, cmd)))):

Where cmd is ('entryconfigure', index). The UnicodeDecodeError error was raised neither by _flatten() nor call(), but by split().

When run `./python -m idlelib.idle \\0.py` call() returns and split() gets a tuple of tuples: (('-activebackground', '', '', '', ''), ('-activeforeground', '', '', '', ''), ('-accelerator', '', '', '', ''), ('-background', '', '', '', ''), ('-bitmap', '', '', '', ''), ('-columnbreak', '', '', 0, 0), ('-command', '', '', '', '3067328620open_recent_file'), ('-compound', 'compound', 'Compound', <index object: 'none'>, 'none'), ('-font', '', '', '', ''), ('-foreground', '', '', '', ''), ('-hidemargin', '', '', 0, 0), ('-image', '', '', '', ''), ('-label', '', '', '', '1 /home/serhiy/py/cpython/\\0.py'), ('-state', '', '', <index object: 'normal'>, 'normal'), ('-underline', '', '', -1, 0)). When set wantobjects in Lib/tkinter/__init__.py to 0, it will get a string r"{-activebackground {} {} {} {}} {-activeforeground {} {} {} {}} {-accelerator {} {} {} {}} {-background {} {} {} {}} {-bitmap {} {} {} {}} {-columnbreak {} {} 0 0} {-command {} {} {} 3067013228open_recent_file} {-compound compound Compound none none} {-font {} {} {} {}} {-foreground {} {} {} {}} {-hidemargin {} {} 0 0} {-image {} {} {} {}} {-label {} {} {} {1 /home/serhiy/py/cpython/\0.py}} {-state {} {} normal normal} {-underline {} {} -1 0}".  Then split() try recursively split its argument. When it splits '1 /home/serhiy/py/cpython/\\0.py' it interprets '\\0' as backslash substitution of octal code 0 which means a character with code 0. Tcl uses modified UTF-8 encoding in which null code is encoded as b'\xC0\x80'. This bytes sequence is invalid UTF-8. That is why UnicodeDecodeError was raised (patch for issue13153 handles b'\xC0\x80' more correctly). When you will try '\101.py', it will be translated by split() to 'A.py'.
History
Date User Action Args
2013-11-10 18:02:19serhiy.storchakasetrecipients: + serhiy.storchaka, terry.reedy, bsherwood, vstinner, gpolo, ezio.melotti
2013-11-10 18:02:19serhiy.storchakasetmessageid: <1384106539.73.0.990805074425.issue19020@psf.upfronthosting.co.za>
2013-11-10 18:02:19serhiy.storchakalinkissue19020 messages
2013-11-10 18:02:19serhiy.storchakacreate