Title: IDLE hangs while printing instance of Unicode subclass
msg201991 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-11-03 04:29
This showed up on StackOverflow:

They were using 32-bit Python 2.7.5 on Windows 7; I reproduced using the same Python on Windows Vista.  To reproduce, open IDLE, and enter

>>> class Foo(unicode):
>>> foo = Foo('bar')
>>> print foo

IDLE hangs then, and Ctrl+C is ignored.  Stranger, these variants do *not* hang:

>>> foo
>>> print str(foo)
>>> print repr(foo)

Those all work as expected.  Cute :-)

And none of these hang in a DOS-box session.
msg202003 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-11-03 07:27
Win 7, console 2.7.5+, 32  bit, compiled Aug 24, does not have the problem. Idle started with 'import idlelib.idle' does, but only for 'print foo', as Tim reported. When I close the hung process with [X], there is no error message in the console. Installed 64bit 2.7.5 fails with 'print foo' also. I actually used F and f instead of Foo and foo, so it is not name specific. A subclass of str works fine.

Current 3.4a4 Idle works fine. The SO OP also reported that there is no problem is the class is imported from another file.

We need a test on something other than Windows, preferably both mac and linux.
msg202004 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-11-03 07:45
It's reproducible on OS X as well with a 32-bit Python 2.7.5 and a 64-bit Python 2.7.6rc1.  However, the example works OK if I start IDLE with no subprocess (-n).
msg202005 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-03 08:19
This patch fixes symptoms.
msg202070 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-11-04 00:21
msg202072 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-11-04 00:28
Do we have a theory for _why_ IDLE goes nuts?  I'd like to know whether the patch is fixing the real problem, or just happens to work in this particular test case ;-)
msg202093 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-11-04 06:55
I am curious too, so I traced through the call chain.

1343: PseudoOutputFile.write(s) calls:, self.tags)
914: shell is an instance of PyShell and self.tags is 'stdout', 'stderr', or 'console'.
1291: PyShell.write(s,tags) calls:
 OutputWindow.write(self, s, tags, "iomark")
 (where 'iomark' must have been defined elsewhere, and the 'gravity' calls should not matter)

46: OutputWindow(EditorWindow).write(s,tags,mark='insert') calls: self.text.insert(mark, s, tags)
after trying to encode s if isinstance(s, str). It follows with:
but if the insert succeeds, these should not care about the source of the inserted chars.

187: self.text = MultiCallCreator(Text)(text_frame, **text_options)
304: MultiCallCreator wraps a tk widget in a MultiCall instance that adds event methods but otherwise passes calls to the tk widget.

So PseudoOutputFile(s) becomes tk.Text().insert('iomark', s, 'stdout').
which becomes (lib-tk/, 3050), 'insert', 'iomark', s) + args)

Tk handles either Latin-1 bytes or BMP unicode. It seems fine with a unicode subclass:
>>> import Tkinter as tk
>>> t = tk.Text()
>>> class F(unicode): pass

>>> f = F('foo')
>>> t.insert('1.0', u'abc', 'stdout') # 'iomark' is not defined
>>> t.insert('1.0', f, 'stdout')
>>> t.get('1.0', 'end')

I remain puzzled.
msg202096 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-04 08:40
I suppose this is related to pickling.

I were puzzled why it works with bytearray subclasses. But now I investigated that print() implicitly converts str and bytearray subclasses to str and left unicode subclasses as is. You can reproduce this bug for str and bytearray subclasses if use sys.stdout.write() instead of print().

Here is a patch for 2.7 which fixes the issue for str and bytearray subclasses too. 3.x needs patch too.

>>> class U(unicode): pass

>>> class S(str): pass

>>> class BA(bytearray): pass

>>> import sys
>>> sys.stdout.write(u'\u20ac')
>>> sys.stdout.write('\xe2\x82\xac')
>>> sys.stdout.write(bytearray('\xe2\x82\xac'))
>>> sys.stdout.write(U(u'\u20ac'))
>>> sys.stdout.write(S('\xe2\x82\xac'))
>>> sys.stdout.write(BA('\xe2\x82\xac'))
msg202098 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-04 08:50
And here is a patch for 3.x. Without it following code hangs.

>>> class S(str): pass

>>> import sys
>>> sys.stdout.write('\u20ac')
>>> sys.stdout.write(S('\u20ac'))
msg202102 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-11-04 09:53
Pickling for the RPC protocol between the GUI process and the interpreter subprocess, which would explain why there is no problem when running idle -n (no subproces)?
msg205732 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-09 19:29
> Pickling for the RPC protocol between the GUI process and the interpreter subprocess, which would explain why there is no problem when running idle -n (no subproces)?

Yes, it is.

If there are no objections I'll commit these patches.
msg205740 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-12-09 21:10
> [2.7] print() implicitly converts str and bytearray subclasses to str and left unicode subclasses as is.

This strikes me as possibly a bug in print, but even if that were changed, there is still the issue of sys.stdout.write and pickle. While the patch is a great improvement, it changes the behavior of sys.stdout.write(s), which acts like it calls str.__str__(s) rather than str(s) == s.__str__

class S(str):
    def __str__(self):
        return 'S: ' + str.__str__(self)

s = S('foo')
print(s, str(s), str.__str__(s))

import sys
S: foo S: foo foo

on the console (hang after first line on Idle)

I am testing the patch with str(s) changed to str.__str__(s).
msg205741 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-12-09 21:29
Confirmed that the revised patch for 3.3 fixes the hang and matches the console interpreter output.
msg205775 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-10 07:59
Good suggestion Terry. And for unicode in 2.7 we can use unicode.__getslice__(s, None, None) (because there is no unicode.__unicode__).
msg205776 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-12-10 08:07
New changeset df9596ca838c by Serhiy Storchaka in branch '2.7':
Issue #19481: print() of unicode, str or bytearray subclass instance in IDLE

New changeset d462b2bf875b by Serhiy Storchaka in branch '3.3':
Issue #19481: print() of string subclass instance in IDLE no more hangs.

New changeset 1d68ea8148ce by Serhiy Storchaka in branch 'default':
Issue #19481: print() of string subclass instance in IDLE no more hangs.
msg237180 - (view) Author: Martijn Pieters (mjpieters) * Date: 2015-03-04 14:54
This changes causes printing BeautifulSoup NavigableString objects to fail; the code actually could never work as `unicode.__getslice__` insists on getting passed in integers, not None.

To reproduce, create a new file in IDLE and paste in:

from bs4 import BeautifulSoup
html_doc = """<title>The Dormouse's story</title>""" 
soup = BeautifulSoup(html_doc)
print soup.title.string

Then pick *Run Module* to see:

Traceback (most recent call last):
  File "/private/tmp/", line 4, in <module>
    print soup.title.string
  File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/idlelib/", line 1353, in write
    s = unicode.__getslice__(s, None, None)
TypeError: an integer is required

The same error can be induced with:

    unicode.__getslice__(u'', None, None)

while specifying a start and end index (0 and len(s)) should fix this.
msg237182 - (view) Author: Martijn Pieters (mjpieters) * Date: 2015-03-04 15:00
Created a new issue:
