Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDLE hangs while printing instance of Unicode subclass #63680

Closed
tim-one opened this issue Nov 3, 2013 · 17 comments
Closed

IDLE hangs while printing instance of Unicode subclass #63680

tim-one opened this issue Nov 3, 2013 · 17 comments
Assignees
Labels
topic-IDLE type-bug An unexpected behavior, bug, or error

Comments

@tim-one
Copy link
Member

tim-one commented Nov 3, 2013

BPO 19481
Nosy @tim-one, @terryjreedy, @kbkaiser, @mjpieters, @ned-deily, @ezio-melotti, @serwy, @serhiy-storchaka
Files
  • idle_print_unicode_subclass.patch
  • idle_write_string_subclass-2.7.patch
  • idle_write_string_subclass-3.x.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2013-12-10.08:34:56.373>
    created_at = <Date 2013-11-03.04:29:15.991>
    labels = ['expert-IDLE', 'type-bug']
    title = 'IDLE hangs while printing instance of Unicode subclass'
    updated_at = <Date 2015-03-04.15:00:12.514>
    user = 'https://github.com/tim-one'

    bugs.python.org fields:

    activity = <Date 2015-03-04.15:00:12.514>
    actor = 'mjpieters'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2013-12-10.08:34:56.373>
    closer = 'serhiy.storchaka'
    components = ['IDLE']
    creation = <Date 2013-11-03.04:29:15.991>
    creator = 'tim.peters'
    dependencies = []
    files = ['32472', '32490', '32491']
    hgrepos = []
    issue_num = 19481
    keywords = ['patch']
    message_count = 17.0
    messages = ['201991', '202003', '202004', '202005', '202070', '202072', '202093', '202096', '202098', '202102', '205732', '205740', '205741', '205775', '205776', '237180', '237182']
    nosy_count = 9.0
    nosy_names = ['tim.peters', 'terry.reedy', 'kbk', 'mjpieters', 'ned.deily', 'ezio.melotti', 'roger.serwy', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue19481'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

    @tim-one
    Copy link
    Member Author

    tim-one commented Nov 3, 2013

    This showed up on StackOverflow:

    http://stackoverflow.com/questions/19749757/print-is-blocking-forever-when-printing-unicode-subclass-instance-from-idle

    They were using 32-bit Python 2.7.5 on Windows 7; I reproduced using the same Python on Windows Vista. To reproduce, open IDLE, and enter

    >>> class Foo(unicode):
            pass
    >>> foo = Foo('bar')
    >>> print foo

    IDLE hangs then, and Ctrl+C is ignored. Stranger, these variants do *not* hang:

    >> foo
    >> print str(foo)
    >> print repr(foo)

    Those all work as expected. Cute :-)

    And none of these hang in a DOS-box session.

    @serhiy-storchaka serhiy-storchaka added the type-bug An unexpected behavior, bug, or error label Nov 3, 2013
    @terryjreedy
    Copy link
    Member

    Win 7, console 2.7.5+, 32 bit, compiled Aug 24, does not have the problem. Idle started with 'import idlelib.idle' does, but only for 'print foo', as Tim reported. When I close the hung process with [X], there is no error message in the console. Installed 64bit 2.7.5 fails with 'print foo' also. I actually used F and f instead of Foo and foo, so it is not name specific. A subclass of str works fine.

    Current 3.4a4 Idle works fine. The SO OP also reported that there is no problem is the class is imported from another file.

    We need a test on something other than Windows, preferably both mac and linux.

    @ned-deily
    Copy link
    Member

    It's reproducible on OS X as well with a 32-bit Python 2.7.5 and a 64-bit Python 2.7.6rc1. However, the example works OK if I start IDLE with no subprocess (-n).

    @serhiy-storchaka
    Copy link
    Member

    This patch fixes symptoms.

    @ned-deily
    Copy link
    Member

    LGTM

    @tim-one
    Copy link
    Member Author

    tim-one commented Nov 4, 2013

    Do we have a theory for _why_ IDLE goes nuts? I'd like to know whether the patch is fixing the real problem, or just happens to work in this particular test case ;-)

    @terryjreedy
    Copy link
    Member

    I am curious too, so I traced through the call chain.

    In PyShell.py
    1343: PseudoOutputFile.write(s) calls: self.shell.write(s, self.tags)
    914: shell is an instance of PyShell and self.tags is 'stdout', 'stderr', or 'console'.
    1291: PyShell.write(s,tags) calls:
    OutputWindow.write(self, s, tags, "iomark")
    (where 'iomark' must have been defined elsewhere, and the 'gravity' calls should not matter)

    In OutputWindow.py
    46: OutputWindow(EditorWindow).write(s,tags,mark='insert') calls: self.text.insert(mark, s, tags)
    after trying to encode s if isinstance(s, str). It follows with:
    self.text.see(mark)
    self.text.update()
    but if the insert succeeds, these should not care about the source of the inserted chars.

    In EditorWindow.py
    187: self.text = MultiCallCreator(Text)(text_frame, **text_options)
    In MultiCall.py,
    304: MultiCallCreator wraps a tk widget in a MultiCall instance that adds event methods but otherwise passes calls to the tk widget.

    So PseudoOutputFile(s) becomes tk.Text().insert('iomark', s, 'stdout').
    which becomes (lib-tk/tkinter.py, 3050)
    self.tk.call((self._w, 'insert', 'iomark', s) + args)

    Tk handles either Latin-1 bytes or BMP unicode. It seems fine with a unicode subclass:
    >>> import Tkinter as tk
    >>> t = tk.Text()
    >>> class F(unicode): pass
    
    >>> f = F('foo')
    >>> t.insert('1.0', u'abc', 'stdout') # 'iomark' is not defined
    >>> t.insert('1.0', f, 'stdout')
    >>> t.get('1.0', 'end')
    u'abcfoo\n'

    I remain puzzled.

    @serhiy-storchaka
    Copy link
    Member

    I suppose this is related to pickling.

    I were puzzled why it works with bytearray subclasses. But now I investigated that print() implicitly converts str and bytearray subclasses to str and left unicode subclasses as is. You can reproduce this bug for str and bytearray subclasses if use sys.stdout.write() instead of print().

    Here is a patch for 2.7 which fixes the issue for str and bytearray subclasses too. 3.x needs patch too.

    >> class U(unicode): pass

    >> class S(str): pass

    >> class BA(bytearray): pass

    >>> import sys
    >>> sys.stdout.write(u'\u20ac')
    €
    >>> sys.stdout.write('\xe2\x82\xac')
    €
    >>> sys.stdout.write(bytearray('\xe2\x82\xac'))
    €
    >>> sys.stdout.write(U(u'\u20ac'))
    €
    >>> sys.stdout.write(S('\xe2\x82\xac'))
    €
    >>> sys.stdout.write(BA('\xe2\x82\xac'))
    €

    @serhiy-storchaka
    Copy link
    Member

    And here is a patch for 3.x. Without it following code hangs.

    >> class S(str): pass

    >>> import sys
    >>> sys.stdout.write('\u20ac')
    €1
    >>> sys.stdout.write(S('\u20ac'))
    €1

    @ned-deily
    Copy link
    Member

    Pickling for the RPC protocol between the GUI process and the interpreter subprocess, which would explain why there is no problem when running idle -n (no subproces)?

    @serhiy-storchaka
    Copy link
    Member

    Pickling for the RPC protocol between the GUI process and the interpreter subprocess, which would explain why there is no problem when running idle -n (no subproces)?

    Yes, it is.

    If there are no objections I'll commit these patches.

    @serhiy-storchaka serhiy-storchaka self-assigned this Dec 9, 2013
    @terryjreedy
    Copy link
    Member

    [2.7] print() implicitly converts str and bytearray subclasses to str and left unicode subclasses as is.

    This strikes me as possibly a bug in print, but even if that were changed, there is still the issue of sys.stdout.write and pickle. While the patch is a great improvement, it changes the behavior of sys.stdout.write(s), which acts like it calls str.__str__(s) rather than str(s) == s.__str__

    ---

    class S(str):
        def __str__(self):
            return 'S: ' + str.__str__(self)
    
    s = S('foo')
    print(s, str(s), str.__str__(s))
    
    import sys
    sys.stdout.write(s)

    S: foo S: foo foo
    foo

    on the console (hang after first line on Idle)

    I am testing the patch with str(s) changed to str.__str__(s).

    @terryjreedy
    Copy link
    Member

    Confirmed that the revised patch for 3.3 fixes the hang and matches the console interpreter output.

    @serhiy-storchaka
    Copy link
    Member

    Good suggestion Terry. And for unicode in 2.7 we can use unicode.__getslice__(s, None, None) (because there is no unicode.__unicode__).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 10, 2013

    New changeset df9596ca838c by Serhiy Storchaka in branch '2.7':
    Issue bpo-19481: print() of unicode, str or bytearray subclass instance in IDLE
    http://hg.python.org/cpython/rev/df9596ca838c

    New changeset d462b2bf875b by Serhiy Storchaka in branch '3.3':
    Issue bpo-19481: print() of string subclass instance in IDLE no more hangs.
    http://hg.python.org/cpython/rev/d462b2bf875b

    New changeset 1d68ea8148ce by Serhiy Storchaka in branch 'default':
    Issue bpo-19481: print() of string subclass instance in IDLE no more hangs.
    http://hg.python.org/cpython/rev/1d68ea8148ce

    @mjpieters
    Copy link
    Mannequin

    mjpieters mannequin commented Mar 4, 2015

    This changes causes printing BeautifulSoup NavigableString objects to fail; the code actually could never work as unicode.__getslice__ insists on getting passed in integers, not None.

    To reproduce, create a new file in IDLE and paste in:

    from bs4 import BeautifulSoup
    html_doc = """<title>The Dormouse's story</title>""" 
    soup = BeautifulSoup(html_doc)
    print soup.title.string

    Then pick *Run Module* to see:

    Traceback (most recent call last):
      File "/private/tmp/test.py", line 4, in <module>
        print soup.title.string
      File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/idlelib/PyShell.py", line 1353, in write
        s = unicode.__getslice__(s, None, None)
    TypeError: an integer is required

    The same error can be induced with:

        unicode.__getslice__(u'', None, None)

    while specifying a start and end index (0 and len(s)) should fix this.

    @mjpieters
    Copy link
    Mannequin

    mjpieters mannequin commented Mar 4, 2015

    Created a new issue: http://bugs.python.org/issue23583

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-IDLE type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants