Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDLE: replace ending with '\' causes crash #57261

Closed
terryjreedy opened this issue Sep 28, 2011 · 15 comments
Closed

IDLE: replace ending with '\' causes crash #57261

terryjreedy opened this issue Sep 28, 2011 · 15 comments
Assignees
Labels
topic-IDLE type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@terryjreedy
Copy link
Member

BPO 13052
Nosy @terryjreedy, @ezio-melotti, @serwy, @asvetlov
Files
  • patch13052.diff: patch
  • patch13052_re.diff: patch with re
  • issue13052.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/asvetlov'
    closed_at = <Date 2012-08-04.18:48:05.633>
    created_at = <Date 2011-09-28.02:44:02.291>
    labels = ['expert-IDLE', 'type-crash']
    title = "IDLE: replace ending with '\\' causes crash"
    updated_at = <Date 2012-08-04.18:48:05.632>
    user = 'https://github.com/terryjreedy'

    bugs.python.org fields:

    activity = <Date 2012-08-04.18:48:05.632>
    actor = 'asvetlov'
    assignee = 'asvetlov'
    closed = True
    closed_date = <Date 2012-08-04.18:48:05.633>
    closer = 'asvetlov'
    components = ['IDLE']
    creation = <Date 2011-09-28.02:44:02.291>
    creator = 'terry.reedy'
    dependencies = []
    files = ['23409', '23411', '24080']
    hgrepos = []
    issue_num = 13052
    keywords = ['patch']
    message_count = 15.0
    messages = ['144562', '144582', '145556', '145558', '145559', '145564', '145566', '150121', '151322', '151341', '151343', '151399', '151408', '167426', '167427']
    nosy_count = 5.0
    nosy_names = ['terry.reedy', 'ezio.melotti', 'roger.serwy', 'asvetlov', 'python-dev']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue13052'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3']

    @terryjreedy
    Copy link
    Member Author

    As reported by R.S.Kachanovsky on the IDLE-sig list, and comfirmed by me with 3.2.2 on Win 7, entering a replace term ending with '\', like 'test\', in the find/replace dialog box causes IDLE to crash, as in do nothing for a couple of seconds and then close all windows.

    A find term in either a find or find/replace box with a \ anywhere causes a beep. This means that one cannot search for test containing such, but that is better than crashing (which has the same effect).

    @terryjreedy terryjreedy added topic-IDLE type-crash A hard crash of the interpreter, possibly with a core dump labels Sep 28, 2011
    @ezio-melotti
    Copy link
    Member

    The problem is in Lib/idlelib/ReplaceDialog.py:141:
    m = prog.match(chars, col)
    if not prog:
    return False
    new = m.expand(self.replvar.get())

    where
    prog = re.compile('foo')  # i.e. text in the find box
    chars = '>>> "foo"\n'  # i.e. the text in the IDLE window
    col = 5
    self.replvar.get() = 'bar\'  # i.e. the var with the trailing \

    m.expand() searches for groups to expand, like \1 and \2 to refer to the first and second matching group, and fails with:

    Traceback (most recent call last):
      File "/home/wolf/dev/py/py3k/Lib/sre_parse.py", line 194, in __next
        c = self.string[self.index + 1]
    IndexError: string index out of range
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/wolf/dev/py/py3k/Lib/re.py", line 278, in _expand
        template = sre_parse.parse_template(template, pattern)
      File "/home/wolf/dev/py/py3k/Lib/sre_parse.py", line 729, in parse_template
        this = sget()
      File "/home/wolf/dev/py/py3k/Lib/sre_parse.py", line 210, in get
        self.__next()
      File "/home/wolf/dev/py/py3k/Lib/sre_parse.py", line 196, in __next
        raise error("bogus escape (end of line)")
    sre_constants.error: bogus escape (end of line)

    Using things like foo\5bar also results in a crash because the group \5 is not found.

    I'm not sure what the expected behavior should be. If numeric/named backreferences are not supposed to be supported, I guess a re.escape() might solve the problem. If they are supported the last line should be wrapped in a try/except that looks for sre_constants.error errors and possibly IndexErrors too (this might actually be fixed in sre_parse.py).

    BTW, I think that "if not prog" in the snippet I pasted should be "if not m". Pattern objects are always True (I assume prog is always a pattern object).

    Do you want to work on a patch?

    @serwy
    Copy link
    Mannequin

    serwy mannequin commented Oct 14, 2011

    I see that Ezio added me to the nosy list.

    Here's a patch that identifies bogus escape characters at the end of the replvar string and appends an extra \ to fix it.

    @ezio-melotti
    Copy link
    Member

    Thanks for the patch.

    This seems to fix the problem with the trailing \, but not the other issues. As I mentioned in my previous message I think re.escape might be a better solution.
    I also wonder if there are tests for this, and how difficult would be to add them if they are missing.

    @serwy
    Copy link
    Mannequin

    serwy mannequin commented Oct 14, 2011

    I don't have the beep problem you describe under Linux using Python 3.2 (r32:88445, Mar 25 2011, 19:56:22). That may be a platform-specific bug.

    Here's a patch to use re.escape instead.

    A minor side effect of using re.escape is that it is not possible to use \n itself in the replace string. It gets escaped as \\n.

    @terryjreedy
    Copy link
    Member Author

    IDLE is sufficiently undocumented that I think we may, *if necessary*, treat the details of 'expected behavior' as undefined, or look to other programs for guidance. Certainly, 'crash' is unexpected and undesired ;-).

    Without "[] Regular expression" checked, I expect a simple 'normal mode' "find the literal string and replace with literal string" behavior. This is how Notepad and Notepad++ work. Neither has any problem with trailing '\'. The main 'special chars' I noticed are those normally significant to dialogs: the [Tab] key advances focus, [Enter] key executes the currently highlighted command. (The other two programs start with [Find] rather than [Replace] selected, but that is a minor and visually noticeable detail.) ^C and ^V work as expected in the entry boxes, other control chars beep.

    In this mode, it is an implementation detail that the re module is used, and it should be transparent that it is. So 'backreferences' are meaningless. But in regular expression mode, they should act as an experienced re user would expect. (What that is, I an not sure, as I am not such ;-).

    Notepad++ separates 'Regular expression' from the other options and has a boxed three-button radio group

    |-Search mode-------------------------|
    | () Normal |
    | () Extended (\n, \r, \t, \0, \x...) |

    () Regular expression

    These amount to treating the find/replace entries as uninterpreted string literals (as with input()*), cooked \-interpreted string literals, and as (raw literal) relational expressions. We could consider Extended mode as a new feature.

    • While raw string literals do not allow training '\', the input function does (in the standard interpreter window)
    >>> print(input())
    abc\
    abc\

    However, this does *not* work in IDLE 3.2.2, Win 7:

    >>> print(input())
    abd\  [hit Enter here]
          [which IDLE echoes here]
          [must Enter again]
          [which IDLE echoes again]
    >>>   [to get new prompt]
    >>> x=input()
    abc\
         
    
    >>> x
    ''

    Entering "a\bc" is also messed up:

    >>> x=input()
    a\bc
    
    >>> x
    ''
    >>> x=input()
    a\bc
    >>> x
    'a\\bc'

    The difference between the two above involves the IDLE history mechanism, which (wrongly, in my view) treats response to input() as part of the statement.

    I wonder if this is related to the current bug. Does the replace dialog use input()? Or is input entirely handled by tk, and should I open another report?

    Ezio's other point:
    '''
    m = prog.match(chars, col)
    if not prog:
    return False
    ...
    I think that "if not prog" in the snippet I pasted should be "if not m". Pattern objects are always True (I assume prog is always a pattern object).'''

    I would guess that you are correct and this should be fixed also.

    @ezio-melotti
    Copy link
    Member

    Defining the desired behavior is a good place where to start. Next it would be good to have tests that reflect the desired behavior, and eventually make them pass with a proper patch.

    Currently IDLE seems to understand \n, \t, etc. in the "Replace with" field, but not in the "Find" field. That means that it's possible to easily replace e.g. 4 spaces with a \t, but not the other way around.
    This works regardless of the "Regular expression" checkbox.

    When "Regular expression" is checked, backreferences like \1 also work (if used correctly).

    I think these behaviors are fine and should be preserved. For the non-regex mode, re.escape() could be used if the \n, \t, etc. still work, but I think that for the regex mode the right way to go is to catch regex errors. This will protect against wrong backreferences and trailing \, without altering the meaning of the regex.

    We could also support \n, \t, etc. in the "Find" box, but that's a separate issue.

    Regarding the print(input()) and the other example, they all seem to work fine here with 3.2/XP.

    @serwy
    Copy link
    Mannequin

    serwy mannequin commented Dec 23, 2011

    bpo-13052.patch against 3.3a0 fixes the replace dialog. It also stops "Replace All" from closing the dialog. (What other application actually does this?)

    When 'Regular Expression' is not checked, the find and the replace dialogs treat the fields as raw strings. This is consistent with Terry's description of "Normal".

    When checked, the find field is treated as a RE and the replace field has "expand" applied.

    I think this behavior is a good compromise as it allows for the basic effect of "Extended" when 'Regular Expression' is checked.

    Also, I can not reproduce the beeping issue as being exclusive to '\'. I know it's caused by "text.bell()", but the bell occurs at appropriate times based on where it's located in the code (i.e. no more search results, find-again returns previous match, etc).

    @ezio-melotti
    Copy link
    Member

    How difficult would it be to add tests for this?

    @terryjreedy
    Copy link
    Member Author

    If there are any tests for Idle, I have not found them. No idlelib/test/, no test/test_idle. There is a testcode.py with random code. I want to look at tkinter/test/ someday to see if it has any models for testing Idle. Roger, do you know of anything?

    @ezio-melotti
    Copy link
    Member

    If the patch works for you, I guess you can commit it; if tests can be added too, that's even better.

    @serwy
    Copy link
    Mannequin

    serwy mannequin commented Jan 16, 2012

    I'm not aware of any formal tests for IDLE. Can you show me the docs for writing proper unit tests? I'll see if I can write one for this.

    @terryjreedy
    Copy link
    Member Author

    The library manual has chapters for

    unittest: the generic testing framework (package) that any app can use

    test: a package with the Python test suite; test_x tests module x
    It has other modules and subpackages, probably not all documented.
    The test modules use unittest, doctest, and other code.

    test.support: some of the other code

    AS it says in the test chapter, the test suite imports each test_x and runs its test_main function. The tkinter package includes a test subpackage. test_tk checks that _tkinter and tk are available and if so, runs the tests in tkinter.test. (But I am not sure it really does what it should ;-).

    My thought is that we might want to do something similar with idle. However, there needs to be a test_idle module in any case, so we can start with a few tests there and migrate them to a separate idlelib.test subpackage later if we want.

    @asvetlov asvetlov self-assigned this Mar 25, 2012
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 4, 2012

    New changeset 0f38948cc6df by Andrew Svetlov in branch '3.2':
    Issue bpo-13052: Fix IDLE crashing when replace string in Search/Replace dialog ended with '\'.
    http://hg.python.org/cpython/rev/0f38948cc6df

    New changeset 9dcfba4d0357 by Andrew Svetlov in branch 'default':
    Issue bpo-13052: Fix IDLE crashing when replace string in Search/Replace dialog ended with '\'.
    http://hg.python.org/cpython/rev/9dcfba4d0357

    New changeset 44c00de723b3 by Andrew Svetlov in branch '2.7':
    Issue bpo-13052: Fix IDLE crashing when replace string in Search/Replace dialog ended with '\'.
    http://hg.python.org/cpython/rev/44c00de723b3

    @asvetlov
    Copy link
    Contributor

    asvetlov commented Aug 4, 2012

    While there are no unittests for Roger's patch I have pushed it in after strong manual testing as important enough.

    Thanks, Roger.

    @asvetlov asvetlov closed this as completed Aug 4, 2012
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-IDLE type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants