classification
Title: IDLE: replace ending with '\' causes crash
Type: crash Stage: resolved
Components: IDLE Versions: Python 3.3, Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: asvetlov Nosy List: asvetlov, ezio.melotti, python-dev, roger.serwy, terry.reedy
Priority: high Keywords: patch

Created on 2011-09-28 02:44 by terry.reedy, last changed 2012-08-04 18:48 by asvetlov. This issue is now closed.

Files
File name Uploaded Description Edit
patch13052.diff roger.serwy, 2011-10-14 18:54 patch
patch13052_re.diff roger.serwy, 2011-10-14 19:38 patch with re
issue13052.patch roger.serwy, 2011-12-23 00:43 review
Messages (15)
msg144562 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-09-28 02:44
As reported by R.S.Kachanovsky on the IDLE-sig list, and comfirmed by me with 3.2.2 on Win 7, entering a replace term ending with '\', like 'test\', in the find/replace dialog box causes IDLE to crash, as in do nothing for a couple of seconds and then close all windows.

A find term in either a find or find/replace box with a \ anywhere causes a beep. This means that one cannot search for test containing such, but that is better than crashing (which has the same effect).
msg144582 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-09-29 01:04
The problem is in Lib/idlelib/ReplaceDialog.py:141:
m = prog.match(chars, col)
if not prog:
    return False
new = m.expand(self.replvar.get())

where
prog = re.compile('foo')  # i.e. text in the find box
chars = '>>> "foo"\n'  # i.e. the text in the IDLE window
col = 5
self.replvar.get() = 'bar\'  # i.e. the var with the trailing \

m.expand() searches for groups to expand, like \1 and \2 to refer to the first and second matching group, and fails with:

Traceback (most recent call last):
  File "/home/wolf/dev/py/py3k/Lib/sre_parse.py", line 194, in __next
    c = self.string[self.index + 1]
IndexError: string index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wolf/dev/py/py3k/Lib/re.py", line 278, in _expand
    template = sre_parse.parse_template(template, pattern)
  File "/home/wolf/dev/py/py3k/Lib/sre_parse.py", line 729, in parse_template
    this = sget()
  File "/home/wolf/dev/py/py3k/Lib/sre_parse.py", line 210, in get
    self.__next()
  File "/home/wolf/dev/py/py3k/Lib/sre_parse.py", line 196, in __next
    raise error("bogus escape (end of line)")
sre_constants.error: bogus escape (end of line)

Using things like foo\5bar also results in a crash because the group \5 is not found.

I'm not sure what the expected behavior should be.  If numeric/named backreferences are not supposed to be supported, I guess a re.escape() might solve the problem.  If they are supported the last line should be wrapped in a try/except that looks for sre_constants.error errors and possibly IndexErrors too (this might actually be fixed in sre_parse.py).

BTW, I think that "if not prog" in the snippet I pasted should be "if not m".  Pattern objects are always True (I assume prog is always a pattern object).

Do you want to work on a patch?
msg145556 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2011-10-14 18:54
I see that Ezio added me to the nosy list.

Here's a patch that identifies bogus escape characters at the end of the replvar string and appends an extra \ to fix it.
msg145558 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-14 19:20
Thanks for the patch.

This seems to fix the problem with the trailing \, but not the other issues.  As I mentioned in my previous message I think re.escape might be a better solution.
I also wonder if there are tests for this, and how difficult would be to add them if they are missing.
msg145559 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2011-10-14 19:38
I don't have the beep problem you describe under Linux using Python 3.2 (r32:88445, Mar 25 2011, 19:56:22). That may be a platform-specific bug.

Here's a patch to use re.escape instead. 

A minor side effect of using re.escape is that it is not possible to use \n itself in the replace string. It gets escaped as \\n.
msg145564 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-10-14 21:55
IDLE is sufficiently undocumented that I think we may, *if necessary*, treat the details of 'expected behavior' as undefined, or look to other programs for guidance. Certainly, 'crash' is unexpected and undesired ;-).

Without "[] Regular expression" checked, I expect a simple 'normal mode' "find the literal string and replace with literal string" behavior. This is how Notepad and Notepad++ work. Neither has any problem with trailing '\'. The main 'special chars' I noticed are those normally significant to dialogs: the [Tab] key advances focus, [Enter] key executes the currently highlighted command. (The other two programs start with [Find] rather than [Replace] selected, but that is a minor and visually noticeable detail.) ^C and ^V work as expected in the entry boxes, other control chars beep.

In this mode, it is an implementation detail that the re module is used, and it should be transparent that it is. So 'backreferences' are meaningless. But in regular expression mode, they should act as an experienced re user would expect. (What that is, I an not sure, as I am not such ;-).

Notepad++ separates 'Regular expression' from the other options and has a boxed three-button radio group

|-Search mode-------------------------| 
| () Normal                           |
| () Extended (\n, \r, \t, \0, \x...) |
| () Regular expression               |
|-------------------------------------|

These amount to treating the find/replace entries as uninterpreted string literals (as with input()*), cooked \-interpreted string literals, and as (raw literal) relational expressions. We could consider Extended mode as a new feature.

* While raw string literals do not allow training '\', the input function does (in the standard interpreter window)

>>> print(input())
abc\
abc\

However, this does *not* work in IDLE 3.2.2, Win 7:

>>> print(input())
abd\  [hit Enter here]
      [which IDLE echoes here]
      [must Enter again]
      [which IDLE echoes again]
>>>   [to get new prompt]
>>> x=input()
abc\
     

>>> x
''

Entering "a\bc" is also messed up:

>>> x=input()
a\bc

>>> x
''
>>> x=input()
a\bc
>>> x
'a\\bc'

The difference between the two above involves the IDLE history mechanism, which (wrongly, in my view) treats response to input() as part of the statement.

I wonder if this is related to the current bug. Does the replace dialog use input()? Or is input entirely handled by tk, and should I open another report?

Ezio's other point:
'''
m = prog.match(chars, col)
if not prog:
    return False
...
I think that "if not prog" in the snippet I pasted should be "if not m".  Pattern objects are always True (I assume prog is always a pattern object).'''

I would guess that you are correct and this should be fixed also.
msg145566 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-14 22:19
Defining the desired behavior is a good place where to start.  Next it would be good to have tests that reflect the desired behavior, and eventually make them pass with a proper patch.

Currently IDLE seems to understand \n, \t, etc. in the "Replace with" field, but not in the "Find" field.  That means that it's possible to easily replace e.g. 4 spaces with a \t, but not the other way around.
This works regardless of the "Regular expression" checkbox.

When "Regular expression" is checked, backreferences like \1 also work (if used correctly).

I think these behaviors are fine and should be preserved.  For the non-regex mode, re.escape() could be used if the \n, \t, etc. still work, but I think that for the regex mode the right way to go is to catch regex errors.  This will protect against wrong backreferences and trailing \, without altering the meaning of the regex.

We could also support \n, \t, etc. in the "Find" box, but that's a separate issue.

Regarding the print(input()) and the other example, they all seem to work fine here with 3.2/XP.
msg150121 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2011-12-23 00:43
issue13052.patch against 3.3a0 fixes the replace dialog. It also stops "Replace All" from closing the dialog. (What other application actually does this?)

When 'Regular Expression' is not checked, the find and the replace dialogs treat the fields as raw strings. This is consistent with Terry's description of "Normal".

When checked, the find field is treated as a RE and the replace field has "expand" applied.

I think this behavior is a good compromise as it allows for the basic effect of "Extended" when 'Regular Expression' is checked.

Also, I can not reproduce the beeping issue as being exclusive to '\'. I know it's caused by "text.bell()", but the bell occurs at appropriate times based on where it's located in the code (i.e. no more search results, find-again returns previous match, etc).
msg151322 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-01-16 05:39
How difficult would it be to add tests for this?
msg151341 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-01-16 08:38
If there are any tests for Idle, I have not found them. No idlelib/test/, no test/test_idle. There is a testcode.py with random code. I want to look at tkinter/test/ someday to see if it has any models for testing Idle. Roger, do you know of anything?
msg151343 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-01-16 08:42
If the patch works for you, I guess you can commit it; if tests can be added too, that's even better.
msg151399 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2012-01-16 18:14
I'm not aware of any formal tests for IDLE. Can you show me the docs for writing proper unit tests? I'll see if I can write one for this.
msg151408 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-01-16 20:55
The library manual has chapters for

unittest: the generic testing framework (package) that any app can use

test: a package with the Python test suite; test_x tests module x
  It has other modules and subpackages, probably not all documented.
  The test modules use unittest, doctest, and other code.

test.support: some of the other code

AS it says in the test chapter, the test suite imports each test_x and runs its test_main function. The tkinter package includes a test subpackage. test_tk checks that _tkinter and tk are available and if so, runs the tests in tkinter.test. (But I am not sure it really does what it should ;-).

My thought is that we might want to do something similar with idle. However, there needs to be a test_idle module in any case, so we can start with a few tests there and migrate them to a separate idlelib.test subpackage later if we want.
msg167426 - (view) Author: Roundup Robot (python-dev) Date: 2012-08-04 18:45
New changeset 0f38948cc6df by Andrew Svetlov in branch '3.2':
Issue #13052: Fix IDLE crashing when replace string in Search/Replace dialog ended with '\'.
http://hg.python.org/cpython/rev/0f38948cc6df

New changeset 9dcfba4d0357 by Andrew Svetlov in branch 'default':
Issue #13052: Fix IDLE crashing when replace string in Search/Replace dialog ended with '\'.
http://hg.python.org/cpython/rev/9dcfba4d0357

New changeset 44c00de723b3 by Andrew Svetlov in branch '2.7':
Issue #13052: Fix IDLE crashing when replace string in Search/Replace dialog ended with '\'.
http://hg.python.org/cpython/rev/44c00de723b3
msg167427 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-08-04 18:48
While there are no unittests for Roger's patch I have pushed it in after strong manual testing as important enough.

Thanks, Roger.
History
Date User Action Args
2012-08-04 18:48:05asvetlovsetstatus: open -> closed
resolution: fixed
messages: + msg167427

stage: test needed -> resolved
2012-08-04 18:45:48python-devsetnosy: + python-dev
messages: + msg167426
2012-03-25 21:14:50asvetlovsetpriority: normal -> high
assignee: asvetlov

nosy: + asvetlov
2012-01-16 20:55:41terry.reedysetmessages: + msg151408
2012-01-16 18:14:10roger.serwysetmessages: + msg151399
2012-01-16 08:42:17ezio.melottisetmessages: + msg151343
2012-01-16 08:38:58terry.reedysetmessages: + msg151341
2012-01-16 05:39:38ezio.melottisetmessages: + msg151322
2011-12-23 00:43:38roger.serwysetfiles: + issue13052.patch

messages: + msg150121
2011-10-14 22:19:56ezio.melottisetmessages: + msg145566
stage: needs patch -> test needed
2011-10-14 21:55:48terry.reedysetmessages: + msg145564
2011-10-14 19:38:29roger.serwysetfiles: + patch13052_re.diff

messages: + msg145559
2011-10-14 19:20:26ezio.melottisetmessages: + msg145558
2011-10-14 18:54:47roger.serwysetfiles: + patch13052.diff
keywords: + patch
messages: + msg145556
2011-10-14 16:58:44ezio.melottisetnosy: + roger.serwy
2011-09-29 01:04:46ezio.melottisetnosy: + ezio.melotti
messages: + msg144582
2011-09-28 02:44:02terry.reedycreate