classification
Title: IDLE Output Window 's goto fails when path has spaces
Type: behavior Stage: resolved
Components: IDLE Versions: Python 3.0, Python 3.1, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: kbk Nosy List: ccanepa, gpolo, kbk, r.david.murray, terry.reedy
Priority: normal Keywords: 26backport

Created on 2009-03-25 03:22 by ccanepa, last changed 2011-05-13 03:42 by kbk. This issue is now closed.

Files
File name Uploaded Description Edit
unnamed ccanepa, 2009-05-03 07:14
Messages (12)
msg84142 - (view) Author: Claudio Canepa (ccanepa) Date: 2009-03-25 03:22
in windows XP, python 2.6.1, 2.6 , python 2.4
1. do an Edit | 'Find in files' [ it pop ups the Output Window with 
result]
2. Right click over one of the target lines found, click the 'goto file
\line' pop up

If the path in the target line has spaces, it will popup a window with 
title 'No special line' and message 'the line you point at doenst look 
like a valid file name followed by a line number'

posible fix:
in idlelib/OutputWindow.py
replace the literal
r'([^\s]+):\s*(\d+):'
with
r'([^\t\n\r\f\v]+):\s*(\d+):'

fair warning: seems to work in windows XP, dont know about other OSes
msg84158 - (view) Author: Guilherme Polo (gpolo) * (Python committer) Date: 2009-03-25 12:14
I see this occurring everywhere, but the proposed solution may not be
enough. I can still create a directory with a tab on it and it will fail
too.

The proper solution (in my head) would involve changing the interaction
of GrepDialog with OutupWindow, but then OutputWindow wouldn't really be
an OutputWindow. But I believe the better way to fix this right now is
to define a more correct regex, and test it :)
msg85663 - (view) Author: Claudio Canepa (ccanepa) Date: 2009-04-06 19:01
On a second look:

1. the code in OutputWindow.py for the 'goto' action looks for a match 
with the first regex in
    file_line_pats = [
        r'file "([^"]*)", line (\d+)',
        r'([^\s]+)\((\d+)\)',
        r'([^\s]+):\s*(\d+):',
    ]
and it assumes the first group gives a valid filename, the second a 
line number. 

2. the potential target lines produced by GrepDialog.py are writen by:
sys.stdout.write("%s: %s: %s\n" % (fn, lineno, line)) 

where:
  fn :a valid filename ( because an open(fn) was issued before), not 
guaranted an abspath
  lineno : unsigned int
  line: a text line in an arbitrary file, but mostly an *.py 

Clearly the 3rd regex is the only one that can hit the line produced by 
GrepDialog, and clearly will fail in any OS that allows spaces in a 
valid path: the regex breaks the first group at the first whitespace.
The tentative fix propossed in the initial post was a minimal change 
that allow spaces in the path, but was very ugly and not comunicates a 
clear intention.

I like more:
r'(?:\s*)(.+):\s+(\d+):'
wich discards leading whitespace ( feature unused by GrepDialog but 
probably handy in pyShell, wich subclasses OutputWindow  )


It is a better regex that the original in IDLE, meaning it would hit 
more positives, but can fail sometimes:
Supose the GrepDialog found a hit at line 111 in the file c:\foo.py , 
with the text
a = 1 # see 24: 32: 1

The line sent to OutputWindow would be
c:\foo.py: 111: a = 1 # see 24: 32: 1

and the regex will capture the groups:
filename = "c:\foo.py: 111: a = 1 # see 24"
linenum = "32"
The first group fails to capture the filename.

I can live with such special case failures, but anyway:
In windows, changing the regex to break the first group at the 
first ': ' would fix the thing ( ': ' cant happen in a fn that pass open
(fn) )

How about other OSes ?


------
msg86628 - (view) Author: Kurt B. Kaiser (kbk) * (Python committer) Date: 2009-04-26 23:29
Added a regex to handle win paths w/spaces.

A regex match may not be the file desired.
Try all the patterns until a valid file is found.

r71995.

Let me know if you can find a failing corner case.

port to 3 and -maint.
msg86629 - (view) Author: Guilherme Polo (gpolo) * (Python committer) Date: 2009-04-26 23:39
Wouldn't it be better to actually write tests for it ? Also, the issue's
title is a bit misleading, it is not only spaces that causes troubles.
Isn't fixing how GrepDialog <-> OutputWindow works an acceptable
solution ? Instead of just writing the output in the OutputWindow, can't
we store the results generated by GrepDialog so OutputWindow can access
it without even needing to retry some regexes till it works (or not) ?
msg86634 - (view) Author: Kurt B. Kaiser (kbk) * (Python committer) Date: 2009-04-27 00:52
I'd welcome a test, please provide one. I actually wrote a small test 
code so I could understand the problem, but didn't include it. 

The 'grep dialog' sends its output to an OutputWindow labelled *Output* 
as a series of lines.  This is entirely separate from the "go to file/
line" code and the path issue raised in this bug.  The line being 
queried may also be a PyShell traceback.

When you right click on a line (any line in an OutputWindow), the 
string is passed to OutputWindow.goto_file_line(), which uses 
_file_line_helper() to try to find a valid file. The problem was the 
regex couldn't handle spaces (or tabs) correctly, and also the code gave 
up if the first match didn't result in a valid file.  Typically, it was 
trying a path which started with the first character after the last 
embedded space.  A match, but invalid. To get to the correct pattern, 
the code must not give up.

We are just trying patterns on a string w/o going back to the *Output* 
window.  That's reasonable, because, for example, the traceback line 
looks quite different from the grep dialog output and needs a different 
regex. It's certainly not inefficient, because the human right-clicked a 
single line and won't wake up for 500 ms, at least :-)

We have a simple problem to fix, and don't want use it as an excuse to 
tear up a lot of code which is working well.

Actually, I ran into this myself this morning while hacking trunk IDLE 
on Windows from a DOS box using python 2.6, and I had the problem half 
fixed before I saw this bug!  Amazing it took this long to raise it, the 
code here hasn't changed for many, many years.

The regex I added is specifically for Win paths.  It ltrims, then takes 
everything up to the first ':'.  The trick is to be non-greedy.

Do you have a case the corrected code doesn't handle?
msg86642 - (view) Author: Guilherme Polo (gpolo) * (Python committer) Date: 2009-04-27 02:19
> Do you have a case the corrected code doesn't handle?

Create a file that starts with a space and search for something that
returns it, for example.
msg86995 - (view) Author: Kurt B. Kaiser (kbk) * (Python committer) Date: 2009-05-03 02:30
r72227.

How's your test code coming?  A relative Win filename with leading 
spaces should be found even when there's a file of same name but no 
spaces.
msg87035 - (view) Author: Claudio Canepa (ccanepa) Date: 2009-05-03 07:14
On Sat, May 2, 2009 at 11:31 PM, Kurt B. Kaiser <report@bugs.python.org>wrote:

>
> Kurt B. Kaiser <kbk@shore.net> added the comment:
>
> r72227.
>
> How's your test code coming?  A relative Win filename with leading
> spaces should be found even when there's a file of same name but no
> spaces.
>
> ----------

> keywords: +26backport
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue5559>
> _______________________________________
>

Sorry for the delay, Kurt.
Test with rev 72227 , ok.

test cases:
(one space betwwen a and b , one before second tmp, one before first x)
Searching 'hello' in d:\tmp\*.tmp ...
d:\tmp\ xx.tmp: 1: hello # see woot.py:24 :24
d:\tmp\a b\ tmp\ xx.tmp: 1: hello # see woot.py:24 :24
d:\tmp\a b\ xx.tmp: 1: hello # see woot.py:24 :24
Found 3 hits.
( all three opens the correct file)

Same changing the target lines to stress the regex:
d:\tmp\ xx.tmp: 1: hello # see woot.py: 24:24
d:\tmp\a b\ tmp\ xx.tmp: 1: hello # see woot.py: 24:24
d:\tmp\a b\ xx.tmp: 1: hello # see woot.py: 24:24
Found 3 hits.
( all three opens the correct file)

Non absolute path:
 xx.py: 1: hello # see woot.py: 24:24
xx.py: 1: hello # see woot.py: 24:24
Found 2 hits.
( both opens the correct file)

others; not listed: ok.

Thanks for taking care Kurt.
msg104448 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-04-28 18:21
Resolution should only be set when an issue is closed. So should this be closed? or unfixed and the versions updated?
msg104535 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-29 16:24
I think the issue is still open waiting for a unit test.  On the other hand, how likely is it that someone is going to write one at this point?  If someone doesn't speak up soon I think you can close it.
msg135887 - (view) Author: Kurt B. Kaiser (kbk) * (Python committer) Date: 2011-05-13 03:42
Backported to 2.6 4Oct09  56387:490190cb4a57
History
Date User Action Args
2011-05-13 03:42:57kbksetstatus: open -> closed

messages: + msg135887
stage: commit review -> resolved
2010-04-29 16:24:13r.david.murraysetnosy: + r.david.murray
messages: + msg104535
2010-04-28 18:21:12terry.reedysetnosy: + terry.reedy
messages: + msg104448
2009-05-03 07:14:14ccanepasetfiles: + unnamed

messages: + msg87035
2009-05-03 02:30:58kbksetkeywords: + 26backport

messages: + msg86995
2009-04-27 02:19:21gpolosetmessages: + msg86642
2009-04-27 00:52:14kbksetmessages: + msg86634
2009-04-26 23:39:58gpolosetmessages: + msg86629
2009-04-26 23:29:59kbksetpriority: normal

assignee: kbk

nosy: + kbk
messages: + msg86628
resolution: accepted -> fixed
stage: commit review
2009-04-06 19:01:41ccanepasetmessages: + msg85663
2009-03-25 12:14:52gpolosetversions: + Python 3.0, Python 3.1, Python 2.7
nosy: + gpolo

messages: + msg84158

resolution: accepted
2009-03-25 03:22:34ccanepacreate