classification
Title: "reindent.py" exposes bug in tokenize
Type: Stage: resolved
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: edcjones, goodger, tim.peters
Priority: normal Keywords:

Created on 2006-03-10 23:55 by edcjones, last changed 2010-07-10 19:41 by eric.araujo. This issue is now closed.

Messages (3)
msg27737 - (view) Author: Edward C. Jones (edcjones) Date: 2006-03-10 23:55
I use up-to-date Debian unstable (i368 port) on a PC with a AMD Athlon64
+3500 chip. I compile my own copy of Python which I keep in /usr/local.

Here is a small Python program called "fixnames.py":

#! /usr/bin/env python

"""Rename files that contain unpleasant characters.

Modify this code as needed.
"""
import os, sys, optparse

usage = 'USAGE: ./fixnames.py [-h] <filelist>'
parser = optparse.OptionParser(usage=usage)
options, args = parser.parse_args()
if len(args) != 1:
    parser.print_help()
    sys.exit('an argument is required'))

# The input is a list of files to be renamed.
for name in open(args[0]), 'r'):
    # Modify these as needed.
    newname = name.replace(' ', '_')
    newname = newname.replace('@', '_at_')
    newname = newname.replace('%20', '_')
    newname = newname.replace("'", '')
    os.rename(name, newname)

If I run

python /usr/local/src/Python-2.4.2/Tools/scripts/reindent.py fixnames.py

I get
Traceback (most recent call last):
  File "/usr/local/src/Python-2.4.2/Tools/scripts/reindent.py", line 293, in ?
    main()
  File "/usr/local/src/Python-2.4.2/Tools/scripts/reindent.py", line 83, in main
    check(arg)
  File "/usr/local/src/Python-2.4.2/Tools/scripts/reindent.py", line 108, in check
    if r.run():
  File "/usr/local/src/Python-2.4.2/Tools/scripts/reindent.py", line 166, in run
    tokenize.tokenize(self.getline, self.tokeneater)
  File "/usr/local/lib/python2.4/tokenize.py", line 153, in tokenize
    tokenize_loop(readline, tokeneater)
  File "/usr/local/lib/python2.4/tokenize.py", line 159, in tokenize_loop
    for token_info in generate_tokens(readline):
  File "/usr/local/lib/python2.4/tokenize.py", line 236, in generate_tokens
    raise TokenError, ("EOF in multi-line statement", (lnum, 0))
tokenize.TokenError: ('EOF in multi-line statement', (24, 0))

msg27738 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2006-03-11 00:11
Logged In: YES 
user_id=31435

What do you think the bug is?  That is, what did you expect
to happen?  tokenize.py isn't a syntax checker, so this
looks like a case of garbage-in, garbage-out to me.  There
are two lines in the sample program that contain a right
parenthesis that shouldn't be there, and if those syntax
errors are repaired then tokenize.py is happy with the
program.  As is, because of the unbalanced parentheses the
net paren level isn't 0 when tokenize reaches the end of the
file, so _something_ is wrong with the file, and "EOF in
multi-line statement" is just its heurestic guess at the
most likely cause.
msg27739 - (view) Author: David Goodger (goodger) (Python committer) Date: 2006-03-11 01:56
Logged In: YES 
user_id=7733

reindent.py and tokenize.py require input with correct
syntax.  The bug is in the input code.

Closing this bug report.
History
Date User Action Args
2010-07-10 19:41:19eric.araujosetstage: resolved
2006-03-10 23:55:16edcjonescreate