This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author gdr@garethrees.org
Recipients gdr@garethrees.org
Date 2011-08-04.22:21:28
SpamBayes Score 1.1924628e-11
Marked as misclassified No
Message-id <1312496489.46.0.0769213514508.issue12691@psf.upfronthosting.co.za>
In-reply-to
Content
tokenize.untokenize is completely broken.

    Python 3.2.1 (default, Jul 19 2011, 00:09:43) 
    [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import tokenize, io
    >>> t = list(tokenize.tokenize(io.BytesIO('1+1'.encode('utf8')).readline))
    >>> tokenize.untokenize(t)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/tokenize.py", line 250, in untokenize
        out = ut.untokenize(iterable)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/tokenize.py", line 179, in untokenize
        self.add_whitespace(start)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/tokenize.py", line 165, in add_whitespace
        assert row <= self.prev_row
    AssertionError

The assertion is simply bogus: the <= should be >=.

The reason why no-one has spotted this is that the unit tests for the tokenize module only ever call untokenize() in "compatibility" mode, passing in a 2-tuple instead of a 5-tuple.

I propose to fix this, and add unit tests, at the same time as fixing other problems with tokenize.py (issue12675).
History
Date User Action Args
2011-08-04 22:21:29gdr@garethrees.orgsetrecipients: + gdr@garethrees.org
2011-08-04 22:21:29gdr@garethrees.orgsetmessageid: <1312496489.46.0.0769213514508.issue12691@psf.upfronthosting.co.za>
2011-08-04 22:21:28gdr@garethrees.orglinkissue12691 messages
2011-08-04 22:21:28gdr@garethrees.orgcreate