Author gumblex
Recipients Arfrever, gumblex, jaraco, terry.reedy
Date 2015-06-20.07:13:45
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1434784426.71.0.297913391897.issue20387@psf.upfronthosting.co.za>
In-reply-to
Content
Sorry for the inconvenience. I failed to find this old bug.

I think there is another problem. The docs of `untokenize` said "The iterable must return sequences with **at least** two elements, the token type and the token string. Any additional sequence elements are ignored.", so if I feed in, say, a 3-tuple, the untokenize should accept it as tok[:2].

The attached patch should have addressed the problems above. 

When trying to make a patch, a tokenize bug was found. Consider the new attached tab.py, the tabs between comments and code, and the tabs between expressions are lost, so when untokenizing, position information is used to produce equivalent spaces, instead of tabs.

Despite the tokenization problem, the patch can produce syntactically correct code as accurately as it can.

The PEP 8 recommends spaces for indentation, but the usage of tabs should not be ignored.

new tab.py (in Python string):

'#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\ndef foo():\n\t"""\n\tTests tabs in tokenization\n\t\tfoo\n\t"""\n\tpass\n\tpass\n\tif 1:\n\t\t# not indent correctly\n\t\tpass\n\t\t# correct\ttab\n\t\tpass\n\tpass\n\tbaaz = {\'a\ttab\':\t1,\n\t\t\t\'b\': 2}\t\t# also fails\n\npass\n#if 2:\n\t#pass\n#pass\n'
History
Date User Action Args
2015-06-20 07:13:46gumblexsetrecipients: + gumblex, terry.reedy, jaraco, Arfrever
2015-06-20 07:13:46gumblexsetmessageid: <1434784426.71.0.297913391897.issue20387@psf.upfronthosting.co.za>
2015-06-20 07:13:46gumblexlinkissue20387 messages
2015-06-20 07:13:46gumblexcreate