This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: tokenize.untokenize first token missing failure case
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.3, Python 3.4, Python 2.7
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: Arfrever, eric.snow, georg.brandl, python-dev, rb, takluyver, terry.reedy
Priority: normal Keywords: patch

Created on 2010-04-21 02:18 by rb, last changed 2022-04-11 14:57 by admin. This issue is now closed.

File name Uploaded Description Edit
untokenize.diff georg.brandl, 2012-10-06 12:51 review
Messages (7)
msg103799 - (view) Author: (rb) * Date: 2010-04-21 02:18
When altering tokens and thus not providing token location information, tokenize.untokenize sometimes misses out the first token. Failure case below.

Expected output: 'import foo ,bar\n'
Actual output: 'foo ,bar\n'

$ python
Python 2.6.4 (r264:75706, Dec  7 2009, 18:43:55) 
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import StringIO, tokenize
>>> def strip(iterable):
...     for t_type, t_str, (srow, scol), (erow, ecol), line in iterable:
...         yield t_type, t_str
>>> source = StringIO.StringIO('import foo, bar\n')
>>> print repr(tokenize.untokenize(strip(tokenize.generate_tokens(source.readline))))
'foo ,bar \n'
>>> print repr(tokenize.untokenize(tokenize.generate_tokens(source.readline)))
'import foo, bar\n'
msg106450 - (view) Author: (rb) * Date: 2010-05-25 16:57
I've looked into this in some more depth.

The problem is that Untokenizer.compat is assuming that iterable can restart from the beginning, when Untokenizer.untokenize has already had the first element out. So it works with a list, but not with a generator.

In particular, untokenize is broken for any input that is a generator which only supplies the first two elements.

Workaround: never hand untokenize a generator. Expand generators to lists first instead.
msg172191 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2012-10-06 12:51
Attaching patch.  Actually both versions of untokenize() were broken; the version used for "full input" (5-tuples) had a flipped inequality sign in an assert.

Other changes in the patch:

* Docs fixed to describe both modes
* Tests fixed to exercise both modes
msg180589 - (view) Author: Thomas Kluyver (takluyver) * Date: 2013-01-25 14:25
#16224 appears to be a duplicate.

There seem to be several quite major issues with untokenize - see also #12691 - with patches made to fix them. Is there anything I can do to help push these forwards?
msg211448 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-02-17 21:50
New changeset c896d292080a by Terry Jan Reedy in branch '2.7':
Untokenize: An logically incorrect assert tested user input validity.

New changeset 51e5a89afb3b by Terry Jan Reedy in branch '3.3':
Untokenize: An logically incorrect assert tested user input validity.
msg211475 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-02-18 04:17
New changeset c2517a37c13a by Terry Jan Reedy in branch '2.7':
Issue #8478: Untokenizer.compat now processes first token from iterator input.

New changeset b6d6ca792b64 by Terry Jan Reedy in branch '3.3':
Issue #8478: Untokenizer.compat now processes first token from iterator input.
msg212041 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-02-23 23:01
New changeset 8d6dd02a973f by Terry Jan Reedy in branch '3.3':
Issue #20750, Enable roundtrip tests for new 5-tuple untokenize. The
Date User Action Args
2022-04-11 14:57:00adminsetgithub: 52724
2018-08-16 22:07:35berker.peksagsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2014-02-23 23:01:28python-devsetmessages: + msg212041
2014-02-18 04:17:22python-devsetmessages: + msg211475
2014-02-17 21:50:19python-devsetnosy: + python-dev
messages: + msg211448
2014-02-17 21:18:32terry.reedysetassignee: terry.reedy
stage: patch review

nosy: + terry.reedy
versions: + Python 2.7, Python 3.3, Python 3.4, - Python 2.6
2013-03-28 10:05:04georg.brandlsetassignee: georg.brandl -> (no value)
2013-01-25 14:25:46takluyversetmessages: + msg180589
2013-01-24 01:01:59takluyversetnosy: + takluyver
2012-11-13 06:44:17eric.snowsetnosy: + eric.snow
2012-10-06 18:03:17Arfreversetnosy: + Arfrever
2012-10-06 12:51:33georg.brandlsetfiles: + untokenize.diff
keywords: + patch
messages: + msg172191
2010-05-25 16:57:26rbsetmessages: + msg106450
2010-04-21 18:18:30georg.brandlsetassignee: georg.brandl

nosy: + georg.brandl
2010-04-21 17:36:41rbsettype: behavior
2010-04-21 02:18:30rbcreate