Issue 8478: tokenize.untokenize first token missing failure case

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/52724

classification

Title:	tokenize.untokenize first token missing failure case
Type:	behavior	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.3, Python 3.4, Python 2.7

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	terry.reedy	Nosy List:	Arfrever, eric.snow, georg.brandl, python-dev, rb, takluyver, terry.reedy
Priority:	normal	Keywords:	patch

Created on 2010-04-21 02:18 by rb, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
untokenize.diff	georg.brandl, 2012-10-06 12:51		review

Messages (7)
msg103799 - (view)	Author: (rb) *	Date: 2010-04-21 02:18
When altering tokens and thus not providing token location information, tokenize.untokenize sometimes misses out the first token. Failure case below. Expected output: 'import foo ,bar\n' Actual output: 'foo ,bar\n' $ python Python 2.6.4 (r264:75706, Dec 7 2009, 18:43:55) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import StringIO, tokenize >>> >>> def strip(iterable): ... for t_type, t_str, (srow, scol), (erow, ecol), line in iterable: ... yield t_type, t_str ... >>> source = StringIO.StringIO('import foo, bar\n') >>> print repr(tokenize.untokenize(strip(tokenize.generate_tokens(source.readline)))) 'foo ,bar \n' >>> source.seek(0) >>> print repr(tokenize.untokenize(tokenize.generate_tokens(source.readline))) 'import foo, bar\n' >>>
msg106450 - (view)	Author: (rb) *	Date: 2010-05-25 16:57
I've looked into this in some more depth. The problem is that Untokenizer.compat is assuming that iterable can restart from the beginning, when Untokenizer.untokenize has already had the first element out. So it works with a list, but not with a generator. In particular, untokenize is broken for any input that is a generator which only supplies the first two elements. Workaround: never hand untokenize a generator. Expand generators to lists first instead.
msg172191 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2012-10-06 12:51
Attaching patch. Actually both versions of untokenize() were broken; the version used for "full input" (5-tuples) had a flipped inequality sign in an assert. Other changes in the patch: * Docs fixed to describe both modes * Tests fixed to exercise both modes
msg180589 - (view)	Author: Thomas Kluyver (takluyver) *	Date: 2013-01-25 14:25
#16224 appears to be a duplicate. There seem to be several quite major issues with untokenize - see also #12691 - with patches made to fix them. Is there anything I can do to help push these forwards?
msg211448 - (view)	Author: Roundup Robot (python-dev)	Date: 2014-02-17 21:50
New changeset c896d292080a by Terry Jan Reedy in branch '2.7': Untokenize: An logically incorrect assert tested user input validity. http://hg.python.org/cpython/rev/c896d292080a New changeset 51e5a89afb3b by Terry Jan Reedy in branch '3.3': Untokenize: An logically incorrect assert tested user input validity. http://hg.python.org/cpython/rev/51e5a89afb3b
msg211475 - (view)	Author: Roundup Robot (python-dev)	Date: 2014-02-18 04:17
New changeset c2517a37c13a by Terry Jan Reedy in branch '2.7': Issue #8478: Untokenizer.compat now processes first token from iterator input. http://hg.python.org/cpython/rev/c2517a37c13a New changeset b6d6ca792b64 by Terry Jan Reedy in branch '3.3': Issue #8478: Untokenizer.compat now processes first token from iterator input. http://hg.python.org/cpython/rev/b6d6ca792b64
msg212041 - (view)	Author: Roundup Robot (python-dev)	Date: 2014-02-23 23:01
New changeset 8d6dd02a973f by Terry Jan Reedy in branch '3.3': Issue #20750, Enable roundtrip tests for new 5-tuple untokenize. The http://hg.python.org/cpython/rev/8d6dd02a973f

History
Date	User	Action	Args
2022-04-11 14:57:00	admin	set	github: 52724
2018-08-16 22:07:35	berker.peksag	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2014-02-23 23:01:28	python-dev	set	messages: + msg212041
2014-02-18 04:17:22	python-dev	set	messages: + msg211475
2014-02-17 21:50:19	python-dev	set	nosy: + python-dev messages: + msg211448
2014-02-17 21:18:32	terry.reedy	set	assignee: terry.reedy stage: patch review nosy: + terry.reedy versions: + Python 2.7, Python 3.3, Python 3.4, - Python 2.6
2013-03-28 10:05:04	georg.brandl	set	assignee: georg.brandl -> (no value)
2013-01-25 14:25:46	takluyver	set	messages: + msg180589
2013-01-24 01:01:59	takluyver	set	nosy: + takluyver
2012-11-13 06:44:17	eric.snow	set	nosy: + eric.snow
2012-10-06 18:03:17	Arfrever	set	nosy: + Arfrever
2012-10-06 12:51:33	georg.brandl	set	files: + untokenize.diff keywords: + patch messages: + msg172191
2010-05-25 16:57:26	rb	set	messages: + msg106450
2010-04-21 18:18:30	georg.brandl	set	assignee: georg.brandl nosy: + georg.brandl
2010-04-21 17:36:41	rb	set	type: behavior
2010-04-21 02:18:30	rb	create