Message 254778 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	serhiy.storchaka
Recipients	serhiy.storchaka
Date	2015-11-17.01:27:24
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1447723653.14.0.424824913135.issue25643@psf.upfronthosting.co.za>
In-reply-to

Content
Here is preliminary patch that refactors the lowest level of Python tokenizer, reading and decoding. It splits the code on smaller simpler functions, decreases the source size by 37 lines, and fixes bugs: issue14811, issue18961, and a number of others. Added tests for most of fixed bugs (except leaks and others hardly reproducible). But the fix for other bugs can be harder, especially for issues with null byte (issue1105770, issue20115). Many bug easily can be fixed if read all Python file in memory instead of reading it line by line. I don't know if it is acceptable.

Here is preliminary patch that refactors the lowest level of Python tokenizer, reading and decoding. It splits the code on smaller simpler functions, decreases the source size by 37 lines, and fixes bugs: issue14811, issue18961, and a number of others. Added tests for most of fixed bugs (except leaks and others hardly reproducible). But the fix for other bugs can be harder, especially for issues with null byte (issue1105770, issue20115).

Many bug easily can be fixed if read all Python file in memory instead of reading it line by line. I don't know if it is acceptable.

History
Date	User	Action	Args
2015-11-17 01:27:33	serhiy.storchaka	set	recipients: + serhiy.storchaka
2015-11-17 01:27:33	serhiy.storchaka	set	messageid: <1447723653.14.0.424824913135.issue25643@psf.upfronthosting.co.za>
2015-11-17 01:27:32	serhiy.storchaka	link	issue25643 messages
2015-11-17 01:27:32	serhiy.storchaka	create