Message 292059 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Eric Appelt
Recipients	Eric Appelt, barry, vstinner
Date	2017-04-21.16:14:57
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1492791298.02.0.628814416706.issue30117@psf.upfronthosting.co.za>
In-reply-to

Content
I added a PR to fix these two (in my opinion) spurious failure conditions in the lib2to3.tests.test_parser.TestParserIdempotency test_parser test with the following changes to the test: 1. Use the same encoding found in the initial file to write a temp file for a diff. This retains the BOM if the encoding was initially utf-8-sig. 2. If the file cannot be parsed using the normal grammar, try again with no print statement which should succeed for valid files using future print_function For case (1), the driver was correctly handling a BOM in a utf-8 file, but then the test was not writing a comparison file using 'utf-8-sig' to diff against, so the BOM got removed. I don't think that is the fault of the parser, and lib2to3 will retain the BOM. For case (2), lib2to3 pre-detects the use of from __future__ import print_function or allows the user to force this interpretation with a -p flag, and then selects a different grammar with the print statement removed. That makes the test cases unfair to this test as the driver itself doesn't know which grammar to use. As a minimal fix, the test will try using a grammar with the print statement, and if that fails fall back on a grammar without it. A more thorough handling of the idempotency test would to be to parse all files using both grammars and ignore if one of the two failed but otherwise check both. I didn't think this was necessary but can change.

I added a PR to fix these two (in my opinion) spurious failure conditions in the lib2to3.tests.test_parser.TestParserIdempotency test_parser test with the following changes to the test:

1. Use the same encoding found in the initial file to write a temp file for a diff. This retains the BOM if the encoding was initially utf-8-sig.

2. If the file cannot be parsed using the normal grammar, try again with no print statement which should succeed for valid files using future print_function

For case (1), the driver was correctly handling a BOM in a utf-8 file, but then the test was not writing a comparison file using 'utf-8-sig' to diff against, so the BOM got removed. I don't think that is the fault of the parser, and lib2to3 will retain the BOM.

For case (2), lib2to3 pre-detects the use of from __future__ import print_function or allows the user to force this interpretation with a -p flag, and then selects a different grammar with the print statement removed. That makes the test cases unfair to this test as the driver itself doesn't know which grammar to use. As a minimal fix, the test will try using a grammar with the print statement, and if that fails fall back on a grammar without it. A more thorough handling of the idempotency test would to be to parse all files using both grammars and ignore if one of the two failed but otherwise check both. I didn't think this was necessary but can change.

History
Date	User	Action	Args
2017-04-21 16:14:58	Eric Appelt	set	recipients: + Eric Appelt, barry, vstinner
2017-04-21 16:14:58	Eric Appelt	set	messageid: <1492791298.02.0.628814416706.issue30117@psf.upfronthosting.co.za>
2017-04-21 16:14:58	Eric Appelt	link	issue30117 messages
2017-04-21 16:14:57	Eric Appelt	create