Author terry.reedy
Recipients ammar2, gregory.p.smith, meador.inge, pablogsal, serhiy.storchaka, taleinat, terry.reedy
Date 2018-10-30.14:59:51
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1540911591.25.0.788709270274.issue35107@psf.upfronthosting.co.za>
In-reply-to
Content
It seems to me a bug that if '\n' is not present, tokenize adds both NL and NEWLINE tokens, instead of just one of them.  Moreover, both tuples of the double correction look wrong.

If '\n' is present,
  TokenInfo(type=56 (NL), string='\n', start=(1, 1), end=(1, 2), line='#\n')
looks correct.

If NL represents a real character, the length 0 string='' in the generated
  TokenInfo(type=56 (NL), string='', start=(1, 1), end=(1, 1), line='#'),
seems wrong.  I suspect that the idea was to mis-represent NL to avoid '\n' being added by untokenize.  In
  TokenInfo(type=4 (NEWLINE), string='', start=(1, 1), end=(1, 2), line='')
string='' is mismatched by length = 2-1 = 1.  I am inclined to think that the following would be the correct added token, which should untokenize correctly
  TokenInfo(type=4 (NEWLINE), string='', start=(1, 1), end=(1, 1), line='')

ast.dump(ast.parse(s)) returns 'Module(body=[])' for both versions of 's', so no help there.
History
Date User Action Args
2018-10-30 14:59:51terry.reedysetrecipients: + terry.reedy, gregory.p.smith, taleinat, meador.inge, serhiy.storchaka, ammar2, pablogsal
2018-10-30 14:59:51terry.reedysetmessageid: <1540911591.25.0.788709270274.issue35107@psf.upfronthosting.co.za>
2018-10-30 14:59:51terry.reedylinkissue35107 messages
2018-10-30 14:59:51terry.reedycreate