This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author gdr@garethrees.org
Recipients benjamin.peterson, eric.snow, ezio.melotti, gdr@garethrees.org, r.david.murray, vladris
Date 2011-08-04.12:21:05
SpamBayes Score 3.8011663e-08
Marked as misclassified No
Message-id <1312460466.34.0.685267027573.issue12675@psf.upfronthosting.co.za>
In-reply-to
Content
Having looked at some of the consumers of the tokenize module, I don't think my proposed solutions will work.

It seems to be the case that the resynchronization behaviour of tokenize.py is important for consumers that are using it to transform arbitrary Python source code (like 2to3.py). These consumers are relying on the "roundtrip" property that X == untokenize(tokenize(X)). So solution (1) is necessary for the handling of tokenization errors.

Also, that fact that TokenInfo is a 5-tuple is relied on in some places (e.g. lib2to3/patcomp.py line 38), so it can't be extended. And there are consumers (though none in the standard library) that are relying on type=ERRORTOKEN being the way to detect errors in a tokenization stream. So I can't overload that field of the structure.

Any good ideas for how to record the cause of error without breaking backwards compatibility?
History
Date User Action Args
2011-08-04 12:21:06gdr@garethrees.orgsetrecipients: + gdr@garethrees.org, benjamin.peterson, ezio.melotti, r.david.murray, eric.snow, vladris
2011-08-04 12:21:06gdr@garethrees.orgsetmessageid: <1312460466.34.0.685267027573.issue12675@psf.upfronthosting.co.za>
2011-08-04 12:21:05gdr@garethrees.orglinkissue12675 messages
2011-08-04 12:21:05gdr@garethrees.orgcreate