Author eric.snow
Recipients eric.snow
Date 2012-10-14.05:49:10
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1350193751.15.0.784217855065.issue16223@psf.upfronthosting.co.za>
In-reply-to
Content
If you pass an iterable of tokens and none of them are an ENCODING token, tokenize.untokenize() returns a string.  This is contrary to what the docs say:

   It returns bytes, encoded using the ENCODING token, which is the
   first token sequence output by tokenize().

Either the docs should be clarified or untokenize() fixed.  My vote is to fix it.  It could check that the first token is an ENCODING token and raise an exception.  Alternately it could fall back to using 'utf-8' by default.

[1] http://docs.python.org/py3k/library/tokenize.html#tokenize.untokenize
History
Date User Action Args
2012-10-14 05:49:11eric.snowsetrecipients: + eric.snow
2012-10-14 05:49:11eric.snowsetmessageid: <1350193751.15.0.784217855065.issue16223@psf.upfronthosting.co.za>
2012-10-14 05:49:11eric.snowlinkissue16223 messages
2012-10-14 05:49:10eric.snowcreate