This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author steve.dower
Recipients Drekin, benjamin.peterson, brett.cannon, eric.araujo, georg.brandl, gvanrossum, ncoghlan, paul.moore, pitrou, steve.dower, tshepang
Date 2016-08-14.16:30:34
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1471192234.73.0.194726043861.issue17620@psf.upfronthosting.co.za>
In-reply-to
Content
I'm working on this as part of my fix for issue1602. Not yet sure how this will come out - compatibility with GNU readline seems to be the biggest issue, as if we want to keep that then we can't allow embedded '\0' in the encoded text (i.e. UTF-16 cannot be used, which implies that sys.stdin.encoding cannot always be used directly).

Adding __readlinehook__ as an alternative may be feasible, but a decent amount of work given how we call into the current readline implementation. Unfortunately, it looks like detecting when a readline hook has been added is going to involve significant changes to the tokenizer, which I really don't want to do.

The easiest approach wrt issue1602 seems to be to special case the console by reencoding from utf-16-le to utf-8 and forcing the encoding in the tokenizer to utf-8 (instead of sys.stdin.encoding) in this case. I'll start here so that at least we can parse Unicode from the interactive prompt.
History
Date User Action Args
2016-08-14 16:30:34steve.dowersetrecipients: + steve.dower, gvanrossum, brett.cannon, georg.brandl, paul.moore, ncoghlan, pitrou, benjamin.peterson, eric.araujo, tshepang, Drekin
2016-08-14 16:30:34steve.dowersetmessageid: <1471192234.73.0.194726043861.issue17620@psf.upfronthosting.co.za>
2016-08-14 16:30:34steve.dowerlinkissue17620 messages
2016-08-14 16:30:34steve.dowercreate