This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Drekin
Recipients Drekin, benjamin.peterson, brett.cannon, eric.araujo, georg.brandl, gvanrossum, ncoghlan, pitrou, steve.dower, tshepang, vstinner
Date 2014-08-29.22:57:46
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1409353067.03.0.467496982955.issue17620@psf.upfronthosting.co.za>
In-reply-to
Content
I realized that the behavior I want can be achieved by setting PyOS_ReadlineFunctionPointer to a function calling sys.stdin.readline(). However I found another problem: Python REPL just doesn't work, when sys.stdin.encoding is UTF-16-LE. The tokenizer (Parser/tokenizer.c:tok_nextc) reads a line using PyOS_Readline and then tries to recode it to UTF-8. The problem is that PyOS_Readline returns just plain *char and strlen() is used to determine its length when decoding, which makes no sense on UTF-16-LE encoded line, since it's full of nullbytes.

Why does PyOS_Readline return *char, rather than Python string object? In the situation when PyOS_ReadlineFunctionPointer points to something producing Unicode string (e.g. my new approach to solve #1602 or pyreadline package), it must be encoded and cast to *char to return from PyOS_Readline, then it is decoded by the tokenizer and again encoded to UTF-8.
History
Date User Action Args
2014-08-29 22:57:47Drekinsetrecipients: + Drekin, gvanrossum, brett.cannon, georg.brandl, ncoghlan, pitrou, vstinner, benjamin.peterson, eric.araujo, tshepang, steve.dower
2014-08-29 22:57:47Drekinsetmessageid: <1409353067.03.0.467496982955.issue17620@psf.upfronthosting.co.za>
2014-08-29 22:57:47Drekinlinkissue17620 messages
2014-08-29 22:57:46Drekincreate