Message226098
I realized that the behavior I want can be achieved by setting PyOS_ReadlineFunctionPointer to a function calling sys.stdin.readline(). However I found another problem: Python REPL just doesn't work, when sys.stdin.encoding is UTF-16-LE. The tokenizer (Parser/tokenizer.c:tok_nextc) reads a line using PyOS_Readline and then tries to recode it to UTF-8. The problem is that PyOS_Readline returns just plain *char and strlen() is used to determine its length when decoding, which makes no sense on UTF-16-LE encoded line, since it's full of nullbytes.
Why does PyOS_Readline return *char, rather than Python string object? In the situation when PyOS_ReadlineFunctionPointer points to something producing Unicode string (e.g. my new approach to solve #1602 or pyreadline package), it must be encoded and cast to *char to return from PyOS_Readline, then it is decoded by the tokenizer and again encoded to UTF-8. |
|
Date |
User |
Action |
Args |
2014-08-29 22:57:47 | Drekin | set | recipients:
+ Drekin, gvanrossum, brett.cannon, georg.brandl, ncoghlan, pitrou, vstinner, benjamin.peterson, eric.araujo, tshepang, steve.dower |
2014-08-29 22:57:47 | Drekin | set | messageid: <1409353067.03.0.467496982955.issue17620@psf.upfronthosting.co.za> |
2014-08-29 22:57:47 | Drekin | link | issue17620 messages |
2014-08-29 22:57:46 | Drekin | create | |
|