Message285210
> Inada, I think you messed up the positioning of bits of the patch. E.g. there are now test methods declared > inside a helper function (rather than a test class).
I'm sorry. `patch -p1` merged previous patch into wrong place, and test passed accidently.
> Since it seems other people are in favour of this API, I would like to expand it a bit to cover two uses cases (see set_encoding-newline.patch):
>
> * change the error handler without affecting the main character encoding
> * set the newline encoding (also suggested by Serhiy)
+1. Since stdio is configured before running Python program, TextIOWrapper should be configurable after creation, as possible.
> Regarding Serhiy’s other suggestion about buffering parameters, perhaps TextIOWrapper.line_buffering could become a writable attribute instead, and the class could grow a similar write_through attribute. I don’t think these affect encoding or decoding, so they could be treated independently.
Could them go another new issue?
This issue is too long to read already.
> The algorithm for rewinding unread data is complicated and can fail. What is the advantage of using it? What is the use case for reading from a stream and then changing the encoding, without a guarantee that it will work?
>
> Even if it is enhanced to never “fail”, it will still have strange behaviour, such as data loss when a decoder is fed a single byte and produces multiple characters (e.g. CR newline, backslashreplace, UTF-7).
When I posted the set_encoding-7.patch, I hadn't read io module deeply. I just solved conflict and ran test.
After that, I read the code and I feel same thing (see msg285111 and msg285112).
Let's drop support changing encoding while reading.
It's significant step that allowing changing stdin encoding only before reading anything from it.
> One step in the right direction IMO would be to only support calling set_encoding() when no extra read data has been buffered (or to explicitly say that any buffered data is silently dropped). So there is no support for changing the encoding halfway through a disk file, but it may be appropriate if you can regulate the bytes being read, e.g. from a terminal (user input), pipe, socket, etc.
Totally agree.
> But I would be happy enough without set_encoding(), and with something like my rewrap() function at the bottom of <https://github.com/vadmium/data/blob/master/data.py#L526>. It returns a fresh TextIOWrapper, but when you exit the context manager you can continue to reuse the old stream with the old settings.
I want one obvious way to control encoding and error handler from Python, (not from environment variable).
Rewrapping stream seems hacky way, rather than obvious way. |
|
Date |
User |
Action |
Args |
2017-01-11 10:35:16 | methane | set | recipients:
+ methane, loewis, ishimoto, ncoghlan, pitrou, vstinner, jwilk, mrabarnett, Arfrever, nikratio, rurpy2, berker.peksag, martin.panter, serhiy.storchaka, quad |
2017-01-11 10:35:16 | methane | set | messageid: <1484130916.32.0.662761708206.issue15216@psf.upfronthosting.co.za> |
2017-01-11 10:35:16 | methane | link | issue15216 messages |
2017-01-11 10:35:16 | methane | create | |
|