Author vstinner
Recipients ezio.melotti, lemburg, serhiy.storchaka, vstinner
Date 2017-03-10.15:41:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
In-reply-to <>
> The reason for the problem is the UTF-8 decoder (and other
> decoders) expecting an extension to the codec decoder API,
> which are not implemented in its StreamReader class (it simply
> uses the base class). It's not a problem of the base class, but
> that of the codec.
> And no: it doesn't have anything to do with
> or the StreamReaderWriter class.

open("document.txt", encoding="utf-8") uses IncrementalDecoder of
encodings.utf_8. This object doesn't seem to have the discussed issue.

IMHO the issue is that StreamReader doesn't use an incremental
decoder. I don't see how it could support multibyte encodings and
error handlers without an incremental decoder.

I like TextIOWrapper design between it only handles codecs and text
buffering. Bytes buffering is done at lower-level in a different

I'm not confortable to modify StreamReader because it combines
TextIOWrapper with BufferedReader and so is more complex.

>> I propose to modify to reuse the io module: call with newline=''. The io module is now battle-tested and handles well many corner cases of incremental codecs with multibyte encodings.
> -1. People who want to use the io module should use it directly.

When porting code to Python 3, many people chose to use
to get text files using a single code base for Python 2 and Python 3.
Once the code is ported, I don't expect that anyone will replace with You know, nobody cares of the technical

>> The next step would be to deprecate the codecs.StreamReaderWriter class and the But my latest attempt to deprecate them was the PEP 400 and it wasn't a full success, so I now prefer to move step by step :-)
> I'm still -1 on the deprecations in PEP 400. You are essentially
> suggesting to replace the complete codecs subsystem with the
> io module, but forgetting that all codecs use StreamWriter and
> StreamReader as base classes.

You can elaborate on "all codecs use StreamWriter and StreamReader as
base classes". Only uses StreamReader and StreamWriter,

All codecs implement a StreamReader and StreamWriter class, but my
question is how use these classes?

> The codecs sub system has a clean design. If used correctly
> and maintained with more care, it works really well.

It seems like we lack such maintainer, since I wrote the PEP, many
issues are still open:

See also issue #5445 (wontfix, whereas TextIOWrapper.writeslines()
uses "for line in lines") and issue #12513 (this one is not fair, io
also has the same bug: issue #12215 :-)).

> I'm tired of having to fight these fights every few years.
> Can't we just stop having them, please ?

The status quo is to do nothing, but as a consequence, bugs are still
not fixed yet, and users are still affected by these bugs :-( I'm
trying to find a solution.
Date User Action Args
2017-03-10 15:41:50vstinnersetrecipients: + vstinner, lemburg, ezio.melotti, serhiy.storchaka
2017-03-10 15:41:50vstinnerlinkissue29783 messages
2017-03-10 15:41:50vstinnercreate