New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delayed exception using non-text encodings with TextIOWrapper #64603
Comments
TextIOWrapper doesn't check the codec type early the way the convenience methods now do, so the open() method currently still suffers from the problems Victor described in bpo-19619 for str.encode() and bytes.decode(): >>> with open("hex.txt", 'w') as f:
... f.write("aabbccddeeff")
...
12
>>> print(open('hex.txt').read())
aabbccddeeff
>>> print(open('hex.txt', encoding='hex').read())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: decoder should return a string result, not 'bytes' These codecs are never going to work correctly with TextIOWrapper, so we should add the appropriate compatibility check in the constructor. |
I also created bpo-20405 as an RFE for 3.5, since it seems there is a possible gap in capability relative to Python 2 here. |
Attached patch checks the requested encoding is a valid text encoding in TextIOWrapper.__init__. Two additional test changes were needed:
Currently, this adds a third lookup of the encoding name to the process of creating a TextIOWrapper instance. This could be reduced to just one by changing the retrieval of the encoder and decoder to look in the retrieved codec info tuple, rather than doing the lookup by name again. |
Revised patch that avoids doing multiple lookups of the same codec name while creating the stream. Absent any comments, I'll commit this version with appropriate NEWS and What's New updates tomorrow. |
Ah, just noticed the test case is still using the overly specific check for the exception wording. I'll fix that, too. |
The _PyCodec_GetIncrementalDecoder name looks too similar to PyCodec_IncrementalDecoder. It would be better to use more different name. And please note my minor comments on Rietveld. |
v3:
|
LGTM. |
New changeset f3ec00d2b75e by Nick Coghlan in branch 'default': |
Here is backported to 3.3 patch. |
New changeset 140a69d950eb by Georg Brandl in branch '3.3': |
New changeset cf6e782a7f94 by Serhiy Storchaka in branch '2.7': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: