Message263409
The openhook for fileinput currently will not be called when the input
is from sys.stdin. However, if the input contains invalid UTF-8
sequences, a program with a hook that specifies errors='replace' will
not behave as expected:
$ cat x.py
import fileinput
import sys
def hook(filename, mode):
print('hook called')
return open(filename, mode, errors='replace')
for line in fileinput.input(openhook=hook):
sys.stdout.write(line)
$ echo -e "foo\x80bar" >in.txt
$ python3 x.py in.txt
hook called
foo�bar
Good. Hook is called, and replacement character is observed.
$ python3 x.py <in.txt
Traceback (most recent call last):
File "x.py", line 8, in <module>
for line in fileinput.input(openhook=hook):
File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/fileinput.py", line 263, in __next__
line = self.readline()
File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/fileinput.py", line 363, in readline
self._buffer = self._file.readlines(self._bufsize)
File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py", line 319, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3: invalid start byte
Hook was not called, and so we get the UnicodeDecodeError.
Should fileinput attempt to apply the hook code to stdin? |
|
Date |
User |
Action |
Args |
2016-04-14 14:32:27 | jmb236 | set | recipients:
+ jmb236 |
2016-04-14 14:32:27 | jmb236 | set | messageid: <1460644347.16.0.552330506095.issue26756@psf.upfronthosting.co.za> |
2016-04-14 14:32:27 | jmb236 | link | issue26756 messages |
2016-04-14 14:32:26 | jmb236 | create | |
|