Title: fileinput inplace clobbers file without leaving backup on decode errors
Type: behavior
Components: IO Versions: Python 3.6
Status: open
Dependencies: Superseder:
Nosy List: switchnode
Priority: normal

Created on 2017-06-19 04:14 by switchnode

Messages (1)
msg296304 - (view) Author: switchnode (switchnode) Date: 2017-06-19 04:14
Consider the script:

$ cat
#!/usr/bin/env python
import fileinput
srt = fileinput.input(inplace=True)
print(srt.readline(), end='')
for line in srt:
        print(line, end='')

Called on text files, it will do nothing.

$ ls -alh test.*
-rw-r--r-- 1 501 utmp 1.3G Jun 18 22:17 test.mp4
-rw-r--r-- 1 501 utmp  71K Jun 18  2017
$ ./
$ ls -alh test.*
-rw-r--r-- 1 501 utmp 1.3G Jun 18 22:17 test.mp4
-rw-r--r-- 1 501 utmp  71K Jun 18  2017

However, if the user accidentally supplies the filename of a video instead of the associated srt...

$ ./ test.mp4
Traceback (most recent call last):
  File "./", line 4, in <module>
    print(srt.readline(), end='')
  File "/usr/lib/python3.6/", line 299, in readline
    line = self._readline()
  File "/usr/lib/python3.6/", line 364, in _readline
    return self._readline()
  File "/usr/lib/python3.6/", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 43: invalid start byte
$ ls -alh test.*
-rw-r--r-- 1 501 utmp    0 Jun 18  2017 test.mp4
-rw-r--r-- 1 501 utmp  71K Jun 18  2017
$ ls -alh * | grep 'bak'

Oops! It is gone.

I'm not sure why this happens. (Without the context-manager syntax, I would expect the program to end by excepting, fail to close the FileInput, and leave the backup file behind—certainly that would be the merciful option.)
