`f.seek()` method may silently wrap around at offsets greater than
`1 << 64` (on AMD64) but return the original seek offset:
$ strace -e lseek python3
[...] bunch of strace output
>>> f = open("/tmp/whatever", "w")
[...] bunh of strace output
>>> f.seek((1 << 64) + 1234)
lseek(3, 1234, SEEK_SET) = 1234
18446744073709552850
>>> _ == (1 << 64) + 1234
True
When the MSB is set to `1` (e.g. it represents a negative `long`
number) it will indeed overflow and raise on error:
>>> f.seek((1<<64) - 1)
lseek(3, -1, SEEK_SET) = -1 EINVAL (Invalid argument)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument
This causes a confusing behavior that using erroneously big seek
offsets may or may not return an error depending on the MSB value. The
expected behavior would be that both the above calls fail in
accordance with the Zen of Python:
> Errors should never pass silently.
> Unless explicitly silenced.
The issue is only present for text mode files, binary files raise an
error, as expected:
ValueError: cannot fit 'int' into an offset-sized integer
After some digging I found that the issue comes from `TextIOWrapper`,
particularly from `textiowrapper_parse_cookie(cookie_type *cookie,
PyObject *cookieObj)` calling `PyNumber_Long` silently truncating the
incoming size to 64 bits. The issue can be reproduced on Python 2.7
when using `io.open` in text mode.
|