This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Non-zero `offset`s are no longer acceptable with SEEK_END/SEEK_CUR implementation of `seek` in python3 when in text mode, breaking py 2.x behavior/POSIX
Type: Stage: resolved
Components: IO Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.4, Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: methane, ngie, serhiy.storchaka, steven.daprano
Priority: normal Keywords:

Created on 2019-02-25 22:12 by ngie, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (6)
msg336564 - (view) Author: Enji Cooper (ngie) * Date: 2019-02-25 22:12
I tried using os.SEEK_END in a technical interview, but unfortunately, that didn't work with python 3.x:

pinklady:cpython ngie$ python3
Python 3.7.2 (default, Feb 12 2019, 08:15:36) 
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> fp = open("configure"); fp.seek(-100, os.SEEK_END)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
io.UnsupportedOperation: can't do nonzero end-relative seeks

It does however work with 2.x, which is aligned with the POSIX spec implementation, as shown below:

pinklady:cpython ngie$ python
Python 2.7.15 (default, Oct  2 2018, 11:47:18) 
[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> fp = open("configure"); fp.seek(-100, os.SEEK_END)
>>> fp.tell()
501076
>>> os.stat("configure").st_size
501176
>>>
msg336565 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-02-25 22:33
I believe you will find that this is because you opened the file in text mode, which means Unicode, not bytes. If you open it in binary mode, the POSIX spec applies:

py> fp = open("sample", "rb"); fp.seek(-100, os.SEEK_END)
350

Supported values for seeking in text (Unicode) files are documented here:

https://docs.python.org/3/library/io.html#io.TextIOBase.seek

I don't believe this is a bug, or possible to be changed. Do you still think otherwise? If not, we should close this ticket.
msg336567 - (view) Author: Enji Cooper (ngie) * Date: 2019-02-25 22:42
?!

Being blunt: why should opening a file in binary vs text mode matter? POSIX doesn't make this distinction.

Per the pydoc (https://docs.python.org/2/library/functions.html#open):

> The default is to use text mode, which may convert '\n' characters to a platform-specific representation on writing and back on reading.

If this is one of the only differentiators between binary and text mode, why should certain types of seeking be made impossible?

Having to stat the file, then set the cursor to the size of the file, minus the offset breaks the 'seek(..)' interface, and having to use 'rb', then convert from bytes to unicode overly complicates things :(.
msg336588 - (view) Author: Enji Cooper (ngie) * Date: 2019-02-26 00:29
Opening and seeking using SEEK_END worked in text mode with python 2.7. I'm not terribly sure why 3.x should depart from this behavior:

>>> fp = open("configure", "rt"); fp.seek(-100, os.SEEK_END)
>>> fp.tell()
501076
msg336606 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-26 05:26
This does not have relation to POSIX, since POSIX says nothing about Unicode files. "Text mode" in POSIX means binary files with converted newlines. This mode is not supported in Python 3.
msg336617 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-02-26 06:32
If you want byte IO, you can use "rb" mode.  You can seek on it.
History
Date User Action Args
2022-04-11 14:59:11adminsetgithub: 80292
2019-02-26 06:32:35methanesetnosy: + methane
messages: + msg336617
2019-02-26 06:31:08methanesetstatus: open -> closed
resolution: not a bug
stage: resolved
2019-02-26 05:26:43serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg336606
2019-02-26 03:43:53ngiesettitle: Non-zero `offset`s are no longer acceptable with implementation of `seek` in some cases with python3 when in text mode; should be per POSIX -> Non-zero `offset`s are no longer acceptable with SEEK_END/SEEK_CUR implementation of `seek` in python3 when in text mode, breaking py 2.x behavior/POSIX
2019-02-26 03:43:02ngiesettitle: Negative `offset` values are no longer acceptable with implementation of `seek` with python3 when in text mode; should be per POSIX -> Non-zero `offset`s are no longer acceptable with implementation of `seek` in some cases with python3 when in text mode; should be per POSIX
2019-02-26 00:30:14ngiesetversions: + Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8
2019-02-26 00:29:37ngiesetmessages: + msg336588
2019-02-25 22:44:15ngiesettitle: Negative `offset` values are no longer acceptable with implementation of `seek` with python3; should be per POSIX -> Negative `offset` values are no longer acceptable with implementation of `seek` with python3 when in text mode; should be per POSIX
2019-02-25 22:42:10ngiesetmessages: + msg336567
2019-02-25 22:33:23steven.dapranosetnosy: + steven.daprano
messages: + msg336565
2019-02-25 22:12:02ngiecreate