This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: io.Buffer*.seek() doesn't seek if "seeking leaves us inside the current buffer"
Type: Stage:
Components: IO Versions: Python 3.3
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: pitrou, vstinner
Priority: normal Keywords:

Created on 2011-05-19 16:04 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)
msg136296 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-05-19 16:04
Example:

with open("setup.py", "rb") as f:
    # read smaller than the file size to fill the readahead buffer
    f.read(1)
    # seek doesn't seek
    f.seek(0)
    print("f pos=", f.tell())
    print("f.raw pos=", f.raw.tell())

Output:

f pos= 0
f.raw pos= 4096

I expect f.raw.tell() to be 0.

Extract of Modules/_io/buffered.c:

    if (whence != 2 && self->readable) {
        Py_off_t current, avail;
        /* Check if seeking leaves us inside the current buffer,
           so as to return quickly if possible. Also, we needn't take the
           lock in this fast path.
           Don't know how to do that when whence == 2, though. */
        /* NOTE: RAW_TELL() can release the GIL but the object is in a stable
           state at this point. */
        current = RAW_TELL(self);
        avail = READAHEAD(self);
        printf("current=%"  PY_PRIdOFF ", avail=%"  PY_PRIdOFF "\n", current, avail);
        if (avail > 0) {
            Py_off_t offset;
            if (whence == 0)
                offset = target - (current - RAW_OFFSET(self));
            else
                offset = target;
            printf("offset=%"  PY_PRIdOFF "\n", offset);
            if (offset >= -self->pos && offset <= avail) {
                printf("NO SEEK!\n");
                self->pos += offset;
                return PyLong_FromOff_t(current - avail + offset);
            }
        }
    }

I found this weird behaviour when trying to understand why:

        with open("setup.py", 'rb') as f:
            encoding, lines = tokenize.detect_encoding(f.readline)
        with open("setup.py", 'r', encoding=encoding) as f:
            imp.load_module("setup", f, "setup.py", (".py", "r", imp.PY_SOURCE))

is different than:

        with tokenize.open("setup.py") as f:
            imp.load_module("setup", f, "setup.py", (".py", "r", imp.PY_SOURCE))

imp.load_module() clones the file using something like fd = os.dup(f.fileno()); clone = os.fdopen(fd, "r").

For tokenizer.open(), a workaround is to replace:
   buffer.seek(0)
by
   buffer.seek(0); buffer.raw.seek(0)
msg136297 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-05-19 16:07
Note: _pyio.BufferedReader(), _pyio.BufferedWriter(), _pyio.BufferedRandom() don't use this optimization. They might be patched too.
msg136298 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-05-19 16:16
This is by design.
msg136306 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-05-19 16:39
And how can I seek the raw file to zero?

Using buffer.raw.seek(0), buffer.tell() becomes inconsistent:

$ ./python 
Python 3.2.1b1 (3.2:bd5e4d8c8080, May 15 2011, 10:22:54) 
>>> buffer=open('setup.py', 'rb')
>>> buffer.read(1)
>>> buffer.tell()
1
>>> buffer.raw.tell()
4096
>>> buffer.raw.seek(0)
0
>>> buffer.raw.tell()
0
>>> buffer.tell()
-4095

Same problem with os.lseek():

$ ./python 
Python 3.2.1b1 (3.2:bd5e4d8c8080, May 15 2011, 10:22:54) 
>>> import os
>>> buffer=open("setup.py", "rb")
>>> buffer.read(1)
>>> os.lseek(buffer.fileno(), 0, 0)
0
>>> buffer.raw.tell()
0
>>> buffer.tell()
-4095
msg136309 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-05-19 16:44
Simple: you are not supposed to use the raw file if you wrapped it inside a buffered file.
History
Date User Action Args
2022-04-11 14:57:17adminsetgithub: 56325
2011-05-19 16:44:17pitrousetmessages: + msg136309
2011-05-19 16:39:34vstinnersetmessages: + msg136306
2011-05-19 16:16:41pitrousetstatus: open -> closed
resolution: not a bug
messages: + msg136298
2011-05-19 16:07:19vstinnersetmessages: + msg136297
2011-05-19 16:04:45vstinnercreate