classification
Title: os.lseek() and FileIO.seek() does not support offset larger than 2^63-1
Type: Stage:
Components: IO Versions: Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Kuberan.Naganathan, jcea, pitrou, vstinner
Priority: normal Keywords: patch

Created on 2011-07-13 03:29 by Kuberan.Naganathan, last changed 2020-11-17 21:36 by neologix.

Files
File name Uploaded Description Edit
lseek_negative.diff neologix, 2011-07-22 20:35 review
lseek_unsigned.patch vstinner, 2013-06-06 20:16 review
lseek_unsigned_test.py vstinner, 2013-06-06 20:29
Messages (21)
msg140222 - (view) Author: Kuberan Naganathan (Kuberan.Naganathan) Date: 2011-07-13 03:29
The lseek function can legitimately return a code less then zero ( except for -1 ) when seeking beyond an offset of 2^63.  This behavior should be supported in order to permit the python interpreter to seek in files with valid data at locations greater than or equal to 2^63.  This can happen in a sparse file or in the /proc file system address space file.  

The fix is simple.  In the posix_lseek function check for result != -1 instead of checking for result < 0 in return code checks of the value returned by lseek.
msg140223 - (view) Author: Kuberan Naganathan (Kuberan.Naganathan) Date: 2011-07-13 03:50
In addition I would like the posix_lseek function to accept a value larger than 2^63 as a seek offset.  Currently I seem to need to seek multiple times within a file to achieve an offset larger than 2^63 ( assuming the return type check on the return value of lseek is fixed. )
msg140250 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-07-13 12:09
> In addition I would like the posix_lseek function to accept a value
> larger than 2^63 as a seek offset

How would it work? The C lseek() takes a signed (64-bit) offset as argument, so we would have to call it multiple times anyway.
msg140251 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-07-13 13:08
"... A negative file offset may be valid for some devices in some implementations.

The POSIX.1-1990 standard did not specifically prohibit lseek() from returning a negative offset. Therefore, an application was required to clear errno prior to the call and check errno upon return to determine whether a return value of ( off_t)-1 is a negative offset or an indication of an error condition. The standard developers did not wish to require this action on the part of a conforming application, and chose to require that errno be set to [EINVAL] when the resulting file offset would be negative for a regular file, block special file, or directory."

Extract of:
http://pubs.opengroup.org/onlinepubs/009695399/functions/lseek.html
msg140253 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-07-13 14:21
> In the posix_lseek function check for result != -1 instead of checking
> for result < 0 in return code checks of the value returned by lseek.

+1
This can be useful when parsing /dev/mem on x86_64, for example.

> In addition I would like the posix_lseek function to accept a value
> larger than 2^63 as a seek offset

Like Antoine, I'm a little bit puzzled here.
msg140261 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-07-13 14:45
> How would it work? The C lseek() takes a signed (64-bit) offset
> as argument, so we would have to call it multiple times anyway.

Python does already call multiple times the same function if the input is larger than the type used by the function. Some examples:

 - mbcs codec: work on chunk on INT_MAX bytes (input size type: Py_ssize_t)
 - zlib: crc32() works on chunk on UINT_MAX bytes (input size type: Py_ssize_t)

We have also functions processing fewer data if the function cannot handle all data: FileIO.write() clamps the buffer size of INT_MAX on Windows for example.

If we call lseek() multiple times, the implementation will depend on the whence value: SEEK_SET will use SEEK_SET and then SEEK_CUR. For SEEK_CUR, all calls can use SEEK_CUR. For SEEK_END... hum? SEEK_END and then SEEK_CUR?

I vote -1 for this feature because I bet that the behaviour of lseek() with an file position > 2^63 or offset > 2^63 highly depend on the platform (kernel, libc) version.

You can implement it in Python for your usecase. I prefer to keep a thin wrapper with an known and reliable behaviour.
msg140270 - (view) Author: Kuberan Naganathan (Kuberan.Naganathan) Date: 2011-07-13 15:24
Im most familiar with solaris.  Atleast on solaris lseek interprets its
signed arg as the unsigned value with the same bit pattern.  I cannot be
certain that this is common across other operating systems.
On Jul 13, 2011 8:09 AM, "Antoine Pitrou" <report@bugs.python.org> wrote:
>
> Antoine Pitrou <pitrou@free.fr> added the comment:
>
>> In addition I would like the posix_lseek function to accept a value
>> larger than 2^63 as a seek offset
>
> How would it work? The C lseek() takes a signed (64-bit) offset as
argument, so we would have to call it multiple times anyway.
>
> ----------
> nosy: +neologix, pitrou
> versions: +Python 3.2, Python 3.3
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue12545>
> _______________________________________
msg140272 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-07-13 15:31
> Im most familiar with solaris.  Atleast on solaris lseek interprets its
> signed arg as the unsigned value with the same bit pattern.

Only with SEEK_SET, I assume?
msg140289 - (view) Author: Kuberan Naganathan (Kuberan.Naganathan) Date: 2011-07-13 16:58
I did some testing on this.  Solaris seems to treat files as a large
circular buffer of size 2^64 with respect to seeks in the /proc file system
address space file. No error results with seeks of greater than 2^63 with
any of SEEK_SET, SEEK_CUR, SEEK_END.  This is not true of a regular file
where I get errno=22 when seeking to an offset greater than or equal to
2^63.

So even on solaris the behavior seems to be filesystem dependent.
msg140605 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-07-18 17:15
> So even on solaris the behavior seems to be filesystem dependent.
>

So  I'd suggest forgetting about this part.
Would you like to write a patch for the first point?
msg140715 - (view) Author: Kuberan Naganathan (Kuberan.Naganathan) Date: 2011-07-20 02:07
Hi.  I can submit a patch for the first part.  Should I submit on this issue tracker item?
msg140721 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-07-20 06:46
> Hi.  I can submit a patch for the first part.  Should I submit on this issue tracker item?
>

Sure.
msg140723 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-07-20 08:18
> So I'd suggest forgetting about this part.

If someone really wants this feature (relative seek of more than 2^63), he/she should open a new issue.

This issue remembers me a mktime() issue: mktime() may return -1 even if it is not an error. time_mktime() uses now a sentinel to detect an error:

    buf.tm_wday = -1;  /* sentinel; original value ignored */
    tt = mktime(&buf);
    /* Return value of -1 does not necessarily mean an error, but tm_wday
     * cannot remain set to -1 if mktime succeeded. */
    if (tt == (time_t)(-1) && buf.tm_wday == -1) /* OverflowError */

For lseek, we can rely on errno. Try something like that:

errno = 0;
offset = lseek(...);
if (offset == (off_t)-1 && errno) /* error */

We can write a test using a sparse file... Or maybe a mmap object?
msg140807 - (view) Author: Kuberan Naganathan (Kuberan.Naganathan) Date: 2011-07-21 11:07
Hi.  I'm unable ( or have to jump through lots of hoops ) to submit this patch myself for regulatory reasons.  Can someone else submit this please?  Thanks.
msg140904 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-07-22 20:35
Patch attached.

> For lseek, we can rely on errno. Try something like that:
>
> errno = 0;
> offset = lseek(...);
> if (offset == (off_t)-1 && errno) /* error */
>

It's a little bit overkill :-) (for mktime, time_t can overflow easily
on 32-bit).

> We can write a test using a sparse file... Or maybe a mmap object?
>

I'm not sure it's easily testable, because it's really a corner case
not addressed by POSIX. On my Linux box, I can't get lseek to return a
negative value, I get EINVAL - which does make sense (the Linux kernel
doesn't accept or return negative file offsets, see
http://lwn.net/Articles/138063/).

Kuberan, on what operating system did you notice this problem? Solaris?
msg140923 - (view) Author: Kuberan Naganathan (Kuberan.Naganathan) Date: 2011-07-23 01:16
yes.  i noticed problem on solaris on the /proc/<pid>/as file which usually
has mapped regions beyond 2^63 in most process files
On Jul 22, 2011 4:35 PM, "Charles-François Natali" <report@bugs.python.org>
wrote:
>
> Charles-François Natali <neologix@free.fr> added the comment:
>
> Patch attached.
>
>> For lseek, we can rely on errno. Try something like that:
>>
>> errno = 0;
>> offset = lseek(...);
>> if (offset == (off_t)-1 && errno) /* error */
>>
>
> It's a little bit overkill :-) (for mktime, time_t can overflow easily
> on 32-bit).
>
>> We can write a test using a sparse file... Or maybe a mmap object?
>>
>
> I'm not sure it's easily testable, because it's really a corner case
> not addressed by POSIX. On my Linux box, I can't get lseek to return a
> negative value, I get EINVAL - which does make sense (the Linux kernel
> doesn't accept or return negative file offsets, see
> http://lwn.net/Articles/138063/).
>
> Kuberan, on what operating system did you notice this problem? Solaris?
>
> ----------
> keywords: +patch
> Added file: http://bugs.python.org/file22719/lseek_negative.diff
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue12545>
> _______________________________________
msg190724 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-06-06 20:16
Oh, I just reproduced the issue on my gdb.py program, which is part of the python-ptrace program. When searching for a pattern in the memory of another process, /proc/pid/maps is used to get the list of all memory mappings of this process. On Linux x64, the last mapping is "ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]", machine code used for system calls. gdb.py opens /proc/pid/mem in binary mode and tries to seek at 0xffffffffff600000 but FileIO.seek() fails because it gets an offset bigger than 2^63, which is seen as a negative number and FileIO.seek() considers a negative number as an error.

Attached patch fixes this issue by checking not only the result of lseek(), but also errno:

    errno = 0;
    Py_BEGIN_ALLOW_THREADS
#if defined(MS_WIN64) || defined(MS_WINDOWS)
    res = _lseeki64(fd, pos, how);
#else
    res = lseek(fd, pos, how);
#endif
    Py_END_ALLOW_THREADS
    if (res == (off_t)-1 && errno)
        return posix_error();

PyLong_AsUnsignedLong() is tried if PyLong_AsLong() failed with an OverflowError. I kept PyLong_AsLong() for relative seek with a negative offset, and I prefer to try first PyLong_AsLong() because the overflow case is very unlikely compared to negative offsets for relative seek.

The result is always casted to an unsigned number. I never seen a device or file descriptor supporting negative offset.

I don't know how to test this code path automatically. I only know the "[vsyscall]" mapping use case on Linux, and it requires to parse /proc/self/maps which is not trivial and I would prefer sometimes simpler (more reliable with less maintenance). I propose to not add an unit test, except someone find how to write a simple and reliable test.

> This is not true of a regular file where I get errno=22
> when seeking to an offset greater than or equal to 2^63.

Same behaviour on Linux: OSError(22, "Invalid argument") when I try file.seek(1 << 63) on a regular file.
msg190725 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-06-06 20:17
About the version field: If we decide to change os.lseek() and FileIO.seek(), it is safer to only do it in Python 3.4.
msg190727 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-06-06 20:27
lseek_negative.diff is not correct: 0x8000000000000000 is a valid offset for a /proc/pid/mem file.
msg190728 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-06-06 20:29
lseek_unsigned_test.py: script to test manually the patch on Linux. It may work on Solaris, I didn't try.
msg190729 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-06-06 20:32
The patch is not complete: _iomodule.h defines PyLong_AsOff_t() and PyLong_FromOff_t(), but these functions don't support unsigned off_t (larger than 2^63-1).

#if defined(MS_WIN64) || defined(MS_WINDOWS)

/* Windows uses long long for offsets */
typedef PY_LONG_LONG Py_off_t;
# define PyLong_AsOff_t     PyLong_AsLongLong
# define PyLong_FromOff_t   PyLong_FromLongLong
# define PY_OFF_T_MAX       PY_LLONG_MAX
# define PY_OFF_T_MIN       PY_LLONG_MIN
# define PY_OFF_T_COMPAT    PY_LONG_LONG /* type compatible with off_t */
# define PY_PRIdOFF         "lld"        /* format to use for that type */

#else

/* Other platforms use off_t */
typedef off_t Py_off_t;
#if (SIZEOF_OFF_T == SIZEOF_SIZE_T)
# define PyLong_AsOff_t     PyLong_AsSsize_t
# define PyLong_FromOff_t   PyLong_FromSsize_t
# define PY_OFF_T_MAX       PY_SSIZE_T_MAX
# define PY_OFF_T_MIN       PY_SSIZE_T_MIN
# define PY_OFF_T_COMPAT    Py_ssize_t
# define PY_PRIdOFF         "zd"
#elif (HAVE_LONG_LONG && SIZEOF_OFF_T == SIZEOF_LONG_LONG)
# define PyLong_AsOff_t     PyLong_AsLongLong
# define PyLong_FromOff_t   PyLong_FromLongLong
# define PY_OFF_T_MAX       PY_LLONG_MAX
# define PY_OFF_T_MIN       PY_LLONG_MIN
# define PY_OFF_T_COMPAT    PY_LONG_LONG
# define PY_PRIdOFF         "lld"
#elif (SIZEOF_OFF_T == SIZEOF_LONG)
# define PyLong_AsOff_t     PyLong_AsLong
# define PyLong_FromOff_t   PyLong_FromLong
# define PY_OFF_T_MAX       LONG_MAX
# define PY_OFF_T_MIN       LONG_MIN
# define PY_OFF_T_COMPAT    long
# define PY_PRIdOFF         "ld"
#else
# error off_t does not match either size_t, long, or long long!
#endif

#endif

My patch fixes FileIO.seek(), but no BufferedReader & friends. So seek() works on open("/proc/self/mem", "rb", 0), but not on open("/proc/self/mem", "rb").

I didn't notice immediatly because I'm using an unbuffered file in my python-pytrace project.
History
Date User Action Args
2020-11-17 21:36:19neologixsetnosy: - neologix
2020-11-17 19:07:45iritkatrielsetversions: + Python 3.9, Python 3.10, - Python 3.4
2013-06-06 20:32:25vstinnersetmessages: + msg190729
2013-06-06 20:29:35vstinnersetfiles: + lseek_unsigned_test.py

messages: + msg190728
2013-06-06 20:27:58vstinnersetmessages: + msg190727
2013-06-06 20:18:23vstinnersetfiles: - unnamed
2013-06-06 20:18:21vstinnersetfiles: - unnamed
2013-06-06 20:18:20vstinnersetfiles: - unnamed
2013-06-06 20:17:55vstinnersettitle: Incorrect handling of return codes in the posix_lseek function in posixmodule.c -> os.lseek() and FileIO.seek() does not support offset larger than 2^63-1
2013-06-06 20:17:32vstinnersetmessages: + msg190725
versions: + Python 3.4, - Python 2.7, Python 3.2, Python 3.3
2013-06-06 20:16:32vstinnersetfiles: + lseek_unsigned.patch

messages: + msg190724
2011-07-23 01:16:31Kuberan.Naganathansetfiles: + unnamed

messages: + msg140923
2011-07-22 20:35:06neologixsetfiles: + lseek_negative.diff
keywords: + patch
messages: + msg140904
2011-07-21 11:07:23Kuberan.Naganathansetmessages: + msg140807
2011-07-20 08:18:45vstinnersetmessages: + msg140723
2011-07-20 06:46:46neologixsetmessages: + msg140721
2011-07-20 02:07:30Kuberan.Naganathansetmessages: + msg140715
2011-07-18 17:15:50neologixsetmessages: + msg140605
2011-07-13 18:05:58jceasetnosy: + jcea
2011-07-13 16:58:55Kuberan.Naganathansetfiles: + unnamed

messages: + msg140289
2011-07-13 15:31:51pitrousetmessages: + msg140272
2011-07-13 15:24:29Kuberan.Naganathansetfiles: + unnamed

messages: + msg140270
2011-07-13 14:45:13vstinnersetmessages: + msg140261
2011-07-13 14:21:16neologixsetmessages: + msg140253
2011-07-13 13:08:28vstinnersetmessages: + msg140251
2011-07-13 12:09:55pitrousetnosy: + neologix, pitrou

messages: + msg140250
versions: + Python 3.2, Python 3.3
2011-07-13 12:05:56pitrousetnosy: + vstinner
2011-07-13 03:50:43Kuberan.Naganathansetmessages: + msg140223
2011-07-13 03:29:18Kuberan.Naganathancreate