classification
Title: os.path.isfile() with big number cause OverflowError: fd is greater than maximum
Type: behavior Stage: resolved
Components: Argument Clinic Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Anj-A, aldwinaldwin, larry, paul.moore, serhiy.storchaka, steve.dower, tim.golden, vstinner, zach.ware, zufuliu
Priority: normal Keywords: easy (C)

Created on 2019-06-28 01:53 by zufuliu, last changed 2019-11-12 14:54 by serhiy.storchaka. This issue is now closed.

Messages (12)
msg346792 - (view) Author: Zufu Liu (zufuliu) Date: 2019-06-28 01:53
I think it's should be TypeError as Python 2.7.

The doc https://docs.python.org/3/library/os.path.html
doesn't says it accept file descriptor.


Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os.path
>>> os.path.isfile(123)
False
>>> os.path.isfile(2**32)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Dev\Python3\lib\genericpath.py", line 30, in isfile
    st = os.stat(path)
OverflowError: fd is greater than maximum
>>>

Python 3.8.0b1 (tags/v3.8.0b1:3b5deb0116, Jun  4 2019, 19:52:55) [MSC v.1916 64 bit (AMD64)] on win32
>>> import os.path
>>> os.path.isfile(123)
False
>>> os.path.isfile(2**32)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\obj\Windows-Release\38amd64_Release\msi_python\zip_amd64\genericpath.py", line 30, in isfile
OverflowError: fd is greater than maximum
>>>

Python 2.7.16 (v2.7.16:413a49145e, Mar  4 2019, 01:37:19) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os.path
>>> os.path.isfile(123)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Dev\Python\lib\genericpath.py", line 37, in isfile
    st = os.stat(path)
TypeError: coercing to Unicode: need string or buffer, int found
>>>
msg346801 - (view) Author: Aldwin Pollefeyt (aldwinaldwin) * Date: 2019-06-28 05:13
Integers seems to be accepted, so not a TypeError? Maybe documentation needs some more information. 


Documentation os.path under os.path.exists(path):

Changed in version 3.3: path can now be an integer: True is returned if it is an open file descriptor, False otherwise.


in posixmodule.c:

 * path_converter also optionally accepts signed
 * integers (representing open file descriptors) instead
 * of path strings.


although i dunno how to use then:

$ echo "test" > 123
$ ls -i 123
8012691 123
$ ./python -c "import os.path; print(os.path.isfile(8012691))"
False
$ ./python -c "import os.path; print(os.path.isfile('123'))"
True
msg347038 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-07-01 16:22
You use this by getting the file descriptor/fileno for an open file, and then testing whether it's a file or not:

>>> f=open("python.bat")
>>> f.fileno()
3
>>> import os
>>> os.path.isfile(3)
True

This is very uncommon on Windows to begin with (file descriptors are emulated by the C Runtime for POSIX compatibility), but it is supported.

Enabling Modules/posixmodule.c#L2446 to be able to optionally handle the OverflowError would allow the "exists" type functions to just return False instead of raising, while letting the others continue to raise.

I'm going to mark this as "easy (C)" so that we consider it for sprints. But in principle, I'm okay with returning False for invalid values in exists-type functions rather than raising.
msg347097 - (view) Author: Zufu Liu (zufuliu) Date: 2019-07-02 00:43
Personally I prefer TypeError. Use file descriptor is rare, I never used it in Python script.
I encountered this because arguments passed in wrong order in my function, then I passed the wrong argument to isfile() without checking.
Make isfile() returns False would make it harder to find bugs in user program.
msg347098 - (view) Author: Zufu Liu (zufuliu) Date: 2019-07-02 02:49
Please also consider following pieces of code.
They (value of int sub classes) can't be returned by fileno().

Python 3.8.0b1 (tags/v3.8.0b1:3b5deb0116, Jun  4 2019, 19:52:55) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> class MyInt(int):pass
...
>>> a=MyInt(6)
>>> a
6
>>> import os.path
>>> os.path.isfile(a)
False
>>> os.path.isfile(True)
False
>>> os.path.isfile(False)
False
>>>
msg355963 - (view) Author: (Anj-A) * Date: 2019-11-04 16:51
Hi all, I'm a newcomer interested in doing a small fix. Wondering if anyone's working on this at the moment?
msg355964 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-04 16:56
This issue reminds me bpo-33721:

"os.path functions that return a boolean result like exists(), lexists(), isdir(), isfile(), islink(), and ismount() now return False instead of raising ValueError or its subclasses UnicodeEncodeError and UnicodeDecodeError for paths that contain characters or bytes unrepresentable at the OS level. (Contributed by Serhiy Storchaka in bpo-33721.)"

https://docs.python.org/dev/whatsnew/3.8.html#os-path
msg355966 - (view) Author: (Anj-A) * Date: 2019-11-04 17:31
Hey, I'm not exactly clear what the required fix is here and would appreciate some guidance, is it in the documentation or in the way the error is handled?
msg355968 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-11-04 18:12
I am not sure there is much in common between this and issue33721. There are many ways to get a file path unrepresentable at the OS level. But I do not know any valid case for getting an out-of-range file descriptor. I am not convinced there is a bug here.
msg356446 - (view) Author: (Anj-A) * Date: 2019-11-12 12:10
Hey, if there is no bug here, could we get this issue closed?
Alternatively, I'd be interested in doing the required change in documentation/error type if that's seen to be the right solution.

Personally, I think returning False instead of raising an error would indeed mask bugs as mentioned by Zufu. As for the error type to be raised, the argument type itself isn't wrong, it's just greater than the maximum so I feel an OverflowError type is right instead of TypeError. So maybe just clearer documentation is needed as suggested by Aldwin and no code change? Just my two cents though.
msg356460 - (view) Author: Zufu Liu (zufuliu) Date: 2019-11-12 14:39
I'm fine with this been closed.
Maybe it's better to have a RawFd type like Julia:

julia> RawFD
RawFD

julia> typeof(RawFD)
DataType

julia> RawFD(0)
RawFD(0x00000000)

julia> RawFD(0)==0
false

Rust has RawFd as c_int.
msg356464 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-11-12 14:54
I considered this. There is a limited number of functions that can produce RawFD (os.open(), os.dup(), etc), and this is a single source of file descriptors in an isolated program. File descriptors can be inherited by child processes. But there are other methods of interprocess communication, and depending of the serialization method the type of the file descriptor can be lost. So there is a possibility to break existing programs. The transition can be made smooth, with corresponding deprecation period if we decide that the final befit is larger than the transition cost.

In any case this is a large change. It should be discussed on the Python-ideas or Python-Dev mailing lists.
History
Date User Action Args
2019-11-12 14:54:06serhiy.storchakasetmessages: + msg356464
2019-11-12 14:39:47zufuliusetmessages: + msg356460
2019-11-12 12:26:21serhiy.storchakasetstatus: open -> closed
resolution: not a bug
stage: resolved
2019-11-12 12:10:17Anj-Asetmessages: + msg356446
2019-11-04 18:12:04serhiy.storchakasetmessages: + msg355968
2019-11-04 17:31:41Anj-Asetmessages: + msg355966
2019-11-04 16:56:01vstinnersetnosy: + vstinner, serhiy.storchaka
messages: + msg355964
2019-11-04 16:51:11Anj-Asetnosy: + Anj-A
messages: + msg355963
2019-07-02 02:49:01zufuliusetmessages: + msg347098
2019-07-02 00:43:44zufuliusetmessages: + msg347097
2019-07-01 16:22:35steve.dowersetversions: + Python 3.9, - Python 3.7
nosy: + larry

messages: + msg347038

components: + Argument Clinic, - Library (Lib), Windows
keywords: + easy (C)
2019-06-28 05:13:58aldwinaldwinsetnosy: + aldwinaldwin
messages: + msg346801
2019-06-28 01:53:43zufuliucreate