msg288355 - (view) |
Author: Mba (mba) |
Date: 2017-02-22 12:24 |
ino_t is of type 'unsigned long' or 'unsigned long long'. Casting it to signed type means that negative values are incorrectly returned for large inodes.
|
msg289213 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-03-08 07:36 |
Any reason our _Py_stat_struct on Windows uses a signed type to represent st_ino?
|
msg289216 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-03-08 08:07 |
On Windows, _Py_attribute_data_to_stat() converts BY_HANDLE_FILE_INFORMATION to _Py_stat_struct, and then _pystat_fromstructstat() creates Python objects.
The file index in BY_HANDLE_FILE_INFORMATION is made of two DWORD, so yes, it's unsigned.
On Linux, stat.st_ino type is ino_t which is unsigned too.
So I created a pull request to fix the bug, even if I don't think that a filesystem produce inodes larger than 2^63-1. Not sure if it's worth it to backport the fix to Python 2.7, 3.5 and 3.6?
|
msg289221 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-03-08 09:24 |
I think it is worth to backport this at least to 3.6.
|
msg289288 - (view) |
Author: Mba (mba) |
Date: 2017-03-09 12:52 |
> I don't think that a filesystem produce inodes larger than 2^63-1
The problem is real: the large inode number was assigned to a file created by MacOS on a share exported via CIFS, and then stated by Linux using NFS export.
|
msg290247 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-03-24 22:37 |
New changeset 68d29809405dc766966b2b973b8597212fbc3dbd by Victor Stinner in branch '3.6':
bpo-29619: Convert st_ino using unsigned integer (#557) (#584)
https://github.com/python/cpython/commit/68d29809405dc766966b2b973b8597212fbc3dbd
|
msg290250 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-03-24 22:38 |
New changeset 0f6d73343d342c106cda2219ebb8a6f0c4bd9b3c by Victor Stinner in branch 'master':
bpo-29619: Convert st_ino using unsigned integer (#557)
https://github.com/python/cpython/commit/0f6d73343d342c106cda2219ebb8a6f0c4bd9b3c
|
msg292141 - (view) |
Author: Xavier de Gaye (xdegaye) * |
Date: 2017-04-22 22:26 |
The compilation of Modules/posixmodule.c fails now on Android architecture 'x86' at api level 21, after the above change 0f6d73343d342c106cda2219ebb8a6f0c4bd9b3c:
/opt/android-ndk/toolchains/llvm/prebuilt/linux-x86_64/bin/clang --sysroot=/opt/android-ndk/platforms/android-21/arch-x86 -target i686-none-linux-androideabi -gcc-toolchain /opt/android-ndk/toolchains/x86-4.9/prebuilt/linux-x86_64 -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wno-unused-value -Wno-empty-body -Qunused-arguments -Wno-parentheses-equality -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -IObjects -IInclude -IPython -I. -I./Include -I/opt/android-ndk/platforms/android-21/arch-x86/usr/include -DPy_BUILD_CORE -c ./Modules/posixmodule.c -o Modules/posixmodule.o
./Modules/posixmodule.c:1935:5: error: array size is negative
Py_BUILD_ASSERT(sizeof(unsigned long) >= sizeof(st->st_ino));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./Include/pymacro.h:43:15: note: expanded from macro 'Py_BUILD_ASSERT'
(void)Py_BUILD_ASSERT_EXPR(cond); \
^~~~~~~~~~~~~~~~~~~~~~~~~~
./Include/pymacro.h:40:19: note: expanded from macro 'Py_BUILD_ASSERT_EXPR'
(sizeof(char [1 - 2*!(cond)]) - 1)
^~~~~~~~~~~~~
1 error generated.
make: *** [Makefile:1720: Modules/posixmodule.o] Error 1
The following code:
#include <stdio.h>
#include <sys/stat.h>
int main()
{
struct stat *st;
printf("sizeof(unsigned long): %d - sizeof(st->st_ino): %d\n",
sizeof(unsigned long), sizeof(st->st_ino));
return 0;
}
prints the following result on this same system:
sizeof(unsigned long): 4 - sizeof(st->st_ino): 8
|
msg292142 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-04-22 22:48 |
Would it work with unsigned long long?
What is the value of HAVE_LARGEFILE_SUPPORT?
Maybe the code should be simplified to always use unsigned long long.
|
msg292174 - (view) |
Author: Xavier de Gaye (xdegaye) * |
Date: 2017-04-23 17:18 |
sizeof(unsigned long long) is 8 on Android x86 and HAVE_LARGEFILE_SUPPORT is undefined.
According to msg280053 HAVE_LARGEFILE_SUPPORT is also undefined on Android x86_64 (and also on Linux x86_64).
> Maybe the code should be simplified to always use unsigned long long.
Yes. If I understand correcty, one could only use 'unsigned long long' to assert at build time and use the smallest of the two unsigned long that fits with the size of st->st_ino at runtime.
|
msg293859 - (view) |
Author: Xavier de Gaye (xdegaye) * |
Date: 2017-05-17 16:11 |
In configure.ac, HAVE_LARGEFILE_SUPPORT is defined if (off_t > long && longlong >= off_t) and thus, when it is not defined, it means that off_t <= long (since longlong < off_t is false) so an off_t should fit into a long. But on Android the size of off_t is different from the size of ino_t so one cannot use the value of HAVE_LARGEFILE_SUPPORT to know if an ino_t would fit into an unsigned long.
On Android architecture 'x86' at api level 21:
sizeof(st->st_ino) = 8
and from the output of configure:
checking size of off_t... 4
checking size of long... 4
checking size of long long... 8
|
msg293873 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-05-17 18:48 |
Before 0f6d73343d342c106cda2219ebb8a6f0c4bd9b3c, the code used PyLong_FromLong((long)st->st_ino), so the change doesn't introduce a regression but detected a real bug, right?
> On Android architecture 'x86' at api level 21:
> sizeof(st->st_ino) = 8
> and from the output of configure:
> checking size of off_t... 4
> checking size of long... 4
> checking size of long long... 8
What is the type of st_ino? It's not off_t?
A workaround would be to force the usage unsigned long long on Android, right?
Can you please try the following patch on Android, and propose a PR if it works?
diff --git a/Modules/posixmodule.c b/Modules/posixmodule.c
index e8c15a9..78b3cb9 100644
--- a/Modules/posixmodule.c
+++ b/Modules/posixmodule.c
@@ -1927,7 +1927,7 @@ _pystat_fromstructstat(STRUCT_STAT *st)
return NULL;
PyStructSequence_SET_ITEM(v, 0, PyLong_FromLong((long)st->st_mode));
-#if defined(HAVE_LARGEFILE_SUPPORT) || defined(MS_WINDOWS)
+#if defined(HAVE_LARGEFILE_SUPPORT) || defined(MS_WINDOWS) || defined(defined(__ANDROID__))
Py_BUILD_ASSERT(sizeof(unsigned long long) >= sizeof(st->st_ino));
PyStructSequence_SET_ITEM(v, 1,
PyLong_FromUnsignedLongLong(st->st_ino));
|
msg293883 - (view) |
Author: Xavier de Gaye (xdegaye) * |
Date: 2017-05-17 19:55 |
> What is the type of st_ino? It's not off_t?
The type of st_ino is unsigned long long as defined in sys/stat.h for x86 and off_t is defined in sys/types.h this way;, quoting:
/* This historical accident means that we had a 32-bit off_t on 32-bit architectures. */
typedef __kernel_off_t off_t;
There is also a related section in the bionic github repo [1] quoted here:
=========================
This probably belongs in the NDK documentation rather than here, but these are the known ABI bugs in the 32-bit ABI:
time_t is 32-bit. http://b/5819737. In the 64-bit ABI, time_t is 64-bit.
off_t is 32-bit. There is off64_t, and in newer releases there is almost-complete support for _FILE_OFFSET_BITS. Unfortunately our stdio implementation uses 32-bit offsets and -- worse -- function pointers to functions that use 32-bit offsets, so there's no good way to implement the last few pieces http://b/24807045. In the 64-bit ABI, off_t is off64_t.
sigset_t is too small on ARM and x86 (but correct on MIPS), so support for real-time signals is broken. http://b/5828899 In the 64-bit ABI, sigset_t is the correct size for every architecture.
=========================
I will test your patch on x86 and x86_64 on Android and propose a patch if it works.
[1] https://android.googlesource.com/platform/bionic.git/#32_bit-ABI-bugs
|
msg293919 - (view) |
Author: Xavier de Gaye (xdegaye) * |
Date: 2017-05-18 11:39 |
With the patch proposed by Victor in msg293873 the cross-compilation of posixmodule.c succeeds on android-21-x86 and android-21-x86_64.
Following my suggestion in msg292174, I propose the following patch instead:
diff --git a/Modules/posixmodule.c b/Modules/posixmodule.c
index e8c15a9473..d7ff3c78bd 100644
--- a/Modules/posixmodule.c
+++ b/Modules/posixmodule.c
@@ -1927,14 +1927,12 @@ _pystat_fromstructstat(STRUCT_STAT *st)
return NULL;
PyStructSequence_SET_ITEM(v, 0, PyLong_FromLong((long)st->st_mode));
-#if defined(HAVE_LARGEFILE_SUPPORT) || defined(MS_WINDOWS)
Py_BUILD_ASSERT(sizeof(unsigned long long) >= sizeof(st->st_ino));
- PyStructSequence_SET_ITEM(v, 1,
+ if (sizeof(unsigned long) >= sizeof(st->st_ino))
+ PyStructSequence_SET_ITEM(v, 1, PyLong_FromUnsignedLong(st->st_ino));
+ else
+ PyStructSequence_SET_ITEM(v, 1,
PyLong_FromUnsignedLongLong(st->st_ino));
-#else
- Py_BUILD_ASSERT(sizeof(unsigned long) >= sizeof(st->st_ino));
- PyStructSequence_SET_ITEM(v, 1, PyLong_FromUnsignedLong(st->st_ino));
-#endif
#ifdef MS_WINDOWS
PyStructSequence_SET_ITEM(v, 2, PyLong_FromUnsignedLong(st->st_dev));
#else
This patch removes the conditional on HAVE_LARGEFILE_SUPPORT (explicit is better than implicit) and fixes the problem for any operating system where the size of off_t and the size of st_ino are different.
The compiler removes the dead branch of the <if (sizeof(unsigned long)...> conditional, so actually it compiles to the same binary as the other patch (gcc does this without any optimization configured on the command line, I didn't check this for clang).
The same kind of fix could also be applied elsewhere in posixmodule.c where behind the use of HAVE_LARGEFILE_SUPPORT there is an incorrect assumption that the size of off_t is the same as the size of another type:
* In _pystat_fromstructstat() - a little bit further down this patch - where the other type is st_size.
* In os_DirEntry_inode_impl() where the other type is ino_t.
Maybe this should be done in a new issue then ?
|
msg293920 - (view) |
Author: Xavier de Gaye (xdegaye) * |
Date: 2017-05-18 11:45 |
> The same kind of fix could also be applied elsewhere in posixmodule.c
By also adding a Py_BUILD_ASSERT() statement which is a welcome improvement on the previous code that allowed to pinpoint the problem on Android.
|
msg293924 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-05-18 14:51 |
Please propose the change as a PR.
|
msg293959 - (view) |
Author: Xavier de Gaye (xdegaye) * |
Date: 2017-05-19 15:24 |
PR 1666 added.
With Unified Headers introduced in android-ndk-r14, the size of off_t for 32 bits Android devices is 8 instead of 4 and the cross-compilation of Python on Android does not fail in this case, see msg293956.
|
msg293987 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2017-05-20 00:09 |
What is the advantage of compiling calls to both Long and LongLong converters? Isn’t it simpler to blindly call PyLong_FromUnsignedLongLong or similar?
|
msg294030 - (view) |
Author: Xavier de Gaye (xdegaye) * |
Date: 2017-05-20 14:12 |
Good point.
PyLong_FromLong() is faster than PyLong_FromLongLong(), IMO the fastest should be used when it is possible.
PyLong_FromUnsignedLong() and PyLong_FromUnsignedLongLong() are almost identical, they implement the same algorithm. The first one uses unsigned long (on the stack and for a register) while the other uses unsigned long long, so the difference in performance may be in the order of nano seconds on current processors. Should only PyLong_FromUnsignedLongLong() be used then or should this be left as is for consistency with the handling of PyLong_FromLong() ?
Another point (not related) is that the d_ino member of the DirEntry structure is of type ino_t which is unsigned and PR 1666 should be corrected to use unsigned instead in the last modifications made by the PR in os_DirEntry_inode_impl(). I will push this change when a decision has been taken on the point raised by Martin.
|
msg294034 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-05-20 15:23 |
I'm fine with always using unsigned long long. I don't think that it's even
possible to measure the perf difference and if it's doable, I expect don't
expect it to be significant... especially compared to the cost of the stat
() syscall!
Well, most computers are 64-bit. Even smartphones, no?
|
msg294135 - (view) |
Author: Xavier de Gaye (xdegaye) * |
Date: 2017-05-22 09:15 |
New changeset 50e86033de85294d87b7e942701d456342abde8e by xdegaye in branch 'master':
bpo-29619: Do not use HAVE_LARGEFILE_SUPPORT for type conversions (GH-1666).
https://github.com/python/cpython/commit/50e86033de85294d87b7e942701d456342abde8e
|
msg300412 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-08-17 12:33 |
New changeset ffbb6f7334ccf54f09dcc9e760766d861928f13e by Victor Stinner in branch '3.6':
bpo-29619: Do not use HAVE_LARGEFILE_SUPPORT for type conversions (GH-1666) (#3102)
https://github.com/python/cpython/commit/ffbb6f7334ccf54f09dcc9e760766d861928f13e
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:43 | admin | set | github: 73805 |
2017-08-17 12:33:09 | vstinner | set | messages:
+ msg300412 |
2017-08-16 09:12:25 | vstinner | set | pull_requests:
+ pull_request3141 |
2017-05-22 09:17:51 | xdegaye | set | status: open -> closed resolution: fixed |
2017-05-22 09:15:10 | xdegaye | set | messages:
+ msg294135 |
2017-05-20 15:23:49 | vstinner | set | messages:
+ msg294034 |
2017-05-20 14:12:02 | xdegaye | set | messages:
+ msg294030 |
2017-05-20 00:09:46 | martin.panter | set | nosy:
+ martin.panter messages:
+ msg293987
|
2017-05-19 15:24:14 | xdegaye | set | messages:
+ msg293959 |
2017-05-19 15:16:25 | xdegaye | set | pull_requests:
+ pull_request1761 |
2017-05-18 14:51:24 | vstinner | set | messages:
+ msg293924 |
2017-05-18 11:45:11 | xdegaye | set | messages:
+ msg293920 |
2017-05-18 11:39:51 | xdegaye | set | messages:
+ msg293919 |
2017-05-17 19:55:53 | xdegaye | set | messages:
+ msg293883 |
2017-05-17 18:48:50 | vstinner | set | status: closed -> open resolution: fixed -> (no value) messages:
+ msg293873
|
2017-05-17 16:11:47 | xdegaye | set | messages:
+ msg293859 |
2017-04-23 17:18:28 | xdegaye | set | messages:
+ msg292174 |
2017-04-22 22:48:43 | vstinner | set | messages:
+ msg292142 |
2017-04-22 22:26:39 | xdegaye | set | nosy:
+ xdegaye messages:
+ msg292141
|
2017-03-24 22:58:04 | xiang.zhang | set | status: open -> closed stage: resolved resolution: fixed versions:
- Python 2.7, Python 3.5 |
2017-03-24 22:38:32 | vstinner | set | messages:
+ msg290250 |
2017-03-24 22:37:58 | vstinner | set | messages:
+ msg290247 |
2017-03-18 06:32:45 | serhiy.storchaka | set | pull_requests:
- pull_request599 |
2017-03-17 21:00:33 | larry | set | pull_requests:
+ pull_request599 |
2017-03-09 16:41:36 | vstinner | set | pull_requests:
+ pull_request481 |
2017-03-09 12:52:08 | mba | set | messages:
+ msg289288 |
2017-03-08 09:24:49 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg289221
|
2017-03-08 08:08:00 | vstinner | set | nosy:
+ vstinner messages:
+ msg289216
|
2017-03-08 08:04:54 | vstinner | set | pull_requests:
+ pull_request458 |
2017-03-08 07:36:06 | xiang.zhang | set | nosy:
+ eryksun, steve.dower messages:
+ msg289213
|
2017-02-22 12:55:43 | xiang.zhang | set | nosy:
+ xiang.zhang
versions:
- Python 3.3, Python 3.4 |
2017-02-22 12:39:48 | mba | set | components:
- Library (Lib) |
2017-02-22 12:24:02 | mba | create | |