Issue11933
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011-04-27 04:29 by jsjgruber, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
setuptest | jsjgruber, 2011-04-27 15:09 | |||
mtime.py | jsjgruber, 2011-04-27 19:47 | |||
fix-newer-mtime.diff | eric.araujo, 2011-06-03 16:39 |
Messages (14) | |||
---|---|---|---|
msg134526 - (view) | Author: John S. Gruber (jsjgruber) | Date: 2011-04-27 04:29 | |
In researching a bug I was surprised that a newly created file was being replaced when being processed a second time (it shouldn't have been processed a second time). I tracked the surprise to diff Lib/distutils/dep_util.py @ 57642:9211a5d7d0b4 where a comparison with a resolution of 1 second was replaced by a floating point comparison, though the new file was copied by file_util.py which tried to preserve the time using a method accurate to only 1 second (truncating the fraction). Since a new file with a truncated modification time looks older than an older file with a full precision stamp the problem resulted. (For convenience--stat.st_mtime is floating point, stat[ST_MTIME] is integer seconds unless os.stat_float_times has been called to change the floating point behavior. Using all floating point doesn't resolve the issue as OS timestamped files can have more resolution than python floating point may hold, again causing truncation and confusion. For reference see Python issue 10148. In my humble opinion time setting and comparison should all be done consistently, and, if sub-second comparisons are desired, some fuzz should be used such that the comparison makes sense for the platform with the least amount of precision available with its floating point implementation. I briefly looked at branches other than 2.7 and they don't seem to have the above change. Probably of minor importance in most cases. Thanks all for the good work, all. I've been learning python and I love it! |
|||
msg134536 - (view) | Author: Éric Araujo (eric.araujo) * | Date: 2011-04-27 08:39 | |
Hi! Welcome to the joy of Python and thanks for the bug report and analysis. > In researching a bug Is it public? I’d be curious to look at it. > I was surprised that a newly created file was being replaced when > being processed a second time What created the file? What king of “processing” happened after? > I tracked the surprise to diff Lib/distutils/dep_util.py @ 57642:9211a5d7d0b4 > where a comparison with a resolution of 1 second was replaced by a > floating point comparison Indeed. The whole distutils codebase should use ST_MTIME or st_mtime consistently. > Using all floating point doesn't resolve the issue as OS timestamped > files can have more resolution than python floating point may hold, If I understand this bug correctly, it’s a platform-specific limitation. We can try changing all distutils files to use st_mtime and see if tests break. Would you like to work on a additional test case to reproduce your bug? |
|||
msg134576 - (view) | Author: John S. Gruber (jsjgruber) | Date: 2011-04-27 15:09 | |
Thanks for considering my report so quickly. Attached is a simple test to reproduce the bug, as you suggested. Please note that I am not suggesting the code base use stat.st_mtime. Running the attached with ext4, which keeps sub-second file timestamps: gruber@gruber-Satellite-L355D:~$ ~/setuptest running build running build_py creating build creating build/lib.linux-i686-2.7 copying charley.py -> build/lib.linux-i686-2.7 running build running build_py copying charley.py -> build/lib.linux-i686-2.7 Under an ext3 filesystem which only accounts to the nearest second: gruber@gruber-Satellite-L355D:~$ pushd /media/EXT3/ /media/EXT3 ~ gruber@gruber-Satellite-L355D:/media/EXT3$ ~/setuptest running build running build_py copying charley.py -> build/lib.linux-i686-2.7 running build running build_py You can see under ext3 the newer() function of dep_util.py works correctly and the redundant copy of charley.py to the build directory, when setup.py is run a second time, is skipped. Above, with ext4, the mtime comparison was confused. As I said in the report, not a big problem--but it obviously isn't what was intended so I hoped a problem report would be worthwhile to the distutils developers. Thanks. |
|||
msg134579 - (view) | Author: John S. Gruber (jsjgruber) | Date: 2011-04-27 15:25 | |
The original bug report is at: https://bugs.launchpad.net/ubuntu/+source/python-distutils-extra/+bug/770566 As you can see it had to do with a symbolic link created by distutils-extra before distutils was called upon to copy anything. Since this was done behind distutils back, LP: 770566 really was a distutils-extra problem and Martin's fix means this issue no longer arises there. To recreate the problem in that environment distutils-extra had to be modified to coerce processing to happen in a particular order so setup.py would complete, and then results compared between an ext4 run and the build farm's presumably ext3 run. In the ext4 case the symbolic link was predictably replaced by the file itself. The test case I posted above should be easier to follow and doesn't involve any smoke and mirrors being done behind distutils back! Please let me know if I provide any further information. |
|||
msg134581 - (view) | Author: Éric Araujo (eric.araujo) * | Date: 2011-04-27 15:31 | |
> Please note that I am not suggesting the code base use stat.st_mtime. Yep, I did :) Do you have another idea for a fix? We have to walk on eggshells with the distutils codebase: it has often happened that changes made to fix bugs were causing problems in third-party code relying on undocumented behavior (and/or bugs), so we want to fix bugs without causing undesired effects. (In the next-generation version, distutils2, we can break compatibility for the better.) I think using only ST_MTIME or only st_mtime could prevent the bug you’re reporting without doing harm otherwise. > Running the attached with ext4, which keeps sub-second file timestamps: I think filesystems with this precision are becoming widespread. Python 3 doesn’t have the bug AFAICT but Python 2.7 will be used for years, so it’s a worthwhile bug to fix. > You can see under ext3 the newer() function of dep_util.py works > correctly and the redundant copy of charley.py to the build > directory, when setup.py is run a second time, is skipped. This would be particularly annoying with larger projects involving C extensions. Would you be willing to turn your test file to a patch, and then fix the code? We have some basic unit tests in test_dep_util.py and test_build_py.py, you could see how they work and add a test for this bug. If you’re inclined to do so, please read the developpers guide at http://docs.python.org/devguide . It not, thanks for what you’ve already done, and I’ll take up from here. |
|||
msg134603 - (view) | Author: John S. Gruber (jsjgruber) | Date: 2011-04-27 17:36 | |
I can understand what you are saying about side effects. I was trying to suggest that the move to stat.st_mtime in dep_util.py in the hg commit I mentioned be reverted back to use stat[ST_MTIME], thereby matching the other python releases and the old behavior (and matching file_util.py). It seems to me that would be the safest course when it comes to side-effects. If it were desired to determine which file was newer using sub-second values, perhaps that would make a reasonable change in distutils2, but files created with a few microseconds would have to be considered equivalent due to the reduced precision available in python floats (53 bits on my platform, if I understand correctly) as compared to the 64 bit precision used by some operating systems and file systems. However, I think I'd prefer to turn the decision and further process over to you, if you don't mind. I thank you and your colleagues for your excellent work with python, and with making it platform independent--a very difficult part of the work indeed. |
|||
msg134610 - (view) | Author: John S. Gruber (jsjgruber) | Date: 2011-04-27 19:47 | |
As I thought about it, I suppose I should demonstrate the problem with stat.st_mtime. Here's an example and its output on an ext4 file system: gruber@gruber-Satellite-L355D:~$ ./mtime.py (1303933305.5525582, 1303933305.5525582) (1303933305.552558, 1303933305.552558) (1303933305.552557, 1303933305.552557) (1303933305.552556, 1303933305.552556) (1303933305.5525581837 1303933305.5525581837) (1303933305.5525579453 1303933305.5525579453) (1303933305.5525569916 1303933305.5525569916) (1303933305.5525560379 1303933305.5525560379) total 0 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:45.552558258 -0400 one -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:45.552558000 -0400 target1 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:45.552557000 -0400 target2 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:45.552556000 -0400 target3 Traceback (most recent call last): File "./mtime.py", line 28, in <module> assert timetargetthree == timetargettwo AssertionError gruber@gruber-Satellite-L355D:~$ ./mtime.py (1303933306.3885624, 1303933306.3885624) (1303933306.388562, 1303933306.388562) (1303933306.388561, 1303933306.388561) (1303933306.388561, 1303933306.388561) (1303933306.3885624409 1303933306.3885624409) (1303933306.3885619640 1303933306.3885619640) (1303933306.3885610104 1303933306.3885610104) (1303933306.3885610104 1303933306.3885610104) total 0 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:46.388562342 -0400 one -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:46.388562000 -0400 target1 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:46.388561000 -0400 target2 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:46.388561000 -0400 target3 Traceback (most recent call last): File "./mtime.py", line 29, in <module> assert timetargettwo == timetargetone AssertionError gruber@gruber-Satellite-L355D:~$ ./mtime.py (1303933307.0525656, 1303933307.0525656) (1303933307.052565, 1303933307.052565) (1303933307.052565, 1303933307.052565) (1303933307.052565, 1303933307.052565) (1303933307.0525655746 1303933307.0525655746) (1303933307.0525650978 1303933307.0525650978) (1303933307.0525650978 1303933307.0525650978) (1303933307.0525650978 1303933307.0525650978) total 0 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:47.052565592 -0400 one -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:47.052565000 -0400 target1 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:47.052565000 -0400 target2 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:47.052565000 -0400 target3 Traceback (most recent call last): File "./mtime.py", line 30, in <module> assert timetargetone == timeone AssertionError gruber@gruber-Satellite-L355D:~$ ./mtime.py (1303933308.2285714, 1303933308.2285714) (1303933308.228571, 1303933308.228571) (1303933308.22857, 1303933308.22857) (1303933308.228569, 1303933308.228569) (1303933308.2285714149 1303933308.2285714149) (1303933308.2285709381 1303933308.2285709381) (1303933308.2285699844 1303933308.2285699844) (1303933308.2285690308 1303933308.2285690308) total 0 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:48.228571344 -0400 one -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:48.228571000 -0400 target1 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:48.228570000 -0400 target2 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:48.228569000 -0400 target3 Traceback (most recent call last): File "./mtime.py", line 28, in <module> assert timetargetthree == timetargettwo AssertionError gruber@gruber-Satellite-L355D:~$ ./mtime.py (1303933309.0685754, 1303933309.0685754) (1303933309.068575, 1303933309.068575) (1303933309.068574, 1303933309.068574) (1303933309.068573, 1303933309.068573) (1303933309.0685753822 1303933309.0685753822) (1303933309.0685749054 1303933309.0685749054) (1303933309.0685739517 1303933309.0685739517) (1303933309.0685729980 1303933309.0685729980) total 0 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:49.068575446 -0400 one -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:49.068575000 -0400 target1 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:49.068574000 -0400 target2 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:49.068573000 -0400 target3 Traceback (most recent call last): File "./mtime.py", line 28, in <module> assert timetargetthree == timetargettwo AssertionError gruber@gruber-Satellite-L355D:~$ ./mtime.py (1303933309.6765785, 1303933309.6765785) (1303933309.676578, 1303933309.676578) (1303933309.676578, 1303933309.676578) (1303933309.676578, 1303933309.676578) (1303933309.6765785217 1303933309.6765785217) (1303933309.6765780449 1303933309.6765780449) (1303933309.6765780449 1303933309.6765780449) (1303933309.6765780449 1303933309.6765780449) total 0 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:49.676578416 -0400 one -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:49.676578000 -0400 target1 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:49.676578000 -0400 target2 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:49.676578000 -0400 target3 Traceback (most recent call last): File "./mtime.py", line 30, in <module> assert timetargetone == timeone AssertionError gruber@gruber-Satellite-L355D:~$ ./mtime.py (1303933310.396582, 1303933310.396582) (1303933310.396581, 1303933310.396581) (1303933310.39658, 1303933310.39658) (1303933310.396579, 1303933310.396579) (1303933310.3965818882 1303933310.3965818882) (1303933310.3965809345 1303933310.3965809345) (1303933310.3965799809 1303933310.3965799809) (1303933310.3965790272 1303933310.3965790272) total 0 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:50.396581938 -0400 one -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:50.396581000 -0400 target1 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:50.396580000 -0400 target2 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:50.396579000 -0400 target3 Traceback (most recent call last): File "./mtime.py", line 28, in <module> assert timetargetthree == timetargettwo AssertionError gruber@gruber-Satellite-L355D:~$ ./mtime.py (1303933311.056585, 1303933311.056585) (1303933311.056585, 1303933311.056585) (1303933311.056585, 1303933311.056585) (1303933311.056585, 1303933311.056585) (1303933311.0565850735 1303933311.0565850735) (1303933311.0565850735 1303933311.0565850735) (1303933311.0565850735 1303933311.0565850735) (1303933311.0565850735 1303933311.0565850735) total 0 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:51.056585165 -0400 one -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:51.056585000 -0400 target1 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:51.056585000 -0400 target2 -rw-r--r-- 1 gruber gruber 0 2011-04-27 15:41:51.056585000 -0400 target3 gruber@gruber-Satellite-L355D:~$ Please excuse my total lack of Python style. |
|||
msg134634 - (view) | Author: Éric Araujo (eric.araujo) * | Date: 2011-04-28 01:39 | |
> I was trying to suggest that the move to stat.st_mtime in dep_util.py > in the hg commit I mentioned be reverted back to use stat[ST_MTIME], > thereby matching the other python releases and the old behavior (and > matching file_util.py). It seems to me that would be the safest > course when it comes to side-effects. Agreed. > If it were desired to determine which file was newer using sub-second > values, perhaps that would make a reasonable change in distutils2, > but files created with a few microseconds would have to be considered > equivalent due to the reduced precision available in python floats > (53 bits on my platform, if I understand correctly) as compared to > the 64 bit precision used by some operating systems and file systems. Interesting. Can you open another bug, using type “feature request” and component “distutils2” so that we discuss that? Alternatively, send an email to the-fellowship-of-the-packaging Google group (more people (including distutils wizards) follow the group than the bug tracker). > However, I think I'd prefer to turn the decision and further process > over to you, if you don't mind. No problem. Thanks to your patch and diagnosis, it should be easy to write a test and fix. I hope you’ll be available to run the test on your machine. > I thank you and your colleagues for your excellent work with python, Thank you for doing your part as a user! (It’s weird to see “colleagues” used with something that’s not a job. We do this for fun, you know :) > and with making it platform independent--a very difficult part of the > work indeed. Heh, the more I learn about Python, the more I hate hardware and operating systems :) |
|||
msg135394 - (view) | Author: John S. Gruber (jsjgruber) | Date: 2011-05-07 01:01 | |
As Éric suggested I opened issue 11993 a couple of days ago. |
|||
msg137552 - (view) | Author: Éric Araujo (eric.araujo) * | Date: 2011-06-03 16:39 | |
Can you test this patch? |
|||
msg138046 - (view) | Author: John S. Gruber (jsjgruber) | Date: 2011-06-10 04:19 | |
The patch worked--it eliminated the redundant copy caused by this issue. Thank you very much, Éric. On Fri, Jun 3, 2011 at 12:39 PM, Éric Araujo <report@bugs.python.org> wrote: > > Éric Araujo <merwok@netwok.org> added the comment: > > Can you test this patch? > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue11933> > _______________________________________ > |
|||
msg138931 - (view) | Author: Éric Araujo (eric.araujo) * | Date: 2011-06-24 12:15 | |
Thanks, I’ll commit shortly. |
|||
msg141574 - (view) | Author: Roundup Robot (python-dev) | Date: 2011-08-02 18:02 | |
New changeset aebe3243bb2c by Éric Araujo in branch '2.7': Fix incorrect mtime comparison in distutils (#11933). http://hg.python.org/cpython/rev/aebe3243bb2c |
|||
msg141579 - (view) | Author: Éric Araujo (eric.araujo) * | Date: 2011-08-02 19:36 | |
Thanks again! |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:16 | admin | set | github: 56142 |
2011-08-02 19:36:13 | eric.araujo | set | status: open -> closed resolution: fixed messages: + msg141579 stage: needs patch -> resolved |
2011-08-02 18:02:43 | python-dev | set | nosy:
+ python-dev messages: + msg141574 |
2011-06-24 12:15:39 | eric.araujo | set | assignee: tarek -> eric.araujo messages: + msg138931 |
2011-06-10 04:19:37 | jsjgruber | set | messages:
+ msg138046 title: newer() function in dep_util.py mixes up new vs. old files due stat.st_mtime vs stat[ST_MTIME] -> newer() function in dep_util.py mixes up new vs. old files due stat.st_mtime vs stat |
2011-06-03 16:39:16 | eric.araujo | set | files:
+ fix-newer-mtime.diff keywords: + patch messages: + msg137552 |
2011-05-07 01:01:58 | jsjgruber | set | messages: + msg135394 |
2011-04-28 01:39:58 | eric.araujo | set | messages: + msg134634 |
2011-04-27 19:47:47 | jsjgruber | set | files:
+ mtime.py messages: + msg134610 |
2011-04-27 17:36:06 | jsjgruber | set | messages: + msg134603 |
2011-04-27 15:31:24 | eric.araujo | set | messages:
+ msg134581 stage: needs patch |
2011-04-27 15:25:28 | jsjgruber | set | messages: + msg134579 |
2011-04-27 15:09:53 | jsjgruber | set | files:
+ setuptest messages: + msg134576 |
2011-04-27 08:39:21 | eric.araujo | set | messages: + msg134536 |
2011-04-27 04:29:50 | jsjgruber | create |