Author chanke
Recipients chanke
Date 2020-07-21.07:29:02
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1595316543.83.0.558279011146.issue41354@roundup.psfhosted.org>
In-reply-to
Content
help(filecmp.cmp) says:

"""
cmp(f1, f2, shallow=True)
    Compare two files.
    
    Arguments:
    
    f1 -- First file name
    
    f2 -- Second file name
    
    shallow -- Just check stat signature (do not read the files).
               defaults to True.
    
    Return value:
    
    True if the files are the same, False otherwise.
    
    This function uses a cache for past comparisons and the results,
    with cache entries invalidated if their stat information
    changes.  The cache may be cleared by calling clear_cache().
"""

However, looking at the code, the shallow-argument is taken only into account if the signatures are the same:
"""
    s1 = _sig(os.stat(f1))
    s2 = _sig(os.stat(f2))
    if s1[0] != stat.S_IFREG or s2[0] != stat.S_IFREG:
        return False
    if shallow and s1 == s2:
        return True
    if s1[1] != s2[1]:
        return False

    outcome = _cache.get((f1, f2, s1, s2))
    if outcome is None:
        outcome = _do_cmp(f1, f2)
        if len(_cache) > 100:      # limit the maximum size of the cache
            clear_cache()
        _cache[f1, f2, s1, s2] = outcome
    return outcome
"""

Therefore, if I call cmp with shallow=True and the stat-signatures differ, 
cmp actually does a "deep" compare.
This "deep" compare however does not check the stat-signatures.

Thus I propose follwing patch:
cmp always checks the "full" signature.
return True if shallow and above test passed.
It does not make sense to me that when doing a "deep" compare, that only the size 
is compared, but not the mtime. 


--- filecmp.py.orig     2020-07-16 12:00:57.000000000 +0200
+++ filecmp.py  2020-07-16 12:00:30.000000000 +0200
@@ -52,10 +52,10 @@
     s2 = _sig(os.stat(f2))
     if s1[0] != stat.S_IFREG or s2[0] != stat.S_IFREG:
         return False
-    if shallow and s1 == s2:
-        return True
-    if s1[1] != s2[1]:
+    if s1 != s2:
         return False
+    if shallow:
+        return True
 
     outcome = _cache.get((f1, f2, s1, s2))
     if outcome is None:
History
Date User Action Args
2020-07-21 07:29:03chankesetrecipients: + chanke
2020-07-21 07:29:03chankesetmessageid: <1595316543.83.0.558279011146.issue41354@roundup.psfhosted.org>
2020-07-21 07:29:03chankelinkissue41354 messages
2020-07-21 07:29:02chankecreate