classification
Title: md5 sums are different between Solaris and Windows XP SP1
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, josiahcarlson, sunmountain (3)
Priority: normal Keywords

Created on 2006-08-21 08:21 by sunmountain, last changed 2006-08-27 17:37 by josiahcarlson.

Files
File name Uploaded Description Edit Remove
md5sum.py sunmountain, 2006-08-21 08:21
Messages (4)
msg29586 - (view) Author: Stefan Sonnenberg (sunmountain) Date: 2006-08-21 08:21
The following program produces
different md5 sums under Solaris and Windows XP,
but sums are equal under the same platform.

#!/opt/ASpy23/bin/python
import sys
import md5
import getopt
import os
import re

try:
    opts,args = getopt.getopt(sys.argv[1:],'c:f:h')
except getopt.GetoptError,e:
    print 'Parsing command line arguments failed. (%s)'
% str(e)
    sys.exit(1)

md5file = None
fname = None

for o,a in opts:
   if o in '-c':
       if fname is not None:
           print '-c and -f are mutually exclusive'
           sys.exit(1)
       md5file = a
   if o in '-f':
       if md5file is not None:
           print '-c and -f are mutually exclusive'
           sys.exit(1)
       fname = a
   if o in '-h':
       print 'Usage: md5 filename. (%s)' % str(e)
       sys.exit(1)

if md5file is not None and os.path.isfile(md5file):
    try:
        lines = open(md5file,'r').readlines()
    except IOError,e: 
        print 'Could not read MD5 sum file %s. (%s)' %
(md5file,str(e))
        sys.exit(1)
    for line in lines:
        line = line[:-1]
        try:
            res = re.compile('MD5[ |\t]+\((.+)\)[
|\t]+?\=[ |\t]+(.+)').findall(line)[0]
        except Exception,e:
            print 'Could not parse line. (%s)' % str(e)
            sys.exit(1)
        if os.path.isfile(res[0]):
            try:
                f = open(res[0],'r')
            except IOError,e:
                print 'Could not open file %s. (%s)' %
(res[0],str(e))
                sys.exit(1)
            sum = md5.new()
            try:
                sum.update(f.read())
            except Exception,e:
                print 'Could not update MD5 sum. (%s)'
% str(e)
                sys.exit(1)
            #print sum.hexdigest(),res[1][2:],res[0],line
            if sum.hexdigest() == res[1][2:]:
                print 'MD5 sum of file %s is OK' % res[0]
            else:
                print 'MD5 sum of file %s DIFFERS' % res[0]
            f.close()
            sum = None
    sys.exit(0)

sum = md5.new()
try:
    f = open(fname,'r')
except IOError,e:
    print 'Could not open %s. (%s)' % ( fname,str(e) )
try:
    sum.update(f.read())
except Exception,e:
    print 'Could not update md5 sum. (%s)' % str(e)
print 'MD5  (%s) = 0x%s' % (fname,sum.hexdigest())
f.close()

Python version Solaris:
Python 2.3.5 (#1, Feb  9 2005, 14:45:39) [C] on sunos5

Python version Windows XP:
Python 2.3.5 (#62, Feb  9 2005, 16:17:08) [MSC v.1200
32 bit (Intel)] on win32
msg29587 - (view) Author: Georg Brandl (georg.brandl) Date: 2006-08-21 09:06
Logged In: YES 
user_id=849994

You're opening files in text mode, which is likely to return
different file contents on Windows and Unix.
msg29588 - (view) Author: Stefan Sonnenberg (sunmountain) Date: 2006-08-21 09:06
Logged In: YES 
user_id=1575341

I double checked the behaviour.
With Python the md5 sums differ between Solaris and Windows.
Using external tools for generating md5 sums, these are
always equal, under both systems, as it should be with md5.
I did not try other python versions.
msg29589 - (view) Author: Josiah Carlson (josiahcarlson) Date: 2006-08-27 17:37
Logged In: YES 
user_id=341410

Open your files with the 'rb' flag and the md5 sums will be
indentical.
History
Date User Action Args
2006-08-21 08:21:06sunmountaincreate