Downloaded the testdata.txt file, and yes, it's UTF-8:

facundo@pomcat:~/devel$ file testdata.txt 
testdata.txt: UTF-8 Unicode text

But I opened it perfectly!

Python 2.5.1 (r251:54863, May  2 2007, 16:56:35) 
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import xml.dom.minidom as dom
>>> data = open('testdata.txt','r').read()
>>> mydom = dom.parseString(data)
>>> mydom
<xml.dom.minidom.Document instance at 0xb7c03b0c>

In which platform you're working?

And yes, you have absolute permission to fix it, patchs are always welcomed!
