classification
Title: test_sax: multiple failures on Windows desktop
Type: behavior Stage: resolved
Components: Tests Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: brian.curtin, christian.heimes, pitrou, python-dev, r.david.murray, serhiy.storchaka, terry.reedy, tim.peters
Priority: normal Keywords: patch

Created on 2013-08-31 01:33 by terry.reedy, last changed 2013-09-01 22:15 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
188xx_text_sax.diff terry.reedy, 2013-08-31 01:33 review
eol-patch.txt tim.peters, 2013-09-01 18:04 Fiddle .hgeol to treat xmltestdata's files as binary review
Messages (14)
msg196606 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-08-31 01:33
These are due to \r in line endings, as with #12037.
  File "F:\Python\dev\py34\lib\test\test_sax.py", line 634, in test_expat_file
    self.assertEqual(result.getvalue(), xml_test_out)

Assert fails because off many \rs in xml_text_out. Same in

line 649, in test_expat_file_nonascii
line 778, in test_expat_inpsource_filename
line 788, in test_expat_inpsource_sysid
line 803, in test_expat_inpsource_sysid_nonascii
line 816, in test_expat_inpsource_stream

A fix that works is essentially the same as for #12037:
  xml_test_out = xml_test_out.replace(b'r', b'')
just after xml_test_out is read from the file.
(.translate(None, b'\r') also works, seems slightly slower)

Another would be to not put \r in the file in the first place;-).

Any objection to this?
msg196608 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-08-31 02:00
On #12037, David suggests instead the equivalent of this:

with open(TEST_XMLFILE_OUT) as f:
    xml_test_out = f.read().encode('ascii')
msg196643 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-31 16:12
Why Lib/test/xmltestdata/test.xml.out contains '\r'? AFAIK Windows buildbots are happy with current tests.

A nitpick for first proposed solution: you can append ".replace(b'\r', b'')" just to "f.read()".

A nitpick for second proposed solution: encoding should be "iso-8859-1" (for both open() and encode()).
msg196655 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-08-31 17:32
As I said on #12037, I would rather \r was not in the file. I do not know if it is present in the master repository or added by hg when cloning.

Buildbots and developer desktops are different environments. The tests must work on both.

Nit 1: right, I should have seen that. On #12037, I originally did the conversion in the test call.

Nit 2: why? The test passes as is.

Please pick a solution, test it on non-Windows if you can, and apply it so I can have the test suite pass on Windows after I apply a patch.
msg196722 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-09-01 17:15
Suggest caution here.  test_sax fails the same way for me too (Windows Vista), under both the released 3.3.2 and a Python built from the current hg default branch.

However, these files (test.xml and test.xml.out) have not changed since the Python 2.7 line - the \r\n line endings have _always_ been there, and test_sax works fine under (e.g.) Python 2.7.5 on Windows.

So it's not that the files have changed, it must be that Python is behaving differently.

Unfortunately, I don't know anything about what test_sax is trying to do.  I do see that under 2.7, test_sax does

xml_test_out = open(TEST_XMLFILE_OUT).read()

but 3.3 does

with open(TEST_XMLFILE_OUT, 'rb') as f:
    xml_test_out = f.read()

That is, 2.7 opens the file for reading in text mode, but 3.3 opens it in binary mode.  That makes a big difference for text files under Windows.

It's also disturbing that the Windows buildbots don't fail.  There is no change in "environment" that should affect the bytes seen when a file is read - and the buildbots "should be" seeing, byte for byte, the same files developers and users see.
msg196726 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-09-01 18:04
Seeing as Python 3 _does_ open these files in binary mode, I fiddled my local .hgeol to mark the test files as BIN (then deleted the test data directory and did an "hg revert" on the directory).  Then test_sax passes.  

I don't know whether that's the proper fix, but it works ;-)
msg196730 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-09-01 19:44
Two things have happened with Python: we switched bytes from 'text or binary' to 'binary (or text)' in 3.0. We switched from svn to hg in 3.2. When did test_sax start failing? I have no idea. Until recently, there were much worse problems with the test suite on desktop windows; it sometimes quit before getting to 'sax'.

Email and its tests (which had the same eol issue) were heavily revised in 3.3; I do not know if the test that failed in 3.3 failed earlier or not. Another failing email test was added in 3.4.

David, Antoine: Tim questions/challenges the solution pushed in #12037.
msg196733 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-09-01 20:04
Marking the files as "binary" (really "don't touch") in .hgeol sounds reasonable to me.
msg196734 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-09-01 20:26
> Nit 2: why? The test passes as is.

Currently test.xml and test.xml.out contain only ascii characters. But this can be changed in future. Actually I had added non-ascii characters to other test data and this had exposed some bugs in test suite.

> When did test_sax start failing? I have no idea.

I guess it becomes after my commit in issue1470548 a half year ago.

All proposed fixes are good enough. Feel free to commit what you prefers.
msg196735 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-09-01 20:33
test_email still passed on Windows under 3.2.5 (but test_sax failed).  test_email and test_sax both fail under 3.3.2.

I'll just push the change to .hgeol - minimally invasive, fixes the immediate problem, and it appears these "really are" binary files.
msg196737 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-09-01 20:53
Having the files be the same on all systems seems the right solution. Changing .hgeol should do that, at least for the repository. Let's hope the Windows .msi installer leave line endings alone. I should install the a,b,c releases just to run the test suite with them.

Side note: just above the quoted context in .hgeol is
Lib/email/test/data/msg_26.txt = BIN
which does not exist now.
I suspect it should be removed as the obsolete old version of
Lib/test/test_email/data/msg_26.txt = BIN
which is in the quoted context.
msg196738 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-09-01 21:02
New changeset 8efcf3c823f9 by Tim Peters in branch '3.3':
Fix issue 18889: test_sax: multiple failures on Windows desktop.
http://hg.python.org/cpython/rev/8efcf3c823f9

New changeset 25211a22228b by Tim Peters in branch 'default':
Merge fix from 3.3 into default.
http://hg.python.org/cpython/rev/25211a22228b
msg196739 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-09-01 21:06
Terry, yes, the installer won't change line endings.  I think - we'll find out for sure next time a release is cut - LOL ;-)

Agreed that Lib/email/test/data/msg_26.txt is probably obsolete.  Fix it, if you like!  It's helpful to get rid of obsolete cruft.
msg196743 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-09-01 22:15
For test_email, the fix was correct because the *test* didn't care about what line ending the source file had.  I can't speak for sax.
History
Date User Action Args
2013-09-01 22:15:30r.david.murraysetmessages: + msg196743
2013-09-01 21:06:53tim.peterssetmessages: + msg196739
2013-09-01 21:03:11tim.peterssetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2013-09-01 21:02:15python-devsetnosy: + python-dev
messages: + msg196738
2013-09-01 20:53:17terry.reedysetmessages: + msg196737
2013-09-01 20:33:03tim.peterssetmessages: + msg196735
2013-09-01 20:26:50serhiy.storchakasetmessages: + msg196734
2013-09-01 20:04:42pitrousetmessages: + msg196733
2013-09-01 19:44:51terry.reedysetnosy: + pitrou, r.david.murray
messages: + msg196730
2013-09-01 18:04:21tim.peterssetfiles: + eol-patch.txt

messages: + msg196726
2013-09-01 17:15:43tim.peterssetnosy: + tim.peters
messages: + msg196722
2013-08-31 17:32:25terry.reedysetmessages: + msg196655
2013-08-31 16:12:01serhiy.storchakasetmessages: + msg196643
2013-08-31 15:56:29pitrousetnosy: + brian.curtin
2013-08-31 15:54:13pitrousetcomponents: + Tests
stage: commit review -> patch review
2013-08-31 15:54:05pitrousetnosy: + serhiy.storchaka
2013-08-31 02:00:06terry.reedysetmessages: + msg196608
2013-08-31 01:33:31terry.reedycreate