This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Make assertMultilineEqual default for unicode string comparison in
Type: Stage: resolved
Components: Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: michael.foord Nosy List: michael.foord, pitrou
Priority: normal Keywords:

Created on 2009-10-01 22:29 by michael.foord, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (6)
msg93424 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2009-10-01 22:29
unittest.TestCase.assertEqual uses the new type equality functions for
comparing containers. 

In Python 2.7 assertMultilineEqual should be the default comparison
method for unicode strings and in Python 3.2 for comparing strings.

assertMultilineEqual should only use difflib for showing differences for
strings above a certain length. (For short strings the extra output is
actually more confusing than helpful.)
msg93432 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-10-02 10:04
Why only unicode strings?
msg93435 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2009-10-02 10:15
Because diffing binary data isn't useful...

This is the reason that assertMultilineEqual isn't already the default
for comparing strings - because in Python 2 when you have strings you
don't know if the intention is for them to contain textual information
or binary information.
msg93436 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-10-02 10:41
> Because diffing binary data isn't useful...

But often it's non-binary data ;)

> This is the reason that assertMultilineEqual isn't already the default
> for comparing strings - because in Python 2 when you have strings you
> don't know if the intention is for them to contain textual information
> or binary information.

You could have a heuristic which counts the number of "\n" bytes and, if
there are more than 1/80th of them, you're likely to have some text.

(80 being the typical max line length)
msg93437 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2009-10-02 10:58
Heh - all ascii would be a better heuristic, or zero null characters
perhaps. But 2.X is destined to die anyway so I'm happy for it to only
be the default for unicode strings without implementing potentially
complex, wrong and slow heuristics.

Users can always register it themselves using addTypeEqualityFunc if
they want.
msg99071 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-02-08 23:31
Revision 78116.
History
Date User Action Args
2022-04-11 14:56:53adminsetgithub: 51281
2010-02-08 23:31:53michael.foordsetstatus: open -> closed
resolution: accepted
messages: + msg99071

stage: resolved
2009-10-02 10:58:17michael.foordsetmessages: + msg93437
2009-10-02 10:41:28pitrousetmessages: + msg93436
2009-10-02 10:15:33michael.foordsetmessages: + msg93435
2009-10-02 10:04:54pitrousetnosy: + pitrou
messages: + msg93432
2009-10-01 22:29:44michael.foordcreate