Author serhiy.storchaka
Recipients Ankur.Ankan, Elena.Oat, Jacek.Bzdak, Puneeth.Chaganti, ankurankan, ezio.melotti, michael.foord, nnja, pitrou, serhiy.storchaka, vstinner
Date 2014-08-10.09:53:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <3659375.UCUqjatPJT@raxxla>
In-reply-to <1407663002.65.0.0356318367991.issue19217@psf.upfronthosting.co.za>
Content
> 1) try to have a single threshold for all types, and use line-based counting
> for strings (so if the threshold is 32, this means 32 elements in a list,
> 32 items in a dict, 32 lines in a string);

You forgot about strings with few but very long lines. We should hide or 
truncate too long lines, and this is not trivial issue. Actually we should 
more control on difflib's machinery and use something like _common_shorten_repr 
to appropriate truncate similar lines.

> Option a) might be doable, and even if it introduces a change in behavior it
> might be acceptable since it affects the output of the messages in case of
> failure, and I don't think anyone is relying on an exact output (also
> because tests shouldn't be failing).  Moreover, the most common usage of
> maxDiff is setting it to None, and having the threshold to None means that
> the full diff will be computed and printed, leaving the behavior unchanged.

This is too much for bug fix. We should fix this issue (do not calculate diffs 
between too long sequences) and preserve as much details as possible. Omitting 
the diff at all when it is outputted with current code (but very slowly) is a 
regression. It would be better to output truncated diff.

Then we can refactor and improve diffs reporting in other issues.
History
Date User Action Args
2014-08-10 09:53:51serhiy.storchakasetrecipients: + serhiy.storchaka, pitrou, vstinner, ezio.melotti, michael.foord, Jacek.Bzdak, Ankur.Ankan, Elena.Oat, nnja, ankurankan, Puneeth.Chaganti
2014-08-10 09:53:51serhiy.storchakalinkissue19217 messages
2014-08-10 09:53:50serhiy.storchakacreate