This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ncoghlan
Recipients barry, durin42, gward, ncoghlan, pitrou, r.david.murray, terry.reedy
Date 2013-03-18.23:04:44
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1363647884.25.0.80711762618.issue17445@psf.upfronthosting.co.za>
In-reply-to
Content
Since we don't need to worry about ASCII incompatible encodings (difflib will already have issues with such files due to the assumptions about newlines), it should be possible to use the same approach as that used in urllib.parse, but based on latin-1 rather than ascii.

It's the least bad option for this kind of use case (surrogateescape can be good too, but it doesn't work properly in this case where the two encodings may be different and we want to compare the raw bytes directly).

(changed scope of issue to reflect the subsequent discussion)
History
Date User Action Args
2013-03-18 23:04:44ncoghlansetrecipients: + ncoghlan, barry, gward, terry.reedy, pitrou, durin42, r.david.murray
2013-03-18 23:04:44ncoghlansetmessageid: <1363647884.25.0.80711762618.issue17445@psf.upfronthosting.co.za>
2013-03-18 23:04:44ncoghlanlinkissue17445 messages
2013-03-18 23:04:44ncoghlancreate