Message 184547 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ncoghlan
Recipients	barry, durin42, gward, ncoghlan, pitrou, r.david.murray, terry.reedy
Date	2013-03-18.23:04:44
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1363647884.25.0.80711762618.issue17445@psf.upfronthosting.co.za>
In-reply-to

Content
Since we don't need to worry about ASCII incompatible encodings (difflib will already have issues with such files due to the assumptions about newlines), it should be possible to use the same approach as that used in urllib.parse, but based on latin-1 rather than ascii. It's the least bad option for this kind of use case (surrogateescape can be good too, but it doesn't work properly in this case where the two encodings may be different and we want to compare the raw bytes directly). (changed scope of issue to reflect the subsequent discussion)

Since we don't need to worry about ASCII incompatible encodings (difflib will already have issues with such files due to the assumptions about newlines), it should be possible to use the same approach as that used in urllib.parse, but based on latin-1 rather than ascii.

It's the least bad option for this kind of use case (surrogateescape can be good too, but it doesn't work properly in this case where the two encodings may be different and we want to compare the raw bytes directly).

(changed scope of issue to reflect the subsequent discussion)

History
Date	User	Action	Args
2013-03-18 23:04:44	ncoghlan	set	recipients: + ncoghlan, barry, gward, terry.reedy, pitrou, durin42, r.david.murray
2013-03-18 23:04:44	ncoghlan	set	messageid: <1363647884.25.0.80711762618.issue17445@psf.upfronthosting.co.za>
2013-03-18 23:04:44	ncoghlan	link	issue17445 messages
2013-03-18 23:04:44	ncoghlan	create