Message 224659 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	Frank.van.Dijk, docs@python, doerwalter, lemburg, vstinner
Date	2014-08-03.20:45:50
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1407098750.65.0.384582388806.issue22128@psf.upfronthosting.co.za>
In-reply-to

Content
Pointing people to io.open() as alternative to codecs.open() is a good idea, but that doesn't make codecs.open() less useful. The reason why codecs.open() uses binary mode is to avoid issues with automatic newline conversion getting in the way of the file's encoding. Think of e.g. UTF-16 encoded files that use newlines. Note that codecs allow handling newlines on a line-by-line bases via the .readline() keepends parameter, so issues with Windows vs. Unix can be worked around explicitly. Since default is to keep line ends, no data loss occurs and application code can deal with line ends as it sees fit. As it stands, I'm -1 on this patch, but would be +1 on mentioning io.open() as alternative to codecs.open() with a slightly different approach to line ends. I don't think it's useful to tell people: * use codecs.open() on Python 2.4, 2.5, 2.6 * use io.open() on Python 2.7 (io is too slow on 2.6 to be a real alternative to codecs.open()) * use open() on Python 3.4+ codecs.open() works the same across all these Python versions.

Pointing people to io.open() as alternative to codecs.open() is a good idea, but that doesn't make codecs.open() less useful.

The reason why codecs.open() uses binary mode is to avoid issues with automatic newline conversion getting in the way of the file's encoding. Think of e.g. UTF-16 encoded files that use newlines.

Note that codecs allow handling newlines on a line-by-line bases via the .readline() keepends parameter, so issues with Windows vs. Unix can be worked around explicitly. Since default is to keep line ends, no data loss occurs and application code can deal with line ends as it sees fit.

As it stands, I'm -1 on this patch, but would be +1 on mentioning io.open() as alternative to codecs.open() with a slightly different approach to line ends.

I don't think it's useful to tell people:
* use codecs.open() on Python 2.4, 2.5, 2.6
* use io.open() on Python 2.7 (io is too slow on 2.6 to be a real alternative to codecs.open())
* use open() on Python 3.4+

codecs.open() works the same across all these Python versions.

History
Date	User	Action	Args
2014-08-03 20:45:50	lemburg	set	recipients: + lemburg, doerwalter, vstinner, docs@python, Frank.van.Dijk
2014-08-03 20:45:50	lemburg	set	messageid: <1407098750.65.0.384582388806.issue22128@psf.upfronthosting.co.za>
2014-08-03 20:45:50	lemburg	link	issue22128 messages
2014-08-03 20:45:50	lemburg	create