# HG changeset patch
# User Frank van Dijk <fwvdijk@gmail.org>
# Date 1407064865 -7200
#      Sun Aug 03 13:21:05 2014 +0200
# Branch 2.7
# Node ID 425f3144941f9fe213f2d61ed9e9e766d0cd87da
# Parent  133ee2b48e52cb891b1898a16e3c187ab18f4310
steer folks away from using codecs.open because it handles text files incorrectly

diff -r 133ee2b48e52 -r 425f3144941f Doc/howto/unicode.rst
--- a/Doc/howto/unicode.rst	Fri Aug 01 23:51:51 2014 -0700
+++ b/Doc/howto/unicode.rst	Sun Aug 03 13:21:05 2014 +0200
@@ -365,10 +365,6 @@
 interfaces, but implementing encodings is a specialized task that also won't be
 covered here.  Consult the Python documentation to learn more about this module.
 
-The most commonly used part of the :mod:`codecs` module is the
-:func:`codecs.open` function which will be discussed in the section on input and
-output.
-
 
 Unicode Literals in Python Source Code
 --------------------------------------
@@ -534,33 +530,31 @@
 
 The solution would be to use the low-level decoding interface to catch the case
 of partial coding sequences.  The work of implementing this has already been
-done for you: the :mod:`codecs` module includes a version of the :func:`open`
-function that returns a file-like object that assumes the file's contents are in
-a specified encoding and accepts Unicode parameters for methods such as
-``.read()`` and ``.write()``.
+done for you: the :func:`io.open` function returns a file-like object that
+assumes the file's contents are in a specified encoding and accepts Unicode
+parameters for methods such as ``.read()`` and ``.write()``.
 
-The function's parameters are ``open(filename, mode='rb', encoding=None,
-errors='strict', buffering=1)``.  ``mode`` can be ``'r'``, ``'w'``, or ``'a'``,
-just like the corresponding parameter to the regular built-in ``open()``
-function; add a ``'+'`` to update the file.  ``buffering`` is similarly parallel
-to the standard function's parameter.  ``encoding`` is a string giving the
-encoding to use; if it's left as ``None``, a regular Python file object that
-accepts 8-bit strings is returned.  Otherwise, a wrapper object is returned, and
-data written to or read from the wrapper object will be converted as needed.
-``errors`` specifies the action for encoding errors and can be one of the usual
-values of 'strict', 'ignore', and 'replace'.
+The function's parameters are ``io.open(file, mode='r', buffering=-1,
+encoding=None, errors=None, newline=None, closefd=True)``.  ``mode`` can be
+``'r'``, ``'w'``, or ``'a'``, just like the corresponding parameter to the
+regular built-in ``open()`` function; add a ``'+'`` to update the file.
+``buffering`` is similarly parallel to the standard function's parameter.
+``encoding`` is a string giving the encoding to use. Data written to or read
+from the stream will be converted as needed.  ``errors`` specifies the action
+for encoding errors and can be one of the usual values of 'strict', 'ignore',
+and 'replace'.
 
 Reading Unicode from a file is therefore simple::
 
-    import codecs
-    f = codecs.open('unicode.rst', encoding='utf-8')
+    import io
+    f = io.open('unicode.rst', encoding='utf-8')
     for line in f:
         print repr(line)
 
 It's also possible to open files in update mode, allowing both reading and
 writing::
 
-    f = codecs.open('test', encoding='utf-8', mode='w+')
+    f = io.open('test', encoding='utf-8', mode='w+')
     f.write(u'\u4500 blah blah blah\n')
     f.seek(0)
     print repr(f.readline()[:1])
diff -r 133ee2b48e52 -r 425f3144941f Doc/library/codecs.rst
--- a/Doc/library/codecs.rst	Fri Aug 01 23:51:51 2014 -0700
+++ b/Doc/library/codecs.rst	Sun Aug 03 13:21:05 2014 +0200
@@ -246,17 +246,17 @@
 
    .. note::
 
+      Files are always opened in binary mode, even if no binary mode was
+      specified. This means that no automatic conversion of ``b'\n'`` is done
+      on reading and writing. To open text files with transparent
+      encoding/decoding use the :func:`io.open` function instead.
+
+   .. note::
+
       The wrapped version will only accept the object format defined by the codecs,
       i.e. Unicode objects for most built-in codecs.  Output is also codec-dependent
       and will usually be Unicode as well.
 
-   .. note::
-
-      Files are always opened in binary mode, even if no binary mode was
-      specified.  This is done to avoid data loss due to encodings using 8-bit
-      values.  This means that no automatic conversion of ``'\n'`` is done
-      on reading and writing.
-
    *encoding* specifies the encoding which is to be used for the file.
 
    *errors* may be given to define the error handling. It defaults to ``'strict'``