classification
Title: surrogateescape largely missing from documentation
Type: Stage: resolved
Components: Documentation, Unicode Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Jim.Jewett, akuchling, cvrebert, docs@python, ezio.melotti, python-dev
Priority: normal Keywords:

Created on 2012-02-14 18:56 by Jim.Jewett, last changed 2013-06-16 17:00 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
patch14015.txt akuchling, 2013-06-09 16:05
Messages (3)
msg153359 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2012-02-14 18:56
Recent discussion on the mailing lists and in http://bugs.python.org/issue13997 make it clear that the best way to get python2 results for "ASCII-in-the-parts-I-might-process-or-change" is to replace 

    f = open(fname)
with
    f = open(fname, encoding="ascii", errors="surrogateescape")

Unfortunately, surrogateescape (let alone this recipe) is not easily discoverable.  

http://docs.python.org/dev/library/functions.html#open lists 5 error-handlers -- but not this one.  It says that other error handlers are possible if they are registered with http://docs.python.org/dev/library/codecs.html#codecs.register_error but I haven't found a way to determine which error handlers are already registered.

The codecs.register (as opposed to register_error) documentation does list it as a possible value, but that is the only reference.

The other 5 error handlers are also available as module-level functions within the codecs module, and have their own documenation sections within http://docs.python.org/dev/library/codecs.html

Neither help(open) nor import codecs; help(codecs) provides any hints of the existence of surrogateescape.  Both explicitly suggest that it does not exist, by enumerating other values.
msg190860 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2013-06-09 16:05
Here's a proposed patch that touches the Sphinx documentation and a docstring in codecs.py.  The text is slightly revised from my current revisions to the Unicode howto.

help(open) says "See the documentation for codecs.register for a list of the permitted encoding error strings".  This is strictly correct: the Sphinx documentation features this info.  But help(codecs.register) doesn't; it's more helpful to look at help(codecs.Codec).  

So maybe the docstring for open() should say: "See help(codecs.Codec) for a list of the permitted..." instead.
msg191273 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-06-16 17:00
New changeset 55f611f55952 by Andrew Kuchling in branch '3.3':
Describe 'surrogateescape' in the documentation.
http://hg.python.org/cpython/rev/55f611f55952
History
Date User Action Args
2013-06-16 17:00:53python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg191273

resolution: fixed
stage: resolved
2013-06-09 16:05:28akuchlingsetfiles: + patch14015.txt

messages: + msg190860
2013-06-08 17:50:41akuchlingsetnosy: + akuchling
2012-02-15 15:23:05eric.araujosetversions: - Python 3.1
2012-02-14 23:27:26cvrebertsetnosy: + cvrebert
2012-02-14 18:56:29Jim.Jewettcreate