Author Jim.Jewett
Recipients Jim.Jewett, docs@python, ezio.melotti
Date 2012-02-14.18:56:29
SpamBayes Score 8.26088e-06
Marked as misclassified No
Message-id <1329245790.15.0.430373154575.issue14015@psf.upfronthosting.co.za>
In-reply-to
Content
Recent discussion on the mailing lists and in http://bugs.python.org/issue13997 make it clear that the best way to get python2 results for "ASCII-in-the-parts-I-might-process-or-change" is to replace 

    f = open(fname)
with
    f = open(fname, encoding="ascii", errors="surrogateescape")

Unfortunately, surrogateescape (let alone this recipe) is not easily discoverable.  

http://docs.python.org/dev/library/functions.html#open lists 5 error-handlers -- but not this one.  It says that other error handlers are possible if they are registered with http://docs.python.org/dev/library/codecs.html#codecs.register_error but I haven't found a way to determine which error handlers are already registered.

The codecs.register (as opposed to register_error) documentation does list it as a possible value, but that is the only reference.

The other 5 error handlers are also available as module-level functions within the codecs module, and have their own documenation sections within http://docs.python.org/dev/library/codecs.html

Neither help(open) nor import codecs; help(codecs) provides any hints of the existence of surrogateescape.  Both explicitly suggest that it does not exist, by enumerating other values.
History
Date User Action Args
2012-02-14 18:56:30Jim.Jewettsetrecipients: + Jim.Jewett, ezio.melotti, docs@python
2012-02-14 18:56:30Jim.Jewettsetmessageid: <1329245790.15.0.430373154575.issue14015@psf.upfronthosting.co.za>
2012-02-14 18:56:29Jim.Jewettlinkissue14015 messages
2012-02-14 18:56:29Jim.Jewettcreate