This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients
Date 2002-04-17.19:40:44
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=38388

Sorry for the late response.

About the difference between encoding and decoding: you shouldn't
just look at the case where you work with Unicode and strings, e.g.
take the rot-13 codec which works on strings only or other codecs
which translate objects into strings and vice-versa.

Error handling has to be flexible enough to handle all these 
situations. Since the codecs know best how to handle the situations,
I'd make this an implementation detail of the codec and leave the
behaviour undefined in the general case.

For the existing codecs, backward compatibility should be 
maintained, if at all possible. If the patch gets overly complicated
because of this, we may have to provide a downgrade solution
for this particular problem (I don't think replace is used in any
computational context, though, since you can never be sure
how many replacement character do get inserted, so the case
may not be that realistic).

Raising an exception for the charmap codec is the right
way to go, IMHO. I would consider the current behaviour
a bug.

For new codecs, I think we should suggest that replace
tries to collect as much illegal data as possible before
invoking the error handler. The handler should be aware
of the fact that it won't necessarily get all the broken data
in one call.

About the codec error handling registry:
You seem to be using a Unicode specific approach
here. I'd rather like to see a generic approach which uses
the API we discussed earlier. Would that be possible ?
In that case, the codec API should probably be called
codecs.register_error('myhandler', myhandler).

Does that make sense ?

BTW, the patch which uses the callback registry does not seem
to be available on this SF page (the last patch still converts
the errors argument to a PyObject, which shouldn't be needed
anymore with the new approach). Can you please upload your 
latest version ?

Note that the highlighting codec would make a nice example
for the new feature.

Thanks.
History
Date User Action Args
2007-08-23 15:06:07adminlinkissue432401 messages
2007-08-23 15:06:07admincreate