Message64189
I don't see exactly what James is proposing.
> For my needs, I would like the decoding parts of the utf_8 module
> to treat an initial BOM as an optional signature and skip it if
> there is one (just like the utf_8_sig decoder). In fact I have
> a working patch that replaces the utf_8_sig decode,
> IncrementalDecoder and StreamReader components by direct
> transplants from utf_8_sig (as recently repaired -- there was a
> SteamReader error).
I've you want a decoder that behave like the utf-8-sig decoder, use the
utf-8-sig decoder. I don't see how changing the utf-8 decoder helps here.
> I can imagine there might be utf_8 client code out there which
> expects to see a leading U+feff as (perhaps) a clue that the
> output should be returned with a BOM-signature (say) to
> accomodate the guessed input requirements of the remote
> correspondant.
In this case use UTF-8: The leading BOM will be passed to the application.
> I can just live with code like
> if input[0] == u"\ufeff":
> input=input[1:}
> spread around, and of course slightly different for incremental
> and stream inputs.
Can you post an example that requires this code? |
|
| Date |
User |
Action |
Args |
| 2008-03-20 18:16:13 | doerwalter | set | spambayes_score: 0.163629 -> 0.163629 recipients:
+ doerwalter, jafo, jgsack, gagenellina, Rhamphoryncus |
| 2008-03-20 18:16:13 | doerwalter | set | spambayes_score: 0.163629 -> 0.163629 messageid: <1206036973.18.0.778841029058.issue1328@psf.upfronthosting.co.za> |
| 2008-03-20 18:16:12 | doerwalter | link | issue1328 messages |
| 2008-03-20 18:16:11 | doerwalter | create | |
|