This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author pitrou
Recipients Dmitry.Jemerov, giampaolo.rodola, ncoghlan, pitrou, r.david.murray
Date 2010-09-19.21:09:36
SpamBayes Score 1.6044707e-05
Marked as misclassified No
Message-id <1284930573.3205.13.camel@localhost.localdomain>
In-reply-to <1284929657.45.0.0473281470654.issue9360@psf.upfronthosting.co.za>
Content
> To make the distinction easier to remember, would it help if the
> methods that are currently set to return bytes instead accepted the
> typical encoding+errors parameters, with parallel *b APIs to get at
> the raw bytes?

Not really, no. For raw messages, which encoding+errors must be used
depends on the returned contents, it's not something the client can know
up front; moreover, different parts of the returned bytes may need
decoding using different encodings (for example if there are several
MIME parts to the message). People should use the email package to parse
the raw messages, as I assume they already do in 2.x.

Apart from raw message bodies, NNTP data has well-defined encodings and
that's why I can take and return unicode (although as stated, I also use
surrogateescape to be fault-tolerant in the face of broken servers).

> My concern with the current API is that there isn't a clear indicator
> during normal programming as to which APIs return strings and which
> return the raw bytes and hence require further decoding.

That's a documentation issue. I haven't touched the docs yet :)
History
Date User Action Args
2010-09-19 21:09:38pitrousetrecipients: + pitrou, ncoghlan, giampaolo.rodola, r.david.murray, Dmitry.Jemerov
2010-09-19 21:09:36pitroulinkissue9360 messages
2010-09-19 21:09:36pitroucreate