Author ncoghlan
Recipients eric.araujo, eric.smith, ncoghlan, pitrou, r.david.murray, vstinner
Date 2010-09-30.10:54:30
SpamBayes Score 2.51958e-09
Marked as misclassified No
Message-id <>
From a function *user* perspective, the latter API (bytes->bytes, str->str) is exactly what I'm doing.

Antoine's point is that there are two ways to achieve that:

Option 1 (what my patch currently does):
- provide bytes and str variants of all constants
- choose which set to use at the start of each function
- be careful never to index, only slice (even for single characters)
- a few other traps that I don't remember off the top of my head

Option 2 (the alternative Antoine suggested and I'm considering):
- "decode" the ASCII compatible bytes to str objects by treating them as nominally latin-1
- use the same str-based constants as are used to handle actual str inputs
- be able to index to your heart's content inside the algorithm
- *ensure* that any bytes-as-pseudo-str objects are "encoded" back to actual bytes before they are returned

From outside the function, a user shouldn't be able to tell which approach we're using internally.

The nice thing about option 2 is to make sure you're doing it correctly, you only need to check three kinds of location:
- the initial parameter handling in each function
- any return statements, raise statements that allow a value to leave the function
- any yield expressions (both input and output)

The effects of option 1 are scattered all over your algorithms, so it's hard to be sure you've caught everything.

The downside of option 2 is if you make a mistake and let your bytes-as-pseudo-str objects escape from the confines of your function, you're going to see some very strange behaviour.
Date User Action Args
2010-09-30 10:54:33ncoghlansetrecipients: + ncoghlan, pitrou, vstinner, eric.smith, eric.araujo, r.david.murray
2010-09-30 10:54:33ncoghlansetmessageid: <>
2010-09-30 10:54:31ncoghlanlinkissue9873 messages
2010-09-30 10:54:30ncoghlancreate