This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ncoghlan
Recipients loewis, ncoghlan, skrah, teoliphant
Date 2012-08-16.01:04:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1345079050.28.0.028981423401.issue15625@psf.upfronthosting.co.za>
In-reply-to
Content
I admit that the main thing that bothers me with the proposal in PEP 3118 is the inconsistency between c -> bytes, while u, w -> str

This was less of an issue in 2.x (which was the main frame of reference when the PEP was written), with implicit str/unicode interoperability, but seems quite jarring in the 3.x world.

Status quo:
struct module: 'c' = individual bytes, 's' = multi-byte sequence
array module: 'u' typecode may be either 2 bytes or 4 bytes (Py_UNICODE) (the addition of the 'w' typecode has been reverted)

My current inclination is still to apply Victor's patch from #13072 (which changes array to export the appropriate integer typecodes for 'u' arrays) and otherwise punt on this for 3.3 and try to sort out the mess for 3.4.

For 3.4, I'm inclined to favour Stefan's proposal of C, U, W mapping to multi-point sequences of UCS-1, UCS-2, UCS-4 code points (with corresponding typecodes in the array module).

Support for lowercase 'u' would then never become an official part of the buffer API, existing only as an array typecode.
History
Date User Action Args
2012-08-16 01:04:10ncoghlansetrecipients: + ncoghlan, loewis, teoliphant, skrah
2012-08-16 01:04:10ncoghlansetmessageid: <1345079050.28.0.028981423401.issue15625@psf.upfronthosting.co.za>
2012-08-16 01:04:09ncoghlanlinkissue15625 messages
2012-08-16 01:04:08ncoghlancreate