This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Rhamphoryncus
Recipients Rhamphoryncus, ajaksu2, amaury.forgeotdarc, benjamin.peterson, collinwinter, eric.smith, ezio.melotti, gvanrossum, jafo, jimjjewett, lemburg, orivej, pitrou, rhettinger, terry.reedy
Date 2010-01-10.19:52:11
SpamBayes Score 7.87671e-10
Marked as misclassified No
Message-id <1263153135.09.0.0405786287299.issue1943@psf.upfronthosting.co.za>
In-reply-to
Content
Points against the subclassing argument:

* We have a null-termination invariant.  For byte strings this was part of the public API, and I'm not sure that's changed for unicode strings; aren't you arguing that we should maximize how much of our implementation is a public API?  This prevents lazy slicing.

* UTF-16 and UTF-32 are rarely used encodings, especially for longer strings (ie files).  For shorter strings (APIs) the unicode object overhead is more significant and we'd need a way to slave to the buffer's lifetime to that of the unicode object (hard to do).  For longer strings UTF-8 would be much more useful, but that's been shot down before.

* subclassing unicode so you can change the meaning of the fields (ie allocating your own buffer) is a gross hack.  It relies far too much on fine details of the implementation and is fragile (what if you miss the dummy byte needed by fastsearch?)  Most of the possible options could be, if they function correctly, applied directly to the basetype as a patch, so it's moot.

* If you dislike PyVarObject in general (I think the API is ugly too) you should argue for a general policy discouraging future use of it, not just get in the way of the one place where it's most appropriate

Terry: PyVarObjects would be much easier to subclass if the type object stored an offset to the beginning of the variable section, so it could be automatically recalculated for subclasses based on the size of the struct.  This'd mean the PyBytesObject struct would no longer end with a char ob_sval[1].  The down side is a tiny bit more math when accessing the variable section (as the offset is no longer constant).
History
Date User Action Args
2010-01-10 19:52:15Rhamphoryncussetrecipients: + Rhamphoryncus, lemburg, gvanrossum, collinwinter, rhettinger, terry.reedy, jafo, jimjjewett, amaury.forgeotdarc, pitrou, eric.smith, ajaksu2, benjamin.peterson, orivej, ezio.melotti
2010-01-10 19:52:15Rhamphoryncussetmessageid: <1263153135.09.0.0405786287299.issue1943@psf.upfronthosting.co.za>
2010-01-10 19:52:13Rhamphoryncuslinkissue1943 messages
2010-01-10 19:52:11Rhamphoryncuscreate