Author josiahcarlson
Recipients
Date 2004-09-06.20:42:10
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
I believe there should be a mechanism to load and
unload arbitrarily large integers via the struct
module.  Currently, one would likely start with the 'Q'
format character, creating the integer in a block-wise
fashion with multiplies and shifts.

This is OK, though it tends to lend itself to certain
kinds of bugs.

There is currently another method for getting large
integers from strings and going back without the struct
module:

long(stri.encode('hex'), 16)
hex(inte)[2:].decode('hex')


Arguably, such things shouldn't be done for the packing
and unpacking of binary data in general (the string
slicing especially).


I propose a new format character for the struct module,
specifically because the struct module is to "Interpret
strings as packed binary data".  Perhaps 'g' and 'G'
(eg. biGint) is sufficient, though any reasonable
character should suffice.  Endianness should be
handled, and the number of bytes representing the
object would be the same as with the 's' formatting
code.  That is, '>60G' would be an unsigned big-endian
integer represented by 60 bytes (null filled if the
magnitude of the passed integer is not large enough).

The only reason why one wouldn't want this
functionality in the struct module is "This module
performs conversions between Python values and C
structs represented as Python strings." and arbitrarily
large integers are not traditionally part of a C struct
(though I am sure many of us have implemented arbitrary
precision integers with structs).  The reason "not a C
type" has been used to quash the 'bit' and 'nibble'
format character, because "masks and shifts" are able
to emulate them, and though "masks and shifts" could
also be used here, I have heard myself and others state
that there should be an easy method for converting
between large longs and strings.


A side-effect for allowing arbitrarily large integers
to be represented in this fashion is that its
functionality could, if desired, subsume the other
integer type characters, as well as fill in the gaps
for nonstandard size integers (3, 5, 6, 7 etc. byte
integers), that I (and I am sure others) have used in
various applications.


Currently no implementation exists, and I don't have
time to do one now.  Having taken a look at
longobject.c and structmodule.c, I would likely be able
to make a patch to the documentation, structmodule.c,
and test_struct.py around mid October, if this
functionality is desireable to others and accepted. 
While I doubt that a PEP for this is required, if
necessary I would write one up with a sample
implementation around mid October.
History
Date User Action Args
2007-08-23 16:08:21adminlinkissue1023290 messages
2007-08-23 16:08:21admincreate