classification
Title: int and float accept bytes, complex does not
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.0
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ggenellina, mark.dickinson, ncoghlan
Priority: normal Keywords:

Created on 2008-03-25 16:22 by mark.dickinson, last changed 2008-04-16 18:36 by mark.dickinson. This issue is now closed.

Messages (6)
msg64494 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-03-25 16:21
In 3.0, the int and float constructors accepts bytes instances as well as 
strings:

>>> int(b'1')
1
>>> float(b'1')
1.0

but the complex constructor doesn't:

>>> complex(b'1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: complex() argument must be a string or a number

I'd suggest that at least one of these three results is a bug,
but I'm not sure which.

From a purity point of view, I think int() and float() shouldn't accept 
bytes.  Is this a case of practicality beats purity?  What are the
pratical reasons to have int() and float() accept bytes?

Once this is resolved, the behaviors of Decimal and Fraction should also 
be considered.
msg64498 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-03-25 16:57
I have a couple of use cases for bytes-as-ASCII-text -> int, one of
which also touches on floats. The other numeric types also accepting
bytes as representing ASCII encoded strings would then follow from a
consistency of behaviour argument.

Use case 1: Decimal implementation

The simplest way to retain the 2.x series decimal performance in 3.0 is
to switch the mantissa storage from a string to a bytes object. This is
only possible if the int constructor accepts bytes objects and treats
them as an ASCII-encoded string.



Use case 2: Serial protocols with embedded ASCII text

I work with a lot of control protocols for different pieces of hardware,
and one way the hardware vendors avoid having to write a custom control
interface for their hardware is to make their serial interface human
readable (so a terminal program like Hyperterminal or Miniterm becomes
their user interface). Writing automated control software for these
devices is essentially an exercise in screen-scraping the ASCII strings
received on the serial port. Having to go through Unicode to convert
ASCII digits embedded in these strings to numbers would be a major pain.
While these numbers are mostly integers, you do get the occasional
floating point value turning up as well, so bytes->float can also be useful.
msg64503 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-03-25 18:28
For both these use cases, don't you still have a problem going
the other way, from an integer back to bytes? Is there an easier way
than

bytes(str(n), 'ascii')

?
msg64542 - (view) Author: Gabriel Genellina (ggenellina) Date: 2008-03-26 06:36
Are numbers so special to break 
the rules? why stopping here? 
what about other types that may 
want to accept ASCII bytes 
instead of characters? Isn't 
this like going back to the 2.x 
world?

The protocol with embedded ASCII 
numbers isn't a very convincing 
case for me. One can read a 
binary integer in C using a 
single function call. In Python 
2.X this can't be done in a 
single call, one has to use 
struct.unpack to decode the 
bytes read, and there was no 
complains that I know of. 
In 3.0 the same happens for 
ASCII numbers too, one will have 
to decode them first. The 
conversion may look like a 
stupid step, but it's as stupid 
as having to use struct.unpack 
to convert some bits to the 
*same* bits inside the integer 
object.

Writing int(str(value,'ascii')) 
doesn't look so terrible.

And one may argue that 
int(b'1234') should return 
0x34333231 instead of 1234; 
b'1234' is the binary 
representation of 0x34333231 in 
little-endian format.
msg64543 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-03-26 07:00
Agreed - I've been convinced that the right thing to do is reject bytes
in int() and float() as well.

If we decide we still want to support a fast-path conversion it should
be via separate methods (e.g an int.from_ascii class method and an
int.to_ascii instance method).
msg65533 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-04-15 22:32
Closing this:  the consensus seems to be that things are fine as they 
are.

See the thread at

http://mail.python.org/pipermail/python-3000/2008-April/013100.html

for discussion.
History
Date User Action Args
2008-04-16 18:36:44mark.dickinsonsetstatus: open -> closed
2008-04-15 22:32:58mark.dickinsonsetresolution: not a bug
messages: + msg65533
2008-03-26 07:00:08ncoghlansetmessages: + msg64543
2008-03-26 06:37:00ggenellinasetnosy: + ggenellina
messages: + msg64542
2008-03-25 18:28:53mark.dickinsonsetmessages: + msg64503
2008-03-25 16:57:57ncoghlansetnosy: + ncoghlan
messages: + msg64498
2008-03-25 16:22:00mark.dickinsoncreate