Author lemburg
Recipients eric.smith, ezio.melotti, ggenellina, lemburg, loewis, mark.dickinson, pitrou
Date 2009-09-22.16:53:06
SpamBayes Score 2.64669e-10
Marked as misclassified No
Message-id <>
In-reply-to <>
Martin v. Löwis wrote:
> Martin v. Löwis <> added the comment:
>> int()/float() use the decimal codec for numbers - this only supports
>> base-10 numbers. For hex numbers, we'd need a new hex codec (only
>> the encoder part, actually), otherwise, int('a') would start to return
>> 10.
> That's not true. PyUnicode_EncodeDecimal could happily accept hexdigits,
> and int(u'a') would still be rejected. In fact, PyUnicode_EncodeDecimal
> *already* accepts arbitrary Latin-1 characters, whether they represent
> digits or not. I suppose this is to support non-decimal bases, so it
> would only be consequential to widen this to all characters that
> reasonably have the Hex_Digit property (although I'm unsure which ones
> are excluded at the moment).

The codec currently doesn't look at the base at all - and shouldn't
need to:

It simply converts input characters that have a decimal digit value
associated with them, to the usual ASCII digits in preparation
for parsing them using the standard number parsing tools we have in

This is to support number representations using non-ASCII code
points for digits (e.g. Japanese or Sanskrit numbers)

All other Latin-1 characters are passed through as-is, so you
can already use the codec to e.g. prepare parsing of hex

Also note that we already have a hex codec in Python 2.x
which converts between the hex representations of a string
and its regular form. This was removed in 3.x for some reason
I don't understand (probably just an oversight).
Date User Action Args
2009-09-22 16:53:08lemburgsetrecipients: + lemburg, loewis, mark.dickinson, ggenellina, pitrou, eric.smith, ezio.melotti
2009-09-22 16:53:06lemburglinkissue6632 messages
2009-09-22 16:53:06lemburgcreate