Author belopolsky
Recipients belopolsky, eric.smith, ezio.melotti, lemburg, mark.dickinson, vstinner
Date 2010-11-28.17:34:02
SpamBayes Score 5.64085e-09
Marked as misclassified No
Message-id <AANLkTi=GYQ0CnUogPoFO_N4P+LpfZhVMen6_rQLgt93y@mail.gmail.com>
In-reply-to <1290963609.51.0.300256939956.issue10557@psf.upfronthosting.co.za>
Content
On Sun, Nov 28, 2010 at 12:00 PM, Mark Dickinson <report@bugs.python.org> wrote:
>
> Mark Dickinson <dickinsm@gmail.com> added the comment:
>
> About Alexander's solution:  might it make more sense to have PyUnicode_EncodeDecimal raise
> for inputs like this?

No, I think PyOS_string_to_double() can generate better error messages
than  PyUnicode_EncodeDecimal.  It is important to pass losslessly
encoded string to PyOS_string_to_double() for proper error reporting.
 Otherwise, we will have to catch the error in PyFloat_FromString()
just to add the string value to the message and may loose other
information such as the precise location of the offending character.
(AFAICT, we don't make use of it now, but this would be a meaningful
improvement.)

> I see it as PyUnicode_EncodeDecimal's job to turn the unicode input into usable ASCII
> (or raise an exception);  it looks like that's not happening here.

UTF-8 is quite usable by PyOS_string_to_double() .  UTF-8 encoder is
extremely fast and will only get faster over time.  In my opinion,
PyUnicode_EncodeDecimal() is either unnecessary or should be exposed
as a codec.
History
Date User Action Args
2010-11-28 17:34:07belopolskysetrecipients: + belopolsky, lemburg, mark.dickinson, vstinner, eric.smith, ezio.melotti
2010-11-28 17:34:03belopolskylinkissue10557 messages
2010-11-28 17:34:02belopolskycreate