Message 122688 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	belopolsky
Recipients	belopolsky, eric.smith, ezio.melotti, lemburg, mark.dickinson, vstinner
Date	2010-11-28.17:34:02
SpamBayes Score	5.640852e-09
Marked as misclassified	No
Message-id	<AANLkTi=GYQ0CnUogPoFO_N4P+LpfZhVMen6_rQLgt93y@mail.gmail.com>
In-reply-to	<1290963609.51.0.300256939956.issue10557@psf.upfronthosting.co.za>

Content
On Sun, Nov 28, 2010 at 12:00 PM, Mark Dickinson <report@bugs.python.org> wrote: > > Mark Dickinson <dickinsm@gmail.com> added the comment: > > About Alexander's solution: might it make more sense to have PyUnicode_EncodeDecimal raise > for inputs like this? No, I think PyOS_string_to_double() can generate better error messages than PyUnicode_EncodeDecimal. It is important to pass losslessly encoded string to PyOS_string_to_double() for proper error reporting. Otherwise, we will have to catch the error in PyFloat_FromString() just to add the string value to the message and may loose other information such as the precise location of the offending character. (AFAICT, we don't make use of it now, but this would be a meaningful improvement.) > I see it as PyUnicode_EncodeDecimal's job to turn the unicode input into usable ASCII > (or raise an exception); it looks like that's not happening here. UTF-8 is quite usable by PyOS_string_to_double() . UTF-8 encoder is extremely fast and will only get faster over time. In my opinion, PyUnicode_EncodeDecimal() is either unnecessary or should be exposed as a codec.

On Sun, Nov 28, 2010 at 12:00 PM, Mark Dickinson <report@bugs.python.org> wrote:
>
> Mark Dickinson <dickinsm@gmail.com> added the comment:
>
> About Alexander's solution:  might it make more sense to have PyUnicode_EncodeDecimal raise
> for inputs like this?

No, I think PyOS_string_to_double() can generate better error messages
than  PyUnicode_EncodeDecimal.  It is important to pass losslessly
encoded string to PyOS_string_to_double() for proper error reporting.
 Otherwise, we will have to catch the error in PyFloat_FromString()
just to add the string value to the message and may loose other
information such as the precise location of the offending character.
(AFAICT, we don't make use of it now, but this would be a meaningful
improvement.)

> I see it as PyUnicode_EncodeDecimal's job to turn the unicode input into usable ASCII
> (or raise an exception);  it looks like that's not happening here.

UTF-8 is quite usable by PyOS_string_to_double() .  UTF-8 encoder is
extremely fast and will only get faster over time.  In my opinion,
PyUnicode_EncodeDecimal() is either unnecessary or should be exposed
as a codec.

History
Date	User	Action	Args
2010-11-28 17:34:07	belopolsky	set	recipients: + belopolsky, lemburg, mark.dickinson, vstinner, eric.smith, ezio.melotti
2010-11-28 17:34:03	belopolsky	link	issue10557 messages
2010-11-28 17:34:02	belopolsky	create