Message 115820 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	amaury.forgeotdarc, vstinner
Date	2010-09-07.23:21:46
SpamBayes Score	3.204601e-06
Marked as misclassified	No
Message-id	<1283901708.12.0.503970600634.issue9769@psf.upfronthosting.co.za>
In-reply-to

Content
> PyUnicode_FromFormat("%s", text) expects a utf-8 buffer. Really? I don't see how "s++ = f;" (where s is Py_UNICODE* and f is char*) can decode utf-8. It looks more like ISO-8859-1. > Very recently (r84472, r84485), some C files of CPython source code > were converted to utf-8 Python source code (C and Python) is written in ASCII except maybe some headers or some tests written in Python with #coding:xxx header (or without the header, but in utf-8, for Python3). I don't think that a C file calls PyErr_Format() or PyUnicode_FromFormat(V)() with a non-ascii format string.

> PyUnicode_FromFormat("%s", text) expects a utf-8 buffer.

Really? I don't see how "*s++ = *f;" (where s is Py_UNICODE* and f is char*) can decode utf-8. It looks more like ISO-8859-1.

> Very recently (r84472, r84485), some C files of CPython source code
> were converted to utf-8

Python source code (C and Python) is written in ASCII except maybe some headers or some tests written in Python with #coding:xxx header (or without the header, but in utf-8, for Python3). I don't think that a C file calls PyErr_Format() or PyUnicode_FromFormat(V)() with a non-ascii format string.

History
Date	User	Action	Args
2010-09-07 23:21:48	vstinner	set	recipients: + vstinner, amaury.forgeotdarc
2010-09-07 23:21:48	vstinner	set	messageid: <1283901708.12.0.503970600634.issue9769@psf.upfronthosting.co.za>
2010-09-07 23:21:46	vstinner	link	issue9769 messages
2010-09-07 23:21:46	vstinner	create