Message 78917 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	brett.cannon
Recipients	amaury.forgeotdarc, brett.cannon, sjmachin
Date	2009-01-03.01:13:44
SpamBayes Score	1.2699788e-07
Marked as misclassified	No
Message-id	<1230945228.36.0.298587550738.issue4626@psf.upfronthosting.co.za>
In-reply-to

Content
Here is what I have found out so far. Python/bltinmodule.c:builtin_compile takes in a PyObject and gets the char * representation of that object and passes it to Python/pythonrun.c:Py_CompileStringFlags. Unfortunately no other information is passed along in the call, including what the encoding happens to be. This is unfortunate as builtin_compile makes sure that the char* data is encoded using the default encoding before calling Py_CompileStringFlags. I just tried setting a PyCF flag to denote that the char* data is encoded using the default encoding, but Parser/tokenizer.c is not compiled against unicodeobject.c and thus one cannot use PyUnicode_GetDefaultEncoding() to know what the data is stored as. I'm going to try to explicitly convert to UTF-8 and see if that works.

Here is what I have found out so far.
Python/bltinmodule.c:builtin_compile takes in a PyObject and gets the
char * representation of that object and passes it to
Python/pythonrun.c:Py_CompileStringFlags. Unfortunately no other
information is passed along in the call, including what the encoding
happens to be. This is unfortunate as builtin_compile makes sure that
the char* data is encoded using the default encoding before calling
Py_CompileStringFlags.

I just tried setting a PyCF flag to denote that the char* data is
encoded using the default encoding, but Parser/tokenizer.c is not
compiled against unicodeobject.c and thus one cannot use
PyUnicode_GetDefaultEncoding() to know what the data is stored as.

I'm going to try to explicitly convert to UTF-8 and see if that works.

History
Date	User	Action	Args
2009-01-03 01:13:48	brett.cannon	set	recipients: + brett.cannon, sjmachin, amaury.forgeotdarc
2009-01-03 01:13:48	brett.cannon	set	messageid: <1230945228.36.0.298587550738.issue4626@psf.upfronthosting.co.za>
2009-01-03 01:13:47	brett.cannon	link	issue4626 messages
2009-01-03 01:13:45	brett.cannon	create