Message 83356 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	jdwhitley
Recipients	georg.brandl, jaywalker, jdwhitley, pitrou, sjmachin, vstinner
Date	2009-03-09.05:11:29
SpamBayes Score	3.1550584e-08
Marked as misclassified	No
Message-id	<1236575492.47.0.0523080771295.issue4847@psf.upfronthosting.co.za>
In-reply-to

Content
Hi all, This patch takes the approach of assuming utf-8 format encoding for files opened with 'rb' directive. That is: 1. Check if each line is Unicode Or Bytes Type. 2. If Bytes, get char array reference to internal buffer. 3. use PyUnicode_FromString to create a new unicode object from the char* - This step assumes UTF-8. 4. get a Py_UNICODE reference to internal unicode object buffer and continue as before. Is this in the right direction at all? Cheers, Jervis

Hi all,

This patch takes the approach of assuming utf-8 format encoding
for files opened with 'rb' directive. 

That is:

1. Check if each line is Unicode Or Bytes Type.
2. If Bytes, get char array reference to internal buffer.
3. use PyUnicode_FromString to create a new unicode object from the
char* - This step assumes UTF-8.
4. get a Py_UNICODE reference to internal unicode object buffer and 
   continue as before.

Is this in the right direction at all?

Cheers,

Jervis

History
Date	User	Action	Args
2009-03-09 05:11:33	jdwhitley	set	recipients: + jdwhitley, georg.brandl, sjmachin, pitrou, vstinner, jaywalker
2009-03-09 05:11:32	jdwhitley	set	messageid: <1236575492.47.0.0523080771295.issue4847@psf.upfronthosting.co.za>
2009-03-09 05:11:31	jdwhitley	link	issue4847 messages
2009-03-09 05:11:30	jdwhitley	create