Message 138948 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	r.david.murray
Recipients	Devin Jeanpierre, benjamin.peterson, petri.lehtinen, r.david.murray, tim.peters
Date	2011-06-24.14:16:25
SpamBayes Score	6.087747e-11
Marked as misclassified	No
Message-id	<1308924985.95.0.24293192739.issue11909@psf.upfronthosting.co.za>
In-reply-to

Content
I agree that having a unicode API for tokenize seems to make sense, and that would indeed require a separate issue. That's a good point about doctest not otherwise supporting coding cookies. Those only really apply to source files. So no doctest fragments ought to contain coding cookies at the start, so your patch ought to be fine. But I'm not familiar with the doctest internals, so having some tests to prove everything is fine would be great. Your code could use the tokenize sniffer to make sure the fragment reads as utf-8 and throw an error otherwise. But using a unicode interface to tokenize would probably be cleaner, since I suspect it would mimic what doctest does otherwise (ignore coding cookies). But I don't know the latter, so your checking it would be appreciated.

I agree that having a unicode API for tokenize seems to make sense, and that would indeed require a separate issue.

That's a good point about doctest not otherwise supporting coding cookies.  Those only really apply to source files.  So no doctest fragments ought to contain coding cookies at the start, so your patch ought to be fine.  But I'm not familiar with the doctest internals, so having some tests to prove everything is fine would be great.

Your code could use the tokenize sniffer to make sure the fragment reads as utf-8 and throw an error otherwise.  But using a unicode interface to tokenize would probably be cleaner, since I suspect it would mimic what doctest does otherwise (ignore coding cookies).  But I don't *know* the latter, so your checking it would be appreciated.

History
Date	User	Action	Args
2011-06-24 14:16:26	r.david.murray	set	recipients: + r.david.murray, tim.peters, benjamin.peterson, Devin Jeanpierre, petri.lehtinen
2011-06-24 14:16:25	r.david.murray	set	messageid: <1308924985.95.0.24293192739.issue11909@psf.upfronthosting.co.za>
2011-06-24 14:16:25	r.david.murray	link	issue11909 messages
2011-06-24 14:16:25	r.david.murray	create