This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lregebro
Recipients lregebro
Date 2010-04-20.15:41:11
SpamBayes Score 1.3798e-07
Marked as misclassified No
Message-id <1271778073.88.0.928922878854.issue8471@psf.upfronthosting.co.za>
In-reply-to
Content
If we return unicode, SpoofOut's buf variable becomes automagically converted to unicode. This means all subsequent output becomes converted to unicode, and if the output contains non-ascii characters that fails.

That means that 

    >>> print u'\xe9'.encode('utf-8')
    é

Will work just fine, but

    >>> print u'abc'
    abc

    >>> print u'\xe9'.encode('utf-8')
    é

Will fail.

    
The reason for this is that when "resetting" the doctest output only a truncate(0) is done, so the buf variable will continue to be unicode. I include tests + a patch that will set self.buf to '' if empty when trunkated. Other options are also possible, like changing the .truncate(0) to a .buf = '' but that's ugly, or adding a reset() method on SpoofOUt.
History
Date User Action Args
2010-04-20 15:41:13lregebrosetrecipients: + lregebro
2010-04-20 15:41:13lregebrosetmessageid: <1271778073.88.0.928922878854.issue8471@psf.upfronthosting.co.za>
2010-04-20 15:41:12lregebrolinkissue8471 messages
2010-04-20 15:41:11lregebrocreate