Issue 3955: maybe doctest doesn't understand unicode_literals?

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/48205

classification

Title:	maybe doctest doesn't understand unicode_literals?
Type:	behavior	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 2.6

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	Matthis Thorade, christoph, georg.brandl, mark, r.david.murray, tim.peters
Priority:	normal	Keywords:

Created on 2008-09-24 12:37 by mark, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
test.py	christoph, 2009-06-30 15:03	Test case revealing Unicode literal weakness

Messages (8)
msg73710 - (view)	Author: Mark Summerfield (mark) *	Date: 2008-09-24 12:37
# This program works fine with Python 2.5 and 2.6: def f(): """ >>> f() 'xyz' """ return "xyz" if __name__ == "__main__": import doctest doctest.testmod() But if you put the statement "from __future__ import unicode_literals" at the start then it fails: File "/tmp/test.py", line 5, in __main__.f Failed example: f() Expected: 'xyz' Got: u'xyz' I don't know if it is a bug or a feature but I didn't see any mention of it in the bugs or docs so thought I'd mention it.
msg73728 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2008-09-24 16:29
It certainly isn't a feature. I don't immediately see how to fix it, though. unicode_literals doesn't change the repr() of unicode objects (it obviously can't, since that change would not be module-local). Let's try to get a comment from Uncle Timmy...
msg89874 - (view)	Author: Christoph Burgmer (christoph)	Date: 2009-06-29 19:19
OutputChecker.check_output() seems to be responsible for comparing 'example.want' and 'got' literals and this is obviously done literally. So as "u'1'" is different to "'1'" this is reflected in the result. This gets more complicated with literals like "[u'1', u'2']" I believe. So, eval() could be used for testing for equality: >>> repr(['1', '2']) == repr([u'1', u'2']) False but >>> eval(repr(['1', '2'])) == eval(repr([u'1', u'2'])) True doctests are already compiled and executed, but evaluating the doctest code's result is probably a security issue, so a method doing the invers of repr() could be used, that only works on variables; something like Pickle, but without its own protocol.
msg89927 - (view)	Author: Christoph Burgmer (christoph)	Date: 2009-06-30 15:03
This problem seems more severe as the appended test case shows. That gives me: Expected: u'ī' Got: u'\u012b' Both literals are the same. Unicode literals in doc strings are not treated as other escaped characters: >>> repr(r'\n') "'\\\\n'" >>> repr('\n') "'\\n'" but: >>> repr(ur'\u012b') "u'\\u012b'" >>> repr(u'\u012b') "u'\\u012b'" So there is no work around in the docstring's reference itself. I file this here, even though the problems are not strictly equal. I do believe though that there is or should be a common solution to these issues. Both results need to be interpreted on a more abstract scale.
msg89997 - (view)	Author: Christoph Burgmer (christoph)	Date: 2009-07-01 20:25
JFTR: To yield the results of my last comment, you need to apply the patch posted in http://bugs.python.org/issue1293741
msg162577 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2012-06-10 02:23
I fail to see the problem here. If the module has 'from __future__ import unicode_literals", then the docstring output clauses would need to be changed to reflect the fact that the input literals are now unicode. What am I missing?
msg162724 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2012-06-13 19:19
Yeah, I don't really remember now what my point was.
msg287530 - (view)	Author: Matthis Thorade (Matthis Thorade)	Date: 2017-02-10 13:10
I found this bug when trying to write a doctest that passes on Python 3.5 and Python 2.7.9. The following adapted example passes on Python2, but fails on Python3: # -- coding: utf-8 -- from __future__ import unicode_literals def f(): """ >>> f() u'xyz' """ return "xyz" if __name__ == "__main__": import doctest doctest.testmod() I think a nice solution could be to add a new directive so that I can use the following def myUnic(): """ This is a small demo that just returns a string. >>> myUnic() u'abc' # doctest: +ALLOW_UNICODE """ return 'abc' I asked the same question here: http://stackoverflow.com/questions/42158733/unicode-literals-and-doctest-in-python-2-7-and-python-3-5

History
Date	User	Action	Args
2022-04-11 14:56:39	admin	set	github: 48205
2017-02-10 13:10:47	Matthis Thorade	set	nosy: + Matthis Thorade messages: + msg287530
2012-06-13 19:19:46	georg.brandl	set	status: pending -> closed messages: + msg162724
2012-06-10 02:23:07	r.david.murray	set	status: open -> pending assignee: tim.peters -> nosy: + r.david.murray messages: + msg162577 resolution: not a bug stage: resolved
2009-07-01 20:25:04	christoph	set	messages: + msg89997
2009-06-30 15:03:19	christoph	set	files: + test.py messages: + msg89927
2009-06-29 19:19:40	christoph	set	nosy: + christoph messages: + msg89874
2008-09-24 16:29:09	georg.brandl	set	assignee: tim.peters messages: + msg73728 nosy: + georg.brandl, tim.peters
2008-09-24 12:37:20	mark	create