classification
Title: Improve explanation of tab expansion in doctests
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: r.david.murray, rbp, techtonik
Priority: normal Keywords: patch

Created on 2009-12-27 16:50 by techtonik, last changed 2010-06-01 01:49 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
doctabtest.py techtonik, 2009-12-27 16:50
issue7583.doctest.tabs.diff techtonik, 2010-04-01 12:59 patch
doctest-tabs-doc.patch r.david.murray, 2010-05-06 00:54
Messages (12)
msg96914 - (view) Author: anatoly techtonik (techtonik) Date: 2009-12-27 16:50
Since 2.4 doctest converts all tabs to 8-space sequences in test source. 
It should do the same with output it receives for comparison. Right now 
there is no way to write a correct doctest if the output includes tab 
character. See attached doctabtest.py

This change would be backwards compatible, because all tests with tabs are 
either fail or run with doctest.NORMALIZE_WHITESPACE flag, which is 
usually inappropriate for doctest cases that involve tab formatting.
msg102048 - (view) Author: anatoly techtonik (techtonik) Date: 2010-04-01 00:02
http://codereview.appspot.com/848043/show
msg102096 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-01 14:51
Removed [patch] from title as patch is set on the keywords.  Removed 2.5 from versions because it is in security fix only mode (we use versions for where things will be fixed, not where they are broken).  Changed component to Library as this is not a bug in the tests themselves but in a library package, doctest.

I didn't set stage to patch review only because I'd like other opinions on the root issue.  Normalizing tabs to 8 spaces sounds wrong to me (including in the source).
msg102097 - (view) Author: anatoly techtonik (techtonik) Date: 2010-04-01 14:55
http://docs.python.org/library/doctest.html#how-are-docstring-examples-recognized

"Changed in version 2.4: Expanding tabs to spaces is new; previous versions tried to preserve hard tabs, with confusing results"

Unfortunately, no confusing results survived to see what was wrong.
msg102317 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-04 03:30
I think I can see how it would cause confusion, and in general it shouldn't be necessary to use real tabs in a doctest.  So as you say the output should be fixed to match.

However, I don't think the patch is quite correct.  It looks to me like the expandtabs call should be made unconditionally on the output string just like it is on the string pulled from the file.  The reason is that the tabs in the output should be expanded relative to the start of the output line.  The indentation is meaningless in that context.  The way you did it, the output in your test appears to be indented incorrectly.  If the expandtabs were done on the raw output string, it would be indented eight spaces from the start of the test indent, and that would look correct.

Your change of 'iff' to 'if' is also most likely incorrect.  I'm pretty sure that 'iff' is intentional (it is an abbreviation for "if and only if".)
msg102322 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-04 07:03
Having thought about it some more, I see why you did the patch the way you did.

The fact that there are two completely different ways to expand tabs in the output that are equally valid and have their advantages and disadvantages makes me wonder if this should be fixed at all.  Perhaps it is better to just say that you can only handle tabs in output by ignoring whitespace.
msg102378 - (view) Author: anatoly techtonik (techtonik) Date: 2010-04-05 15:37
Could you be more specific about why users should not be allowed to use tabs in docstrings. An example use case/user story would help me a lot.

I've made a precondition to check tab existence before expanding tabs for performance reasons.

The indentation of output prior to tab expansion is necessary to correctly format output and show result when test fails.

I agree to close this bug as "won't fix" if we can clear the confusion by describing the situation in more detail and providing examples.
msg104253 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-26 17:22
The problem is that it would be equally reasonable for someone to want to put real tab characters in the output section (which results in strange looking doctest text) or to put expanded spaces in the doctest output section based on the assumption that output starts in the column under the first > of the >>> and that doctest will expand the tabs in the output it receives from the executed test using a tabstop of 8.  Neither of these is a good solution (the first gives you messed up looking doctests, the second means the output you document isn't really the output the test gives).  The second solution would also mean a significant rewrite of the doctest processing loop.  So the best course, since doctests are *primarily* about documentation, is to allow tabs in the output only with whitespace normalization.  (If you really want to test for the presence of tabs in the output, as opposed to just creating documentation, you can capture the output and test it using string comparison.)

Personally I think the 'too bad' language in the docs is not really appropriate, so if you can think of a succinct way to document the above, I'll see about getting it in to the docs.
msg105057 - (view) Author: Rodrigo Bernardo Pimentel (rbp) (Python committer) Date: 2010-05-05 16:43
I've just been bitten by this, and I agree the language in the docs is very inappropriate (made me angry for a minute :)). 

One suggestion: "While not everyone might believe tabs should mean that, doctests are primarily aimed at documentation, and, since it's very hard to get tabs to look consistent, keeping hard tabs would be potentially confusing. If you absolutely need to test for the presence of tabs at the output, you can capture it and use string comparison, or write your own DocTestParser class."
msg105117 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-05-06 00:54
I tried your suggestion, but it seemed to me that it made the first paragraph of that section be all about tabs, and get even farther away from its original focus, which was introducing the example.

I've attached a patch that instead moves the entire discussion of tabs to the 'fine print' section, where it arguably belongs anyway, and expands the discussion somewhat.  Feedback welcome.
msg105146 - (view) Author: anatoly techtonik (techtonik) Date: 2010-05-06 13:43
Sorry for not being able to follow up on this issue. I believe we need to expand the problem of handling tabs in doctests with use cases and expose the problem outside the issue tracker item. I still remember that at some point I had a patch somewhere that allowed to use tabs in doctests without too much explanations. Can't find it in this tracker though.
msg106822 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-06-01 01:49
In the absence of feedback about the doc patch, I have applied it in r81634.

@techtonic: if I recall correctly I explained in your issue that had the patch what the problem was.  Short summary: there are two equally valid ways in which tabs in the output and the spacing in the sample output could be reconciled; both have their problems.  Therefore it is better to have the user handle it explicitly one way or another.
History
Date User Action Args
2010-06-01 01:49:15r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg106822

stage: patch review -> resolved
2010-05-06 13:43:43techtoniksetmessages: + msg105146
2010-05-06 00:54:23r.david.murraysetfiles: + doctest-tabs-doc.patch
type: enhancement -> behavior
messages: + msg105117

stage: needs patch -> patch review
2010-05-05 16:43:52rbpsetnosy: + rbp
messages: + msg105057
2010-04-26 17:22:44r.david.murraysettitle: doctest should normalize tabs when comparing output -> Improve explanation of tab expansion in doctests
nosy: techtonik, r.david.murray
messages: + msg104253

components: + Documentation, - Library (Lib)
type: behavior -> enhancement
stage: patch review -> needs patch
2010-04-07 17:53:27r.david.murrayunlinkissue7585 dependencies
2010-04-05 15:37:39techtoniksetmessages: + msg102378
2010-04-04 07:03:05r.david.murraysetresolution: accepted -> (no value)
messages: + msg102322
2010-04-04 03:30:33r.david.murraysetassignee: r.david.murray
resolution: accepted
messages: + msg102317
stage: patch review
2010-04-01 14:57:16r.david.murraylinkissue7585 dependencies
2010-04-01 14:55:33techtoniksetmessages: + msg102097
2010-04-01 14:51:50r.david.murraysetpriority: normal

type: behavior

components: + Library (Lib), - Tests
title: [patch] doctest should normalize tabs when comparing output -> doctest should normalize tabs when comparing output
nosy: + r.david.murray
versions: - Python 2.5
messages: + msg102096
2010-04-01 13:00:20techtoniksetfiles: - issue7583.doctest.tabs.diff
2010-04-01 12:59:56techtoniksetfiles: + issue7583.doctest.tabs.diff
2010-04-01 00:02:39techtoniksetmessages: + msg102048
2010-03-31 23:36:49techtoniksettitle: doctest should normalize tabs when comparing output -> [patch] doctest should normalize tabs when comparing output
2010-03-31 23:32:10techtoniksetfiles: + issue7583.doctest.tabs.diff
keywords: + patch
2009-12-27 16:50:30techtonikcreate