Author chris.jerdonek
Recipients chris.jerdonek
Date 2012-07-31.01:26:17
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1343697979.41.0.629576632811.issue15510@psf.upfronthosting.co.za>
In-reply-to
Content
While working on issue 1859, I found that textwrap.wrap() returns an empty list when passed the empty string:

>>> from textwrap import wrap
>>> wrap('')
[]

as opposed to a list containing the empty string which is what I expected--

['']

I originally accepted this as intended behavior, but for the following reasons I now think it is a bug that we should consider fixing:

1. The textwrap documentation says that wrap(), "Wraps the single paragraph in text (a string) so every line is at most width characters long. Returns a list of output lines, without final newlines."  The empty string is less than width characters long, so following the documentation, wrapping should not change it.

2. It is known that wrap() is broken for strings containing newlines (i.e. strings containing more than one paragraph).  Indeed, this is the issue that issue 1859 is meant to fix.

The commonly recommended work-around is to call splitlines() on the incoming string, pass the pieces to wrap(), and then join the return values on newlines.

However, the behavior described in this issue causes this workaround not to behave sensibly when the original string contains breaks between paragraphs.  See this message, for example:

http://bugs.python.org/issue1859#msg166627

Currently, this work-around would return "a\nb" for both "a\nb" and "a\n\nb" for example when common sense says it should preserve the newlines and return "a\nb" and "a\n\nb", respectively.

3. In addition, the behavior causes the following inconsistent behavior:

>>> repr(wrap('  ', drop_whitespace=False))
"['  ']"
>>> repr(wrap('  ', drop_whitespace=True))
'[]'

The documentation says this about drop_whitespace: "If true, whitespace that, after wrapping, happens to end up at the beginning or end of a line is dropped."  If the first case is correct, then, the second case should at least return a non-empty list because that is what would result from dropping whitespace from the string '  ' in the first list.  This is how drop_whitespace behaves when the string contains non-whitespace characters:

>>> repr(wrap('a  ', drop_whitespace=False))
"['a  ']"
>>> repr(wrap('a  ', drop_whitespace=True))
"['a']"

4. There is no unit test for the case of the empty string, and existing tests still pass after fixing the issue.

If we cannot correct this behavior, then I feel we should at least document the inconsistent behavior, and then work around it in the fix for issue 1859.

Marking for this issue to be resolved either way before fixing issue 1859.  I am happy to prepare the patch.
History
Date User Action Args
2012-07-31 01:26:19chris.jerdoneksetrecipients: + chris.jerdonek
2012-07-31 01:26:19chris.jerdoneksetmessageid: <1343697979.41.0.629576632811.issue15510@psf.upfronthosting.co.za>
2012-07-31 01:26:18chris.jerdoneklinkissue15510 messages
2012-07-31 01:26:17chris.jerdonekcreate