This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: TextWrapper fails to split 'two-and-a-half-hour' correctly
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: samwyse, serhiy.storchaka
Priority: normal Keywords:

Created on 2015-11-29 02:07 by samwyse, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (2)
msg255558 - (view) Author: Samwyse (samwyse) * Date: 2015-11-29 02:07
Single character words in a hyphenated phrase are not split correctly.  The root issue it the wordsep_re class variable.  To reproduce, run the following:

>>> import textwrap
>>> textwrap.TextWrapper.wordsep_re.split('two-and-a-half-hour')
['', 'two-', 'and-a', '-half-', 'hour']

It works if 'a' is replaces with two or more alphabetic characters.

>>> textwrap.TextWrapper.wordsep_re.split('two-and-aa-half-hour')
['', 'two-', '', 'and-', '', 'aa-', '', 'half-', 'hour']

The problem is in this part of the pattern:  (?=\w+[^0-9\W])

I confess that I don't understand the situation that would require that complicated of a pattern.  Why wouldn't (?=\w) would work?
msg255561 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-11-29 06:20
Already fixed in issue22687.
History
Date User Action Args
2022-04-11 14:58:24adminsetgithub: 69946
2015-11-29 06:20:55serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg255561

resolution: out of date
stage: resolved
2015-11-29 02:07:24samwysecreate