Message398836
I think fix to make `drop_whitespace=False` stable, can be as simple as adding two lines in `_munge_whitespace()`:
+ text = re.sub(r' \n', ' ', text)
+ text = re.sub(r'\n ', ' ', text)
text = text.translate(self.unicode_whitespace_trans)
The perf impact is not small though, 12% :
2892 (~/opensource/cpython) % ./python.exe -mtimeit 'import textwrap' 'textwrap.wrap("abc foo\nbar baz", 5)' --INS--
5000 loops, best of 5: 60.2 usec per loop
2893 (~/opensource/cpython) % r --INS--
./python.exe -mtimeit 'import textwrap' 'textwrap.wrap("abc foo\nbar baz", 5)'
5000 loops, best of 5: 52.9 usec per loop
I don't know if it's worth doing, but if yes, the options are:
- just add this change for drop_whitespace=False, which is not the default, so perf regression will not affect default usage of wrap.
- add a new arg that will only have effect when drop_whitespace=False, and will run these 2 lines. Name could be something like `collapse_space_newline`. It's hard to think of a good name.
If '\r\n' is handled, it needs one additional `sub()` line, and the perf. difference is 22%. |
|
Date |
User |
Action |
Args |
2021-08-03 16:04:21 | andrei.avk | set | recipients:
+ andrei.avk, larry, serhiy.storchaka, mdk, cheryl.sabella |
2021-08-03 16:04:21 | andrei.avk | set | messageid: <1628006661.01.0.0697169356812.issue32397@roundup.psfhosted.org> |
2021-08-03 16:04:21 | andrei.avk | link | issue32397 messages |
2021-08-03 16:04:20 | andrei.avk | create | |
|