Message 398846 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	andrei.avk
Recipients	andrei.avk, cheryl.sabella, iritkatriel, larry, mdk, serhiy.storchaka
Date	2021-08-03.17:53:10
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1628013192.27.0.601030918266.issue32397@roundup.psfhosted.org>
In-reply-to

Content
Irit: I assume you mean r' \r?\n', that's a great idea, it's much faster than adding a separate replacement step. Latest version I came up with is this: if re.search(r' \r?\n', text): text = re.sub(r' \r?\n', ' ', text) if re.search(r'\r?\n ', text): text = re.sub(r'\r?\n ', ' ', text) This optimizes the case when there's no newlines, which is likely the most common case for small fragments of text, but it may be the less common case for larger fragments where performance is more important; so I'm not sure if it's worth it. Timings: # sub() has to run 2904 (~/opensource/cpython) % ./python.exe -mtimeit 'import textwrap' 'textwrap.wrap("abc foo\n bar baz", 5)' ----VICMD---- 5000 loops, best of 5: 67.6 usec per loop # search() runs; but sub() does NOT because there's no adjacent space 2906 (~/opensource/cpython) % ./python.exe -mtimeit 'import textwrap' 'textwrap.wrap("abc foo\nbar baz", 5)' ----VICMD---- 5000 loops, best of 5: 60.3 usec per loop

Irit: I assume you mean r' \r?\n', that's a great idea, it's much faster than adding a separate replacement step.

Latest version I came up with is this:

                if re.search(r' \r?\n', text):
                    text = re.sub(r' \r?\n', ' ', text)
                if re.search(r'\r?\n ', text):
                    text = re.sub(r'\r?\n ', ' ', text)

This optimizes the case when there's no newlines, which is likely the most common case for small fragments of text, but it may be the less common case for larger fragments where performance is more important; so I'm not sure if it's worth it.

Timings:
# sub() has to run
2904 (~/opensource/cpython) % ./python.exe -mtimeit 'import textwrap' 'textwrap.wrap("abc foo\n bar baz", 5)'       ----VICMD----
5000 loops, best of 5: 67.6 usec per loop

# search() runs; but sub() does NOT because there's no adjacent space
2906 (~/opensource/cpython) % ./python.exe -mtimeit 'import textwrap' 'textwrap.wrap("abc foo\nbar baz", 5)'        ----VICMD----
5000 loops, best of 5: 60.3 usec per loop

History
Date	User	Action	Args
2021-08-03 17:53:12	andrei.avk	set	recipients: + andrei.avk, larry, serhiy.storchaka, mdk, cheryl.sabella, iritkatriel
2021-08-03 17:53:12	andrei.avk	set	messageid: <1628013192.27.0.601030918266.issue32397@roundup.psfhosted.org>
2021-08-03 17:53:12	andrei.avk	link	issue32397 messages
2021-08-03 17:53:10	andrei.avk	create