Message 359283 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	apalala
Recipients	apalala, ezio.melotti, mrabarnett, rhettinger, serhiy.storchaka, terry.reedy
Date	2020-01-04.12:46:25
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1578141986.26.0.513518260867.issue39165@roundup.psfhosted.org>
In-reply-to

Content
There's no way to assert that `findall(...)[0]` is efficient enough in most cases. It is easy to see that that it is risky in every case, as runtime may be exponential, and memory O(len(input)). A mistake in the regular expression may easily result in an out-of-memory, which can only be debugged with a series of tests using `search()`. A problem with `re.search(...)` is that id doesn't have the return value semantics of `findall(...)[0]`, and those semantics seem to be what appeal to Python programmers. It takes several lines of code (the ones in `findalliter()`) to have the same result as `findal(...)[0]` when using `search()`. `findall()` is the sole, lonely function in `re` with its return-value semantics. Also this proposal embeds `first()` within the body of `findfirst(...)`, but by the implementation one should consider if `first()` shouldn't be part of `itertools`, perhaps with a different name, like `take_one()`. One should also consider that although third-party extensions to `itertools` already provide the equivalent of `first()`, `findalliter()` and `findfirst()` do not belong there, and there are no mainstream third-party extensions to `re` where they would fit.

There's no way to assert that `findall(...)[0]` is efficient enough in most cases. It is easy to see that that it is risky in every case, as runtime may be exponential, and memory O(len(input)). A mistake in the regular expression may easily result in an out-of-memory, which can only be debugged with a series of tests using `search()`.

A problem with `re.search(...)` is that id doesn't have the return value semantics of `findall(...)[0]`, and those semantics seem to be what appeal to Python programmers. It takes several lines of code (the ones in `findalliter()`) to have the same result as `findal(...)[0]` when using `search()`. `findall()` is the sole, lonely function in `re` with its return-value semantics.

Also this proposal embeds `first()` within the body of `findfirst(...)`, but by the implementation one should consider if `first()` shouldn't be part of `itertools`, perhaps with a different name, like `take_one()`.

One should also consider that although third-party extensions to `itertools` already provide the equivalent of `first()`, `findalliter()` and `findfirst()` do not belong there, and there are no mainstream third-party extensions to `re` where they would fit.

History
Date	User	Action	Args
2020-01-04 12:46:26	apalala	set	recipients: + apalala, rhettinger, terry.reedy, ezio.melotti, mrabarnett, serhiy.storchaka
2020-01-04 12:46:26	apalala	set	messageid: <1578141986.26.0.513518260867.issue39165@roundup.psfhosted.org>
2020-01-04 12:46:26	apalala	link	issue39165 messages
2020-01-04 12:46:25	apalala	create