Message 94058 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	verdy_p
Recipients	ezio.melotti, r.david.murray, verdy_p
Date	2009-10-15.00:13:57
SpamBayes Score	2.3440313e-08
Marked as misclassified	No
Message-id	<1255565639.68.0.980406007472.issue7132@psf.upfronthosting.co.za>
In-reply-to

Content
> a "general" regex (e.g. for an ipv6 address) I know this problem, and I have already written about this. It is not possible to parse it in a single regexp if it is written without using repetitions. But in that case, the regexp becomes really HUGE, and the number of groups in the returned match object is prohibitive. That's why CPAN has had to write a specific module for IPv6 addresses in Perl. Such module can be reduced to just a couple of lines with a single regexp, if its capturing groups correctly return ALL their occurences in the regexp engine: it requires no further processing and analysis, and the data can effectively be reassembled cleanly, just from the returned groups (as lists): - \1 and \2 (for hex components of IPv6 in hex format only, where \1 can occur 0 or 1 time, and \2 can occur 0 to 7 times) - or from \1 to \2 and \3 to \4 (for hex components in \1..\2, where \1 occurs 0 or 1 time and \2 occurs 0 to 5 times, and for decimal components in \3..\4, where \3 occurs 1 time and \4 occurs exactly 3 times).

> a "general" regex (e.g. for an ipv6 address)

I know this problem, and I have already written about this. It is not 
possible to parse it in a single regexp if it is written without using 
repetitions. But in that case, the regexp becomes really HUGE, and the 
number of groups in the returned match object is prohibitive. That's why 
CPAN has had to write a specific module for IPv6 addresses in Perl.

Such module can be reduced to just a couple of lines with a single 
regexp, if its capturing groups correctly return ALL their occurences in 
the regexp engine: it requires no further processing and analysis, and 
the data can effectively be reassembled cleanly, just from the returned 
groups (as lists):
- \1 and \2 (for hex components of IPv6 in hex format only, where \1 can 
occur 0 or 1 time, and \2 can occur 0 to 7 times)
- or from \1 to \2 and \3 to \4 (for hex components in \1..\2, where \1 
occurs 0 or 1 time and \2 occurs 0 to 5 times, and for decimal 
components in \3..\4, where \3 occurs 1 time and \4 occurs exactly 3 
times).

History
Date	User	Action	Args
2009-10-15 00:14:00	verdy_p	set	recipients: + verdy_p, ezio.melotti, r.david.murray
2009-10-15 00:13:59	verdy_p	set	messageid: <1255565639.68.0.980406007472.issue7132@psf.upfronthosting.co.za>
2009-10-15 00:13:58	verdy_p	link	issue7132 messages
2009-10-15 00:13:57	verdy_p	create