Title: email.feedparser regex duplicate
Components: Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8
Created on 2008-04-24 13:52 by jimjjewett, last changed 2022-04-11 14:56 by admin.

2679.patch moijes12, 2012-09-06 05:33 Change in patch : NLCRE_crack = NLCRE_bol review
Messages (2)
msg65723 - (view) Author: Jim Jewett (jimjjewett) Date: 2008-04-24 13:52
feedparser defines four regexs for end-of-line, but two are redundant.

NLCRE checks for the three common line endings.
NLCRE_crack also captures the line ending.
NLCRE_eol also adds a $ to ensure it is at the end.
NLCRE_bol ... is identical to NLCRE_crack.

It should either use a ^ to insist on line-start, or be explicitly the 
same.  (e.g., NLCRE_bol=NLCRE_crack.)  (It gets away with not listing the ^ 
because the current code only uses NLCRE_bol.match.

(Actually, if the regexes are considered private, then the current code 
could just use the bound methods directly ... setting NLCRE_bol to the
 .match method, NLCRE_eol to the .search method, and NLCRE_crack to the
 .split method.)
msg169904 - (view) Author: moijes12 (moijes12) Date: 2012-09-06 05:36

I've attached a patch. Its a simple one wherein NLCRE_crack = NLCRE_bol. I found this in Python 3.3.0b2+ and so I've added version 3.3. I executed "./python -m test" after making the change and no failures were reported (340-OK, 30-Skip). Looking forward to your comments.
