This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author srael
Recipients ezio.melotti, mrabarnett, srael
Date 2020-05-04.09:54:09
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1588586050.03.0.673760016961.issue40496@roundup.psfhosted.org>
In-reply-to
Content
I have found a deadlock using Python 3.6.10 that seems to have been solved on 3.7.x. probably related to capture groups. To reproduce the deadlock just do something like this:

re.findall(
    '\[et_pb_image(?:\w|=|"|\d|\.| |_|\/)*src="(https?:\/\/(?:www\.)?\w*\.\w*(?:\/|\w|\d|\.|-)*\.(?:png|jpg|jpeg|gif))"(?:\w|=|"|\d|\.| |_|\/|%|\|)*(?:\/?\])(?:\[\/et_pb_image\])?',
    '[et_pb_image _builder_version="3.27.2" src="https://www.somewhere.com/wp-content/uploads/2019/08/stabilizers.jpg" box_shadow_horizontal_tablet="0px" box_shadow_vertical_tablet="0px" box_shadow_blur_tablet="40px" box_shadow_spread_tablet="0px" z_index_tablet="500" url="https://youtu.be/fTrC5gkyYBM" url_new_window="on" /]',
)

I noticed that the problem is related to having two image urls on the content. The regex says to look only for the one starting with "src=" so the one starting with "url=" should be ignored. If "url=\"XXX\"" is removed from the tag it works fine.
History
Date User Action Args
2020-05-04 09:54:10sraelsetrecipients: + srael, ezio.melotti, mrabarnett
2020-05-04 09:54:10sraelsetmessageid: <1588586050.03.0.673760016961.issue40496@roundup.psfhosted.org>
2020-05-04 09:54:10sraellinkissue40496 messages
2020-05-04 09:54:09sraelcreate