Author gregsmith
Recipients
Date 2007-06-18.17:23:38
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
First off, don't expect other threads to run during re execution. Multi-threading in python is mainly to allow one thread to run while the others are waiting for I/O or doing a time.sleep() or something specific like that. Switching between runnable threads only occurs in interpreter loop.
There may exceptions to allow switching during some really long core operations (a mutex needs to be released and taken again) but it has to be done under certain conditions so that threads won't mess each other's data up.

So, on to the r.e.: first, try changing all the .*? to just .*  -- the ? is redundant and may be increasing the runtime by expanding the number of permutations that are being tried.

But I think your real trouble is all of these :  img src=\"(.*?)\"
This allows the second " to match with anything at all between, including any number of quoted strings.
 Your combination of several of these may be causing the RE engine to spend a huge amount of time looking at many different combinations for the first few .*?, all of which fail by the time you get to the last one.

Try   img src=\"([^"]*)\"  instead; this will only match the pair of " with no " in between.

Likewise, in .*?> the .* will match any number of '>' chars if this is needed to make the whole thing match, which is probably not what you want.

You might get it to work just by turning off 'greedy' matching for '*'.








History
Date User Action Args
2007-08-23 14:55:01adminlinkissue1737127 messages
2007-08-23 14:55:01admincreate