This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author timehorse
Recipients akuchling, amaury.forgeotdarc, collinwinter, ezio.melotti, georg.brandl, jaylogan, jimjjewett, loewis, mark, moreati, mrabarnett, nneonneo, pitrou, rsc, timehorse
Date 2009-03-09.15:15:47
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1236611755.37.0.12931687253.issue2636@psf.upfronthosting.co.za>
In-reply-to
Content
Martin and Matthew,

I've been far too busy in the new year to keep up with all your updates 
to this issue, but since Martin wanted some clarification on direction 
and copyright, Matthew and I are co-developers, but there is clear delineation between each of our work where the patches uploaded by 
Matthew (mrbarnett) were uploaded by him and totally a product of his 
work.  The ones uploaded by me are more complicated, as I have always 
intended this to be a piecemeal project, not one patch fixes all, which 
is why I created the Bazaar repository hierarchy 
(https://launchpad.net/~pythonregexp2.7) with 36 or so branches of 
mostly independent development at various stages of completion.  Here is 
where the copyrights get more complicated, but not much so.  As I said, 
there are branches where multiple issues are combined (with the plus 
operator (+)).  In general, I consider primary development the single-
number branch and only create combined branches where I feel there may 
be a cross-dependency between one branch and the other.  Working this 
way is VERY time consuming: one spends more time merging branches than 
actually developing.  Matthew, on the other hand, has worked fairly 
linearly so his branches generally have long number trains to indicate 
all the issues solved in each.  What's more, the last time I updated the 
repository was last summer so all of Matthew's latest patches have not 
been catalogued and documented.  But, what is there that is more or less 
100% copyright and thanks to Matthew's diligent work always contains his 
first contribution, the new RegExp engine, thread 09-02.  So, any items 
which contain ...+09-02+... are pretty much Matthew's work and the rest 
are mine.

All that said, I personally like having all this development in one 
place, but also like having the separate branch development model I've 
set up in Bazaar.  If new issues are created from this one, I would thus 
hope they would still follow the outline specified on the Launchpad 
page.  I prefer keeping everything in one issue though as IMHO it makes 
things easier to keep track of.

As for the stuff I've worked on, I first should forewarn that there is a  
root patch at 
(https://code.launchpad.net/~pythonregexp2.7/python/issue2636) and as 
issue2636.patch in the tar.bz2 patch library I posted last June.  This 
patch contains various code cleanups and most notably a realignment of 
the documentation to follow 72-column rule.  I know Python's 
documentation is supposed to be 80-column, but some of the lines were 
going out even passed that and by making it 72 it allows for incremental 
expansion before having to reformat any lines.  However, textually, the 
issue2636 version of re.rst is no different than the last version it's 
based off off, which I verified by generating Sphinx hierarchies for 
both versions.  I therefore suggest this as the only change which is 
'massive restructuring' as it does not effect the actual documentation, 
it just makes it more legible in reStructuredText form.  This and other 
suggested changes in the root issue2636 thread are indented to be 
applied if at least 1 of the other issues is accepted, and as such is 
the root branch of every other branch.  Understanding that even these 
small changes may not in fact be acceptable, I have always generated 2 
sets of patches for each issue: one diff'ed against the python snapshot 
stored in base  
(https://code.launchpad.net/~pythonregexp2.7/python/base) and one that 
is diff'ed against the issue2636 root so if the changes in issue2636 
root are none the less unacceptable, they can easily be disregarded.

Now, with respect to work ready for analysis and merging prepared by me, 
I have 4 threads ready for analysis, with documentation updated and test 
cases written and passing:

1: Atomic Grouping / Possessive Qualifiers

5: Added a Python-specific RegExp comment group, (?P#...) which supports  
parenthetical nesting (see the issue for details)

7: Better caching algorithm for the RegExp compiler with more entries in 
the cache and reduced possibility of thrashing.

12: Clarify the python documentation for RegExp comments; this was only 
a change in re.rst.

The branches 09-01 and 09-01-01 are engine redesigns that I used to 
better understand the current RegExp engine but neither is faster than 
the existing engine so they will probably be abandoned.

10 is also nearly complete and effects the implementation of 01 (whence 
01+10) if accepted, but I have not done a final analysis to determine if any other variables can be consolidated to be defined only in one place.  

Thread 2 is in a near-complete form, but has been snagged by a decision 
as to what the interface to it should be -- see the discussion above and specifically http://bugs.python.org/msg68336 and http://bugs.python.org/msg68399.  The stand-alone patch by me is the 
latest version and implements the version called (a) in those notes.  I 
prefer to implement (e).

I don't think I'd had a chance to do any significant work on any of the 
other threads and got really bogged down with changing thread 2 as 
described above, trying to maintain threads for Matthew and just 
performing all those merges in Bazaar!

So that's the news from me, and nothing new to contribute at this time, 
but if you want separate, piecemeal solutions, feel free to crack opened http://bugs.python.org/file10645/issue2636-patches.tar.bz2 and grab them 
for at least items 1, 5, 7 and 12.
History
Date User Action Args
2009-03-09 15:15:55timehorsesetrecipients: + timehorse, loewis, akuchling, georg.brandl, collinwinter, jimjjewett, amaury.forgeotdarc, pitrou, nneonneo, rsc, mark, ezio.melotti, mrabarnett, jaylogan, moreati
2009-03-09 15:15:55timehorsesetmessageid: <1236611755.37.0.12931687253.issue2636@psf.upfronthosting.co.za>
2009-03-09 15:15:54timehorselinkissue2636 messages
2009-03-09 15:15:47timehorsecreate