This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: difflib.SequenceMatcher: expose junk sets, deprecate undocumented isb... functions.
Type: enhancement Stage: needs patch
Components: Versions: Python 3.2
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: eli.bendersky, georg.brandl, hodgestar, terry.reedy
Priority: normal Keywords: patch

Created on 2010-11-25 20:20 by terry.reedy, last changed 2022-04-11 14:57 by admin. This issue is now closed.

File name Uploaded Description Edit
difflib.10534.diff terry.reedy, 2010-12-03 01:19 Expose junk & popular; doc along with b2j
issue10534.2.patch eli.bendersky, 2010-12-04 07:36
Messages (8)
msg122400 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-11-25 20:20
Expose and document the junk and popular sets as attributes of the SequenceMatcher object.

self.junk = junk
self.popular = popular

Deprecate the then unneeded and undocumented isbjunk and isbpopular functions, currently defined as
    self.isbjunk = junk.__contains__
    self.isbpopular = popular.__contains__
(and slightly modify the matcher function that localizes and uses one of the above).

Question, (how) do we  document deprecation/removal of undocumented function?

In discussions that included Tim Peters, the idea of exposing the tuning parameters of the heuristic was discussed. Now that the heuristic can be turned off, I think this is YAGNI until someone specifically requests it.
msg123152 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-12-03 01:19
Here is a pretty minimal patch to expose bjunk and bpopular as attributes and document them along with b2j, which is already exposed but not documented.

I suppose the proposed paragraph could be formatted as a list, perhaps after 
   ":class:`SequenceMatcher` objects have the following methods:"
with 'methods' expanded to 'attributes and methods'. But I do not know how to do that and personally prefer what I wrote.

However, Georg, I am adding you as nosy so you can comment on the doc addition *before* I commit this tomorrow, to get it in the beta.

Still to do: improve the doctests, including those added as part of #2986, to use the exposed sets (they are already implicitly tested by the existing tests); deprecate the two obsolete function attributes, as discussed on pydev.

The docstrings and code comments might stand revision, and there is an XXX todo in the doc yet.
msg123194 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-12-03 07:07
Don't worry about doc additions too much; all doc changes can be considered bug fixes.

In this patch, it appears that two of the three attributes are new?  They should get a versionadded.
msg123280 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-12-03 18:58
Added version-added and committed. r86983
msg123299 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-12-03 22:32
Deprecated isbjunk and isbpopular methods, ran doc and unit tests, and committed as r87000. Still need to add 'gone in 3.3 test' when revise unittests.
msg123304 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-12-03 22:50
News entry for both commits: r87001
msg123317 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2010-12-04 07:36

Attaching a patch with the following:

1. Added unittest assertions for bjunk and bpopular data attributes. 
2. Minor markup & formatting fixes in one paragraph of the doc difflib.rst
msg124065 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-12-15 20:23
Added tweak to .__chain_b to avoid creating list of b2j.keys and .items be deleting from b2j in separate loop after creating sets. Test with timeit suggests time about same with 1% deletion. r87276
Date User Action Args
2022-04-11 14:57:09adminsetgithub: 54743
2010-12-15 20:23:44terry.reedysetstatus: open -> closed

messages: + msg124065
resolution: fixed
nosy: georg.brandl, terry.reedy, hodgestar, eli.bendersky
2010-12-04 07:36:03eli.benderskysetfiles: + issue10534.2.patch

messages: + msg123317
2010-12-03 22:50:56terry.reedysetmessages: + msg123304
2010-12-03 22:32:08terry.reedysetmessages: + msg123299
2010-12-03 18:58:54terry.reedysetstage: commit review -> needs patch
2010-12-03 18:58:23terry.reedysetmessages: + msg123280
2010-12-03 07:07:48georg.brandlsetmessages: + msg123194
2010-12-03 01:19:23terry.reedysetfiles: + difflib.10534.diff

nosy: + georg.brandl
messages: + msg123152

keywords: + patch
stage: test needed -> commit review
2010-11-25 20:23:36terry.reedylinkissue2986 superseder
2010-11-25 20:20:56terry.reedycreate