classification
Title: difflib new cli interface
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Claudiu.Popa, berker.peksag, rhettinger, tshepang, vstinner
Priority: normal Keywords: patch

Created on 2014-03-22 20:44 by Claudiu.Popa, last changed 2014-05-22 22:56 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
difflib_cli.patch Claudiu.Popa, 2014-03-22 20:44 review
difflib_cli.patch Claudiu.Popa, 2014-03-22 20:51 Fix failure on Windows. review
issue21027.patch Claudiu.Popa, 2014-03-22 21:21 review
issue21027_1.patch Claudiu.Popa, 2014-05-14 08:21 A couple of doc fixes. review
issue21027_2.patch Claudiu.Popa, 2014-05-16 12:16 review
Messages (11)
msg214516 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014-03-22 20:44
Hello!

The attached patch proposes a new command line interface to difflib module.
Currently, `python -m difflib` does nothing useful, it runs the doc suite for the difflib module.
Right now, there are a couple of modules in the standard lib, which provides
helpful cli interfaces. For instance, inspect for analyzing an object, compileall for compilation of Python files
or json.tool for validating and pretty printing JSON. Also, in Tools/scripts/ there is a small utility called diff.py,
which uses difflib to implement a simple diff like utility, but the following issue proposes its deprecation
and I'll enumerate my reasons for this:

- On Windows, py -3 -m difflib is easier to use. Yes, Tools/Scripts can be added to PATH, so that diff.py can be used there, but we can't do always that. I have at work a couple of machines where I can't modify the PATH due to user limitations. Having `py -3 -m difflib` as a handy diff tool is invaluable on such systems.

- Continuing the same argument as above, you can't always install a proper diff tool, due to same limitations. Having a simple one builtin in the stdlib is more than useful! Also, you can't always use a versioning system, in order to use its diff feature.

- Tools/Scripts/diff.py is not tested at all.

- diff.py was added before the `-m` thingy came, now `-m difflib` is the more natural way and I hope to see even more modules providing it with useful cli interfaces, like compileall or inspect.

Thanks in advance!
msg214518 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014-03-22 21:21
Here's a new patch which addresses the comments of berker.peksag. Thank you for the review!
msg214547 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-03-23 04:48
This looks promising to me.  I'll look in more detail shortly.
msg216299 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014-04-15 14:48
Raymond, any news on this?
msg218584 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-05-14 22:34
$ ./python -m difflib -u a.py b.py 
...
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='a.py' mode='r' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='b.py' mode='r' encoding='UTF-8'>

It looks like files are not closed.

I would prefer to have the unified output (-u) by default. I don't think that Python should mimick exactly UNIX tools.

tarfile has now a command line interface, but options are different. For example, it's -e to extract instead of -x.
msg218585 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-05-14 22:40
The HTML output contains an encoding in the <head>:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

The problem is that sys.stdout may use a different encoding. For example, my locale encoding is UTF-8. So "python -m difflib -um a.py b.py > test.html" produces an HTML file encoded in UTF-8 but announcing a ISO 8859-1 header. There are different options to fix this issue:

* drop the --html command line option
* drop the invalid Content-Type header
* add an option to write the diff into a file, use UTF-8 to encode this file and emit a correct HTTP header (announce UTF-8)
msg218587 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2014-05-14 23:13
> The HTML output contains an encoding in the <head>:
> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

See issue2052 for this.
msg218591 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-05-15 04:51
After more thought, I think this should remain in tools as a demo.

We don't have an objective to create command-line tools to compete with existing well developed, tested tools.  For the most part, our command line tools exposed through the -m option are few in number and mostly limited to things that aid development or don't have a readily available alternative.

I recommend closing this to avoid feature creep.
msg218596 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-05-15 07:25
> After more thought, I think this should remain in tools as a demo.

I disagree, I like the command line interface. It's very useful on Windows for example. It's also useful on UNIX embedded devices where Python is installed, but only a few UNIX tools.

If you are not convinced, please see this amazing talk of David Beazley at Pycon 2014:
http://pyvideo.org/video/2645/discovering-python

The patch only adds a few lines to difflib.py.

IMO difflib CLI is even more useful than tarfile CLI ;-)

It's not like no other Python module has a CLI. Modules with a CLI of Python 3.5:

aifc
base64
calendar
cgi
code
compileall
cProfile
dis
doctest
filecmp
fileinput
formatter
ftplib
getopt
gzip
imaplib
imghdr
inspect
locale
mailcap
mimetypes
modulefinder
netrc
nntplib
pdb
pickle
pickletools
platform
poplib
pprint
profile
pstats
pyclbr
py_compile
pydoc
quopri
random
runpy
shlex
site
smtpd
smtplib
sndhdr
sre_constants
symbol
symtable
sysconfig
tabnanny
tarfile
telnetlib
textwrap
timeit
tokenize
token
trace
turtle
uu
webbrowser
zipfile
msg218656 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014-05-16 12:16
Attached the new version of the patch which removes the resource warnings. 
Raymond, I disagree on certain points. `difflib -m` does help the development, especially for platforms where there aren't many readily available alternatives (like Windows). I gave an example for this in my first message, where you can't modify the PATH, nor install additional software. Also, you say that this should remain in tools as a demo. Wouldn't be better to have that demo well tested in stdlib and in a place where you can easy access it? This way, the user doesn't have to reimplement the wheel everytime he needs the differences between two files. And we are not competing with well developed, tested tools. By this argument, having `-m zipfile` and `-m tarfile` is redundant, because we can always use zip and tar instead.
msg218929 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-05-22 22:56
Sorry guys, I appreciate your enthusiasm, but when I designed the code, I intentionally put it in the Tools/scripts section rather than as a command-line option for a library module.  As the author of the context_diff and unified_diff, I was concerned that wasn't competitive with a real diff tool in a numbers of ways.  IIRC, Guido had problems with it and couldn't get it to work with "patch" and Uncle Timmy noted some algorithmic difference with other diffs.

I'm going to close this because I think it is not a good idea to offer an "attractive nuisance".

Victor, I attended David's talk and enjoyed it thoroughly.  However, if you thought he was arguing for the standard library to be turned into a suite of command-line unix replacements, you may have missed the point.  In addition, most of his "magic" was done by writing scripts that incorporated the tools (for example, the diffs were part of a script that reconstructed the change history for a series of files).
History
Date User Action Args
2014-05-22 22:56:02rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg218929
2014-05-16 12:16:58Claudiu.Popasetfiles: + issue21027_2.patch

messages: + msg218656
2014-05-15 07:25:05vstinnersetmessages: + msg218596
2014-05-15 04:51:06rhettingersetmessages: + msg218591
2014-05-14 23:13:20berker.peksagsetmessages: + msg218587
2014-05-14 22:40:28vstinnersetmessages: + msg218585
2014-05-14 22:34:37vstinnersetnosy: + vstinner
messages: + msg218584
2014-05-14 08:21:45Claudiu.Popasetfiles: + issue21027_1.patch
2014-04-15 14:48:37Claudiu.Popasetmessages: + msg216299
2014-04-02 17:44:38tshepangsetnosy: + tshepang
2014-03-23 04:48:24rhettingersetassignee: rhettinger

messages: + msg214547
nosy: + rhettinger
2014-03-22 22:17:48berker.peksagsetnosy: + berker.peksag

stage: patch review
2014-03-22 21:21:10Claudiu.Popasetfiles: + issue21027.patch

messages: + msg214518
2014-03-22 20:51:00Claudiu.Popasetfiles: + difflib_cli.patch
2014-03-22 20:44:16Claudiu.Popacreate