classification
Title: difflib should separate filename from timestamp with tab
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: brian.curtin, pitrou, r.david.murray, techtonik, tim.peters
Priority: normal Keywords: patch

Created on 2009-12-27 20:57 by techtonik, last changed 2010-05-13 16:24 by techtonik. This issue is now closed.

Files
File name Uploaded Description Edit
difflib.tab_separated_filename.diff techtonik, 2009-12-27 20:57
example.py brian.curtin, 2009-12-29 04:33
issue7585.difflib-tab-separator.diff techtonik, 2010-04-01 20:05 base patch
issue7585.difflib-tab-separator-no-trail.diff techtonik, 2010-04-01 20:22 no trailing whitespace
issue7585.difflib-tab-update-docs.diff techtonik, 2010-04-02 11:41 (optional) update docs
issue7585.difflib-tab-updoc-iso8601.diff techtonik, 2010-04-02 11:43 (opt) alt docs update
issue7585.difflib-tab-unittest.diff techtonik, 2010-04-07 10:14 with unittest
Messages (21)
msg96924 - (view) Author: anatoly techtonik (techtonik) Date: 2009-12-27 20:57
The patch inserts \t character between filename and timestamp in unified 
and context diff headers. 

According to specification by Guido Van Rossum =)
http://www.artima.com/weblogs/viewpost.jsp?thread=164293

And de-facto output from various tools
http://code.google.com/p/python-patch/source/browse/#svn/trunk/doc

And the common sense
--- that it is easier to split	this stuff
+++ than this one into filename + timestamp

--- diff.py Sun Dec 27 16:08:28 2009
+++ trunk/diff.py	Sun Dec 27 15:46:58 2009
msg96961 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2009-12-28 17:38
I don't think the conditional checks around the timestamps are
necessary. Couldn't you just put the \t directly in the string that gets
yielded? That way the chunk headers always follow the same format.
msg96980 - (view) Author: anatoly techtonik (techtonik) Date: 2009-12-29 00:34
Conditional checks are required to prevent leaving trailing whitespace in 
filename when date component is not present. Such trailing whitespace may 
confuse patch tools. [1]

[1] http://code.google.com/p/python-patch/issues/detail?id=2
msg96986 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2009-12-29 04:33
Shouldn't the patch tool handle that? FWIW, both the "svn diff" and *nix
diff utilities produce headers with the parts split by a tab character.

Would the code in example.py work in your tool to handle both tabs and
spaces?
msg96989 - (view) Author: anatoly techtonik (techtonik) Date: 2009-12-29 10:28
Filenames may contain spaces too.

--- handle dis Sun Dec 27 16:08:28 2009
--- или вот это пон, дек 27 16:08:28 2009

The last line is space separated filename and date in Russian locale. 
Patch tool should handle that, but as you may see it is not always 
possible. That's why difflib modification with \t separator will greatly 
improve interoperability of difflib patches regardless of timestamp 
format.

Stripping trailing whitespace when there is no timestamp serves the same 
purpose. We can assume that patch tools are perfect, but I wouldn't 
write my tool if that was true, so its better to be friendly on difflib 
side and generate the output that won't require more work than necessary 
to use it.
msg97013 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2009-12-29 23:36
I agree that splitting with a tab character is good, but I think it
should be split by a tab character in every case. 

If the separator is different based on what data is provided, then it
complicates parsing and would have to be explained in documentation that
we provide different header formats.
msg97018 - (view) Author: anatoly techtonik (techtonik) Date: 2009-12-30 00:57
This patch makes sure filename and date split by tab in every case when 
date is provided.
msg97164 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-01-03 15:00
Is there any reason why you changed some of the examples in the docstrings (e.g. 'Sat Jan 26 23:30:50 1991' -> '1991-01-26 23:30:50')?
msg97166 - (view) Author: anatoly techtonik (techtonik) Date: 2010-01-03 15:10
It is the same reason as for removing recommendation from docstring to generate timestamps in the format returned by time.ctime(). See issue #7582
msg99167 - (view) Author: anatoly techtonik (techtonik) Date: 2010-02-10 15:40
The reason is to provide a good usage example.
msg102046 - (view) Author: anatoly techtonik (techtonik) Date: 2010-03-31 23:38
depends on issue #7583
msg102154 - (view) Author: anatoly techtonik (techtonik) Date: 2010-04-02 10:49
Refreshed patches - feel free to apply in any order you want.

Convenience link for reviews:
http://codereview.appspot.com/809043/show
msg102160 - (view) Author: anatoly techtonik (techtonik) Date: 2010-04-02 11:41
These patches are incrementally add features. Feel free to apply one that suits most. Codereview provides more convenient way to review differences between these.

attaching optional patch that removes recommendation to use ctime format for timestamps
msg102318 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-04 03:36
Your last patch looks the best to me.  I agree both that a tab should not be emitted if there is no date (which is what git, for example, does), and that ISO 8601 timestamps should be promoted as the preferred format.

As you pointed out, issue 7583 needs to be resolved before this can be applied.

In the meantime it would be nice to add an additional test for the no-tab-if-no-date case.
msg102323 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-04 07:06
We could avoid the 7583 problem by making the doctests use NORMALIZE_WHITESPACE and moving the real *tests* into the unittests for the module.  I think that would be a good thing to do anyway.
msg102530 - (view) Author: anatoly techtonik (techtonik) Date: 2010-04-07 10:14
Added unittest for tab with and without set filedate.
Removed #7583 dependency with NORMALIZE_WHITESPACE.
msg102551 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-07 17:53
Great, thanks.  I'll check this in when the branch is unfrozen.
msg102958 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-12 16:59
Applied to trunk in r8004 and py3k in r8006.
msg103011 - (view) Author: anatoly techtonik (techtonik) Date: 2010-04-13 05:20
r80004 and r80006 to be exact. 

Great! Thanks! =)
msg103041 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-04-13 12:15
Yes, thanks for the typo correction.
msg105641 - (view) Author: anatoly techtonik (techtonik) Date: 2010-05-13 16:24
tag:difflib
History
Date User Action Args
2010-05-14 04:59:29r.david.murraylinkissue4898 superseder
2010-05-13 16:24:26techtoniksetmessages: + msg105641
2010-04-13 12:15:02r.david.murraysetmessages: + msg103041
2010-04-13 05:20:20techtoniksetmessages: + msg103011
2010-04-12 16:59:46r.david.murraysetstatus: open -> closed
resolution: accepted -> fixed
messages: + msg102958

stage: test needed -> resolved
2010-04-07 17:53:27r.david.murraysetdependencies: - Improve explanation of tab expansion in doctests
messages: + msg102551
2010-04-07 10:14:38techtoniksetfiles: + issue7585.difflib-tab-unittest.diff

messages: + msg102530
2010-04-04 07:06:44r.david.murraysetmessages: + msg102323
2010-04-04 03:36:57r.david.murraysetnosy: + r.david.murray
messages: + msg102318

assignee: r.david.murray
resolution: accepted
stage: patch review -> test needed
2010-04-02 11:43:18techtoniksetfiles: + issue7585.difflib-tab-updoc-iso8601.diff
2010-04-02 11:41:24techtoniksetfiles: + issue7585.difflib-tab-update-docs.diff

messages: + msg102160
2010-04-02 11:35:26techtoniksetfiles: - issue7585.difflib-tab-update-docs.diff
2010-04-02 11:18:27techtoniksetfiles: + issue7585.difflib-tab-update-docs.diff
2010-04-02 10:49:11techtoniksetmessages: + msg102154
2010-04-01 20:22:22techtoniksetfiles: + issue7585.difflib-tab-separator-no-trail.diff
2010-04-01 20:05:45techtoniksetfiles: + issue7585.difflib-tab-separator.diff
2010-04-01 14:57:16r.david.murraysetdependencies: + Improve explanation of tab expansion in doctests
title: [patch] difflib should separate filename from timestamp with tab -> difflib should separate filename from timestamp with tab
2010-03-31 23:38:02techtoniksetmessages: + msg102046
2010-02-10 15:40:40techtoniksetmessages: + msg99167
2010-01-03 15:10:26techtoniksetmessages: + msg97166
2010-01-03 15:01:57pitrousetpriority: normal
nosy: + tim.peters
versions: - Python 2.6, Python 3.1

stage: patch review
2010-01-03 15:00:31pitrousetnosy: + pitrou
messages: + msg97164
2009-12-30 00:57:28techtoniksetmessages: + msg97018
2009-12-29 23:36:14brian.curtinsetmessages: + msg97013
2009-12-29 10:28:51techtoniksetmessages: + msg96989
2009-12-29 04:33:13brian.curtinsetfiles: + example.py

messages: + msg96986
2009-12-29 00:34:03techtoniksetmessages: + msg96980
2009-12-28 17:38:55brian.curtinsetnosy: + brian.curtin
messages: + msg96961
2009-12-27 20:57:36techtonikcreate