Issue2142
Created on 2008-02-18 20:16 by trentm, last changed 2009-05-26 16:50 by trentm.
|
msg62543 - (view) |
Author: Trent Mick (trentm) |
Date: 2008-02-18 20:16 |
|
When comparing content with difflib, if the resulting diff covers the
last line of one or both of the inputs that that line doesn't end with
an end-of-line character(s), then the generated diff lines don't include
an EOL. Fair enough.
Naive (and I suspect typical) usage of difflib.unified_diff(...) is:
diff = ''.join(difflib.unified_diff(...))
This results in an *incorrect* unified diff for the conditions described
above.
>>> from difflib import *
>>> gen = unified_diff("one\ntwo\nthree".splitlines(1),
... "one\ntwo\ntrois".splitlines(1))
>>> print ''.join(gen)
---
+++
@@ -1,3 +1,3 @@
one
two
-three+trois
The proper behaviour would be:
>>> gen = unified_diff("one\ntwo\nthree".splitlines(1),
... "one\ntwo\ntrois".splitlines(1))
>>> print ''.join(gen)
---
+++
@@ -1,3 +1,3 @@
one
two
-three
\ No newline at end of file
+trois
\ No newline at end of file
I *believe* that "\ No newline at end of file" are the appropriate
markers -- that tools like "patch" will know how to use. At least this
is what "svn diff" generates.
I'll try to whip up a patch.
Do others concur that this should be fixed?
|
|
msg62544 - (view) |
Author: Trent Mick (trentm) |
Date: 2008-02-18 20:24 |
|
Attached is a patch against the Python 2.6 svn trunk for this.
|
|
msg62545 - (view) |
Author: Trent Mick (trentm) |
Date: 2008-02-18 20:25 |
|
At a glance I suspect this patch will work back to Python 2.3 (when
difflib.unified_diff() was added). I haven't looked at the Py3k tree yet.
Note: This *may* also applied to difflib.context_diff(), but I am not sure.
|
|
msg88375 - (view) |
Author: Trent Mick (trentm) |
Date: 2009-05-26 16:50 |
|
Here is a new patch that also fixes the same issue in
difflib.context_diff() and adds a couple test cases.
|
|
| Date |
User |
Action |
Args |
| 2009-05-26 16:50:14 | trentm | set | files:
+ python_difflib_no_eol.patch
title: naive use of ''.join(difflib.unified_diff(...)) results in bogus diffs with inputs that don't end with end-of-line char -> naive use of ''.join(difflib.unified_diff(...)) results in bogus diffs with inputs that don't end with end-of-line char (same with context_diff) messages:
+ msg88375 stage: test needed -> patch review |
| 2009-05-12 14:09:32 | ajaksu2 | set | stage: test needed versions:
+ Python 3.1, - Python 2.5, Python 2.4, Python 2.3 |
| 2008-02-19 09:09:53 | christian.heimes | set | priority: normal keywords:
+ patch |
| 2008-02-18 20:25:22 | trentm | set | messages:
+ msg62545 |
| 2008-02-18 20:24:08 | trentm | set | files:
+ python_difflib_unified_diff.patch messages:
+ msg62544 versions:
+ Python 2.6, Python 2.4, Python 2.3 |
| 2008-02-18 20:16:55 | trentm | create | |
|