classification
Title: 2to3 does not preserve line endings
Type: behavior Stage: patch review
Components: 2to3 (2.x to 3.x conversion tool) Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: benjamin.peterson Nosy List: Aaron Ang, benjamin.peterson, bialix, eric.araujo, jason.coombs, lukasz.langa, miss-islington
Priority: normal Keywords: easy, patch

Created on 2011-03-18 16:27 by bialix, last changed 2018-04-17 21:58 by miss-islington.

Files
File name Uploaded Description Edit
2to3-crlf-test.diff eric.araujo, 2011-06-09 15:13
Pull Requests
URL Status Linked Edit
PR 6483 merged Aaron Ang, 2018-04-16 02:43
PR 6515 merged miss-islington, 2018-04-17 21:35
Messages (13)
msg131335 - (view) Author: Alexander Belchenko (bialix) Date: 2011-03-18 16:27
I'm using LF-only line-endings for development of my IntelHex library. I'm working on Windows most of the time.

After 2to3 tool has been ran on my library it has not only changed the Python syntax, but it also saved all files with CRLF line-endings. As result I have all changed files completelly modified and diff shows change in every line. 

2to3 tool should respect my line-endings and must not use simple open(xxx, "wt") mode for writing modified files.
msg131349 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-03-18 18:19
Thanks for the report.  Can you run “python -m test.test_lib2to3”, if possible with a Python 3.x version?  I’ve seen that the tests use binary mode to compare file contents, so maybe you will get an error message that can get us started.
msg131626 - (view) Author: Alexander Belchenko (bialix) Date: 2011-03-21 09:38
@Éric Araujo: I've ran tests with python 3.2. All tests have passed:

----------------------------------------------------------------------
Ran 540 tests in 37.688s

OK
msg131639 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-03-21 12:29
Thanks.  Would you like to work on a unit test or full patch?
msg131645 - (view) Author: Alexander Belchenko (bialix) Date: 2011-03-21 13:16
Éric, thank you for the proposal, but I'm not familiar enough with the codebase to work on it. 

The short scan over the tests reveals that there is at least one test which tries to test CRLF behavior, in the file test_refactor.py, but I don't understand what it doing?

    def test_crlf_newlines(self):
        old_sep = os.linesep
        os.linesep = "\r\n"
        try:
            fn = os.path.join(TEST_DATA_DIR, "crlf.py")
            fixes = refactor.get_fixers_from_package("lib2to3.fixes")
            self.check_file_refactoring(fn, fixes)
        finally:
            os.linesep = old_sep

So, in theory I can modify that test to check what if the file has LF-only line-endings originally, but os.linesep is CRLF, but then I don't know what the content I should create and how to run fixer over that.
msg131652 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-03-21 13:53
I can fix it. I just need to find time. :)
msg137983 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-06-09 15:13
I was surprised to see that the crlf.py file was not using CRLF in the new Mercurial repo.  It is also not in the .hgeol file.  I changed it locally, but it doesn’t change anything, the tests pass before and after the change.
msg307865 - (view) Author: Jason R. Coombs (jason.coombs) * (Python committer) Date: 2017-12-08 19:25
This issue affected me today. I'm editing a codebase that has mixed line endings in different files. I'd like to patch for Python 3 support without changing line endings. Even invoking a single fixer (print), the line endings mutate. Since I'm running on macOS, the files with CRLF get LF line endings.

Answers [in this question](https://stackoverflow.com/questions/39028517/2to3-how-to-keep-newline-characters-from-input-file) suggest the mutation can be suppressed by altering the _to_system_newlines function, but the proposed fix has no effect on the Python 3.6 version of this routine.

I thought I'd encountered this issue before, but I realized after searching that I was thinking of issue10639, which may present a model for retaining the newlines when refactoring the code.

I found I was able to work around the issue by putting lib2to3-clean.py in my current directory:

import lib2to3.refactor
import runpy
import functools


if __name__ == '__main__':
	lib2to3.refactor._open_with_encoding = functools.partial(open, newline='')
	runpy.run_module('lib2to3')


And invoking `python -m lib2to3-clean` instead of `-m lib2to3`. The addition of newline='' follows the [guidance in the docs](https://docs.python.org/release/3.2/library/functions.html#open) on how to avoid mutating newlines.

I've released this functionality in [jaraco.develop 4.0](https://pypi.org/project/jaraco.develop) so others can readily invoke it with `rwt jaraco.develop -- -m jaraco.develop.lib2to3`.
msg315313 - (view) Author: Aaron Ang (Aaron Ang) * Date: 2018-04-15 07:01
I couldn't reproduce this issue. I tried reproducing this problem by extending the TestRefactoringTool class and creating two files: one file with LF line-endings and one file with CRLF line-endings.

The changes that I made can be found here: https://github.com/aaronang/cpython/commit/55e8bd317f37923e6e23780e6ae41858493e98d8.

The output of the tests:

Before: b'print("hi")\n\nprint("Like bad Windows newlines?")\n'
After:  b'print("hi")\n\nprint("Like bad Windows newlines?")\n'

Before: b'print("hi")\r\n\r\nprint("Like bad Windows newlines?")\r\n'
After:  b'print("hi")\r\n\r\nprint("Like bad Windows newlines?")\r\n'

Maybe this problem has been resolved?
msg315319 - (view) Author: Jason R. Coombs (jason.coombs) * (Python committer) Date: 2018-04-15 13:01
I do still see the issue on Python 3.7b3:

$ python ~/Dropbox/bin/scripts/which-line-ending onefile.py
Line ending is '\n'
$ python ~/Dropbox/bin/scripts/which-line-ending otherfile.py
Line ending is '\r\n'
$ python -V
Python 3.7.0b3
$ python -m lib2to3 . -w
RefactoringTool: Skipping optional fixer: buffer
RefactoringTool: Skipping optional fixer: idioms
RefactoringTool: Skipping optional fixer: set_literal
RefactoringTool: Skipping optional fixer: ws_comma
RefactoringTool: Refactored ./onefile.py
--- ./onefile.py        (original)
+++ ./onefile.py        (refactored)
@@ -1 +1 @@
-print 'hello world'
+print('hello world')
RefactoringTool: Refactored ./otherfile.py
--- ./otherfile.py      (original)
+++ ./otherfile.py      (refactored)
@@ -1 +1 @@
-print 'hello world'
+print('hello world')
RefactoringTool: Files that were modified:
RefactoringTool: ./onefile.py
RefactoringTool: ./otherfile.py
$ $ python ~/Dropbox/bin/scripts/which-line-ending onefile.py
Line ending is '\n'
$ python ~/Dropbox/bin/scripts/which-line-ending otherfile.py
Line ending is '\n'
msg315349 - (view) Author: Aaron Ang (Aaron Ang) * Date: 2018-04-16 02:45
@Jason R. Coombs
You are right. I managed to reproduce the problem with a test. It only occurs when a fix is applied.

Also, I figured out that the refactoring reads in the file using `open(file, 'r')`, which basically transforms all line-endings to LF regardless the used line-endings. I think I fixed the problem, looking forward to receiving feedback 😬
msg315423 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2018-04-17 21:34
New changeset c127a86e1862df88ec6f9d15b79c627fc616766e by Łukasz Langa (Aaron Ang) in branch 'master':
bpo-11594: Ensure line-endings are respected when using 2to3 (GH-6483)
https://github.com/python/cpython/commit/c127a86e1862df88ec6f9d15b79c627fc616766e
msg315426 - (view) Author: miss-islington (miss-islington) Date: 2018-04-17 21:58
New changeset 3b3be1fe10f6c15e57360cac9d9dbc660666e655 by Miss Islington (bot) in branch '3.7':
bpo-11594: Ensure line-endings are respected when using 2to3 (GH-6483)
https://github.com/python/cpython/commit/3b3be1fe10f6c15e57360cac9d9dbc660666e655
History
Date User Action Args
2018-04-17 21:58:42miss-islingtonsetnosy: + miss-islington
messages: + msg315426
2018-04-17 21:35:27miss-islingtonsetpull_requests: + pull_request6208
2018-04-17 21:34:21lukasz.langasetnosy: + lukasz.langa
messages: + msg315423
2018-04-16 02:45:47Aaron Angsetmessages: + msg315349
2018-04-16 02:43:46Aaron Angsetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request6181
2018-04-15 13:01:47jason.coombssetmessages: + msg315319
versions: + Python 3.7, Python 3.8
2018-04-15 07:01:12Aaron Angsetnosy: + Aaron Ang
messages: + msg315313
2018-03-13 09:13:53lukasz.langasetkeywords: + easy, - patch
2017-12-08 19:25:24jason.coombssetnosy: + jason.coombs

messages: + msg307865
versions: + Python 3.6, - Python 2.7, Python 3.2, Python 3.3
2011-06-09 15:13:39eric.araujosetfiles: + 2to3-crlf-test.diff
keywords: + patch
messages: + msg137983

versions: - Python 3.1
2011-03-21 13:53:32benjamin.petersonsetnosy: bialix, benjamin.peterson, eric.araujo
messages: + msg131652
2011-03-21 13:16:34bialixsetnosy: bialix, benjamin.peterson, eric.araujo
messages: + msg131645
2011-03-21 12:29:29eric.araujosetnosy: bialix, benjamin.peterson, eric.araujo
title: 2to3 tool does not preserve line-endings -> 2to3 does not preserve line endings
messages: + msg131639
stage: test needed
2011-03-21 09:38:15bialixsetnosy: bialix, benjamin.peterson, eric.araujo
messages: + msg131626
2011-03-18 23:51:15benjamin.petersonsetassignee: benjamin.peterson
nosy: bialix, benjamin.peterson, eric.araujo
2011-03-18 18:19:16eric.araujosetnosy: + eric.araujo

messages: + msg131349
versions: + Python 3.1, Python 3.3
2011-03-18 17:18:59r.david.murraysetnosy: + benjamin.peterson
2011-03-18 16:27:27bialixcreate