Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reindent.py should not convert newlines #54848

Closed
jaraco opened this issue Dec 6, 2010 · 34 comments
Closed

reindent.py should not convert newlines #54848

jaraco opened this issue Dec 6, 2010 · 34 comments
Assignees
Labels
type-bug An unexpected behavior, bug, or error

Comments

@jaraco
Copy link
Member

jaraco commented Dec 6, 2010

BPO 10639
Nosy @jaraco, @orsenthil, @merwok, @bitdancer
Files
  • reindent-autonewline.patch
  • 900df5732f93.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/jaraco'
    closed_at = <Date 2011-07-29.13:38:51.706>
    created_at = <Date 2010-12-06.16:36:51.091>
    labels = ['type-bug']
    title = 'reindent.py should not convert newlines'
    updated_at = <Date 2011-08-02.12:20:52.879>
    user = 'https://github.com/jaraco'

    bugs.python.org fields:

    activity = <Date 2011-08-02.12:20:52.879>
    actor = 'python-dev'
    assignee = 'jaraco'
    closed = True
    closed_date = <Date 2011-07-29.13:38:51.706>
    closer = 'jaraco'
    components = ['Demos and Tools']
    creation = <Date 2010-12-06.16:36:51.091>
    creator = 'jaraco'
    dependencies = []
    files = ['19956', '22766']
    hgrepos = ['45']
    issue_num = 10639
    keywords = ['patch']
    message_count = 34.0
    messages = ['123476', '123477', '123478', '124369', '125818', '126915', '127449', '127481', '127482', '136030', '141092', '141151', '141153', '141154', '141159', '141160', '141163', '141164', '141165', '141171', '141233', '141239', '141261', '141325', '141328', '141361', '141370', '141371', '141373', '141375', '141418', '141547', '141558', '141563']
    nosy_count = 7.0
    nosy_names = ['jaraco', 'scott.dial', 'orsenthil', 'eric.araujo', 'r.david.murray', 'eli.bendersky', 'python-dev']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'commit review'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue10639'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3']

    @jaraco
    Copy link
    Member Author

    jaraco commented Dec 6, 2010

    When reindent.py runs, it will convert the line endings for each file it converts to the default line ending for the platform on which reindent.py runs. It would be better if reindent.py would retain line endings of the source file. Attached is a patch that addresses this issue.

    @jaraco
    Copy link
    Member Author

    jaraco commented Dec 6, 2010

    Merged the patch with the latest trunk.

    @merwok
    Copy link
    Member

    merwok commented Dec 6, 2010

    Looks good.

    @merwok merwok added the type-bug An unexpected behavior, bug, or error label Dec 6, 2010
    @merwok
    Copy link
    Member

    merwok commented Dec 19, 2010

    If nobody objects, I will commit this when py3k is unfrozen.

    @merwok merwok self-assigned this Dec 19, 2010
    @merwok
    Copy link
    Member

    merwok commented Jan 9, 2011

    I did a bit of testing on demo files. The first one I tested with had mixed EOLs, in which case the newlines attribute will be a tuple and open(..., newline=r.newlines) will fail. For reference: http://docs.python.org/dev/library/io#io.TextIOBase.newlines

    What would be the best behavior in such cases? Refuse the temptation to guess and exit?

    @orsenthil
    Copy link
    Member

    Why is reindent.py translating the whole file to platform default newline such a bad thing? Could not it be considered as helpful behavior, especially in the case where the user has mixed EOLs by mistake?

    @merwok
    Copy link
    Member

    merwok commented Jan 29, 2011

    I’m not opposed to reindent normalizing EOLs, given that our VCS hooks will translate them to the right thing (for example, making sure that files specific to the Windows build use CRLF, not LF, even if edited on POSIX).

    @jaraco
    Copy link
    Member Author

    jaraco commented Jan 29, 2011

    That's great that reindent works in your environment, but that's no help in my environment.

    Here's the use case: We have a software package developed in Unix and with Unix LF line endings. The code base may also contain files with CRLF line endings (I can't be certain).

    I'm developing in Windows (which includes reindent.py installed by default). I'd like to use that tool to reindent the code base without munging the line endings. I can't do this without first modifying reindent.py or using another tool to convert the line endings afterword (at which point I don't know if the Windows line endings were created by reindent.py or were there natively).

    If someone wants the line endings normalized on their code base, they should use another tool (or possibly an option to reindent.py), but there's definitely a case for using reindent.py without altering the line endings.

    @merwok
    Copy link
    Member

    merwok commented Jan 29, 2011

    That is a valid use case, so +1 on not changing behavior and adding a new option in 3.3 (in another bug report).

    What is your reply to msg125818?

    @jaraco
    Copy link
    Member Author

    jaraco commented May 15, 2011

    Re msg125818, I agree that the mixed EOLs is also a potential problem, though the proposed solution behaves better than the status quo, where the EOLs get converted without warning.

    It would be desirable for the patch to be more robust in that situation (providing a useful error message about why it fails). What might even be better is to retain newlines even if they're mixed. I'll explore a cross-platform technique to consistently retain newlines as encountered.

    @jaraco
    Copy link
    Member Author

    jaraco commented Jul 25, 2011

    I looked into the possibility of retaining newline characters even for files with mixed newlines, but I've decided that's too intrusive an approach. The current implementation specifically takes measures to strip whitespaces from the ends of lines, so it seems outside the scope of this bug to alter that behavior.

    So I've taken another stab at a more robust implementation that does newline detection but raises an error if the file contains mixed newlines.

    Furthermore, it adds an option (--newline) to specify which newline character to use, bypassing the mixed-newline error and allowing the user to override newline detection.

    Here's a demo run:

    PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py .\foo.py
    .\foo.py: mixed newlines detected; cannot continue without --newline
    PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py --newline CRLF .\foo.py
    PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py --newline LF .\foo.py
    PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py .\foo.py
    

    I've published this change as https://bitbucket.org/jaraco/cpython-issue10639/changeset/900df5732f93.

    Please review. If this changeset is acceptable, I will push the revisions to the master repo. Please advise if I may also backport to Python 3.2 and 2.7.

    @merwok
    Copy link
    Member

    merwok commented Jul 26, 2011

    Thanks for your work. The generated patch is for some reason unreadable, so I looked at the changeset on Bitbucket. It looks good, except that a new option would have to be 3.3-only, whereas the bug could be fixed in 2.7 and 3.2 too. If it’s too bothersome for you to split your current patch, then we can fix in 3.3 only.

    @merwok merwok changed the title reindent.py converts newlines to platform default reindent.py should not convert newlines Jul 26, 2011
    @jaraco
    Copy link
    Member Author

    jaraco commented Jul 26, 2011

    No problem. I'll rebase and push the current patch as-is, and then backport to 3.2 and 2.7 without the option (just raising the error when mixed newlines are encountered).

    @merwok
    Copy link
    Member

    merwok commented Jul 26, 2011

    If this changeset is acceptable, I will push the revisions to the
    master repo.

    I'll rebase and push the current patch as-is,

    What repo are you talking about, BTW? The developers list in the devguide does not contain your name, so I assume you’re not talking about hg.python.org/cpython?

    and then backport to 3.2 and 2.7 without the option
    IMO it would be easier to make one bugfix changeset in 3.2, merge it in 3.3, and then add the new option.

    @jaraco
    Copy link
    Member Author

    jaraco commented Jul 26, 2011

    I assume you’re not talking about hg.python.org/cpython?

    My name appears as jason.coombs. But your note does remind me that I need to review the devguide, since it's been a while since I've pushed something. I may also look into linking/merging my two accounts.

    I plan to do as you suggest, splitting the patch into the bugfix and the new capability.

    @merwok
    Copy link
    Member

    merwok commented Jul 26, 2011

    Okay, I hadn’t seen you in http://docs.python.org/devguide/developers and I don’t recall the URI of the generated file with all names. I’ve asked that your Roundup profile be updated so that you get the Python icon and the possibility to be assigned bugs.

    If you already have clones, the workflow should not be hard: go to a 3.2 checkout, fix things, run patchcheck and tests, commit. Go to 3.3, pull, merge, test, commit, push. The only tricky things that can make push hooks reject changesets are trailing whitespace, maybe EOLs, and new named branches. We also prefer to push one changeset with all changes, not a series of changesets with gradual fixes.

    @merwok merwok assigned jaraco and unassigned merwok Jul 26, 2011
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 26, 2011

    New changeset 070dc6e359fb by Jason R. Coombs in branch '3.2':
    Fixes bpo-10639: reindent.py should not convert newlines
    http://hg.python.org/cpython/rev/070dc6e359fb

    New changeset 826a0208d143 by Jason R. Coombs in branch 'default':
    Merge with 3.2 Issue bpo-10639: reindent.py should not convert newlines.
    http://hg.python.org/cpython/rev/826a0208d143

    New changeset 4feb889d3bed by Jason R. Coombs in branch 'default':
    Issue bpo-10639: reindent.py tool now accepts a --newline option to specify the newline to be used in the output of converted files.
    http://hg.python.org/cpython/rev/4feb889d3bed

    @jaraco
    Copy link
    Member Author

    jaraco commented Jul 26, 2011

    I've stripped the undesirable revisions and updated the bitbucket repo so it now contains three changesets for Python 3.2 and 3.3 and suggested.

    I don't believe running the test suite is relevant, as I grepped the test suite for reindent, and there's no reference to the tool script.

    It took several rounds of rebasing and stripping and importing patches to get everything right, but I believe these changes meet the desired goals.

    @merwok
    Copy link
    Member

    merwok commented Jul 26, 2011

    I've stripped the undesirable revisions and updated the bitbucket repo
    so it now contains three changesets for Python 3.2 and 3.3 and
    suggested.

    hg diff combined with hg import can help you flatten a series of changeset into one.

    I don't believe running the test suite is relevant
    You’re right, I mentioned that step because I was talking about the workflow in general, but here it does not apply, we only have manual testing for most tools.

    I believe these changes meet the desired goals.
    Me too! Are you comfortable with how to backport this to 2.7?

    @jaraco
    Copy link
    Member Author

    jaraco commented Jul 26, 2011

    What do you think about this: https://bitbucket.org/jaraco/cpython-issue10639/changeset/ea63310dddce

    The patch is a little more intrusive than the Python 3 patch because Python 2.7 doesn't allow specifying the newline to use when writing a file (afaict), but it abstracts that intrusiveness in the NewlineAdapter class.

    I've tested the script on a file with mixed newlines, a file with non-native newlines, and a file with content but no newlines, and it performs as expected.

    @merwok
    Copy link
    Member

    merwok commented Jul 27, 2011

    The patch is a little more intrusive than the Python 3 patch because
    Python 2.7 doesn't allow specifying the newline to use when writing a
    file (afaict)

    Instead of writing a new class, what about using io.open instead of the builtin? Then you’ll be able to use the same patch than 3.2.

    @jaraco
    Copy link
    Member Author

    jaraco commented Jul 27, 2011

    Instead of writing a new class, what about using io.open instead of the
    builtin? Then you’ll be able to use the same patch than 3.2.

    Ooh. Excellent suggestion.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 27, 2011

    New changeset f9cf55bbe9b9 by Jason R. Coombs in branch '2.7':
    Fixes bpo-10639: reindent.py should not convert newlines
    http://hg.python.org/cpython/rev/f9cf55bbe9b9

    @jaraco jaraco closed this as completed Jul 27, 2011
    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Jul 29, 2011

    It appears this breaks "make patchcheck" in trunk:

    ./python ./Tools/scripts/patchcheck.py
    Getting the list of files that have been added/changed ... 5 files
    Fixing whitespace ... Traceback (most recent call last):
      File "./Tools/scripts/patchcheck.py", line 147, in <module>
        main()
      File "./Tools/scripts/patchcheck.py", line 129, in main
        normalize_whitespace(python_files)
      File "./Tools/scripts/patchcheck.py", line 22, in call_fxn
        result = fxn(*args, **kwargs)
      File "./Tools/scripts/patchcheck.py", line 66, in normalize_whitespace
        fixed = [path for path in file_paths if path.endswith('.py') and
      File "./Tools/scripts/patchcheck.py", line 67, in <listcomp>
        reindent.check(path)]
      File "/home/eliben/python-src/33/Tools/scripts/reindent.py", line 129, in check
        newline = spec_newline if spec_newline else r.newlines
    NameError: global name 'spec_newline' is not defined
    make: *** [patchcheck] Error 1

    @elibendersky elibendersky mannequin reopened this Jul 29, 2011
    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Jul 29, 2011

    This is a good example of why passing parameters into functions by means of globals sucks. In reindent.py, main() sets the spec_newline global which check() uses, but this was forgotten in patchcheck.py which also uses reindent.check()

    IMHO reindent.py should be rewritten to ditch all globals

    @merwok
    Copy link
    Member

    merwok commented Jul 29, 2011

    I agree that using globals instead function arguments sucks, but let’s fix this bug without rewriting the whole file (unless this affects only the code new in 3.3).

    @jaraco
    Copy link
    Member Author

    jaraco commented Jul 29, 2011

    The 'spec_newline' in particular is only in the trunk (not in the backports), as it's part of the new --newline option.

    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Jul 29, 2011

    Jason, one way or another, a prompt fix for trunk is required, since make patchcheck is an important step for committing patches.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 29, 2011

    New changeset 2547f7965733 by Jason R. Coombs in branch 'default':
    Issue bpo-10639: spec_newline wasn't defined globally unless main() was called; now spec_newline is set at module import/execution
    http://hg.python.org/cpython/rev/2547f7965733

    @jaraco
    Copy link
    Member Author

    jaraco commented Jul 29, 2011

    I hadn't realized that the other global variables were defined at the module level or I would have implemented this originally. Please let me know if this patch doesn't correct the issue.

    @jaraco jaraco closed this as completed Jul 29, 2011
    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Jul 30, 2011

    "make patchcheck" is working again.
    Thanks

    @scottdial
    Copy link
    Mannequin

    scottdial mannequin commented Aug 2, 2011

    I haven't seen anyone use a side-effect-less statement (a string) as a comment before, but I doubt that is an approved style for the CPython codebase. Please change the string preceeding the spec_line definition into a proper comment.

    @bitdancer
    Copy link
    Member

    Well, this is actually blessed by http://www.python.org/dev/peps/pep-0257/, but if that convention were actually followed the docstring would go *after* the assignment. But I agree that it is rarely used, and as far as I know is not used in the stdlib at all.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 2, 2011

    New changeset dc96af0e7f60 by Jason R. Coombs in branch 'default':
    Corrected attribute docstring per pep-257 (reference bpo-10639)
    http://hg.python.org/cpython/rev/dc96af0e7f60

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants