classification
Title: reindent.py inserts spaces in multiline literals
Type: behavior Stage:
Components: Demos and Tools Versions: Python 3.4, Python 3.3, Python 3.2, Python 2.7
process
Status: open Resolution:
Dependencies: 13447 Superseder:
Assigned To: Nosy List: Caio.Romao, Dima.Tisnek, Jonathan.Rogers, eric.araujo, roger.serwy, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2011-09-07 14:29 by Dima.Tisnek, last changed 2012-10-01 00:57 by roger.serwy.

Files
File name Uploaded Description Edit
caioromao-fix-12930-v4.patch Caio.Romao, 2011-09-19 23:03 review
bug12930-testfiles.tgz Caio.Romao, 2011-09-19 23:17 test
testfile-original.py Caio.Romao, 2011-10-23 01:26
testfile-expected.py Caio.Romao, 2011-10-23 01:26
testfile-issue.py Caio.Romao, 2011-10-23 01:26
tab_test.py roger.serwy, 2011-12-22 01:40
save_strings.patch Jonathan.Rogers, 2012-01-01 06:07 Patch which replaces tab characters in strings with '\t'
issue12930.patch roger.serwy, 2012-09-30 22:19 review
test_reindent.py roger.serwy, 2012-09-30 22:21
Messages (21)
msg143679 - (view) Author: Dima Tisnek (Dima.Tisnek) Date: 2011-09-07 14:29
Given this as input:
#!/usr/bin/python
def x():
  s = """line one
line two
line three"""
  return s

reindent.py changes it to:
#!/usr/bin/python
def x():
    s = """line one
  line two
  line three"""
    return s


Which means that I cannot use reindent.py on any source that includes multiline literals that are not docs.

Btw, it is generally weird that reindented file ends up with 2 spaces before "line two".
msg143715 - (view) Author: Caio Romão (Caio.Romao) Date: 2011-09-08 00:23
This patch fixes the reported issue.

First time contributor here, feel free to bash.
msg143718 - (view) Author: Caio Romão (Caio.Romao) Date: 2011-09-08 01:57
New patch, fixing issue pointed out by gutworth and some others that came up.

I've used the following as a test input:

-----8<----------8<--------
#!/usr/bin/python
def x():
  """
  This is a doc
  """
  '''
  Another doc.'''
  s = """line one
line two
      line three
      '''
line five
"""
  var = '''test'''
  """Third doc"""
  return s
-----8<----------8<--------

The patch got way bigger than the initial version and feels a little hackish, but I couldn't come up with something better.
msg143999 - (view) Author: Caio Romão (Caio.Romao) Date: 2011-09-14 00:36
Third version, with slightly less code and my name added to the Misc/ACKS file.
msg144291 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-09-19 15:44
Follow the “review” link for some comments.

Do you have a test file that I could use to reproduce the bug and make sure it’s fixed?
msg144313 - (view) Author: Caio Romão (Caio.Romao) Date: 2011-09-19 23:03
New patch version ack-ing Éric's suggestion.

Note: I'm now confused as to whether I should add my name to the ACKS file or not, heh. This patch doesn't include the change.
msg144314 - (view) Author: Caio Romão (Caio.Romao) Date: 2011-09-19 23:17
Attaching files for testing in a gzipped tarball:

 - testfile-original.py: file to be reindented with reindent.py
 - testfile-issue.py: resulting file after using the current Tools/scripts/reindent.py
 - testfile-expected.py: expected output file; the result after applying the patch in this thread
msg144321 - (view) Author: Dima Tisnek (Dima.Tisnek) Date: 2011-09-20 07:51
Thanks Caio, your test case covers my issue; seeing these spelt out got me thinking, there are perhaps 3~4 different cases:

def f0():
  s = """select
some sql
from
somewhere;
-- cannot be reindented"""

def f1():
  """ Multiline 
  text docstring
  should be reindented """

def f2():
  """ should example be reindented inside docstring?
  if f2():
    print "great"
  """

def f3():
  """ # should doctest be reindented inside docstring?
  >>> if f3():
  ...   print "yes"
  ... else:
  ...   print "no"
  ...
  no
  """
msg146112 - (view) Author: Caio Romão (Caio.Romao) Date: 2011-10-21 18:42
It's been a while since this got any activity. Was the provided testfile not enough or any issue found? Just let me know and I'll make adjustments asap.
msg146169 - (view) Author: Dima Tisnek (Dima.Tisnek) Date: 2011-10-22 12:05
good enough for me :)
msg146209 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-10-23 00:56
I want to look at this but lack time right now.  Could someone make one up-to-date patch with all changes, code and tests?  It will be much easier to review and test than an archive.
msg146212 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-10-23 01:09
Forget that, there are no automated tests for tools.  So, a text version of the files would be nice.
msg146213 - (view) Author: Caio Romão (Caio.Romao) Date: 2011-10-23 01:26
Attaching files from tarball as requested. See http://bugs.python.org/issue12930#msg144314 for explanation
msg148063 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-11-21 17:10
Thanks, I’ll make time to review the change this week.
msg150064 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2011-12-22 01:40
I applied the patch and it failed against the attached "tab_test.py" file. For reference, every '\t' in the file is followed by "Tab".
msg150425 - (view) Author: Jonathan Rogers (Jonathan.Rogers) Date: 2012-01-01 04:09
I don't think reindent.py should change any bytes inside string literals since it can't know anything about what those strings mean or how they'll be used by the program at run time. Unfortunately, it starts out by unconditionally calling the .expandtabs() method on each input line, so tab characters are lost. The only change to a string literal I can imagine that would be safe is to replace tab characters with '\t'.

I am trying to use reindent.py on Python source files which include triple-quoted, multi-line string literals containing makefile and Python snippets. In both cases, running reindent.py changes the meaning of of that contained in the literal.
msg150426 - (view) Author: Jonathan Rogers (Jonathan.Rogers) Date: 2012-01-01 06:07
Rather than expanding tab characters inside string literals, it's safer to replace them with '\t'.
msg171678 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2012-09-30 22:19
This issue also affects reindent.py in Python 3.x.

The attached patch adds a "-s" switch to reindent.py to disallow changes to the contents of strings.

The rstrip_and_expand_tabs function tokenizes the input to identify all strings and then expands and rstrips all parts of the input not within a string.
msg171679 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2012-09-30 22:21
Attached is a simple test of the "-s" functionality for reindent.py. It works on Linux but I'm not sure about Windows.
msg171682 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-30 23:07
Definitely a bugfix should not add any new switches. Patch would simply lead to preserving of all string literals and nothing more. I will look at Caio's patches a little later.

Now Python contains automated tests for some tools.
msg171685 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2012-10-01 00:57
>
> Definitely a bugfix should not add any new switches. Patch would simply lead to preserving of all string literals and nothing more.
I added the switch so that the existing behavior of reindent would stay 
the same in case some other scripts rely on that behavior. If you'd 
rather not include the switch, then I'm ok with removing it.

The patch causes reindent to no longer removes trailing whitespace on 
multi-line doc strings. Is that side-effect acceptable?
History
Date User Action Args
2012-10-01 00:57:27roger.serwysetmessages: + msg171685
2012-09-30 23:07:32serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg171682
2012-09-30 22:21:26roger.serwysetfiles: + test_reindent.py

messages: + msg171679
2012-09-30 22:19:55roger.serwysetfiles: + issue12930.patch

messages: + msg171678
versions: + Python 3.2, Python 3.3, Python 3.4
2012-01-01 06:07:50Jonathan.Rogerssetfiles: + save_strings.patch

messages: + msg150426
2012-01-01 04:09:38Jonathan.Rogerssetnosy: + Jonathan.Rogers
messages: + msg150425
2011-12-22 01:40:06roger.serwysetfiles: + tab_test.py
nosy: + roger.serwy
messages: + msg150064

2011-11-21 17:10:55eric.araujosetdependencies: + Add tests for some scripts in Tools/scripts
messages: + msg148063
2011-10-23 01:26:38Caio.Romaosetfiles: + testfile-issue.py
2011-10-23 01:26:25Caio.Romaosetfiles: + testfile-expected.py
2011-10-23 01:26:11Caio.Romaosetfiles: + testfile-original.py

messages: + msg146213
2011-10-23 01:09:49eric.araujosetmessages: + msg146212
2011-10-23 00:56:34eric.araujosetmessages: + msg146209
2011-10-22 12:05:33Dima.Tisneksetmessages: + msg146169
2011-10-21 18:42:31Caio.Romaosetmessages: + msg146112
2011-09-20 07:51:05Dima.Tisneksetmessages: + msg144321
2011-09-19 23:17:15Caio.Romaosetfiles: + bug12930-testfiles.tgz

messages: + msg144314
2011-09-19 23:03:56Caio.Romaosetfiles: + caioromao-fix-12930-v4.patch

messages: + msg144313
2011-09-19 23:02:02Caio.Romaosetfiles: - caioromao-fix-12930-v3.patch
2011-09-19 15:44:15eric.araujosetnosy: + eric.araujo
messages: + msg144291
2011-09-14 00:36:41Caio.Romaosetfiles: + caioromao-fix-12930-v3.patch

messages: + msg143999
2011-09-14 00:34:38Caio.Romaosetfiles: - caioromao-fix-12930-v2.patch
2011-09-08 01:57:33Caio.Romaosetfiles: + caioromao-fix-12930-v2.patch

messages: + msg143718
2011-09-08 00:45:00Caio.Romaosetfiles: - caioromao-fix-12930.patch
2011-09-08 00:23:31Caio.Romaosetfiles: + caioromao-fix-12930.patch

nosy: + Caio.Romao
messages: + msg143715

keywords: + patch
2011-09-07 14:29:37Dima.Tisnekcreate