This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: meaningful whitespace can be lost in rfc822_escape
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: tarek Nosy List: christian.heimes, hodgestar, stephenemslie, tarek
Priority: low Keywords: easy

Created on 2008-01-24 15:43 by stephenemslie, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
distutils_metadata_whitespace.diff stephenemslie, 2008-01-28 11:09
Messages (7)
msg61633 - (view) Author: Stephen Emslie (stephenemslie) Date: 2008-01-24 15:43
distutils.util.rfc822_escape strips each line of its whitespace before
indenting, but this can mean losing meaningful whitespace, such as in
reStructuredText.


distutils uses rfc822_escape to escape fields in metadata, such as
PKG-INFO. This unfortunately means that you cant use reStructuredText
formatting in your long description (suggested in PEP345), or are
limited to a set that doesn't require indentation (no block quotes, etc.).

for example:

>>> rest = """
... a literal python block::
...     >>> import this
... """
>>> print distutils.util.rfc822_escape(rest)

       a literal python block::
       >>> import this

I would be expecting this to look something like:

       a literal python block::
           >>> import this


It looks like this behavior was intentionally added in  rev 20099, but
that was about 7 years ago - before reStructuredText and eggs. I
wonder if it makes sense to re-think that implementation with this
sort of metadata in mind, assuming this behavior isn't required to be
rfc822 compliant. I think it would certainly be a shame to miss out on
a good thing like proper (renderable) reST in our metadata.

Is distutils being over-cautious in flattening out all whitespace? A
w3c discussion on multiple lines in rfc822 [1] seems to suggest that
whitespace can be 'unfolded' safely, so it seems a shame to be
throwing it away when it can have important meaning.

http://www.w3.org/Protocols/rfc822/3_Lexical.html
msg61653 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-24 20:37
Can you provide a patch with doc updates and an unit test?
msg61775 - (view) Author: Stephen Emslie (stephenemslie) Date: 2008-01-28 11:09
Here's that keeps the whitespace in tact, along with a simple test. This
doesn't patch docs as the existing documentation_ already describes the
long string as multiple lines of "plain text in reStructuredText
format", which is what this fixes.

.. _documentation:
http://docs.python.org/dev/distutils/setupscript.html#additional-meta-data
msg72104 - (view) Author: Simon Cross (hodgestar) Date: 2008-08-28 18:38
I've just checked that the patch still applies cleanly to 2.6 and it
does and the tests still passes. It looks like the patch has already
been applied to 3.0 but without the test. The test part of the part
applies cleanly to 3.0 too.
msg72419 - (view) Author: Simon Cross (hodgestar) Date: 2008-09-03 21:22
Poking the issue.
msg95979 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-12-05 02:22
Notice that we are also losing something else that can mean a lot
in reST : empty lines. They also need to be escaped.

But we can't do it properly unless we encode empty lines with something
else than a 8 space line because when rfc822.Message reads it, it
removes it.
msg96022 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-12-06 09:31
I will treat the empty line problem in another issue because I won't
apply it in 2.6/3.1.

This one is fixed in r76684, r76685, r76686, r76687.

Thanks !
History
Date User Action Args
2022-04-11 14:56:30adminsetgithub: 46217
2009-12-06 09:31:42tareksetstatus: open -> closed

messages: + msg96022
versions: + Python 2.6, Python 3.1, Python 2.7, Python 3.2, - Python 2.5
2009-12-05 02:22:10tareksetmessages: + msg95979
2009-12-04 13:26:24pitrousetassignee: tarek

nosy: + tarek
2008-09-03 21:22:59hodgestarsetmessages: + msg72419
2008-08-28 18:38:20hodgestarsetnosy: + hodgestar
messages: + msg72104
2008-01-28 11:09:38stephenemsliesetfiles: + distutils_metadata_whitespace.diff
messages: + msg61775
2008-01-24 20:37:59christian.heimessetpriority: low
keywords: + easy
messages: + msg61653
nosy: + christian.heimes
2008-01-24 15:43:59stephenemsliecreate