Message 26980 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	nherring
Recipients
Date	2005-12-04.10:53:10
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
The Header class describes its operation as using the continuation_ws parameter, prepending the value to continuation lines. This has the byproduct of possibly converting pre-existing FWS in a header, as evidenced by mailman 2.1.6 which exposes the problem. If the Header class is passed pre-existing Header lines, which in the mailman case is from the original message, and, without any manipulation, ask it for the encoded version, it replaces the original folding with the continuation_ws characters. Given that RFC 2822 unfolding consists only of removing CRLFs, exchanging out the FWS characters changes the logical content of a header value. Standard folding of us-ascii text should only consist of introducing line breaks in front of original FWS in the header line. In the case where the encoding of the source string requires multiple adjacent RFC 2047 encoded-words (either due to disparate encodings or text length), then FWS can be freely inserted and is treated as an artifact of the encoding. It is ignored on reading and as such it doesn't affect the logical content of the header value. It's in this latter case that the continuation_ws parameter should be used. e.g., #!/usr/bin/python -d from email.Header import Header s = "Thread-Topic: Use of tabs when folding header lines -- increasing subject\n length as a test\n" print Header(s, 'us-ascii', None, None, '\t') This script will have replaced the space in front of the word "length" with a tab. It should retain that space and not convert it to the continuation_ws character.

The Header class describes its operation as using the continuation_ws 
parameter, prepending the value to continuation lines. This has the 
byproduct of possibly converting pre-existing FWS in a header, as 
evidenced by mailman 2.1.6 which exposes the problem.

If the Header class is passed pre-existing Header lines, which in the 
mailman case is from the original message, and, without any 
manipulation, ask it for the encoded version, it replaces the original 
folding with the continuation_ws characters.

Given that RFC 2822 unfolding consists only of removing CRLFs, 
exchanging out the FWS characters changes the logical content of a 
header value. Standard folding of us-ascii text should only consist of 
introducing line breaks in front of original FWS in the header line. In 
the case where the encoding of the source string requires multiple 
adjacent RFC 2047 encoded-words (either due to disparate encodings 
or text length), then FWS can be freely inserted and is treated as an 
artifact of the encoding. It is ignored on reading and as such it doesn't 
affect the logical content of the header value. It's in this latter case that 
the continuation_ws parameter should be used.

e.g., 

#!/usr/bin/python -d
from email.Header import Header
s = "Thread-Topic: Use of tabs when folding header lines -- increasing 
subject\n length as a test\n"
print Header(s, 'us-ascii', None, None, '\t')

This script will have replaced the space in front of the word "length" 
with a tab. It should retain that space and not convert it to the 
continuation_ws character.

History
Date	User	Action	Args
2007-08-23 14:36:33	admin	link	issue1372770 messages
2007-08-23 14:36:33	admin	create