Message91397
Unicode includes Line Separator U+2028 and Paragraph Separator U+2029
line ending characters. The readlines method of the file object returned
by the built-in open does not treat these characters as line ends
although the object returned by codecs.open(..., encoding='utf-8') does.
The attached program creates a UTF-8 file containing three lines with
the second line ended with a Paragraph Separator. The program then reads
this file back in as a text file. Only two lines are seen when reading
the file back in.
The desired behaviour is for the file to be read in as three lines. |
|
Date |
User |
Action |
Args |
2009-08-07 09:14:15 | nyamatongwe | set | recipients:
+ nyamatongwe |
2009-08-07 09:14:15 | nyamatongwe | set | messageid: <1249636455.06.0.336549280075.issue6664@psf.upfronthosting.co.za> |
2009-08-07 09:14:13 | nyamatongwe | link | issue6664 messages |
2009-08-07 09:14:12 | nyamatongwe | create | |
|