Message91999
Ryan McGuire wrote:
>
> New submission from Ryan McGuire <python.org@enigmacurry.com>:
>
> Opening a UTF-8 encoded file with unix newlines ("\n") on Win32:
>
> codecs.open("whatever.txt","r","utf-8").read()
>
> replaces the newlines ("\n") with CR+LF ("\r\n").
>
> The docs specifically say that :
>
> "Files are always opened in binary mode, even if no binary mode was
> specified. This is done to avoid data loss due to encodings using 8-bit
> values. This means that no automatic conversion of '\n' is done on
> reading and writing."
>
> And yet, opening the file with an explicit binary mode resolves the
> situation:
>
> codecs.open("whatever.txt","rb","utf-8").read()
>
> This reads the file with the original newlines unmodified.
>
> The implementation of codecs.open and the documentation are out of sync.
The implementation looks like this:
if encoding is not None and \
'b' not in mode:
# Force opening of the file in binary mode
mode = mode + 'b'
in both Python 2 and 3, so I'm not sure what could be causing
this. |
|
Date |
User |
Action |
Args |
2009-08-27 09:36:45 | lemburg | set | recipients:
+ lemburg, georg.brandl, EnigmaCurry |
2009-08-27 09:36:43 | lemburg | link | issue6788 messages |
2009-08-27 09:36:43 | lemburg | create | |
|