This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: bytes.decode changes/destroys line endings on windows
Type: behavior Stage: resolved
Components: Unicode, Windows Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Matthias Naegler, ezio.melotti, paul.moore, steve.dower, steven.daprano, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2020-06-04 14:07 by Matthias Naegler, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test.png Matthias Naegler, 2020-06-04 18:01
Messages (5)
msg370711 - (view) Author: Matthias Naegler (Matthias Naegler) Date: 2020-06-04 14:07
```
# 0x0D, 13 = /r
# 0x0A, 10 = /n

print('test "\\r\\n"')
print('-------------')
b = bytes([0x41, 0x0D, 0x0A])
print("bytes: %s" % b)
print("string: %s" % b.decode('utf8'), end='')
# expected string: "A\r\n"
# ressult string: "A\r\r\n"

print('test "\\n"')
print('----------')
b = bytes([0x41, 0x0A])
print("bytes: %s" % b)
print("string: %s" % b.decode('utf8'), end='')
# expected string: "A\n"
# ressult string: "A\r\n"
```

It seems like bytes.decode always replaces "\n" with "\r\n".

Tested with
Windows 10 with Python:
- 3.6.0
- 3.7.7
- 3.8.3-rc1
- 3.8.3
- 2.7.18 (works as expected)
msg370714 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-06-04 15:35
You don't need `b = bytes([0x41, 0x0D, 0x0A])`, this will work just as well:

    b = b'\x41\x0D\x0A'

and this is even better:

    b = b'A\r\n'


> It seems like bytes.decode always replaces "\n" with "\r\n".

What makes you say that? You are passing the output through print, which does things with newlines and carriage returns. Try this, for example:

    py> print('ABCD\rZ\n', end='')
    ZBCD


The best way to see what your string actually contains is the print the repr:

    print(repr(b.decode('utf-8')))

which I'm pretty sure will print A\r\n as you expect. But I don't have a Windows box to test it on.
msg370719 - (view) Author: Matthias Naegler (Matthias Naegler) Date: 2020-06-04 18:01
Thanks Steven for your fast response.

> The best way to see what your string actually contains is the print the repr:
You are right. bytes.decode is correct.
Im not a python expert, so thanks for the note about "repr". With repr(...) everything looks fine.

Nevertheless, I get an additional \r in my output. Not sure if it is a problem of python, windows or just me.

I get the following output with the python interpretor:

Python 3.8.3rc1 (tags/v3.8.3rc1:802eb67, Apr 29 2020, 21:39:14) [MSC v.1924 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding
'utf-8'
>>> b =  b'A\r\nA'
>>> s = b.decode('utf-8')
>>> print(b)
b'A\r\nA'
>>> print(repr(s))
'A\r\nA'
>>> print(s)
A
A
>>> sys.stdout.write(s)
A
A4
>>> with open("./test.txt", "a") as myfile:
...     myfile.write(s)

This all looks right. But the file doesn't (see attached screenshot).

I also get an additional \r in the output file if i run the script throught "python test.py > piped.txt"
msg370720 - (view) Author: Matthias Naegler (Matthias Naegler) Date: 2020-06-04 18:10
I forgot something important. Using open with 'ab' works.

>>> ...above code...
with open("./test_binary.txt", "ab") as myfile:
...     myfile.write(s.encode('utf-8'))
msg370721 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2020-06-04 18:18
Because if you open a file in text mode (without "b" in the mode), Python writes \n (newline) characters as \r\n (carriage return, line feed) which are the Windows textfile representation of "Newline".

From the documentation of the built in open() function,

"When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep."
History
Date User Action Args
2022-04-11 14:59:32adminsetgithub: 85040
2020-06-04 18:29:46eryksunsetstatus: open -> closed
stage: resolved
2020-06-04 18:18:23paul.mooresetresolution: not a bug
messages: + msg370721
2020-06-04 18:10:38Matthias Naeglersetmessages: + msg370720
2020-06-04 18:01:00Matthias Naeglersetfiles: + test.png

messages: + msg370719
2020-06-04 16:14:48vstinnersetnosy: - vstinner
2020-06-04 15:35:05steven.dapranosetnosy: + steven.daprano
messages: + msg370714
2020-06-04 14:07:03Matthias Naeglercreate