Issue34437
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018-08-19 23:24 by Nathan Benson, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Messages (2) | |||
---|---|---|---|
msg323773 - (view) | Author: Nathan Benson (Nathan Benson) | Date: 2018-08-19 23:24 | |
While writing some shellcode I uncovered an unusual bug where Python 3 seems to print out incorrect (and extra) hex bytes using the print statement with \x. Needless to say I was pulling my hair out trying to figure out why my shellcode wasn’t working. Python 2 behaves as expected. I haven't tested the latest version of Python 3, but all the versions prior to that seem to have the bug. I’ve been able to reproduce the bug in Ubuntu Linux and on my Mac. An example printing "\xfd\x84\x04\x08" I expect to get back "fd 84 04 08", but Python 3 seems to add bytes beginning with c2 and c3 and tosses in random bytes. For the purpose of these demonstrations: Akame:~ jfa$ python2 --version Python 2.7.15 Akame:~ jfa$ python3 --version Python 3.7.0 Here is Python 2 operating as expected: Akame:~ jfa$ python2 -c 'print("\xfd\x84\x04\x08")' | hexdump -C 00000000 fd 84 04 08 0a |.....| 00000005 Here is Python 3 with the exact same print statement: Akame:~ jfa$ python3 -c 'print("\xfd\x84\x04\x08")' | hexdump -C 00000000 c3 bd c2 84 04 08 0a |.......| 00000007 There are 6 bytes not 4 and where did the c3, bd, and c2 come from? Playing around with it a little bit more it seems like the problem arises when you are printing bytes that start with a-f or 8 or 9: Here is a-f: Akame:~ jfa$ for b in {a..f}; do echo "\x${b}0"; python3 -c "print(\"\x${b}0\")" | hexdump -C; done \xa0 00000000 c2 a0 0a |...| 00000003 \xb0 00000000 c2 b0 0a |...| 00000003 \xc0 00000000 c3 80 0a |...| 00000003 \xd0 00000000 c3 90 0a |...| 00000003 \xe0 00000000 c3 a0 0a |...| 00000003 \xf0 00000000 c3 b0 0a |...| 00000003 Here is 0-9 (notice everything is fine until 8): Akame:~ jfa$ for b in {0..9}; do echo "\x${b}0"; python3 -c "print(\"\x${b}0\")" | hexdump -C; done \x00 00000000 00 0a |..| 00000002 \x10 00000000 10 0a |..| 00000002 \x20 00000000 20 0a | .| 00000002 \x30 00000000 30 0a |0.| 00000002 \x40 00000000 40 0a |@.| 00000002 \x50 00000000 50 0a |P.| 00000002 \x60 00000000 60 0a |`.| 00000002 \x70 00000000 70 0a |p.| 00000002 \x80 00000000 c2 80 0a |...| 00000003 \x90 00000000 c2 90 0a |...| 00000003 Here are the same tests with Python 2: Akame:~ jfa$ for b in {a..f}; do echo "\x${b}0"; python2 -c "print(\"\x${b}0\")" | hexdump -C; done \xa0 00000000 a0 0a |..| 00000002 \xb0 00000000 b0 0a |..| 00000002 \xc0 00000000 c0 0a |..| 00000002 \xd0 00000000 d0 0a |..| 00000002 \xe0 00000000 e0 0a |..| 00000002 \xf0 00000000 f0 0a |..| 00000002 Akame:~ jfa$ for b in {0..9}; do echo "\x${b}0"; python2 -c "print(\"\x${b}0\")" | hexdump -C; done \x00 00000000 00 0a |..| 00000002 \x10 00000000 10 0a |..| 00000002 \x20 00000000 20 0a | .| 00000002 \x30 00000000 30 0a |0.| 00000002 \x40 00000000 40 0a |@.| 00000002 \x50 00000000 50 0a |P.| 00000002 \x60 00000000 60 0a |`.| 00000002 \x70 00000000 70 0a |p.| 00000002 \x80 00000000 80 0a |..| 00000002 \x90 00000000 90 0a |..| 00000002 As you can see Python 2 works as expected and Python 3, when printing using \x[a-f08], seem to cause the byte to be replaced with a c2 or c3 and another byte of data. |
|||
msg323774 - (view) | Author: Steven D'Aprano (steven.daprano) * ![]() |
Date: 2018-08-20 00:20 | |
You wrote: > There are 6 bytes not 4 and where did the c3, bd, and c2 come from? In Python 2, strings are byte strings, in Python 3, strings by default are Unicode text strings. You are seeing the UTF-8 representation of the text string. py> "\xfd\x84\x04\x08".encode('utf-8') b'\xc3\xbd\xc2\x84\x04\x08' So the behaviour in Python 3 is correct and not a bug, it has just changed (intentionally) from Python 2. Googling may help you find more about this: https://duckduckgo.com/?q=python3+write+bytes+to+stdout |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:04 | admin | set | github: 78618 |
2018-08-20 00:20:07 | steven.daprano | set | status: open -> closed nosy: + steven.daprano messages: + msg323774 resolution: not a bug stage: resolved |
2018-08-19 23:24:40 | Nathan Benson | create |