Issue 24551: byte conversion

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/68739

classification

Title:	byte conversion
Type:	behavior	Stage:	resolved
Components:		Versions:	Python 3.4

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	Padmanabhan.Tr, steven.daprano
Priority:	normal	Keywords:

Created on 2015-07-02 16:05 by Padmanabhan.Tr, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (6)
msg246086 - (view)	Author: Padmanabhan Tr (Padmanabhan.Tr) *	Date: 2015-07-02 16:08
I have copied below an execution sequence. What is the problem? >>> x = 8240 >>> x.to_bytes(4,byteorder='big') b'\x00\x00 0' >>> int.from_bytes(b'\x00\x00 0',byteorder='big') 8240 >>> int.from_bytes(b'\x20\x30',byteorder='big') 8240 >>>
msg246088 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2015-07-02 16:50
I don't know, what is the problem? What behaviour did you expect? The code sample you show seems to be working exactly as it is supposed to. b'\x00\x00 0' is the same as b'\x00\x00\x20\x30', and that is the same as b'\x20\x30' with NUL padding on the left. Written as integers, that is like 0x00002030 == 0x2030 == 8240. I don't think this demonstrates a bug or problem. If you still believe it does, please re-open the issue with a detailed description of what behaviour you expected and why you think this is a bug.
msg246355 - (view)	Author: Padmanabhan Tr (Padmanabhan.Tr) *	Date: 2015-07-06 11:21
Dear Mr Steven D'ApranoThanks for your prompt response.I guess that 'b'\x00\x00 0' is the same as b'\x00\x00\x20\x30' if we take (space) as 20 & 0 as 30 as with ASCII / UTF-8 representation. But if I go by 'Python Library Reference -Release version 3.4.2 (Section 4.4.2) ' there is no room for ASCII / UTF-8 representation here. Direct byte conversion is used.Please confirm whether I am right.RegardsPadmanabhan On Thursday, July 2, 2015 10:20 PM, Steven D'Aprano <report@bugs.python.org> wrote: Steven D'Aprano added the comment: I don't know, what is the problem? What behaviour did you expect? The code sample you show seems to be working exactly as it is supposed to. b'\x00\x00 0' is the same as b'\x00\x00\x20\x30', and that is the same as b'\x20\x30' with NUL padding on the left. Written as integers, that is like 0x00002030 == 0x2030 == 8240. I don't think this demonstrates a bug or problem. If you still believe it does, please re-open the issue with a detailed description of what behaviour you expected and why you think this is a bug. ---------- nosy: +steven.daprano resolution: -> not a bug status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue24551> _______________________________________
msg246421 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2015-07-07 15:52
Bytes in Python 3 do use ASCII representation: py> b'\x41' == b'A' # ASCII True If you think the documentation is unclear, please tell us which part of the docs you read (provide a URL) and we will see if it can be improved.
msg246457 - (view)	Author: Padmanabhan Tr (Padmanabhan.Tr) *	Date: 2015-07-08 14:50
On Wednesday, July 8, 2015 7:56 PM, padmanabhan T R <trpuma@yahoo.com> wrote: Dear Mr Steven D'ApranoI have not gone through the relevant Source Codes; purely based on my working with Python3 (Version 3.4.2) and the 'The Python Library Reference manual, Release 3.4.2' document, I have the following to suggest as additions to this Manual: - Insert the following under 'codecs.decode(obj [,encoding[,errors]])' - Section 7.2, Page 142 : When a bytes objectis decoded with 'hex' decoding, the corresponding returned array hasASCII characters for byte pairs wherever possible; other byte pairsappear as such. The reverse holds good for encoding. >>> import codecs >>> codecs.encode(b'\x1d\x1e\x1f !"','hex') b'1d1e1f202122' >>> codecs.encode(b'\x1d\x1e\x1f\x20\x21\x22','hex') b'1d1e1f202122' >>> codecs.decode(b'1d1e1f202122','hex') b'\x1d\x1e\x1f !"' >>> codecs.encode(_,'hex') b'1d1e1f202122' >>> codecs.decode(b'3031323334','hex') b'01234' >>> codecs.encode(_,'hex') b'3031323334' >>> codecs.decode(b'797a7b7c7d7e7f8081','hex') b'yz{\|}~\x7f\x80\x81' >>> codecs.encode(_,'hex') b'797a7b7c7d7e7f8081' >>> codecs.encode(b'\x79\x7a\x7b\x7c\x7d\7e\x7f\x80\x81','hex') b'797a7b7c7d07657f8081' >>> - Under 'int.to_bytes() - classmethod int.to_bytes()' - Section 4.4.2, Page 31 insert: 'See codecs.decode() also' - Under 'int.to_bytes() - classmethod int.frombytes()' - Section 4.4.2, Page 31 insert: 'See codecs.decode() also' - Under 'classmethod bytes.fromhex(string)' - Section 7.2, Page 142 insert: 'See codecs.decode() also' Padmanabhanm On Wednesday, July 8, 2015 8:57 AM, padmanabhan T R <trpuma@yahoo.com> wrote: On Tuesday, July 7, 2015 9:22 PM, Steven D'Aprano <report@bugs.python.org> wrote: Steven D'Aprano added the comment: Bytes in Python 3 do use ASCII representation: py> b'\x41' == b'A' # ASCII True If you think the documentation is unclear, please tell us which part of the docs you read (provide a URL) and we will see if it can be improved. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue24551> _______________________________________
msg246476 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2015-07-09 00:43
I'm sorry, but I believe that you have misunderstood what happens here. This has nothing to do with the hex codec, or int.to_bytes() etc. This is the standard property of byte strings in Python, that they are displayed using ASCII as much as possible. The byte string b'\x41' is just the hex escape form for the byte string b'A' (ASCII capital A). It doesn't matter whether you use ASCII, decimal, octal or hexadecimal, you get an equal byte string: py> b'A' == bytes([65]) == b'\101' == b'\x41' True with the same internal byte value. When you print a byte string, Python prefers to display it using ASCII characters where possible regardless of whether it was constructed from backslash escapes or not. So the byte string b'\x41 A \xEF' prints as b'A A \xef' because \x41 is the ASCII character A, but \xEF has no ASCII representation so it remains in hex escape form. It doesn't matter where that byte string comes from: the hex codec, int.to_bytes, or somewhere else. I don't think this is appropriate to continue discussing this on the bug tracker. If you would like to continue the discussion, please join the python-list@python.org mailing list, or comp.lang.python newsgroup, and I'll be happy to discuss it further there.

History
Date	User	Action	Args
2022-04-11 14:58:18	admin	set	github: 68739
2015-07-09 02:47:27	zach.ware	set	components: - Demos and Tools
2015-07-09 02:47:09	zach.ware	set	stage: resolved
2015-07-09 00:43:42	steven.daprano	set	messages: + msg246476
2015-07-08 14:50:57	Padmanabhan.Tr	set	messages: + msg246457
2015-07-07 15:52:34	steven.daprano	set	messages: + msg246421
2015-07-06 11:21:24	Padmanabhan.Tr	set	messages: + msg246355
2015-07-02 16:50:40	steven.daprano	set	status: open -> closed nosy: + steven.daprano messages: + msg246088 resolution: not a bug
2015-07-02 16:08:07	Padmanabhan.Tr	set	messages: + msg246086 components: + Demos and Tools
2015-07-02 16:05:38	Padmanabhan.Tr	create