This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: byte conversion
Type: behavior Stage: resolved
Components: Versions: Python 3.4
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Padmanabhan.Tr, steven.daprano
Priority: normal Keywords:

Created on 2015-07-02 16:05 by Padmanabhan.Tr, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (6)
msg246086 - (view) Author: Padmanabhan Tr (Padmanabhan.Tr) * Date: 2015-07-02 16:08
I have copied below an execution sequence. What is the problem?

>>> x = 8240
>>> x.to_bytes(4,byteorder='big')
b'\x00\x00 0'
>>> int.from_bytes(b'\x00\x00 0',byteorder='big')
8240
>>> int.from_bytes(b'\x20\x30',byteorder='big')
8240
>>>
msg246088 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2015-07-02 16:50
I don't know, what *is* the problem? What behaviour did you expect? The code sample you show seems to be working exactly as it is supposed to.

b'\x00\x00 0' is the same as b'\x00\x00\x20\x30', and that is the same as b'\x20\x30' with NUL padding on the left. Written as integers, that is like 0x00002030 == 0x2030 == 8240.

I don't think this demonstrates a bug or problem. If you still believe it does, please re-open the issue with a detailed description of what behaviour you expected and why you think this is a bug.
msg246355 - (view) Author: Padmanabhan Tr (Padmanabhan.Tr) * Date: 2015-07-06 11:21
Dear Mr Steven D'ApranoThanks for your prompt response.I guess that 'b'\x00\x00 0' is the same as b'\x00\x00\x20\x30' if we take (space) as 20 & 0 as 30 as with ASCII / UTF-8 representation.  But if I go by 'Python Library Reference -Release version 3.4.2 (Section 4.4.2) ' there is no room for ASCII / UTF-8 representation here.  Direct byte conversion is used.Please confirm whether I am right.RegardsPadmanabhan

     On Thursday, July 2, 2015 10:20 PM, Steven D'Aprano <report@bugs.python.org> wrote:

Steven D'Aprano added the comment:

I don't know, what *is* the problem? What behaviour did you expect? The code sample you show seems to be working exactly as it is supposed to.

b'\x00\x00 0' is the same as b'\x00\x00\x20\x30', and that is the same as b'\x20\x30' with NUL padding on the left. Written as integers, that is like 0x00002030 == 0x2030 == 8240.

I don't think this demonstrates a bug or problem. If you still believe it does, please re-open the issue with a detailed description of what behaviour you expected and why you think this is a bug.

----------
nosy: +steven.daprano
resolution:  -> not a bug
status: open -> closed

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue24551>
_______________________________________
msg246421 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2015-07-07 15:52
Bytes in Python 3 do use ASCII representation:

py> b'\x41' == b'A'  # ASCII
True

If you think the documentation is unclear, please tell us which part of the docs you read (provide a URL) and we will see if it can be improved.
msg246457 - (view) Author: Padmanabhan Tr (Padmanabhan.Tr) * Date: 2015-07-08 14:50
On Wednesday, July 8, 2015 7:56 PM, padmanabhan T R <trpuma@yahoo.com> wrote:

 Dear Mr Steven D'ApranoI have not gone through the relevant Source Codes; purely based on my working with Python3 (Version 3.4.2) and the 'The Python Library Reference manual, Release 3.4.2' document, I have the following to suggest as additions to this Manual:   
   - Insert the following under 'codecs.decode(obj [,encoding[,errors]])' - Section 7.2, Page 142 :   

When a bytes objectis decoded with 'hex' decoding, the corresponding returned array hasASCII characters for byte pairs wherever possible; other byte pairsappear as such. The reverse holds good for encoding.
>>> import codecs
>>> codecs.encode(b'\x1d\x1e\x1f !"','hex')
b'1d1e1f202122'
>>> codecs.encode(b'\x1d\x1e\x1f\x20\x21\x22','hex')
b'1d1e1f202122'
>>> codecs.decode(b'1d1e1f202122','hex')
b'\x1d\x1e\x1f !"'
>>> codecs.encode(_,'hex')
b'1d1e1f202122'
>>> codecs.decode(b'3031323334','hex')                   
b'01234'
>>> codecs.encode(_,'hex')
b'3031323334'
>>> codecs.decode(b'797a7b7c7d7e7f8081','hex')
b'yz{|}~\x7f\x80\x81'
>>> codecs.encode(_,'hex') 
b'797a7b7c7d7e7f8081'
>>> codecs.encode(b'\x79\x7a\x7b\x7c\x7d\7e\x7f\x80\x81','hex')
b'797a7b7c7d07657f8081'
>>> 

   - Under 'int.to_bytes() -  classmethod int.to_bytes()' - Section 4.4.2, Page 31 insert: 'See codecs.decode() also'
   - Under 'int.to_bytes() -  classmethod int.frombytes()' - Section 4.4.2, Page 31 insert: 'See codecs.decode() also'
   - Under 'classmethod bytes.fromhex(string)' - Section 7.2, Page 142 insert: 'See codecs.decode() also'
Padmanabhanm

     On Wednesday, July 8, 2015 8:57 AM, padmanabhan T R <trpuma@yahoo.com> wrote:

      On Tuesday, July 7, 2015 9:22 PM, Steven D'Aprano <report@bugs.python.org> wrote:

Steven D'Aprano added the comment:

Bytes in Python 3 do use ASCII representation:

py> b'\x41' == b'A'  # ASCII
True

If you think the documentation is unclear, please tell us which part of the docs you read (provide a URL) and we will see if it can be improved.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue24551>
_______________________________________
msg246476 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2015-07-09 00:43
I'm sorry, but I believe that you have misunderstood what happens here. 
This has nothing to do with the hex codec, or int.to_bytes() etc. This 
is the standard property of byte strings in Python, that they are 
displayed using ASCII as much as possible.

The byte string b'\x41' is just the hex escape form for the byte string 
b'A' (ASCII capital A). It doesn't matter whether you use ASCII, 
decimal, octal or hexadecimal, you get an equal byte string:

    py> b'A' == bytes([65]) == b'\101' == b'\x41'
    True

with the same internal byte value. When you print a byte string, Python 
prefers to display it using ASCII characters where possible regardless 
of whether it was constructed from backslash escapes or not. So the byte 
string b'\x41 A \xEF' prints as b'A A \xef' because \x41 is the ASCII 
character A, but \xEF has no ASCII representation so it remains in hex 
escape form. It doesn't matter where that byte string comes from: the 
hex codec, int.to_bytes, or somewhere else.

I don't think this is appropriate to continue discussing this on the bug 
tracker. If you would like to continue the discussion, please join the 
python-list@python.org mailing list, or comp.lang.python newsgroup, and 
I'll be happy to discuss it further there.
History
Date User Action Args
2022-04-11 14:58:18adminsetgithub: 68739
2015-07-09 02:47:27zach.waresetcomponents: - Demos and Tools
2015-07-09 02:47:09zach.waresetstage: resolved
2015-07-09 00:43:42steven.dapranosetmessages: + msg246476
2015-07-08 14:50:57Padmanabhan.Trsetmessages: + msg246457
2015-07-07 15:52:34steven.dapranosetmessages: + msg246421
2015-07-06 11:21:24Padmanabhan.Trsetmessages: + msg246355
2015-07-02 16:50:40steven.dapranosetstatus: open -> closed

nosy: + steven.daprano
messages: + msg246088

resolution: not a bug
2015-07-02 16:08:07Padmanabhan.Trsetmessages: + msg246086
components: + Demos and Tools
2015-07-02 16:05:38Padmanabhan.Trcreate