Author nneonneo
Recipients martin.panter, ncoghlan, nneonneo, terry.reedy
Date 2016-12-17.18:24:09
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1481999049.76.0.0901294904638.issue28927@psf.upfronthosting.co.za>
In-reply-to
Content
I see your point, Nick. Can I offer a counterpoint?

Most of the string parsers operate only on relatively short inputs, like numbers. Numbers in particular are rarely written with inner spaces, so it makes sense not to ignore internal whitespaces.

On the other hand, hexadecimal data can be very long, and is often formatted with spaces and newlines. For example, the default output of `xxd -p`, a format quite suitable for copy-paste, looks like this:

cffaedfe07000001030000800200000015000000d8080000858021000000
000019000000480000005f5f504147455a45524f00000000000000000000
000000000000000001000000000000000000000000000000000000000000
000000000000000000000000000019000000180300005f5f544558540000
0000000000000000000000000100000000909d0100000000000000000000

It would be desirable to write something like

blob = bytes.fromhex('''
cffaedfe07000001030000800200000015000000d8080000858021000000
000019000000480000005f5f504147455a45524f00000000000000000000
000000000000000001000000000000000000000000000000000000000000
000000000000000000000000000019000000180300005f5f544558540000
0000000000000000000000000100000000909d0100000000000000000000
''')

and not have to worry about sticking in some whitespace remover, like this:

blob = bytes.fromhex(''.join('''
cffaedfe07000001030000800200000015000000d8080000858021000000
000019000000480000005f5f504147455a45524f00000000000000000000
000000000000000001000000000000000000000000000000000000000000
000000000000000000000000000019000000180300005f5f544558540000
0000000000000000000000000100000000909d0100000000000000000000
'''.split()))

or removing the newlines in the source code, which impacts readability. 

Similar kinds of whitespaced output (sometimes with spaces between octets, words or dwords, sometimes with tabs between 8-16 byte groups, sometimes with newlines between groups, etc.) can be found in the wild and from the "hex" clipboard output from various applications.

We can already have newlines and other whitespace with base64, which is in principle quite similar:

blob = base64.b64decode('''
z/rt/gcAAAEDAACAAgAAABUAAADYCAAAhYAhAAAAAAAZAAAASAAAAF9fUEFHRVpFUk8AAAAAAAAAAAAA
AAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAZAAAAGAMAAF9fVEVYVAAA
AAAAAAAAAAAAAAAAAQAAAACQnQEAAAAAAAAAAAAAAAAAkJ0BAAAAAAcAAAAFAAAACQAAAAAAAABfX3Rl
eHQAAAAAAAAAAAAAX19URVhUAAAAAAAAAAAAAAALAAABAAAARCF5AQAAAAAACwAACAAAAAAAAAAAAAAA
''')

so I think it makes sense to support other whitespaces in fromhex. I'm happy to reconsider if there's a strong argument against adding this convenience.
History
Date User Action Args
2016-12-17 18:24:09nneonneosetrecipients: + nneonneo, terry.reedy, ncoghlan, martin.panter
2016-12-17 18:24:09nneonneosetmessageid: <1481999049.76.0.0901294904638.issue28927@psf.upfronthosting.co.za>
2016-12-17 18:24:09nneonneolinkissue28927 messages
2016-12-17 18:24:09nneonneocreate