msg261308 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-07 17:47 |
I regulary see Python code using hex(value)[2:], whereas "%x" % value does the same thing. We should mention "%x" % value in the hex() doc. Maybe also mention "%#X" % value to format in upper case?
|
msg261312 - (view) |
Author: Eric V. Smith (eric.smith) *  |
Date: 2016-03-07 18:58 |
For 3.5 and 2.7, I'd suggest:
format(value, 'x')
or:
format(value, 'X')
Although you might disagree because of the verbosity. But at least you're not parsing a string at runtime.
And for 3.6 with PEP-498:
f'{value:x}'
There are of course options for padding and adding the '0x', as well.
|
msg261478 - (view) |
Author: Wolfgang Maier (wolma) * |
Date: 2016-03-09 21:15 |
Your two suggestions prompted me to do a speed comparison between them and the result surprised me.
I tried:
import random
nums = [random.randint(0, 255) for n in range(10000000)]
then timed the simple:
for n in nums:
hx = '%X' % n # or hx = format(n, 'X')
I also tested a number of more complex formats like:
hx = '%{:02X}'.format(n) vs hx = '%%%02X' % n
In all cases, the old vs new formatting styles are rather similar in speed in my system Python 2.7.6 (with maybe a slight advantage for the format-based formatting).
In Python 3.5.0, however, old-style %-formatting is much speedier than under Python 2, while new-style formatting doesn't appear to have changed much, with the result that %-formatting is now between 30-50% faster than format-based formatting.
So I guess my questions are:
- are my timings wrong?
and if not:
- how got %-formatting improved (generally? or for %X specifically?)
- can this speed up be transferred to format-based formatting somehow?
|
msg261479 - (view) |
Author: Eric V. Smith (eric.smith) *  |
Date: 2016-03-09 21:43 |
Without lots of analysis (and disassembly), I can't speak to how valid your tests are, but they don't seem unreasonable.
format() will always be slower, because it's more general (primarily in that it can be extended to new types). Plus, it involves at least a name lookup that %-formatting can skip. The usual ways to optimize this lookup holds here, too, if speed is really that critical (which I'm skeptical of).
For example, say you had a custom type which implemented __format__ to understand the "X" format code. Using format(), this type could format itself as hex. %-formatting can't do that.
In any event, I don't think we want to promulgate the fastest way to do a hex conversion, just the clearest.
I can't say why format() in 3.5 is slower. There are many changes and tracking it down would be quite time consuming.
|
msg261480 - (view) |
Author: Wolfgang Maier (wolma) * |
Date: 2016-03-09 21:47 |
Ah, but it's not that format() is slower in 3.5, but that %-formatting got faster.
It looks as if it got optimized and I was wondering whether the same optimization could be applied to format().
|
msg261481 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-09 23:16 |
> Ah, but it's not that format() is slower in 3.5, but that %-formatting got faster.
Hum, python3 looks faster on this dummy microbenchmark yeah. Who said that Python 3 is slower? :-)
$ python2 -m timeit -s 'import random; nums = [random.randint(0, 255) for n in range(10**5)]' '["%x" % x for x in nums]'
10 loops, best of 3: 43.7 msec per loop
$ python3 -m timeit -s 'import random; nums = [random.randint(0, 255) for n in range(10**5)]' '["%x" % x for x in nums]'
10 loops, best of 3: 19.2 msec per loop
I spent a lot time to micro-optimize str%args, str.format(args), and operations on str in general in Python 3. I wrote a first article to explain my work on optimization:
https://haypo.github.io/pybyteswriter.html
I have a draft article explaning other kinds of optimizations related to the PEP 393.
> It looks as if it got optimized and I was wondering whether the same optimization could be applied to format().
str.format(args) was also optimized, but it's still faster than str%args.
On Python 3, "%x" % 0x1234abc takes 17 nanoseconds according to timeit. It's super fast! Any extra work can have a non negligible overhead. For example, it's known that operators are faster than functions in Python. One reason is that a calling requires to lookup the function in namespaces (local, global or builtin namespaces). It can be even worse (slower) to lookup a method (especially with custom __getattr__ method).
--
Hum, I don't recall why you started to talk about performance :-D
Why not documenting "%x" % value *and* format(value, 'x')?
I prefer "%x" % value. I never use format(value, ...) but sometimes I use "{0:x}".format(value).
f'{x:value}' looks too magical for me.
|
msg261494 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-10 11:21 |
> I regulary see Python code using hex(value)[2:]
In fact, it's even worse, I also saw Python 2 code stripping trailing "L", since hex(long) adds a L suffix...
$ python2
Python 2.7.10 (default, Sep 8 2015, 17:20:17)
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux2
>>> hex(123L)
'0x7bL'
>>> "%x" % 123L
'7b'
>>> format(123L, "x")
'7b'
>>> "%#x" % 123L
'0x7b'
>>> format(123L, "#x")
'0x7b'
|
msg261516 - (view) |
Author: Wolfgang Maier (wolma) * |
Date: 2016-03-10 17:25 |
> Hum, python3 looks faster on this dummy microbenchmark yeah. Who said that Python 3 is slower? :-)
If you're alluding to that seemingly endless thread over on python-list, let me say that it is not my motivation to start anything like that here. Sorry also if I sort of hijacked your documentation issue with my performance question.
I really only wondered whether there would be any argument for or against any of the two versions (%-interpolation, format-based) other than stylistic ones.
That's why I ran the micro-benchmark and, in fact, I was expecting %-interpolation to be faster exactly because it is less flexible.
What I am surprised by is not the fact that %-interpolation got faster in Python3, but the fact that format didn't.
I was wondering whether %-interpolation maybe takes some fast path in Python3 that simply wasn't implemented for format. If that was the case it could have been rewarding to just optimize format the same way.
As I know Victor is working on performance stuff I thought I'd just ask here, but from your answer I gather that things are rather not so simple and that's ok.
> I wrote a first article to explain my work on optimization:
https://haypo.github.io/pybyteswriter.html
Thanks for the link.
> str.format(args) was also optimized, but it's still faster than str%args.
You mean slower I assume ?
> Hum, I don't recall why you started to talk about performance :-D
See above.
> Why not documenting "%x" % value *and* format(value, 'x')?
> I prefer "%x" % value. I never use format(value, ...) but sometimes I use "{0:x}".format(value).
I prefer the last version, use the first sometimes, but documenting several ways seems reasonable.
|
msg261518 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-03-10 17:47 |
> That's why I ran the micro-benchmark and, in fact, I was expecting %-interpolation to be faster exactly because it is less flexible.
Actually %-interpolation is more flexible.
>>> '%x' % 123
'7b'
>>> '%0X' % 123
'7B'
>>> '%#x' % 123
'0x7b'
>>> '%04x' % 123
'007b'
If document alternatives for hex(), we should also document formatting alternatives for bin(), oct(), repr(), ascii(), str(), chr(), str.ljust(), str.rjust(), str.center(), str.zfill().
|
msg261542 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-11 06:38 |
Serhiy Storchaka added the comment:
> If document alternatives for hex(), we should also document formatting
alternatives for bin(), oct(),
Ok for these two since they also add a prefix. But I don't see the point of
documenting alternatives for the other listed functions. The matter here is
the 0x prefix.
|
msg261544 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-03-11 06:56 |
There is no harm if use hex(value)[2:]. It's a matter of taste.
We can mention "%x" % value for the case if the user just doesn't know about this alternative. The same is for value.ljust(5) and '%-5s' % value.
|
msg261549 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-11 08:37 |
I opened the issue when I read this change:
https://review.openstack.org/#/c/288224/2/neutron/common/utils.py
rndstr = hex(...)[2:]
# Whether there is a trailing 'L' is a py2/3 incompatibility
rndstr = rndstr.rstrip('L')
return rndstr.zfill(length)
can be simply written
return "{0:0{1}x}".format(..., length)
It's less readable, but it's more efficient.
|
msg261551 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-03-11 08:51 |
I agree with you and always prefer formatting strings.
Your example shows that at least an alternative to str.zfill() should be mentioned in the educational purposes.
With C-style formatting your example can be written more laconically:
return "%0*x" % (length, ...)
|
msg262068 - (view) |
Author: Manvi B (Manvi B) * |
Date: 2016-03-20 11:28 |
Modified documentation for the functions bin(), hex() and oct() as mentioned in the comments. Submitted the patch.
|
msg262084 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-20 17:14 |
You misunderstood the whole purpose of my issue! You must not write
hex()[:2] (it's inefficent)! Please remove it from your patch.
|
msg262085 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-03-20 17:23 |
The documentation for hex() doesn't look the bests place for examples of using string formatting. I think it is enough to add short references to corresponding formatting codes.
|
msg262106 - (view) |
Author: Manvi B (Manvi B) * |
Date: 2016-03-21 08:33 |
Removed hex()[:2] from the patch.
|
msg262107 - (view) |
Author: Manvi B (Manvi B) * |
Date: 2016-03-21 08:42 |
Modified the patch with '%x' % value.
|
msg262108 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-21 09:00 |
Serhiy Storchaka:
> The documentation for hex() doesn't look the bests place for examples of using string formatting. I think it is enough to add short references to corresponding formatting codes.
I like Manvi B's patch with many examples. It's hard to read formatting strings, it's hard to compute the result, so full examples are just more obvious.
I don't think that it hurts to add many formatting examples. I expect that most users will combine the result of bin/hex/oct with another string, so suggesting using formatting functions will probably help them to simplify the code.
For example,
print("x=", hex(x), "y=", hex(y))
can be written:
print("x=%#x y=%#x" % (x, y))
or
print("x={:#x} y={:#x}".format(x, y))
or
print(f"x={x:#x} y={y:#x}")
The first expression using hex() adds spaces after "=", but well, it's just to give a simple example. IMHO formatting strings are more readable.
|
msg262109 - (view) |
Author: Ezio Melotti (ezio.melotti) *  |
Date: 2016-03-21 09:02 |
> The documentation for hex() doesn't look the bests place for examples
> of using string formatting. I think it is enough to add short
> references to corresponding formatting codes.
I think those examples take too much space compared to the actual docs of the functions.
I can think of 3 possible solutions:
1) keep the examples but condense them so that they don't take so much space:
>>> n = 255
>>> f'{n:#x}', format(n, '#x'), '%#x' % n
('0xff', '0xff', '0xff')
>>> f'{n:x}', format(n, 'x'), '%x' % n
('ff', 'ff', 'ff')
>>> f'{n:X}', format(n, 'X'), '%X' % n
('FF', 'FF', 'FF')
or
>>> '%#x' % 255, '%x' % 255, '%X' % 255
('0xff', 'ff', 'FF')
>>> format(255, '#x'), format(255, 'x'), format(255, 'X')
('0xff', 'ff', 'FF')
>>> f'{255:#x}', f'{255:x}', f'{255:X}'
('0xff', 'ff', 'FF')
(the latter should only go in 3.6 though)
2) add a direct link to https://docs.python.org/3/library/string.html#format-examples where there are already some examples (more can be added if needed);
3) add a single footnote for all 3 functions that includes examples using old/new string formatting and f-strings, mentions the fact that # can be used to omit the prefix and the fact that b/o/x and B/O/X can be used for lowercase and uppercase output.
FWIW I don't think that performances matter too much in this case, but I also dislike hex(value)[2:] and agree it should not be mentioned.
|
msg262110 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-21 09:23 |
Ezio Melotti added the comment:
> I can think of 3 possible solutions:
>
> 1) keep the examples but condense them so that they don't take so much space:
>>>> n = 255
>>>> f'{n:#x}', format(n, '#x'), '%#x' % n
> ('0xff', '0xff', '0xff')
>>>> f'{n:x}', format(n, 'x'), '%x' % n
> ('ff', 'ff', 'ff')
>>>> f'{n:X}', format(n, 'X'), '%X' % n
> ('FF', 'FF', 'FF')
Hum. It's not easy to read these complex formatting strings when they are written like that.
> or
>
>>>> '%#x' % 255, '%x' % 255, '%X' % 255
> ('0xff', 'ff', 'FF')
>>>> format(255, '#x'), format(255, 'x'), format(255, 'X')
> ('0xff', 'ff', 'FF')
>>>> f'{255:#x}', f'{255:x}', f'{255:X}'
> ('0xff', 'ff', 'FF')
I really prefer when the same kind of the formating strings are written on the same line. I really like this example. Short, obvious, easy to read.
I have a prefererence for an example using a variable name rather than a number literal. It's more common to manipulate variables than number literals.
If you use a variable, please use a variable name longer than "n" to get more readable example. Otherwise, it's not obvious what is in the variable name in "{n:x}": is "n" the variable? is "x" the variable?
In short, I suggest this example:
>>> value = 255
>>> '%#x' % value, '%x' % value, '%X' % value
('0xff', 'ff', 'FF')
>>> format(value, '#x'), format(value, 'x'), format(value, 'X')
('0xff', 'ff', 'FF')
>>> f'{value:#x}', f'{value:x}', f'{value:X}'
('0xff', 'ff', 'FF')
Note: Ezio, do you prefer format(value, 'x) for '{:x}'.format(value)?
> 2) add a direct link to https://docs.python.org/3/library/string.html#format-examples where there are already some examples (more can be added if needed);
IMHO it's ok to add formatting examples to bin/hex/oct. Using your compact example, it's not going to make the doc too long.
|
msg262119 - (view) |
Author: Ezio Melotti (ezio.melotti) *  |
Date: 2016-03-21 11:11 |
> Note: Ezio, do you prefer format(value, 'x') for '{:x}'.format(value)?
While formatting a single value the former is better/shorter, but the latter is perhaps more common since you usually have something else in the string too.
The latter can also be used to do something like:
>>> '{num:x} {num:X} {num:#x} {num:#X}'.format(num=255)
'ff FF 0xff 0XFF'
|
msg262168 - (view) |
Author: Manvi B (Manvi B) * |
Date: 2016-03-22 07:46 |
Considered the reviews from STINNER Victor (haypo) and comments, the patch is modified.
|
msg297104 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-06-28 01:10 |
Can someone pick the last patch and convert it to a pull request? CPython moved to GitHub in the meanwhile! See http://docs.python.org/devguide/ ;-)
|
msg297195 - (view) |
Author: Sharan Yalburgi (Sharan Yalburgi) * |
Date: 2017-06-28 16:31 |
Hey, I am new to Open Source, can I work on this?
|
msg297196 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-06-28 16:34 |
> Hey, I am new to Open Source, can I work on this?
Hi, did you read http://docs.python.org/devguide/ ? IMHO its a good start. You can also join the https://www.python.org/dev/core-mentorship/ group to get help!
|
msg297198 - (view) |
Author: Mariatta (Mariatta) *  |
Date: 2017-06-28 17:01 |
When uploading patch from another person, please include "Original patch by <original author>" in the PR, and the commit message.
Thanks.
|
msg297200 - (view) |
Author: Sharan Yalburgi (Sharan Yalburgi) * |
Date: 2017-06-28 17:05 |
> Hi, did you read http://docs.python.org/devguide/ ? IMHO its a good start. You can also join the https://www.python.org/dev/core-mentorship/ group to get help!
Yes I did. Thank you. I have made a PR. I says I haven't signed CLA yet. I am doing that right now.
> When uploading patch from another person, please include "Original patch by <original author>" in the PR, and the commit message.
Will do that thank you.
|
msg297836 - (view) |
Author: Mariatta (Mariatta) *  |
Date: 2017-07-06 19:31 |
New changeset 67ba4fa467ffff825d6a0c0a21cc54ff1df2ed1b by Mariatta (Manvisha Kodali) in branch 'master':
bpo-26506: hex() documentation: mention %x % int (GH-2525)
https://github.com/python/cpython/commit/67ba4fa467ffff825d6a0c0a21cc54ff1df2ed1b
|
msg299126 - (view) |
Author: Mariatta (Mariatta) *  |
Date: 2017-07-25 18:04 |
New changeset 59e6ab15e47d496ac4e5f9d53aac0fae0c708da4 by Mariatta in branch '3.6':
bpo-26506: hex() documentation: mention %x % int (GH-2525) (GH-2870)
https://github.com/python/cpython/commit/59e6ab15e47d496ac4e5f9d53aac0fae0c708da4
|
msg299127 - (view) |
Author: Mariatta (Mariatta) *  |
Date: 2017-07-25 18:05 |
Thanks for the PRs Manvi.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:28 | admin | set | github: 70693 |
2017-07-25 18:05:01 | Mariatta | set | status: open -> closed versions:
+ Python 3.7 messages:
+ msg299127
resolution: fixed stage: resolved |
2017-07-25 18:04:12 | Mariatta | set | messages:
+ msg299126 |
2017-07-25 17:41:07 | Mariatta | set | pull_requests:
+ pull_request2921 |
2017-07-06 19:31:00 | Mariatta | set | messages:
+ msg297836 |
2017-07-01 10:17:19 | Manvi B | set | pull_requests:
+ pull_request2590 |
2017-06-28 17:05:24 | Sharan Yalburgi | set | messages:
+ msg297200 |
2017-06-28 17:01:56 | Mariatta | set | nosy:
+ Mariatta messages:
+ msg297198
|
2017-06-28 16:54:36 | Sharan Yalburgi | set | pull_requests:
+ pull_request2533 |
2017-06-28 16:34:29 | vstinner | set | messages:
+ msg297196 |
2017-06-28 16:31:08 | Sharan Yalburgi | set | nosy:
+ Sharan Yalburgi messages:
+ msg297195
|
2017-06-28 01:10:55 | vstinner | set | keywords:
+ easy title: hex() documentation: mention "%x" % int -> [EASY] hex() documentation: mention "%x" % int |
2017-06-28 01:10:40 | vstinner | set | messages:
+ msg297104 |
2016-03-22 07:46:42 | Manvi B | set | files:
+ issue26506.diff
messages:
+ msg262168 |
2016-03-21 11:11:35 | ezio.melotti | set | messages:
+ msg262119 |
2016-03-21 09:23:57 | vstinner | set | messages:
+ msg262110 |
2016-03-21 09:02:17 | ezio.melotti | set | nosy:
+ ezio.melotti messages:
+ msg262109
|
2016-03-21 09:00:55 | vstinner | set | messages:
+ msg262108 |
2016-03-21 08:42:38 | Manvi B | set | files:
+ issue26506.diff
messages:
+ msg262107 |
2016-03-21 08:33:31 | Manvi B | set | files:
+ issue26506.diff
messages:
+ msg262106 |
2016-03-20 17:23:44 | serhiy.storchaka | set | messages:
+ msg262085 |
2016-03-20 17:14:57 | vstinner | set | messages:
+ msg262084 |
2016-03-20 11:28:45 | Manvi B | set | files:
+ issue26506.diff
nosy:
+ Manvi B messages:
+ msg262068
keywords:
+ patch |
2016-03-11 08:51:36 | serhiy.storchaka | set | messages:
+ msg261551 |
2016-03-11 08:37:32 | vstinner | set | messages:
+ msg261549 |
2016-03-11 06:56:51 | serhiy.storchaka | set | messages:
+ msg261544 |
2016-03-11 06:38:43 | vstinner | set | messages:
+ msg261542 |
2016-03-10 17:47:35 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg261518
|
2016-03-10 17:25:59 | wolma | set | messages:
+ msg261516 |
2016-03-10 11:21:10 | vstinner | set | messages:
+ msg261494 |
2016-03-09 23:16:57 | vstinner | set | messages:
+ msg261481 |
2016-03-09 21:47:42 | wolma | set | messages:
+ msg261480 |
2016-03-09 21:43:11 | eric.smith | set | messages:
+ msg261479 |
2016-03-09 21:15:46 | wolma | set | nosy:
+ wolma messages:
+ msg261478
|
2016-03-07 18:58:59 | eric.smith | set | nosy:
+ eric.smith messages:
+ msg261312
|
2016-03-07 17:47:03 | vstinner | create | |