classification
Title: PyUnicode_FromFormat integer format handling different from printf about zeropad
Type: behavior Stage: needs patch
Components: Documentation, Interpreter Core Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, eric.smith, haypo, serhiy.storchaka, terry.reedy, xiang.zhang, ztane
Priority: normal Keywords: easy

Created on 2016-10-11 08:56 by xiang.zhang, last changed 2017-03-17 03:26 by xiang.zhang.

Messages (3)
msg278467 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-10-11 08:56
Although declared *exactly equivalent* to printf in the doc, PyUnicode_FromFormat could generate different result from printf with the same format.

For example:

from ctypes import pythonapi, py_object, c_int
f = getattr(pythonapi, 'PyUnicode_FromFormat')
f.restype = py_object
f(b'%010.5d', c_int(100))
'0000000100'

while printf outputs:

printf("%010.5d\n", 100);
     00100

I use both gcc and clang to compile and get the same result. gcc gives me a warning:

warning: '0' flag ignored with precision and ā€˜%dā€™ gnu_printf format

I am not sure this should be fixed. It seems the change could break backwards compatibility.
msg278528 - (view) Author: Antti Haapala (ztane) * Date: 2016-10-12 12:36
To be more precise, C90, C99, C11 all say that ~"For d, i, o, u, x and X conversions, if a precision is specified, the 0 flag will be ignored."
msg278666 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-10-14 21:21
I presume that PyUnicode_FromFormat is responsible for the first of the following:
>>> '%010.5d' % 100
'0000000100'
>>> b'%010.5d' % 100
b'0000000100'

I am strongly of the opinion that the behavior should be left alone and the C-API doc changed by either 1) replacing 'exactly' with 'nearly' or 2) adding the following: "except that a 0 conversion flag is not ignored when a precision is given for d, i, o, u, x and X conversion types" (and other exceptions as discovered).

I took the terms 'conversion flag' and 'conversion type' from
https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting
https://docs.python.org/3/library/stdtypes.html#printf-style-bytes-formatting

I consider the Python behavior to be superior.  The '0' conversion flag, the '.' precision indicator, and the int conversion types are literal characters.  If one does not want the '0' conversion, one should omit it and not write it to be ignored.
>>> '%10.5d' % 100
'     00100'

And I consider the abolition of int 'precision', inr {} formatting even better.  
>>> '{:010.5d}'.format(100)
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    '{:010.5d}'.format(100)
ValueError: Precision not allowed in integer format specifier

It has always been a source of confusion, and there is hardly any real-world use case for a partial 0 fill.
History
Date User Action Args
2017-03-17 03:26:42xiang.zhangsetkeywords: + easy
stage: needs patch
2016-10-19 02:57:10josh.rsettitle: PyUnicode_FromFromat interger format handling different from printf about zeropad -> PyUnicode_FromFormat integer format handling different from printf about zeropad
2016-10-14 21:21:57terry.reedysetnosy: + eric.smith, terry.reedy, docs@python
messages: + msg278666

assignee: docs@python
components: + Documentation
2016-10-12 12:36:08ztanesetnosy: + ztane
messages: + msg278528
2016-10-11 08:56:20xiang.zhangcreate