New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
center, ljust and rjust are inconsistent with unicode parameters #47696
Comments
Not all combinations of unicode/non-unicode parameters work for ljust, This doctest fails in 3 places. Though I would expect it to be passing. def doctest_strings():
"""
>>> uni.center(5, ascii)
u'aaaaa'
>>> uni.center(5, uni)
u'aaaaa'
>>> ascii.center(5, ascii)
'aaaaa'
>>> ascii.center(5, uni)
u'aaaaa'
>>> uni.ljust(5, ascii)
u'aaaaa'
>>> uni.ljust(5, uni)
u'aaaaa'
>>> ascii.ljust(5, ascii)
'aaaaa'
>>> ascii.ljust(5, uni)
u'aaaaa'
>>> uni.rjust(5, ascii)
u'aaaaa'
>>> uni.rjust(5, uni)
u'aaaaa'
>>> ascii.rjust(5, ascii)
'aaaaa'
>>> ascii.rjust(5, uni)
u'aaaaa'
|
Indeed this behavior doesn't seem to be documented. When the string is unicode and the fillchar non-unicode Python
implicitly tries to decode the fillchar (and possibly it raises a
TypeError if it's not in range(0,128)):
>>> u'x'.center(5, 'y') # unicode string, non-unicode (str) fillchar
u'yyxyy' # the fillchar is decoded
When the string is non-unicode it only accepts a non-unicode fillchar
(e.g. 'x'.center(5, 'y')) and it raises a TypeError if the fillchar is
unicode:
>>> 'x'.center(5, u'y') # non-unicode (str) string, unicode fillchar
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: center() argument 2 must be char, not unicode If it tries to decode the fillchar when the string is unicode, it could Py3, instead, seems to have the opposite behavior. It implicitly encodes >>> b'x'.center(5, 'y') # byte string, unicode fillchar
b'yyxyy' # the fillchar is encoded
>>> 'x'.center(5, b'y') # unicode string, byte fillchar
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: The fill character cannot be converted to Unicode In the doc 1 there's written that "The methods on bytes and bytearray
|
In Py2.x, I think the desired behavior should match str.join(). If For Py3.x, I think the goal was to have str.join() enforce that both |
About Python3, bytes.center accepts unicode as second argument, which >>> b"x".center(5, b"\xe9")
b'\xe9\xe9x\xe9\xe9'
>>> b"x".center(5, "\xe9")
b'\xe9\xe9x\xe9\xe9' The second example must fail with a TypeError. str.center has the right behaviour: >>> "x".center(5, "\xe9")
'ééxéé'
>>> "x".center(5, b"\xe9")
TypeError: The fill character cannot be converted to Unicode |
haypo> About Python3, bytes.center accepts unicode as second argument, Ok, it's fixed thanks by r71013 (issue bpo-5499). |
This issue only concerns Python 2.x, Python 3.x has the right |
The question is why str.{ljust,rjust,center} doesn't accept unicode To be consistent with other string methods, str.{ljust,rjust,center} Can you write such patch? -- str.{ljust,rjust,center} use PyArg_ParseTuple(args, "n|c:...", ...) unicode.{ljust,rjust,center} use PyArg_ParseTuple(args, "n| def convert_uc(o):
try:
u = unicode(o)
except:
raise TypeError("The fill character cannot be converted to
Unicode")
if len(u) != 1:
raise TypeError("The fill character must be exactly one
character long"))
return u[0] convert_uc() accepts an byte string of 1 ASCII. string_count() uses PyArg_ParseTuple(args, "O...", ...) and then test |
As a feature request for 2.x, I think this should be rejected. Any objections? The "behavior" part seem to have been fixed. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: