Author vstinner
Recipients ezio.melotti, ignas, rhettinger, vstinner
Date 2009-05-04.12:58:26
SpamBayes Score 7.03219e-07
Marked as misclassified No
Message-id <1241441908.38.0.732899725763.issue3446@psf.upfronthosting.co.za>
In-reply-to
Content
The question is why str.{ljust,rjust,center} doesn't accept unicode 
argument, whereas unicode.{ljust,rjust,center} accept ASCII string. 
Other string methods accept unicode argument, like str.count() (encode 
the unicode string to bytes using utf8 charset).

To be consistent with other string methods, str.{ljust,rjust,center} 
should accept unicode string and convert them to byte string using 
utf8, like str.count does. But I hate such implicit conversion (I 
prefer Python3 way: disallow mixing bytes and characters), so I will 
not contribute to such patch.

Can you write such patch?

--

str.{ljust,rjust,center} use PyArg_ParseTuple(args, "n|c:...", ...) 
and getarg('c') which only accepts a string of 1 byte.

unicode.{ljust,rjust,center} use PyArg_ParseTuple(args, "n|
O&:...", ..., convert_uc, ...) where convert_uc looks something like:

  def convert_uc(o):
     try:
        u = unicode(o)
     except:
        raise TypeError("The fill character cannot be converted to 
Unicode")
     if len(u) != 1:
        raise TypeError("The fill character must be exactly one 
character long"))
     return u[0]

convert_uc() accepts an byte string of 1 ASCII.

string_count() uses PyArg_ParseTuple(args, "O...", ...) and then test 
the substring type.
History
Date User Action Args
2009-05-04 12:58:28vstinnersetrecipients: + vstinner, rhettinger, ezio.melotti, ignas
2009-05-04 12:58:28vstinnersetmessageid: <1241441908.38.0.732899725763.issue3446@psf.upfronthosting.co.za>
2009-05-04 12:58:26vstinnerlinkissue3446 messages
2009-05-04 12:58:26vstinnercreate