classification
Title: string.zfill and unicode
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: doerwalter Nosy List: akuchling, doerwalter, gvanrossum, loewis, mwh
Priority: normal Keywords: patch

Created on 2002-03-28 13:26 by doerwalter, last changed 2002-04-22 12:03 by doerwalter. This issue is now closed.

Files
File name Uploaded Description Edit
diff.txt doerwalter, 2002-03-28 13:26
diff.txt doerwalter, 2002-04-12 18:37 Implements zfill as str and unicode methods
diff3.txt doerwalter, 2002-04-17 18:55 Checks that str/unicode methods for str/unicode subinstances don't return subinstances
branch-diff.txt doerwalter, 2002-04-19 16:30
Messages (22)
msg39372 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-03-28 13:26
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.
msg39373 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2002-03-29 16:24
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


msg39374 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-03-30 11:16
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?
msg39375 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2002-03-30 11:25
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.
msg39376 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-04-12 14:51
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.
msg39377 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-04-12 18:37
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)
msg39378 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-13 01:00
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!
msg39379 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-04-15 13:41
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138
msg39380 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-15 13:53
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.
msg39381 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-04-15 14:43
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.
msg39382 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-15 14:47
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).
msg39383 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-04-15 18:23
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)
msg39384 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-15 18:29
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.
msg39385 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-04-15 18:47
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?
msg39386 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-15 18:48
Logged In: YES 
user_id=6380

If you want to be thorough, yes, that's a good test to add!
msg39387 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-04-17 18:55
Logged In: YES 
user_id=89016

Diff3.txt adds these tests to Lib/test/test_unicode.py and 
Lib/test/test_string.py. All tests pass (except that 
currently test_unicode.py fails the unicode_internal 
roundtripping test with --enable-unicode=ucs4) and when I 
change zfill back to always return self they properly fail.

I don't know whether the fail message should be made 
better, and how this would interact with "make test" and 
whether the "Prefer string methods over string module 
functions" part in test_string.py might pose problems.

And maybe the code could be simplyfied to always use the 
subclasses without first trying str und unicode?
msg39388 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-17 20:50
Logged In: YES 
user_id=6380

The test seems fine, and a good addition.  Don't worry too
much about how to report the failure (though perhaps
including the key word "subtype" in the error output might
help).

I noticed that when I change the Unicode function fixup() to
not do a check for subclasses, I only get very few failures:
one for capitalize, two for lower, one for upper. I think
this is because the test suite doesn't have enough sample
cases where the output is the same as the input. Maybe some
could be added.

But go ahead and check in diff3.txt.
msg39389 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-04-17 21:35
Logged In: YES 
user_id=89016

Checked in as:
Lib/test/test_string.py 1.16
Lib/test/test_unicode.py 1.56
msg39390 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2002-04-18 09:09
Logged In: YES 
user_id=6656

Walter, do you feel like sorting out the release22-maint
branch too?

It's probably best to activate the new string methods there
too.  I can't see how it could possibly break anything.
msg39391 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-04-19 16:30
Logged In: YES 
user_id=89016

OK, I have a branch version that has the methods (attached 
as branch-diff.txt). In addition to the zfill changes it 
has Guido's change to test_userstring.py and the 
subinstance checks in the test.

Does this look ok?
msg39392 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2002-04-22 10:17
Logged In: YES 
user_id=6656

Not sure if you were asking me, but it looks fine from here.

Do you want me to check it in?
msg39393 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2002-04-22 12:03
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.80.6.5
Lib/UserString.py 1.10.18.2
Lib/string.py 1.60.16.2
test/string_tests.py 1.10.16.2
test/test_string.py 1.15.6.1
test/test_unicode.py 1.47.6.2
test/test_userstring.py 1.5.24.1
Misc/NEWS 1.337.2.4.2.25
Objects/stringobject.c 2.147.6.2
Objects/unicodeobject.c 2.124.6.7
History
Date User Action Args
2002-03-28 13:26:29doerwaltercreate