Message 107579 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	lemburg, loewis, vstinner
Date	2010-06-11.20:12:28
SpamBayes Score	2.809797e-05
Marked as misclassified	No
Message-id	<1276287150.27.0.398158320639.issue8949@psf.upfronthosting.co.za>
In-reply-to

Content
Some examples of functions using "s" format: * str.encode(encoding, errors), bytes.decode(encoding, errors): both arguments have to be unicode strings * compile(source, filename, mode, ...): filename and mode have to be unicode strings * crypt.crypt(word, salt): both arguments have to be unicode strings I think that crypt() should also accept bytes, but not str.encode() nor bytes.decode(). Some examples of functions using "z" format: * _locale.bindtextdomain(domain, dirname): dirname uses "z" format and so accepts str, bytes or buffer compatible object. It should use PyUnicode_FSConverter() instead. But I agree that bytes is welcomed here. * readline.(write_history_file\|read_init_file\|read_history_file) functions do use "z" to parse a filename. PyUnicode_FSConverter() would also be better, but in this case "z" is better than "s" :-) I don't know why "s" and "z" are different about bytes, but it will be difficult to change it without changing a lot ot code (all functions using these formats). I tried to reject types different than str for "z": most tests of the test suite fail. I tried to accept bytes for "s" format: "unicode".encode(b'abc') does segfault.

Some examples of functions using "s" format:
 * str.encode(encoding, errors), bytes.decode(encoding, errors): both arguments have to be unicode strings
 * compile(source, filename, mode, ...): filename and mode have to be unicode strings
 * crypt.crypt(word, salt): both arguments have to be unicode strings

I think that crypt() should also accept bytes, but not str.encode() nor bytes.decode().

Some examples of functions using "z" format:
 * _locale.bindtextdomain(domain, dirname): dirname uses "z" format and so accepts str, bytes or buffer compatible object. It should use PyUnicode_FSConverter() instead. But I agree that bytes is welcomed here.
 * readline.(write_history_file|read_init_file|read_history_file) functions do use "z" to parse a filename. PyUnicode_FSConverter() would also be better, but in this case "z" is better than "s" :-)

I don't know why "s" and "z" are different about bytes, but it will be difficult to change it without changing a lot ot code (all functions using these formats). I tried to reject types different than str for "z": most tests of the test suite fail. I tried to accept bytes for "s" format: "unicode".encode(b'abc') does segfault.

History
Date	User	Action	Args
2010-06-11 20:12:30	vstinner	set	recipients: + vstinner, lemburg, loewis
2010-06-11 20:12:30	vstinner	set	messageid: <1276287150.27.0.398158320639.issue8949@psf.upfronthosting.co.za>
2010-06-11 20:12:28	vstinner	link	issue8949 messages
2010-06-11 20:12:28	vstinner	create