This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tarek
Recipients tarek
Date 2008-04-06.09:47:15
SpamBayes Score 0.04822157
Marked as misclassified No
Message-id <1207475239.5.0.0696358951433.issue2562@psf.upfronthosting.co.za>
In-reply-to
Content
If I try to put my name in the Author field as a string field, 
it will brake because distutils makes the assumption that 
the fields are string encoded in ascii, before it decodes
it into unicode, then encode it in utf8 to send the data.

See in distutils.command.register.post_to_server :

value = unicode(value).encode("utf-8")


One way to avoid this error is to provide unicode for all field,
but will fail farther if setuptools is used, because
this other package makes the assumption that the fields *are* strings::

self.run_command('egg_info')
...
distutils/dist.py", line 1047, in write_pkg_info
    pkg_info.write('Author: %s\n' % self.get_contact() )
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
position 18: ordinal not in range(128)

So I guess distutils shouldn't guess that it receives ascii strings
and do a raw unicode() call, and should make the assumption that 
it receives unicode fields only.


Since many packages out there use strings, I have left a unicode()
call in my patch, together with a warning. 

test provided.
History
Date User Action Args
2008-04-06 09:47:19tareksetspambayes_score: 0.0482216 -> 0.04822157
recipients: + tarek
2008-04-06 09:47:19tareksetspambayes_score: 0.0482216 -> 0.0482216
messageid: <1207475239.5.0.0696358951433.issue2562@psf.upfronthosting.co.za>
2008-04-06 09:47:18tareklinkissue2562 messages
2008-04-06 09:47:17tarekcreate