Author merwok
Recipients haypo, merwok, tarek
Date 2010-08-11.03:21:25
SpamBayes Score 1.87406e-13
Marked as misclassified No
Message-id <1281496888.52.0.395441307665.issue9561@psf.upfronthosting.co.za>
In-reply-to
Content
There are different kind of files created by write_file:

- PKG-INFO (METADATA in distutil2), that already uses a trick to support Unicode, but your change would replace it in a better way;

- MANIFEST, which with your fix would gain the ability to handle non-ASCII paths, which is a feature or a bugfix depending on your point of view;

- .def files, used by the compilers for the C linking step; I don’t know if it’s appropriate to allow UTF-8 there.

- RPM spec files, which use ASCII or UTF-8 according to http://en.opensuse.org/openSUSE:Specfile_guidelines#Specfile_Encoding but it’s not confirmed in http://www.rpm.org/max-rpm/s1-rpm-build-creating-spec-file.html (linked from the LSB site), so there’s no guarantee this works for all RPM platforms. This sort of platform-specific thing is the reason why RPM support has been removed in distutils2.

- record and .pth files created by the install command.

I agree that there is something to be fixed, but I don’t know if they can be fixed in distutils. Unicode in PKG-INFO is unrelated to files, whereas there are files or directories in MANIFEST, spec, record and .pth. If this is going to be fixed, write_file should not use UTF-8 unconditionally but grow a keyword argument IMO, so that use cases requiring ASCII continue to work.

When you say “patch *all* functions reading files”, I guess you mean all functions that read distutils files, i.e. MANIFEST and PKG-INFO.

Tarek, is this a bug fix or a feature? Could it break third-party tools?
History
Date User Action Args
2010-08-11 03:21:29merwoksetrecipients: + merwok, haypo, tarek
2010-08-11 03:21:28merwoksetmessageid: <1281496888.52.0.395441307665.issue9561@psf.upfronthosting.co.za>
2010-08-11 03:21:27merwoklinkissue9561 messages
2010-08-11 03:21:26merwokcreate