Author eric.araujo
Recipients eric.araujo, tarek, vstinner
Date 2010-08-11.03:21:25
SpamBayes Score 1.87406e-13
Marked as misclassified No
Message-id <>
There are different kind of files created by write_file:

- PKG-INFO (METADATA in distutil2), that already uses a trick to support Unicode, but your change would replace it in a better way;

- MANIFEST, which with your fix would gain the ability to handle non-ASCII paths, which is a feature or a bugfix depending on your point of view;

- .def files, used by the compilers for the C linking step; I don’t know if it’s appropriate to allow UTF-8 there.

- RPM spec files, which use ASCII or UTF-8 according to but it’s not confirmed in (linked from the LSB site), so there’s no guarantee this works for all RPM platforms. This sort of platform-specific thing is the reason why RPM support has been removed in distutils2.

- record and .pth files created by the install command.

I agree that there is something to be fixed, but I don’t know if they can be fixed in distutils. Unicode in PKG-INFO is unrelated to files, whereas there are files or directories in MANIFEST, spec, record and .pth. If this is going to be fixed, write_file should not use UTF-8 unconditionally but grow a keyword argument IMO, so that use cases requiring ASCII continue to work.

When you say “patch *all* functions reading files”, I guess you mean all functions that read distutils files, i.e. MANIFEST and PKG-INFO.

Tarek, is this a bug fix or a feature? Could it break third-party tools?
Date User Action Args
2010-08-11 03:21:29eric.araujosetrecipients: + eric.araujo, vstinner, tarek
2010-08-11 03:21:28eric.araujosetmessageid: <>
2010-08-11 03:21:27eric.araujolinkissue9561 messages
2010-08-11 03:21:26eric.araujocreate