This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: distutils sdist add_defaults does not add data_files
Type: behavior Stage: resolved
Components: Distutils Versions: Python 3.0, Python 3.1, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: tarek Nosy List: dripton, gsakkis, tarek
Priority: high Keywords:

Created on 2008-03-12 14:53 by dripton, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (14)
msg63475 - (view) Author: David Ripton (dripton) Date: 2008-03-12 14:53
distutils.sdist.add_defaults adds the Python modules and scripts and C
extensions found in setup.py to the MANIFEST.  It does *not* add
data_files mentioned in setup.py to the MANIFEST.

This is non-orthogonal and confusing, because it means that a
MANIFEST.in is required if you have data_files, optional if you do not.
 If you have data_files and do not have MANIFEST.in, you get a broken
package but no error.
msg81650 - (view) Author: George Sakkis (gsakkis) Date: 2009-02-11 16:46
Agreed; package_data are also ignored by sdist. Unfortunately, neither
setuptools seems to work as expected on sdist but at least bdist_egg does.

MANIFEST.in is an ugly hack and should be deprecated; you shouldn't have
to repeat yourself in two files.
msg81888 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-02-13 08:58
I'll change add_defaults() so it includes package_data.

I don't think though, that MANIFEST.in is an ugly hack. Imho the best
practice is an explicit MANIFEST.in (and not use the .svn browsing
setuptools do)
msg81980 - (view) Author: George Sakkis (gsakkis) Date: 2009-02-14 00:06
I didn't mean to imply that automagic discovery based on external
version control software is better than MANIFEST.in; I favor
explicitness here as well. It's just that this information can (and
often has to) be duplicated in setup.py as 'package_data' or
'data_files'. For cases that package_data/data_files are not enough,
setup.py should be extended to handle them, instead of requiring to
write and keep in sync a separate file with its own mini syntax.
msg82004 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-02-14 10:24
Right, but if MANIFEST.in is removed, I can see cases where you would
need the same kind of mini-syntax in your setup.py.

For instance, how would you tell sdist to recursively add files located
in a directory (like the current recursive-include feature of MANIFEST.in) ?
msg82213 - (view) Author: George Sakkis (gsakkis) Date: 2009-02-16 03:18
By an equivalent option in setup() of course. I'm not against the
*functionality* of MANIFEST.in but on that (a) it's a second file you
have to remember to write and maintain in addition to setup.py (b) has
its own ad-hoc syntax instead of python and (c) overlaps in scope with
the package_data and data_files of setup.py.

FWIW I wrote a module that overrides the default build_py and sdist
commands with versions that allow specifying package_data recursively
(while preserving file permissions, unlike the - buggy IMO - behavior of
distutils) so that I can get rid of the MANIFEST.in.
msg82269 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-02-16 20:36
Ok I have finished the patch, I'll commit it during the week

> FWIW I wrote a module that overrides the default build_py and sdist
> commands with versions that allow specifying package_data recursively

Maybe that could be a new feature ? 


> (while preserving file permissions, unlike the - buggy IMO - behavior of
> distutils) so that I can get rid of the MANIFEST.in.

Sounds like a bug to me, could you fill an issue on that ?
msg82277 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-02-16 21:59
done in r69692 and r69696.
msg82297 - (view) Author: George Sakkis (gsakkis) Date: 2009-02-17 03:31
> > FWIW I wrote a module that overrides the default build_py and sdist
> > commands with versions that allow specifying package_data recursively
> 
> Maybe that could be a new feature ? 

That would be nice, especially if we want to reimplement MANIFEST.in as
setup() option at some point. My current implementation doesn't extend
the API, so there's no way to specify a subset of files under a
directory like recursive-include; every directory matched by a glob is
copied in whole (recursively):

import os
from distutils.command.build_py import build_py as _build_py

class build_py(_build_py):
    def find_data_files(self, package, src_dir):
        files = []
        for p in _build_py.find_data_files(self, package, src_dir):
            if os.path.isdir(p):
                files.extend(os.path.join(par,f)
                             for par,dirs,files in os.walk(p) 
                             for f in files)
            else:
                files.append(p)
        return files


> > (while preserving file permissions, unlike the - buggy IMO - behavior of
> > distutils) so that I can get rid of the MANIFEST.in.
> 
> Sounds like a bug to me, could you fill an issue on that ?

If it's a bug, it's certainly not accidental; there's a big XXX comment
justifying this choice but I'm not convinced. I posted about it at
http://mail.python.org/pipermail/python-list/2009-January/524263.html;
if you think it's a bug I'll fill an issue.
msg82298 - (view) Author: George Sakkis (gsakkis) Date: 2009-02-17 03:52
> done in r69692 and r69696.

Great, thanks. The data_files part though seems incorrect; for one thing
each item in data_files can be either a (dir,files) tuple or a plain
string, and for two 'dir' is the output (installation) directory, not
the root of 'files'. Here's the relevant part from my module:

if self.distribution.has_data_files():
    for item in self.distribution.data_files:
        if isinstance(item, basestring): # plain file
            self.filelist.append(convert_path(item))
        else: # an (outdir, files) tuple
            outdir,data_files = item
            self.filelist.extend(convert_path(f) for f in data_files)
msg82309 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-02-17 09:44
added in r69710. Thx
msg82313 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-02-17 10:08
2009/2/17 George Sakkis <report@bugs.python.org>:
>> Maybe that could be a new feature ?
>
> That would be nice, especially if we want to reimplement MANIFEST.in as
> setup() option at some point. My current implementation doesn't extend
> the API, so there's no way to specify a subset of files under a
> directory like recursive-include; every directory matched by a glob is
> copied in whole (recursively):

Please could you add a feature request ?

We will need to discuss it there.

>
> import os
> from distutils.command.build_py import build_py as _build_py
>
> class build_py(_build_py):
>    def find_data_files(self, package, src_dir):
>        files = []
>        for p in _build_py.find_data_files(self, package, src_dir):
>            if os.path.isdir(p):
>                files.extend(os.path.join(par,f)
>                             for par,dirs,files in os.walk(p)
>                             for f in files)
>            else:
>                files.append(p)
>        return files

> If it's a bug, it's certainly not accidental; there's a big XXX comment
> justifying this choice but I'm not convinced. I posted about it at
> http://mail.python.org/pipermail/python-list/2009-January/524263.html;
> if you think it's a bug I'll fill an issue.

Please do so, I am focusing on the Distutils-SIG ML , so I missed it

I don't know yet what is a proper way to adress this, but the bug tracker seem
apprioriate for this.

>
> ----------
> versions: +Python 2.6, Python 3.0
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue2279>
> _______________________________________
> _______________________________________________
> Python-bugs-list mailing list
> Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/ziade.tarek%40gmail.com
>
>
msg82383 - (view) Author: George Sakkis (gsakkis) Date: 2009-02-17 22:45
Opened #5300 and #5302 for the mentioned issues. 

Btw in your patch, I believe `os.path.join(dirname, f)` should be
replaced by `f` alone; `dirname` refers to the dir under the
installation directory, not the source.
msg82388 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-02-17 23:08
> Opened #5300 and #5302 for the mentioned issues. 

ok thanks

> Btw in your patch, I believe `os.path.join(dirname, f)` should be
> replaced by `f` alone; `dirname` refers to the dir under the
> installation directory, not the source.

Right, thanks George. I'v also added you in the ACKS for your contribution
on this (r69724)
History
Date User Action Args
2022-04-11 14:56:31adminsetgithub: 46532
2010-06-03 14:53:13eric.araujosetresolution: accepted -> fixed
stage: resolved
2009-02-17 23:08:21tareksetmessages: + msg82388
2009-02-17 22:45:59gsakkissetmessages: + msg82383
2009-02-17 10:08:36tareksetmessages: + msg82313
2009-02-17 09:44:20tareksetmessages: + msg82309
2009-02-17 03:52:33gsakkissetmessages: + msg82298
2009-02-17 03:31:40gsakkissetmessages: + msg82297
versions: + Python 2.6, Python 3.0
2009-02-16 22:03:38tareksetversions: - Python 2.6, Python 3.0
2009-02-16 21:59:11tareksetstatus: open -> closed
messages: + msg82277
2009-02-16 20:36:25tareksetmessages: + msg82269
2009-02-16 03:18:41gsakkissetmessages: + msg82213
2009-02-14 10:36:37tareksetmessages: - msg82007
2009-02-14 10:35:52tareksetmessages: + msg82007
2009-02-14 10:30:50tareksetmessages: + msg82004
2009-02-14 00:06:05gsakkissetmessages: + msg81980
2009-02-13 08:58:38tareksetpriority: high
resolution: accepted
messages: + msg81888
2009-02-11 16:46:34gsakkissetnosy: + gsakkis
messages: + msg81650
2009-02-06 09:13:35tareksetassignee: tarek
nosy: + tarek
versions: + Python 3.1, Python 2.7, - Python 2.5, Python 2.4
2008-03-12 14:53:31driptoncreate