This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: os.listdir-alike that includes file type
Type: enhancement Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: donut, draghuram, loewis
Priority: normal Keywords:

Created on 2002-10-06 12:22 by donut, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Messages (4)
msg61100 - (view) Author: Matthew Mueller (donut) Date: 2002-10-06 12:22
I propose to add two new functions, say os.listdirtypes
and os.llistdirtypes.  These would be similar to
os.listdir except they would return a list of tuples
(filename, filetype).  This would have the advantage
that on oses that support the d_type entry in the
dirent struct the type could be calculated without
extra calls and harddrive reading.  Even on
non-supporting os/filesystems, it could emulate it with
a call to stat/lstat in the func, still saving some
work of calling stat and interpreting its result in
python code or using os.path.isX.

Filetype would indicate wether the entry was a file,
directory, link, fifo, etc.  This could either be a
char (like ls -l gives) ('-', 'd', 'l', 'p', etc), or
some sort of constant (os.DT_REG, os.DT_DIR, os.DT_LNK,
os.DT_FIFO, etc).  Personally I think the string method
is simpler and easier, though some (non-*ix) people may
be confused by '-' being file rather than 'f'.  (Of
course, you could change that, but then *ix users would
be confused ;)

listdirtypes would be equivalent to using stat, ie.
symlinks would be followed when determining types, and
llistdirtypes would be like lstat so symlinks would be
indicated as 'l'.

An app I'm working on right now that reads in a
directory tree on startup got about a 2.2x speedup when
I implemented this as an extension module, and about
1.6x speedup when I tested it without d_type support. 
(The module was written using Pyrex, so its not a
candidate for inclusion itself, but I would be willing
to work on a C implementation if this idea is accepted..)
msg61101 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-10-13 14:08
Logged In: YES 
user_id=21627

I'm in favour of exposing more information received from
readdir. I'm not sure whether adding new functions is the
right API, perhaps adding a flag to the existing listdir is
sufficient.

I don't think listdir should perform stat calls itself; if
the system has some information available, fine, if it
doesn't, return nothing.

What is the proposed difference between listdirtypes and
llistdirtypes?

On the return type of the "verbose" listdir, I think it
should return structs with named fields, such as d_ino,
d_name, and d_type. Callers can then find out themselves
what information they got, and augment this with information
from stat that they also need. In particular, d_type should
be returned as presented in the system, since it might have
slight semantic difference to what os.stat would tell about
the file.

This should extend to other systems as well. E.g. on
Windows, it is possible to learn the modification times from
listdir, with no extra overhead.

There should also be a way to use this with os.path.walk.

So, in short, I'm in favour of this idea. Would you
volunteer to write a PEP, and provide the Unix implementation?
msg61102 - (view) Author: Matthew Mueller (donut) Date: 2002-10-13 19:57
Logged In: YES 
user_id=65253

Adding a flag to the existing listdir as opposed to adding
more functions would be fine I think.

There are two reasons I suggest adding the stat calls in
listdir.  The first is purely practical, and that is even
without a filesystem that supports the d_type field, you can
still get a decent speed up merely by performing the stat
call in C rather than python.

The second is from a usability point of view.  If listdir
would not do the stat for you, your code would always have
to have a seperate case to handle the non-d_type using
filesystems, so it would not really make listdir any easier
to use, whereas if listdir did the stat itself, you could
simplify a huge amount of code out there that always follows
an os.listdir by os.stat or os.path.isX.

Perhaps the d_type field could be returned verbatim, but a
seperate field could be added that, if d_type was something
useful would just be set by that, or otherwise would be set
by a call to stat, that way you could still see if you
really wanted to whether the filesystem actually gave you
the d_type.

The difference between listdirtypes and llistdirtypes is
just like the difference between os.stat and os.lstat, that
is in the case of symlinks the first will return the data of
the linked-to file while the second will return the data of
the symlink it self.  Again, this is mostly for user
convenience.

As for os.path.walk, a flag could be added to that which
would replace the "names" argument with the same return type
as the new verbose-listdir.

Sure, I'll volunteer.  I'll start reading up on the PEP process.
msg62072 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-02-05 17:45
No activity for long time.
History
Date User Action Args
2022-04-10 16:05:43adminsetgithub: 37269
2008-02-05 17:45:52draghuramsetstatus: open -> closed
nosy: + draghuram
messages: + msg62072
2002-10-06 12:22:50donutcreate