This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author gvanrossum
Recipients gvanrossum, pitrou
Date 2016-01-06.22:41:28
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
There are concerns that pathlib is inefficient because it doesn't cache stat() operations. Thus, for example this code calls stat() for each result twice (once internal to the glob, a second time to answer the is_symlink() question):

  p = pathlib.Path('/usr')
  links = [x for x in p.rglob('*') if x.is_symlink()]

I have a tentative patch (without tests). On my Mac it only gives modest speedups (between 5 and 20 percent) but things may be different on other platforms or for applications that make a lot of inquiries about the same path.

The API I am proposing is that by default nothing changes; to benefit from caching you must instantiate a StatCache() object and pass it to Path() constructor calls, e.g. Path('/usr', stat_cache=cache_object). All Path objects derived from this path object will share the cache. To force an uncached Path object you can use Path(p).

The patch is incomplete; there are no tests for the new functionality (though existing tests pass) and __eq__ should be adjusted so that Path objects using different caches always compare unequal.

Question for Antoine: Did you perhaps anticipate a design like this? Each Path instance has an _accessor slot, but there is only one accessor instance defined that is used everywhere (the global _normal_accessor). So you could have avoided a bunch of complexity in the code around setting the proper _accessor unless you were planning to use multiple accessors.
Date User Action Args
2016-01-06 22:41:29gvanrossumsetrecipients: + gvanrossum, pitrou
2016-01-06 22:41:29gvanrossumsetmessageid: <>
2016-01-06 22:41:29gvanrossumlinkissue26031 messages
2016-01-06 22:41:29gvanrossumcreate