Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MLSD command support to ftplib #55281

Closed
giampaolo opened this issue Jan 30, 2011 · 19 comments
Closed

Add MLSD command support to ftplib #55281

giampaolo opened this issue Jan 30, 2011 · 19 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@giampaolo
Copy link
Contributor

BPO 11072
Nosy @pitrou, @ericvsmith, @giampaolo
Files
  • ftplib_mlsd.patch
  • issue11072.diff
  • ftplib_mlsd_v2.patch
  • ftplib_mlsd_v3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/giampaolo'
    closed_at = <Date 2011-05-06.17:51:01.612>
    created_at = <Date 2011-01-30.23:37:29.512>
    labels = ['type-feature', 'library']
    title = 'Add MLSD command support to ftplib'
    updated_at = <Date 2015-12-14.06:11:07.865>
    user = 'https://github.com/giampaolo'

    bugs.python.org fields:

    activity = <Date 2015-12-14.06:11:07.865>
    actor = 'python-dev'
    assignee = 'giampaolo.rodola'
    closed = True
    closed_date = <Date 2011-05-06.17:51:01.612>
    closer = 'giampaolo.rodola'
    components = ['Library (Lib)']
    creation = <Date 2011-01-30.23:37:29.512>
    creator = 'giampaolo.rodola'
    dependencies = []
    files = ['20623', '21047', '21064', '21066']
    hgrepos = []
    issue_num = 11072
    keywords = ['patch']
    message_count = 19.0
    messages = ['127562', '130300', '130302', '130327', '130471', '130474', '130476', '130500', '130501', '130502', '135145', '135346', '135348', '135350', '135351', '135356', '135358', '135463', '256369']
    nosy_count = 6.0
    nosy_names = ['pitrou', 'eric.smith', 'giampaolo.rodola', 'SilentGhost', 'George.Dhoore', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue11072'
    versions = ['Python 3.3']

    @giampaolo
    Copy link
    Contributor Author

    From RFC-3659:

    The MLST and MLSD commands are intended to standardize the file and
    directory information returned by the server-FTP process. These
    commands differ from the LIST command in that the format of the
    replies is strictly defined although extensible.

    The patch in attachment adds support for MLSD command.
    This should ease the development of ftp clients which are forced to parse un-standardized LIST responses via dir() or retrlines() methods to obtain meaningful data for the directory listing.

    Example:

    >>> import ftplib
    >>> from pprint import pprint as pp
    >>> f = ftplib.FTP()
    >>> f.connect("localhost")
    >>> f.login("anonymous")
    >>> ls = f.mlsd()
    >>> pp(ls)
     {'modify': 20100814164724,
      'name': 'svnmerge.py',
      'perm': 'r',
      'size': 90850,
      'type': 'file',
      'unique': '80718b568'},
     {'modify': 20101207185033,
      'name': 'README',
      'perm': 'r',
      'size': 53731,
      'type': 'file',
      'unique': '80718aafe'},
     {'modify': 20100417183215,
      'name': 'install-sh',
      'perm': 'r',
      'size': 7122,
      'type': 'file',
      'unique': '80718b2d2'},
     {'modify': 20110129210053,
      'name': 'Include',
      'perm': 'el',
      'size': 4096,
      'type': 'dir',
      'unique': '8071a2bc4'}]
    >>>

    @pitrou
    Copy link
    Member

    pitrou commented Mar 8, 2011

    Why the callback option?
    Also, the tests don't appear to check the return value.

    @ericvsmith
    Copy link
    Member

    I agree that the callback isn't needed, and it reflects the older coding style of much of the library (such as in retrlines). Instead, I'd make this a generator, yielding each of the dicts. (Actually in some ideal rewrite of ftplib, the whole callback "feature" used by retrlines would go away except as a compatibility feature, and internally everything would use generators instead of callbacks.)

    Aren't you modifying the state on the server (via "OPTS MLST"), and then if you make a subsequent call without specifying "facts" you'll be using the value of "facts" from the previous call to MLSD? I don't think that's desirable, but short of calling "OPTS MLST" every time (possibly with the results of an initial FEAT) there's no way around it.

    Just today I saw a filename with "; " in the name. I think you want to use .partition(' ') isolate the facts from the filename, and add a test with such a filename.

    I don't like the "isdigit" test. Shouldn't this decision be left to the caller? What if a fact happens to look like an integer some of the time, but not always? Maybe "unique" is a hex string. I don't think you'd want it converted to an int on the occasions where it was all decimal digits, but a string otherwise. Also look at your "modify" facts. Those are not very useful as ints.

    I don't think you should invent a pseudo-fact "name", in case the standard ever uses that. I'd prefer if you yielded tuples of (filename, fact-dict).

    MLSD_DATA has newlines in it, so you get "\r\n\n" sequences in your test data.

    If a fact=value string does not have an '=', you silently ignore it. I'd rather this raise an exception and not pass silently. It's not compliant. You should have a test for this.

    It's possible for a value to have an '=' in it, so you don't want to use .split('='), you should use .partition('='). You should add such a fact to the test data.

    Thanks for working on this. It will be a great addition to ftplib.

    @SilentGhost
    Copy link
    Mannequin

    SilentGhost mannequin commented Mar 8, 2011

    Here is a patch incorporating some of the changes proposed by Eric:

    • drop callback, return generator of (filename, fact-dict)

    Aren't you modifying the state on the server (via "OPTS MLST"), and then if you make a subsequent call without specifying "facts" you'll be using the value of "facts" from the previous call to MLSD? I don't think that's desirable, but short of calling "OPTS MLST" every time (possibly with the results of an initial FEAT) there's no way around it.

    This is the behaviour according to RFC. MLSD will return the set of facts, until a new OPTS MLST issued.

    I don't like the "isdigit" test. Shouldn't this decision be left to the caller? What if a fact happens to look like an integer some of the time, but not always? Maybe "unique" is a hex string. I don't think you'd want it converted to an int on the occasions where it was all decimal digits, but a string otherwise. Also look at your "modify" facts. Those are not very useful as ints.

    Drop the isdigit test: some value such as timestamps ('modify') might be "digits" for one files and would remain strings for the others.

    If a fact=value string does not have an '=', you silently ignore it. I'd rather this raise an exception and not pass silently. It's not compliant. You should have a test for this.

    I don't think it should raise an error, i'd rather just pass it to the caller. There is a wide variety of possible non-compliant responses. For example there is a rigid format for time stamps, do we have to check for that?

    One thing, that my patch doesn't do at the moment, but I think would be useful is to normalise all fact names to lower case. What do you think?

    I hope that MLSD_DATA now covers wider range of cases (it's from http://tools.ietf.org/html/rfc3659#section-7.7.4). Note that server MUST NOT return facts that weren't requested.

    What would be the return values check here?

    @giampaolo
    Copy link
    Contributor Author

    Thanks for the great review Eric.
    Patch in attachment provides the following changes:

    • return a generator object
    • remove callback parameter
    • each yielded entry is a (name, {...}) tuple
    • fix for ";" in file name
    • fix for " " in file name
    • fix for "=" in fact value
    • raise on missing "=" in fact
    • no longer converts numbers in integers
    • all dictionary keys (fact names) are lower-cased
    • extended test suite covering a lot of cases
    • no new lines in MLSD_DATA
    • updated doc with deprecation warnings for dir() and nslt() methods

    Aren't you modifying the state on the server (via "OPTS MLST"),
    and then if you make a subsequent call without specifying "facts"
    you'll be using the value of "facts" from the previous call to MLSD?

    Correct. This is how OPTS/MLSD are supposed to work.
    Not sure whether we can or *should* reset initial default facts.
    I think we shouldn't.

    @SilentGhost
    Copy link
    Mannequin

    SilentGhost mannequin commented Mar 9, 2011

    facts_found.strip(" ").rstrip(";")

    strip is redundant since facts_found is a first element of partitioning by the same string. rstrip is wrong since you're potentially deleting more than one character (there is no test for that).

    In your test you're checking whether returned facts contain every requested fact, this is not guaranteed by RFC. Also, not sure what the last for statement is supposed to mean. Is it a typo?

    @giampaolo
    Copy link
    Contributor Author

    You're right about r/strip(), thanks (new patch in attachment).

    In your test you're checking whether returned facts
    contain every requested fact, this is not guaranteed by RFC.

    Yes, but we're using a dummy FTP server returning static MLSD_DATA, so this is not important.

    Also, not sure what the last for statement
    is supposed to mean. Is it a typo?

    It makes sure whether in case of directory empty (see: no data returned) we don't enter in the "for" block.
    Probably it's not even necessary because this is already tested by:
    self.assertRaises(StopIteration, next, self.client.mlsd())
    ...but... whathever.

    @ericvsmith
    Copy link
    Member

    I'll give this a proper review in the next day or so (busy at PyCon).

    Despite the fact that I typically hate changing values returned by the server, I agree on case-folding the fact names to lowercase upon reading them. The RFC clearly states they're case insensitive: "Fact names are case-insensitive. Size, size, SIZE, and SiZe are the same fact." You should have the facts in MLSD_DATA be mixed case, to ensure they're being lowercased correctly.

    I agree that there's nothing that can be done in the case where no facts are specified: the spec clearly says just use the values we've previously set. Caller beware.

    @giampaolo
    Copy link
    Contributor Author

    You should have the facts in MLSD_DATA be mixed case, to ensure
    they're being lowercased correctly.

    This is already tested by:

    + # case sensitiveness
    + set_data('Type=type;TyPe=perm;UNIQUE=unique; name\r\n')
    + _name, facts = next(self.client.mlsd())
    + [self.assertTrue(x.islower()) for x in facts.keys()]

    @ericvsmith
    Copy link
    Member

    So it is. I told you I hadn't done a proper review! I was mainly trying to say I agree with lowercasing.

    I'll shut up until I can read the whole patch.

    @giampaolo
    Copy link
    Contributor Author

    Eric, any further comments about the patch?
    Can we go on and commit it?

    @ericvsmith
    Copy link
    Member

    Yes, I think this should be committed. I think the API is reasonable, which is the primary concern. If there are implementation bugs, they can be addressed as they're found.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 6, 2011

    New changeset 56bce38b274f by Giampaolo Rodola' in branch 'default':
    Issue bpo-11072: added MLSD command (RFC-3659) support to ftplib.
    http://hg.python.org/cpython/rev/56bce38b274f

    @giampaolo giampaolo added the stdlib Python modules in the Lib dir label May 6, 2011
    @giampaolo giampaolo self-assigned this May 6, 2011
    @giampaolo giampaolo added the type-feature A feature request or enhancement label May 6, 2011
    @SilentGhost
    Copy link
    Mannequin

    SilentGhost mannequin commented May 6, 2011

    Has something prevent you from implementing suggestion provided in my review?

    @giampaolo
    Copy link
    Contributor Author

    Which ones in particular? msg130474?

    @SilentGhost
    Copy link
    Mannequin

    SilentGhost mannequin commented May 6, 2011

    a review on rietveld http://bugs.python.org/review/11072/show
    which as I confirmed with you specifically at #python-dev you received a notification of.

    @giampaolo
    Copy link
    Contributor Author

    In my view, aside from the documentation note they're all minor/styling-related changes but feel free to provide a patch if you want to and I'll commit it.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 7, 2011

    New changeset 153bd8fc22c7 by Giampaolo Rodola' in branch 'default':
    bpo-11072- applying http://bugs.python.org/review/11072/show suggestions
    http://hg.python.org/cpython/rev/153bd8fc22c7

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 14, 2015

    New changeset ecef6b3f6639 by Gregory P. Smith in branch '3.5':
    Issue bpo-11072: change the incorrect "deprecation" of ftplib dir() and nlst()
    https://hg.python.org/cpython/rev/ecef6b3f6639

    New changeset 287bb82768a7 by Gregory P. Smith in branch 'default':
    Issue bpo-11072: change the incorrect "deprecation" of ftplib dir() and nlst()
    https://hg.python.org/cpython/rev/287bb82768a7

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants