msg127562 - (view) |
Author: Giampaolo Rodola' (giampaolo.rodola) *  |
Date: 2011-01-30 23:37 |
From RFC-3659:
The MLST and MLSD commands are intended to standardize the file and
directory information returned by the server-FTP process. These
commands differ from the LIST command in that the format of the
replies is strictly defined although extensible.
The patch in attachment adds support for MLSD command.
This should ease the development of ftp clients which are forced to parse un-standardized LIST responses via dir() or retrlines() methods to obtain meaningful data for the directory listing.
Example:
>>> import ftplib
>>> from pprint import pprint as pp
>>> f = ftplib.FTP()
>>> f.connect("localhost")
>>> f.login("anonymous")
>>> ls = f.mlsd()
>>> pp(ls)
{'modify': 20100814164724,
'name': 'svnmerge.py',
'perm': 'r',
'size': 90850,
'type': 'file',
'unique': '80718b568'},
{'modify': 20101207185033,
'name': 'README',
'perm': 'r',
'size': 53731,
'type': 'file',
'unique': '80718aafe'},
{'modify': 20100417183215,
'name': 'install-sh',
'perm': 'r',
'size': 7122,
'type': 'file',
'unique': '80718b2d2'},
{'modify': 20110129210053,
'name': 'Include',
'perm': 'el',
'size': 4096,
'type': 'dir',
'unique': '8071a2bc4'}]
>>>
|
msg130300 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2011-03-08 00:13 |
Why the callback option?
Also, the tests don't appear to check the return value.
|
msg130302 - (view) |
Author: Eric V. Smith (eric.smith) *  |
Date: 2011-03-08 00:54 |
I agree that the callback isn't needed, and it reflects the older coding style of much of the library (such as in retrlines). Instead, I'd make this a generator, yielding each of the dicts. (Actually in some ideal rewrite of ftplib, the whole callback "feature" used by retrlines would go away except as a compatibility feature, and internally everything would use generators instead of callbacks.)
Aren't you modifying the state on the server (via "OPTS MLST"), and then if you make a subsequent call without specifying "facts" you'll be using the value of "facts" from the previous call to MLSD? I don't think that's desirable, but short of calling "OPTS MLST" every time (possibly with the results of an initial FEAT) there's no way around it.
Just today I saw a filename with "; " in the name. I think you want to use .partition(' ') isolate the facts from the filename, and add a test with such a filename.
I don't like the "isdigit" test. Shouldn't this decision be left to the caller? What if a fact happens to look like an integer some of the time, but not always? Maybe "unique" is a hex string. I don't think you'd want it converted to an int on the occasions where it was all decimal digits, but a string otherwise. Also look at your "modify" facts. Those are not very useful as ints.
I don't think you should invent a pseudo-fact "name", in case the standard ever uses that. I'd prefer if you yielded tuples of (filename, fact-dict).
MLSD_DATA has newlines in it, so you get "\r\n\n" sequences in your test data.
If a fact=value string does not have an '=', you silently ignore it. I'd rather this raise an exception and not pass silently. It's not compliant. You should have a test for this.
It's possible for a value to have an '=' in it, so you don't want to use .split('='), you should use .partition('='). You should add such a fact to the test data.
Thanks for working on this. It will be a great addition to ftplib.
|
msg130327 - (view) |
Author: SilentGhost (SilentGhost) *  |
Date: 2011-03-08 12:17 |
Here is a patch incorporating some of the changes proposed by Eric:
* drop callback, return generator of (filename, fact-dict)
> Aren't you modifying the state on the server (via "OPTS MLST"), and then if you make a subsequent call without specifying "facts" you'll be using the value of "facts" from the previous call to MLSD? I don't think that's desirable, but short of calling "OPTS MLST" every time (possibly with the results of an initial FEAT) there's no way around it.
This is the behaviour according to RFC. MLSD will return the set of facts, until a new OPTS MLST issued.
> I don't like the "isdigit" test. Shouldn't this decision be left to the caller? What if a fact happens to look like an integer some of the time, but not always? Maybe "unique" is a hex string. I don't think you'd want it converted to an int on the occasions where it was all decimal digits, but a string otherwise. Also look at your "modify" facts. Those are not very useful as ints.
Drop the isdigit test: some value such as timestamps ('modify') might be "digits" for one files and would remain strings for the others.
> If a fact=value string does not have an '=', you silently ignore it. I'd rather this raise an exception and not pass silently. It's not compliant. You should have a test for this.
I don't think it should raise an error, i'd rather just pass it to the caller. There is a wide variety of possible non-compliant responses. For example there is a rigid format for time stamps, do we have to check for that?
One thing, that my patch doesn't do at the moment, but I think would be useful is to normalise all fact names to lower case. What do you think?
I hope that MLSD_DATA now covers wider range of cases (it's from http://tools.ietf.org/html/rfc3659#section-7.7.4). Note that server MUST NOT return facts that weren't requested.
What would be the return values check here?
|
msg130471 - (view) |
Author: Giampaolo Rodola' (giampaolo.rodola) *  |
Date: 2011-03-09 22:35 |
Thanks for the great review Eric.
Patch in attachment provides the following changes:
- return a generator object
- remove callback parameter
- each yielded entry is a (name, {...}) tuple
- fix for ";" in file name
- fix for " " in file name
- fix for "=" in fact value
- raise on missing "=" in fact
- no longer converts numbers in integers
- all dictionary keys (fact names) are lower-cased
- extended test suite covering a lot of cases
- no new lines in MLSD_DATA
- updated doc with deprecation warnings for dir() and nslt() methods
> Aren't you modifying the state on the server (via "OPTS MLST"),
> and then if you make a subsequent call without specifying "facts"
> you'll be using the value of "facts" from the previous call to MLSD?
Correct. This is how OPTS/MLSD are supposed to work.
Not sure whether we can or *should* reset initial default facts.
I think we shouldn't.
|
msg130474 - (view) |
Author: SilentGhost (SilentGhost) *  |
Date: 2011-03-09 22:58 |
facts_found.strip(" ").rstrip(";")
strip is redundant since facts_found is a first element of partitioning by the same string. rstrip is wrong since you're potentially deleting more than one character (there is no test for that).
In your test you're checking whether returned facts contain every requested fact, this is not guaranteed by RFC. Also, not sure what the last for statement is supposed to mean. Is it a typo?
|
msg130476 - (view) |
Author: Giampaolo Rodola' (giampaolo.rodola) *  |
Date: 2011-03-09 23:21 |
You're right about r/strip(), thanks (new patch in attachment).
> In your test you're checking whether returned facts
> contain every requested fact, this is not guaranteed by RFC.
Yes, but we're using a dummy FTP server returning static MLSD_DATA, so this is not important.
> Also, not sure what the last for statement
> is supposed to mean. Is it a typo?
It makes sure whether in case of directory empty (see: no data returned) we don't enter in the "for" block.
Probably it's not even necessary because this is already tested by:
self.assertRaises(StopIteration, next, self.client.mlsd())
...but... whathever.
|
msg130500 - (view) |
Author: Eric V. Smith (eric.smith) *  |
Date: 2011-03-10 11:17 |
I'll give this a proper review in the next day or so (busy at PyCon).
Despite the fact that I typically hate changing values returned by the server, I agree on case-folding the fact names to lowercase upon reading them. The RFC clearly states they're case insensitive: "Fact names are case-insensitive. Size, size, SIZE, and SiZe are the same fact." You should have the facts in MLSD_DATA be mixed case, to ensure they're being lowercased correctly.
I agree that there's nothing that can be done in the case where no facts are specified: the spec clearly says just use the values we've previously set. Caller beware.
|
msg130501 - (view) |
Author: Giampaolo Rodola' (giampaolo.rodola) *  |
Date: 2011-03-10 11:26 |
> You should have the facts in MLSD_DATA be mixed case, to ensure
> they're being lowercased correctly.
This is already tested by:
+ # case sensitiveness
+ set_data('Type=type;TyPe=perm;UNIQUE=unique; name\r\n')
+ _name, facts = next(self.client.mlsd())
+ [self.assertTrue(x.islower()) for x in facts.keys()]
|
msg130502 - (view) |
Author: Eric V. Smith (eric.smith) *  |
Date: 2011-03-10 11:31 |
So it is. I told you I hadn't done a proper review! I was mainly trying to say I agree with lowercasing.
I'll shut up until I can read the whole patch.
|
msg135145 - (view) |
Author: Giampaolo Rodola' (giampaolo.rodola) *  |
Date: 2011-05-04 18:05 |
Eric, any further comments about the patch?
Can we go on and commit it?
|
msg135346 - (view) |
Author: Eric V. Smith (eric.smith) *  |
Date: 2011-05-06 17:41 |
Yes, I think this should be committed. I think the API is reasonable, which is the primary concern. If there are implementation bugs, they can be addressed as they're found.
|
msg135348 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2011-05-06 17:49 |
New changeset 56bce38b274f by Giampaolo Rodola' in branch 'default':
Issue #11072: added MLSD command (RFC-3659) support to ftplib.
http://hg.python.org/cpython/rev/56bce38b274f
|
msg135350 - (view) |
Author: SilentGhost (SilentGhost) *  |
Date: 2011-05-06 17:55 |
Has something prevent you from implementing suggestion provided in my review?
|
msg135351 - (view) |
Author: Giampaolo Rodola' (giampaolo.rodola) *  |
Date: 2011-05-06 18:11 |
Which ones in particular? msg130474?
|
msg135356 - (view) |
Author: SilentGhost (SilentGhost) *  |
Date: 2011-05-06 18:24 |
a review on rietveld http://bugs.python.org/review/11072/show
which as I confirmed with you specifically at #python-dev you received a notification of.
|
msg135358 - (view) |
Author: Giampaolo Rodola' (giampaolo.rodola) *  |
Date: 2011-05-06 18:48 |
In my view, aside from the documentation note they're all minor/styling-related changes but feel free to provide a patch if you want to and I'll commit it.
|
msg135463 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2011-05-07 14:07 |
New changeset 153bd8fc22c7 by Giampaolo Rodola' in branch 'default':
#11072- applying http://bugs.python.org/review/11072/show suggestions
http://hg.python.org/cpython/rev/153bd8fc22c7
|
msg256369 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2015-12-14 06:11 |
New changeset ecef6b3f6639 by Gregory P. Smith in branch '3.5':
Issue #11072: change the incorrect "deprecation" of ftplib dir() and nlst()
https://hg.python.org/cpython/rev/ecef6b3f6639
New changeset 287bb82768a7 by Gregory P. Smith in branch 'default':
Issue #11072: change the incorrect "deprecation" of ftplib dir() and nlst()
https://hg.python.org/cpython/rev/287bb82768a7
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:12 | admin | set | github: 55281 |
2015-12-14 06:11:07 | python-dev | set | messages:
+ msg256369 |
2011-05-07 14:07:04 | python-dev | set | messages:
+ msg135463 |
2011-05-06 18:48:07 | giampaolo.rodola | set | messages:
+ msg135358 |
2011-05-06 18:24:42 | SilentGhost | set | messages:
+ msg135356 |
2011-05-06 18:11:13 | giampaolo.rodola | set | messages:
+ msg135351 |
2011-05-06 17:55:35 | SilentGhost | set | messages:
+ msg135350 |
2011-05-06 17:51:01 | giampaolo.rodola | set | status: open -> closed resolution: fixed assignee: giampaolo.rodola components:
+ Library (Lib) type: enhancement stage: resolved |
2011-05-06 17:49:14 | python-dev | set | nosy:
+ python-dev messages:
+ msg135348
|
2011-05-06 17:41:56 | eric.smith | set | messages:
+ msg135346 |
2011-05-04 18:05:38 | giampaolo.rodola | set | messages:
+ msg135145 |
2011-03-10 11:31:27 | eric.smith | set | nosy:
pitrou, eric.smith, giampaolo.rodola, SilentGhost, George.Dhoore messages:
+ msg130502 |
2011-03-10 11:26:30 | giampaolo.rodola | set | nosy:
pitrou, eric.smith, giampaolo.rodola, SilentGhost, George.Dhoore messages:
+ msg130501 |
2011-03-10 11:17:28 | eric.smith | set | nosy:
pitrou, eric.smith, giampaolo.rodola, SilentGhost, George.Dhoore messages:
+ msg130500 |
2011-03-09 23:21:13 | giampaolo.rodola | set | files:
+ ftplib_mlsd_v3.patch nosy:
pitrou, eric.smith, giampaolo.rodola, SilentGhost, George.Dhoore messages:
+ msg130476
|
2011-03-09 22:58:30 | SilentGhost | set | nosy:
pitrou, eric.smith, giampaolo.rodola, SilentGhost, George.Dhoore messages:
+ msg130474 |
2011-03-09 22:35:45 | giampaolo.rodola | set | files:
+ ftplib_mlsd_v2.patch nosy:
pitrou, eric.smith, giampaolo.rodola, SilentGhost, George.Dhoore messages:
+ msg130471
|
2011-03-08 12:17:20 | SilentGhost | set | files:
+ issue11072.diff nosy:
pitrou, eric.smith, giampaolo.rodola, SilentGhost, George.Dhoore messages:
+ msg130327
|
2011-03-08 01:09:58 | SilentGhost | set | nosy:
+ SilentGhost
|
2011-03-08 00:54:30 | eric.smith | set | nosy:
pitrou, eric.smith, giampaolo.rodola, George.Dhoore messages:
+ msg130302 |
2011-03-08 00:38:01 | George.Dhoore | set | nosy:
+ George.Dhoore
|
2011-03-08 00:13:45 | pitrou | set | nosy:
pitrou, eric.smith, giampaolo.rodola messages:
+ msg130300 |
2011-01-31 02:18:05 | eric.smith | set | nosy:
+ eric.smith
|
2011-01-30 23:37:29 | giampaolo.rodola | create | |