New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tarfile doesn't overwrite symlink by directory #64173
Comments
tarfile.py compared to GNU tar doesn't overwrite existing symlink with directory of same name if such directory exists in the extracted tarball. |
Here is the preliminary path. It works and tested on Linux. I'll check the behaviour on Windows later. |
Here is the patch that works both on Windows and Linux. |
Testing passed on OSX Mavericks. |
Ah, thanks for the review, Serhiy. My bad. There is no underlying bug of tar. I was confused by the behaviour of tar which is converting the absolute path to relative path. So, adding '/home/user/dir/file' to tar when you are in '/home/user/dir' then extracting it in the same place, you'll get: I thought it was a bug. But this is what it is supposed to be. I see that tarfile module is mimicking GNU tar behaviour. This is the updated patch. |
Serhiy commented, "I think we should remove targetpath in all cases. Not only when softlink is I already did that in my latest patch but I am a little bit wary of this behaviour. |
Yeah, you are right, Serhiy. I check the behaviour of GNU tar command line. It always replaces the target no matter what kind of file source and target are. |
There are several issues with last patch.
|
Here is the preliminary patch to address Serhiy's concern. I added some regression tests as well. Give me a time to think how to refactor the code (especially the test). |
Martin: you review a patch written 4 years ago at https://bugs.python.org/review/19974/diff/11106/Lib/tarfile.py Oh wow, I didn't know that Rietveld was still working :-D It's maybe time to convert the old patch to a proper pull request on GitHub, no? :-) |
Martin Panter, thank you for reviewing my patch. Let me rework it. It has been a while (4 years!!!). |
Martin Panter, I have modernized the patch. About your suggestion:
|
"I could not find a case where these methods return different result." -> This is wrong. My mistake. os.lexists returns True for non symbolic link file while os.islink returns False. |
Sorry, Martin. I just understood what you suggested. I thought you were saying lexists to replace islink, but it is to replace the complex if condition. Let me work on it. |
About “lexists”, I meant using it instead of “os.path.exits” (not “islink”). On Linux: >>> targetpath = 'target'
>>> os.symlink('nonexistant', dst=targetpath) # Make a broken symlink
>>> os.system('ls -l')
total 0
lrwxrwxrwx 1 vadmium vadmium 11 Jan 6 09:28 target -> nonexistant
0
>>> os.path.exists(targetpath) # Doesn't register broken symlinks
False
>>> os.path.lexists(targetpath) # Does register it
True Did you try extracting a tar file over a broken link? (I haven’t tried your code; I’m just going on theory.) |
After experimenting with lexists, I don't think I can simplify it with lexists. |
I have added test case for broken symlinks. I am not sure whether this is necessary or not. But anyway I have added it. |
Martin, thank you for your advice. I have added additional two test cases for broken symlink case.
I hope that is enough. Let me know if you have another concern. |
Correction:
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: