New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The tarfile module crashes when tarfile contains a symlink and unpack directory contain it too #67417
Comments
The tarfile.makelink() code crashes with a maximum recursion error when it unpacks a tarfile that contains a symlink into a directory that already contains this symlink. Attached is a standalone testcase (that probably better explains whats going on :) and a possible fix. |
A possible fix that works with the previous testcase for this bug. It does not break a tarfile tests. |
This patch contains a regression test as well. |
Perhaps I'm missing something here, but it doesn't seem to be a problem with valid links. Only invalid symlinks are causing this issue. |
Thanks SilentGhost for your feedback and sorry for my slow reply. I looked into this some more and attached a updated patch with a more complete test. It also covers a crash now that happens when there is a symlink cycle in the tar and on disk. My fix is to remove existing links before unpack and replace them with the link in the tar. This is what gnu tar is also doing (according to http://www.gnu.org/software/tar/manual/html_node/Dealing-with-Old-Files.html). However this seems to be a behavior change of what the tarfile module has traditionally been done so feel free to reject it, I'm open to alternative ideas of course (even though I feel like having the same behavior as gnu tar is desirable). Thanks, |
Anything I can do to help moving this issue forward? |
TarFile.makelink() has a fallback mode in case the platform does not support links. Instead of a symlink or a hardlink it extracts the file it points to as long as it exists in the current archive. More precisely, makelink() calls os.symlink() and if one of the exceptions in the symlink_exception tuple is raised, it goes into fallback mode. r80944 introduced a regression because it replaced the WindowsError in symlink_exception with an OSError which is much less specific than a WindowsError. Since that change, the fallback is used everytime an OSError occurs, in Michael's case it is a FileExistsError, because the symlink is already there. The attached patch restores the old behavior. This might not be what you wanted, Michael, but at least, tarfile no longer crashes. |
I suck :-) It is hg revision bb94f6222fef. |
The problem with WindowsError should only exist in 3.4+. 2.7 doesn’t support creating symlinks on Windows. Michael’s fix is the same as already done in 2.7 for bpo-10761 and (part of) bpo-12088. However I’m not sure that is the best approach for a bug fix. Also see bpo-19974 proposing to replace existing directory entries in all cases, including replacing empty subdirectories, and not just when extracting symlinks. I suspect Michael has only fixed the recursive loop on Unix. What happens if an exception is raised because symlinks are not supported (e.g. Windows)? Possible test case: data = BytesIO()
writer = tarfile.TarFile(fileobj=data, mode='w')
selflink = tarfile.TarInfo('self')
selflink.size = 0
selflink.type = tarfile.SYMTYPE
selflink.linkname = selflink.name
writer.addfile(selflink)
writer.close()
data.seek(0)
tarfile.TarFile(fileobj=data).extractall() |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: