Title: The tarfile module crashes when tarfile contains a symlink and unpack directory contain it too
Messages (9)
msg233907 - (view) Author: Michael Vogt (mvo) * Date: 2015-01-13 08:55
The tarfile.makelink() code crashes with a maximum recursion error when it unpacks a tarfile that contains a symlink into a directory that already contains this symlink.

Attached is a standalone testcase (that probably better explains whats going on :) and a possible fix.
msg233910 - (view) Author: Michael Vogt (mvo) * Date: 2015-01-13 09:03
A possible fix that works with the previous testcase for this bug. It does not break a tarfile tests.
msg233914 - (view) Author: Michael Vogt (mvo) * Date: 2015-01-13 09:10
This patch contains a regression test as well.
msg233918 - (view) Author: SilentGhost (SilentGhost) * (Python triager) Date: 2015-01-13 09:36
Perhaps I'm missing something here, but it doesn't seem to be a problem with valid links. Only invalid symlinks are causing this issue.
msg234309 - (view) Author: Michael Vogt (mvo) * Date: 2015-01-19 09:27
Thanks SilentGhost for your feedback and sorry for my slow reply.

I looked into this some more and attached a updated patch with a more complete test. It also covers a crash now that happens when there is a symlink cycle in the tar and on disk. 

My fix is to remove existing links before unpack and replace them with the link in the tar. This is what gnu tar is also doing (according to However this seems to be a behavior change of what the tarfile module has traditionally been done so feel free to reject it, I'm open to alternative ideas of course (even though I feel like having the same behavior as gnu tar is desirable).

msg264808 - (view) Author: Michael Vogt (mvo) * Date: 2016-05-04 12:39
Anything I can do to help moving this issue forward?
msg265146 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2016-05-08 15:16
TarFile.makelink() has a fallback mode in case the platform does not support links. Instead of a symlink or a hardlink it extracts the file it points to as long as it exists in the current archive.

More precisely, makelink() calls os.symlink() and if one of the exceptions in the symlink_exception tuple is raised, it goes into fallback mode. r80944 introduced a regression because it replaced the WindowsError in symlink_exception with an OSError which is much less specific than a WindowsError. Since that change, the fallback is used everytime an OSError occurs, in Michael's case it is a FileExistsError, because the symlink is already there.

The attached patch restores the old behavior. This might not be what you wanted, Michael, but at least, tarfile no longer crashes.
msg265147 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2016-05-08 15:27
I suck :-) It is hg revision bb94f6222fef.
msg331911 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2018-12-16 01:34
The problem with WindowsError should only exist in 3.4+. 2.7 doesn’t support creating symlinks on Windows.

Michael’s fix is the same as already done in 2.7 for Issue 10761 and (part of) Issue 12088. However I’m not sure that is the best approach for a bug fix. Also see Issue 19974 proposing to replace existing directory entries in all cases, including replacing empty subdirectories, and not just when extracting symlinks.

I suspect Michael has only fixed the recursive loop on Unix. What happens if an exception is raised because symlinks are not supported (e.g. Windows)? Possible test case:

data = BytesIO()
writer = tarfile.TarFile(fileobj=data, mode='w')
selflink = tarfile.TarInfo('self')
selflink.size = 0
selflink.type = tarfile.SYMTYPE
selflink.linkname =
