New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tarfile cannot extract from stdin #84230
Comments
Hi, I have the following code:
then doing the following on a debian 10 system:
The second extraction trys to seek, although the mode is 'r|*'. For reference if I remove ".buffer" from the code above, I can run
|
Hello I can't reproduce this issue on my Laptop from 3.8.1 to 3.9.0a4 I think maybe it depends on the file you use would you mind to upload the file with the problem? |
Hi, well, it says entity too large. I've attached a smaller one, that throws a similar but slightly different error. (Note: only on the _second_ extraction, it looks like problems with symlinks) You can find larger ones here: https://data.rbfh.de/issue40049/ The typescript*.txt are showing a shell session with two different python versions. (3.4.2 and 3.8.2) |
This is caused when tarfile tries to write a symlink that already exists. Any exceptions to os.symlink() as handled as if the platform doesn't support symlinks, so it scans the entire tar to try and find the linked files. When it resumes extraction, it needs to do a negative seek to pick up where it left off, which causes the exception. I've reproduced the error on both Windows 10 and Ubuntu running on WSL. Python 2.7 handled this situation by checking if the symlink exists, but it looks like the entire tarfile library was replaced with an alternate implementation that doesn't check if the symlink exists. I've created a pull request to address this issue. |
For me, this patch solves my problems. Thank you. |
GNU tar (v1.30, Ubuntu 20.04) does indeed overwrite files with symlinks upon extracting, while both |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: