|
From: | Johannes Nieß |
Subject: | Re: Feature request: make --multivolume and --update seeking between headers |
Date: | Sun, 24 Oct 2021 22:22:34 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0 Thunderbird/78.13.0 |
I have to correct myself regarding --compare doing seeks. It looks like it is doing checksums on both tar file contents and sometimes also on the file system. Of course that means reading the complete archive. My testing was flawed in that I --compared a single non-archived file. As tar seeks through all the archive headers unsuccessfully, I misread the error message.
I want to turn that into a feature request for a --no-checksum
option, upon which tar does only uses file date and sizes to
decide on file differences and skips the checksums.
Hi,
While working on a backup script that --updates the tar file (on a disk) with --multi-volume, I discovered that tar does not seek through the archive and speed is much lower than expected. Are there any technical reasons for that, other than outdated silent assumptions?
While trying to read the code and documentation, I stumbled upon this code in buffer.c
if (!multi_volume_option && !use_compress_program_option && fstat (archive, &st) == 0) seekable_archive = S_ISREG (st.st_mode); else seekable_archive = false;
That multi_volume_option isn't documented to make the file non-seekable (see below). Is this just a silent and incorrect assumption that --multi-volume always implies non-seekable tapes?
From the man page:
-n, --seekAs far as I understand it, --update is mostly a combination of --compare (which is a seeking read operation) and --append in case file size and date differ. According to my tests, the --compare part of --update does not seek between headers (even without --multivolume and for an uncompressed .tar file). Can we please get a huge performance boost in --update by making it jump from header to header (=seek) in the compare phase? The streamed file contents seem not to be needed for anything and slow down the process.
Assume the archive is seekable. Normally tar determines automatically whether the archive
can be seeked or not. This option is intended for use in cases when such recognition fails.
It takes effect only if the archive is open for reading (e.g. with --list or --extract op-
tions).
Best regards,
Johannes Nieß
[Prev in Thread] | Current Thread | [Next in Thread] |