quilt-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Quilt-dev] Another shell re-write of backup-files.


From: Jean Delvare
Subject: Re: [Quilt-dev] Another shell re-write of backup-files.
Date: Fri, 18 Mar 2011 10:18:27 +0100
User-agent: KMail/1.12.4 (Linux/2.6.32.29-0.3-pae; KDE/4.3.5; i686; ; )

Hi Kaz,

On Friday 18 March 2011 02:26:27 am Kaz Kylheku wrote:
> Hey everyone,
> 
> Recently I became interested in a quilt that consists only of shell
> scripts.

You're not the only one. I'm happy to see momentum grow in this 
direction.

> 
> I did find Martin Quinson's script from 2007 in the list archives,
> as well as Jan Delvare's followup.
> 
> Noting the performance problems inherent in launching numerous
> utilities for each file that is processed, and also noting
> remarks about the -z option being deprecated, I took a few

Dropping -z was indeed the key to start solving performance issues.

> liberties and followed a different approach, leading to the
>  following:

Too bad you did not contact me before starting to work on this, because 
it seems that you duplicated a lot of the work I have already done. I 
had publicly announced that I was reviving the idea here:

http://lists.nongnu.org/archive/html/quilt-dev/2011-01/msg00000.html

Since then I have updated https://features.opensuse.org/311072 with 
progress information on a regular basis. Raphael Hertzog has been 
reviewing my work and I was about to post it here for public review, now 
that it has been sufficiently tested.

The good news is that you have an interest in this, so I guess you'll be 
interested in reviewing and/or testing my own code :) Stay tuned, I'll 
try to find some time today to post it.


> 
> #!/bin/bash
> 
> #
> # Shell script replacement for quilt's backup-files utility
> # Proof-of-concept alpha code
> # Mar 17, 2011
> # Kaz Kylheku <address@hidden>
> #
> 
> #
> # Main concepts:
> #
> # - Goal is to avoid invoking a process for each file name

I noticed the problem as well, and my implementation batches things as 
much as possible.

> # - We use the CPIO utility for creating hard-linked backups; CPIO
> pass-through
> #   mode can take list of names and create hard links (in
>  pass-through mode).

This adds a dependency on one more external utility, which isn't part of 
the common utility packages. This needs to be evaluated, depending on 
the availability of cpio on various systems and how compatible it is 
across systems. My own implementation still relies on ln and cp, 
optionally using GNU cp options to improve performance where available. 
We may check later if using cpio would be better, thanks for the hint.

> # - Use tree-to-tree recursive cp to restore a backup (taking
> everything
> #   in the backup, ignoring the file list).

This doesn't work with the current internal backup representation. The 
backup tree contains empty files which must result in original files to 
be removed. A recursive copy would create empty files instead.

> # - To touch files on restore, if requested, we can do a find + xargs
> + touch
> #   over the backup instead, prior to restoring it.

No, we can't, at least not unconditionally. Restore has option -k which 
means keeping the backup, in this case you obviously can't touch the 
backup files, otherwise restore -k -t later followed by restore without 
options will do the wrong thing.

But we _can_ batch the touch on the restored files, that's what I did.

> # - Added files (i.e. backups of nonexistent files) are represented
>  as a
> #   specially named file containing an explicit list, and not as
> zero-length
> #   files. This eases the implementation, and lets us back up/restore
> #   zero-length files!

As you found out yourself, this doesn't work, at least not easily. And 
even if you managed to get there, this would mean an incompatible change 
to the internal representation, for which you would have to provide a 
conversion path (see quilt upgrade). This should be considered with a 
lot of care, as this is putting A LOT of burden and pain on the user's 
shoulders. Please keep in mind that:
* Users should be able to do "quilt push" with one version of quilt, 
upgrade to the next version of quilt, and do "quilt pop".
* Users occasionally operate on the same quilt data set from different 
machines running different versions of quilt (think NFS-mounted home on 
a company network). A change of the internal representation breaks this 
possibility for a long period of time.

Anyway I doubt you can change the backup format. It is dictated by what 
GNU patch does (the empty backup files for files created by a patch, in 
particular), and we can't change that.

> #
> 
> set -eu # bail on any errors, and unbound variable uses
> 
> opt_prefix=
> opt_suffix=
> opt_file=
> opt_backup=
> opt_restore=
> opt_remove_backup=
> opt_keep_backups=
> opt_silent=
> opt_touch=
> opt_nolink=
> 
> usage()
> {
> cat <<!
> Usage: $0 [-B prefix] [-f {file|-}] [-sktL] [-b|-r|-x] {file|-} ...
> 
>       Create hard linked backup copies of a list of files
>       read from a file (or standard input), or from the
>       argument list.
> 
>       -b      Create backup
>       -r      Restore the backup
>       -x      Remove backup files and empty parent directories
>       -k      When doing a restore, keep the backup files
>       -B      Path name prefix for backup files
>       -z      Unsupported, obsolete option
>       -s      Silent operation; only print error messages
>       -f      Read the filenames to process from file (- = standard input)
>       -t      Touch original files after restore (update their mtimes)
> 
>       -L      Ensure that when finished, the source file has a link count of 1
> !
> }
> 
> if ! options=`getopt -o B:f:brxkstLh -- "$@"` ; then
>       usage
>       exit 1
> fi
> 
> eval set -- "$options"
> 
> while true
> do
>       case "$1" in
>       -B)
>               opt_prefix="$2"
>               shift 2 ;;
>         -f)
>                 opt_file="$2"
>                 shift 2 ;;
>         -b)
>                 opt_backup=B
>                 shift ;;
>         -r)
>                 opt_restore=R
>                 shift ;;
>         -x)
>                 opt_remove_backup=D
>                 shift ;;
>         -k)
>                 opt_keep_backups=y
>                 shift ;;
>         -s)
>                 opt_silent=y
>                 shift ;;
>         -t)
>                 opt_touch=y
>                 shift ;;
>         -L)
>                 opt_nolink=y
>                 shift ;;
>       -h)
>               usage
>               exit 0 ;;
>       --)
>               shift
>               break ;;
>       esac
> done
> 
> if [ $# -eq 0 -a -z "$opt_file" ] ; then
>       echo "Error: specify input file names as arguments or via -f option"
>       echo
>       usage
>       exit 1
> fi
> 
> if [ $# -ge 1 -a -n "$opt_file" ] ; then
>       echo "Error: conflict: both -f and file name argument given"
>       echo
>       usage
>       exit 1
> fi
> 
> if [ $# -gt 1 -a "$1" == "-" ] ; then
>       echo "Error: if - is specified, then no other arguments can be
>  added" echo
>       usage
>       exit 1
> fi
> 
> 
> if [ -z "$opt_prefix" ] ; then
>       echo "Error: specify backup/restore directory with -B"
>       echo
>       usage
>       exit 1
> fi
> 
> temp_list=$(mktemp "${TMP_DIR:-/tmp}/backup-files-tl-XXXXXX")
> file_list=$(mktemp "${TMP_DIR:-/tmp}/backup-files-fl-XXXXXX")
> noex_list=$(mktemp "${TMP_DIR:-/tmp}/backup-files-ne-XXXXXX")
> 
> cleanup()
> {
>       rm -f $temp_list $file_list $noex_list
> }
> 
> trap cleanup exit
> 
> #
> # capture the file list into the $temp_list file
> #
> 
> if [ -n "$opt_file" -a "$opt_file" == "-" -o "$1" == "-" ] ; then
>       cat > $temp_list
> elif [ -n "$opt_file" ] ; then
>       cat "$opt_file" > $temp_list
> else
>       # IFS trick. The string literal here contains a newline
>       ( IFS="
> "

I prefer IFS=$'\n', it's more readable.

>         echo "$*" > $temp_list )
> fi
> 
> #
> # separate name list into existing and nonexisting
> #
> 
> > $file_list
> > $noex_list
> 
> while read name ; do
>       if [ -e "$name" ] ; then
>               echo "$name" >> $file_list
>       else
>               echo "$name" >> $noex_list
>       fi
> done < $temp_list
> 
> if [ $opt_silent ] ; then
>       echo "$0: operating on these files:"
>       cat $file_list
> fi
> 
> case $opt_backup$opt_restore$opt_remove_backup in
> B )
>       if [ $opt_nolink ] ; then
>               cpio --quiet -pd "$opt_prefix" < $file_list
>       else
>               cpio --quiet -pdl "$opt_prefix" < $file_list
>       fi
> 
>       mv $noex_list "$opt_prefix/.#new-files#"
> 
>       ;;
> R )
>       if [ $opt_nolink ] ; then
>               cp -a "$opt_prefix"/. .
>       else
>               if [ $opt_touch ] ; then
>                       find "$opt_prefix" -type f | xargs touch
>               fi
> 
>               cp -rlf "$opt_prefix"/. .
>       fi
> 
>       if [ -z "$opt_keep_backups" ] ; then
>               rm -rf "$opt_prefix"
>       fi
> 
>       # Trick! .#new-files# list gets copied along with the files from
>       # the backup directory to the working directory: we process
>       # it and remove it.
> 
>       xargs rm -f < .#new-files#
>       rm -f .#new-files#
> 
>       ;;
> D )
>       rm -rf "$opt_prefix"
>       ;;
> * )
>       # either no operation was specified, or multiple operations
>       # were been specified, like -b and -r.
>       usage
>       exit 1
>       ;;
> esac

-- 
Jean Delvare
Suse L3



reply via email to

[Prev in Thread] Current Thread [Next in Thread]