duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Create full backup from incremental


From: Eric O'Connor
Subject: Re: [Duplicity-talk] Create full backup from incremental
Date: Thu, 16 Apr 2015 18:16:18 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

Ya, I think it would take some work to make this happen, but I don't
think duplicity's approach is incompatible. Most tricky would be
to allow incremental backups based on a syn-full, as you mention in your
second bullet point.

I wouldn't consider the current backup chain (full + incrementals) to
have similar properties to a "full" backup, synthetic or otherwise.
Recovering the most recent state takes a bunch of processing time and
extra storage/bandwidth, both of which grow with the length of the chain.

Would you be interested in patches to implement this, or is it too far
off the roadmap?

Eric

On 04/16/2015 02:59 AM, address@hidden wrote:
thx Eric, unfortunately the current duplicity design is such as that
 - it bundles changes to different files in a volume until max size
and than continues in a next volume - the changes are rsync diffs
that have to be applied in a row eg. first state will be restored,
first rsync diff will be applied, second rsync.. etc. until the
latest state is restored

if i understood your explanation correctly than this would mean that
currently our "synthetic full" is essentially our complete backup
chain.

..ede/duply.net

On 15.04.2015 23:24, Eric O'Connor wrote:
For this feature, the remote doesn't really need to have access to
the data, or be very smart at all (dumb file servers work just be
fine). It is true that Duplicity does not support it yet.

Doing a synthetic full backup requires only that you be able to
(locally) keep track of where on the dumb server the latest
version of each file is stored, and which files are recently
modified. Then a full backup is the set of archive files containing
unmodified files (likely a large percentage) + new archives
containing files modified since the last syn-full. So you upload
the new archives, and an index pointing to all the relevant data
chunks.

It's a true "full backup" because it directly contains every file
needed to do a restore.

When an archive file contains 1 file, there is no additional data
storage overhead to this -- you just upload a new index and all the
modified files. If archive files contain more than 1 file, a full
backup will have some storage overhead -- some files in the
archives will be irrelevant older copies. The backup program can
pick some overhead maximum and upload enough new data to reduce
the overhead to acceptable levels.

This can also be spread out over the course of the full backup
period -- i.e. every day upload an incremental backup along with
5% of the modified files. You could also occasionally re-upload
unmodified files such that it's more likely a single archive
corruption is recoverable. It may even be possible to ditch the
full/incr schedule entirely if the length of an incremental chain
for a file has an upper bound.

Anyway, sorry for being pedantic, unhelpful? I've been thinking
about building something like this for a while but haven't gotten
around to it yet. Also, duplicity works well enough -- so thanks
for that :)

Eric

On 2015-04-15 13:48, address@hidden wrote:
good point.. why would you need encrypted backup if you trust the
backend?

thx Scott.. ede/duply.net

On 15.04.2015 19:30, Scott Hannahs wrote:
Note, that to do this, you need to have unencryption locally on
the server.  Duplicity assumes an insecure server model. To
collapse incremental backups onto a full backup means that all
your data is compromised to the level of security of the remote
server.

The duplicity model assumes that once the data goes out over
the wire it is subject to unknown security.

For any commercial remote storage, you might as well just use a
commercial backup system without encryption.

-Scott

On Apr 15, 2015, at 07:21, address@hidden wrote:

On 15.04.2015 12:56, Ulrik Rasmussen wrote:
On Wed, 15 Apr 2015 12:00:00 +0200 address@hidden
wrote:

On 15.04.2015 09:54, Ulrik Rasmussen wrote:
Hi,

I just started using duplicity for backing up my work
to a VPS. It is my understanding that it is wise to do
a full backup about once a month, to enable deletion of
old backups and faster restoration. However, when doing
a full backup, duplicity seems to transfer everything
over the wire again, which takes a long time if I'm on
a slow connection and also costs me bandwidth. Since
the server already has all my data, this really
shouldn't be necessary.

Is there a way to do a full backup on the server side?
More precisely, can I tell duplicity to create a new
backup chain based on the contents of the current
chain?


no.

duplicity deals with "dumb" backends and solely uses them
for file storage. for this design to create a synthetic
full you would have to transfer the whole data over the
line again anyway completely.

however, it'd be possible to implement that for the rare
 cases that users have shell access to their backends
and can have a duplicity instance running locally there.

see also
https://answers.launchpad.net/duplicity/+question/257348

..ede/duply.net

I see, thanks for clarifying. That makes sense, considering
most backends don't imply shell access. Since I _do_ have
shell access to the server and plenty of disk storage, I
guess I can accomplish the task by just restoring the
incremental backup on the server and doing a full backup
from that using the file system backend.


right you are.. make sure to have identical users/numeric ids
and restore as root, if you want to keep those.

alternatively you can hackishly "reuse" the old full by
copying it and updating the filenames with a proper newer
timestamp. depending on your data's turnover you might be
doing that for a while until your first incremental grows
too big.

..ede/duply.net


_______________________________________________
Duplicity-talk mailing list address@hidden
https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________ Duplicity-talk
 mailing list address@hidden
https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________ Duplicity-talk
mailing list address@hidden
https://lists.nongnu.org/mailman/listinfo/duplicity-talk



_______________________________________________ Duplicity-talk
mailing list address@hidden
https://lists.nongnu.org/mailman/listinfo/duplicity-talk

_______________________________________________ Duplicity-talk
mailing list address@hidden
https://lists.nongnu.org/mailman/listinfo/duplicity-talk

s




reply via email to

[Prev in Thread] Current Thread [Next in Thread]