|
From: | Kenneth Loafman |
Subject: | Re: [Duplicity-talk] Incremental backup of files with changed data but unchanged timestamp |
Date: | Mon, 4 Aug 2014 09:15:51 -0500 |
On 03.08.2014 17:57, Kenneth Loafman wrote:do you think it is wise to place the switch there? who knows how often file objects are compared. essentially we want _only_ the decision if to backup to be modified.
> ROPath.__eq__ is used any time two ROPath objects are compared for
> equality. Any change there will mean that the files will compare unequal.
> The places I'd put code to enforce data comparison are around lines 323-326
> where it checks perms and times. Even then, I'd qualify that with a check
> for self.isreg() so we limit the option to regular files only.
wouldn't it be the other way around with this mod? because we don't check attributes we would always create a delta, which in turn will create a backup with all files changed but zero sized deltas (except for the mtime files really modified with proper deltas)?
>
> Yes, if the delta is created, the file has changed.
..ede
>
> ...Ken
>
>
>
> On Sun, Aug 3, 2014 at 8:05 AM, <address@hidden> wrote:
>
>> Ken,
>>
>> can you as well point me to where the path.py _eq_() method (L325) is
>> called during an incremental backup? that would be the place, where i'd put
>> a --compare-data switch to enforce data comparison.
>>
>> but, even if we ignore mtime and _always_ create deltas, wouldn't the
>> current design (afaics) assume the file has changed, create a new signature
>> and save this to tar - for every file - always?
>> i didn't see a check for when delta is zero. instead duplicity seems to
>> assume that when a delta is created the file has to have changed.
>>
>> ..ede
>>
>> On 03.08.2014 14:31, Kenneth Loafman wrote:
>>> We don't need to do the librsync create delta, we just need to ignore the
>>> timestamp and duplicity will do that for us.
>>>
>>> duplicity.py, dup_time.py, path.py, and tarfile.py are the ones that
>>> actually reference mtime. path.py would be the place to look for
>>> comparison, well before we call librsync. Take a look in path.py, line
>> 325
>>> for backup and line 372 for verify. Replace with 'return 0' and you will
>>> alway go through the rdiff process. Very expensive.
>>>
>>> ...Ken
>>>
>>>
>>> On Sun, Aug 3, 2014 at 6:39 AM, <address@hidden> wrote:
>>>
>>>> hmm, some more searching didn't reveal no options for the librsync
>> create
>>>> delta call. it simply seems to create signatures for the whole file
>> only.
>>>>
>> http://librsync.sourcefrog.net/doc/librsync.html#processing-whole-files
>>>>
>>>> that suggests that the mtime is compared somewhere else, probably in
>>>>
>>>>
>> http://bazaar.launchpad.net/~duplicity-team/duplicity/0.7-series/view/head:/duplicity/path.py
>>>> although i am absolutely clueless as to where in the code path this is
>>>> supposed to happen.
>>>>
>>>> @Ken, Mike: any (more) input?
>>>>
>>>> after all this (identical mtime) comes up from time to time on the list
>>>> e.g.
>>>>
>>>>
>> https://lists.nongnu.org/archive/html/duplicity-talk/2013-07/msg00015.html
>>>> rsync allows to enforce checksum checking '-c' as well, so people
>> probably
>>>> will expect this from duplicity.
>>>>
>>>> ..ede
>>>>
>>>>
>>>> On 03.08.2014 13:05, Kenneth Loafman wrote:
>>>>> I've seen packages that have the timestamp reflect the version number,
>> so
>>>>> he's probably right, it would be the packager doing the dirty trick.
>>>>>
>>>>> I'm fairly sure you are right that DeltaFile is the first place. I
>> could
>>>>> not find anything else. Mod that and he should be good to go. It will
>>>> be
>>>>> a lot slower, so save the original for the next backup.
>>>>>
>>>>> ...Ken
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Aug 3, 2014 at 5:11 AM, <address@hidden> wrote:
>>>>>
>>>>>> On 03.08.2014 02:03, Nate Eldredge wrote:
>>>>>>> I am using duplicity to make incremental backups of my system. I
>> have
>>>>>> some files whose data has changed since the last backup, but whose
>> mtime
>>>>>> stayed the same. It looks like `duplicity incremental' ignores files
>>>> whose
>>>>>> timestamp has not changed, so it doesn't back up the new data. Is
>>>> there a
>>>>>> way to force duplicity to compare the file with a stored checksum, or
>>>> even
>>>>>> to use rdiff unconditionally? I'd prefer not to have to do a new full
>>>>>> backup.
>>>>>>>
>>>>>>> I'd consider hacking duplicity myself but it would be helpful to know
>>>>>> where in the code I should look.
>>>>>>>
>>>>>>> (Before you accuse me of abusing timestamps: it isn't my fault! I
>>>>>> crossgraded this Ubuntu system from 32-bit to 64-bit. It appears that
>>>> some
>>>>>> Ubuntu packages have the same timestamps on corresponding files in the
>>>>>> 32-bit and 64-bit versions. Presumably the packages were generated at
>>>> the
>>>>>> same time, and coincidentally those files were compiled during the
>> same
>>>>>> second. So when I replaced the 32-bit package with the 64-bit
>> package,
>>>> I
>>>>>> get a different file with the same timestamp.)
>>>>>>>
>>>>>>> I'm using duplicity 0.6.23 (latest from the PPA) on Ubuntu 14.04.
>>>>>>>
>>>>>>
>>>>>> i like "(Before you accuse me of abusing timestamps: it isn't my
>> fault!"
>>>>>> bit .. hehe as long as the time stamps were old enough you will get
>> off
>>>>>> scott free i guess..
>>>>>>
>>>>>> but seriously - this was obviously not on the horizon of when
>> duplicity
>>>>>> was developed. i searched a bit but couldn't find anything apart from
>>>> the
>>>>>> librsync call 'librsync.DeltaFile(old_sigfp, newfp)' in
>>>>>>
>>>>>>
>>>>
>> http://bazaar.launchpad.net/~duplicity-team/duplicity/0.7-series/view/head:/duplicity/diffdir.py#L136
>>>>>>
>>>>>> i cannot seem to find a routine that checks time stamps before that.
>>>>>>
>>>>>> @Ken, Mike: can you hint where this magic happens?
>>>>>>
>>>>>> ..ede
>>>>>>
>>>>>
>>>>
>>>
>>
>
[Prev in Thread] | Current Thread | [Next in Thread] |