[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Nmh-workers] sync'ing an mh mailstore between two machines?
From: |
bergman |
Subject: |
Re: [Nmh-workers] sync'ing an mh mailstore between two machines? |
Date: |
Wed, 21 May 2008 13:24:17 -0400 |
In the message dated: Wed, 21 May 2008 16:52:33 BST,
The pithy ruminations from Peter Maydell on
<Re: [Nmh-workers] sync'ing an mh mailstore between two machines?> were:
=> Paul Fox wrote:
=> >but then i have the synchronization problem -- i'd like, at the
=> >end of the day, to be able to merge what i've inc'ed/read/deleted/saved
=> >on the laptop back into my home mh mail tree.
I've been toying with the same idea...still at a conceptual stage--I haven't
written any code yet.
=>
=> Just use rsync to copy the laptop's idea of the situation
=> onto the desktop again? (And vice-versa in the morning.)
Ick.
=> This basically relies on there being at any time one "right"
=> copy of the tree, with the other one being dead and not to
=> be modified, though, and woe betide you if you forget to do
=> the resync... (Anything where both trees could change
=> and you need to merge the changes starts to get tricky.)
Right. This idea relies on everything being "perfect"--particularly the
user procedures, and discounts the use of more intelligent software tools.
I'm thinking of something on the order of:
[1] use something to determine what needs to be synchronized in
the naive case. This could be "diff -r" or"rsync --dry-run".
[2] parse the output of [1] skipping unchanged directories, because
most of my 288 mail directories do not change (yes,
I have a very wide & deep folder tree, but most of it
is archived mail...there are probably only 20 folders
that change on a daily basis)
[3] for each folder that has changes:
collect a list of filenames & MessageIds on each machine. This
can be done manually, or using tools like "formail".
(think of a hashed list, keyed by MessageId)
As I see it, there are 4 cases for the
Client:Message-Id:Folder:Filename tuple:
[a] duplicate messages on each machine
If a MessageId exists in the same directory on both client
& server (regardless of file name [message number]),
then ignore it. This probably means that "folder -pack" or
"sortm" was run on one or both machines. Let mh reorder and
the files, as long as files with the same content (Message-Id)
exist in the same folders on each machine. The folder ordering
will by synchronized in steps [4] & [5].
[b] refiled message
A slightly more difficult case is if a MessageId exists
on both machines, but in different directories. This
implies that a refile was done on one machine but not
the other. If you trust the clocks on both machines,
then use the newer message (by mtime, not by the Date:
header) as definitive, remove the older message with rmm
and then treat the newer message as if it only exists
on one machine (case [c]).
[c] new message on one machine
If a MessageId exists on only one machine, and it is newer
than the last syncronization date, then assume it is a new
message that was only received on one machine. Copy it to
a temp directory on the "other" machine and use rcvstore
to incorporate the file into the correct MH folder.
[d] removed message on one machine
If a MessageId exists on only one machine, and it is older
than the last syncronization date, then assume that the
file with the same Message-Id was removed on the other
machine. In this case, we want to duplicate that action on the
machine where the message still exists. Remove the message.
[4] at the conclusion of synchronizing, run "folder -pack" and
"sortm" on each changed folder
[5] synchronize the current state back to the other machine by any
means (rsync, unison, tar+scp, etc.)
[6] create a timestamp file recording the last synchronization date
Note that there doesn't need to be anything fancy running on _both_
machines...once the differing files are determined, they could all be
copied from the client to the server (to a temp directory, preserving
the MH tree structure), then the server could do the rcvstore/pack/sortm
procedure, and finally the server could use rsync to update the client
with the new MH tree structure. I'll probably go this route, as my
"client" is a Nokia Internet Tablet handheld. It runs Linux, and has
the CLAWS mail program, which uses an MH folder structure.
This scheme assumes that:
each message has a Message-Id header (if that is not true,
the procedure could be extended to do an md5sum of the file and
create a replacement for the missing header)
each Message-Id per-folder is unique
the changes on the client/server are as a result of receiving,
refiling, or removing mail messages--not as a result of editing
the contents of individual messages
the client & server don't both receive the same messsages (ie.,
there's no external mail alias or forwarding system that would
send one message to both machines...if this is happening, there
would be messages with the same Message-Id on each machine,
but with different "Received", "Delivered-to" and other
headers...which you might consider importan)
Mark "the implementation details are left up to the reader" Bergman
=>
=> -- PMM
-----
Mark Bergman Biker, Rock Climber, Unix mechanic, IATSE #1 Stagehand
http://wwwkeys.pgp.net:11371/pks/lookup?op=get&search=bergman%40merctech.com
I want a newsgroup with a infinite S/N ratio! Now taking CFV on:
rec.motorcycles.stagehands.pet-bird-owners.pinballers.unix-supporters
15+ So Far--Want to join? Check out: http://www.panix.com/~bergman