Howdy,
I've dug into this further and think I now know what's going on. Just
FYI this is more than an academic question for me because I have
several vms that I would like to take snapshots of and this first one
is by far the smallest.
These VMs are LXC containers, and when I started out a long time ago,
I would just manually create the filesystems and use some cli tools to
install a fresh distro. Eventually the linux kernal started supporting
namspaces to improve security and they were adopted by the
virtualization ecosystems.
I'm not sure when it happened because I just noticed it, maybe it was
when I switched to letting proxmox spin up the new VMs, but now the
UIDs and GIDs in the filesystems for the unprivileged containers have
all been shifted by adding 100000 to them. This is why rdiff-backup
updated all that metadata.
This is not just a mapping in ram, it's actually in the filesystem
image on disk. There are several ways of dealing with this, some tools
will update the UID/GIDs for you when you reboot the vm. Other tools
act like layer in a bind mount to mostly duplicate a filesystem
somewhere else, and they rewrite the UID/GIDs on the fly. Some
utilities like rdiff-backup and rsync have some ability to rewrite or
map the UID/GIDs as they copy. The last two seem most attractive to me.
rsync has --usermap, and --groupmap, and rdiff-backup has
--user-mapping-file, and --group-mapping-file. In the filesystem mount
utility area there are, shiftfs, idmapped mounts, and bindfs.
Shiftfs is deprecated in favor of idmapped mounts, though some of my
kernels don't have that yet. Bindfs is a FUSE based solution and so
might be slower, however it might be the only one that is really
workable for me at the moment. This is because it has the
--uid-offset, and --gid-offset options. Bye the way, you can put in
negative offsets too, good thing. :-)
It would be great if rdiff-backup would allow offsets like this or
even better the ability to specify a range like
100000-165535:0-65535
Or you could just have the starting UID after the colon.
In the man page under USERS AND GROUPS, it says:
"If you specify both --preserve-numerical-ids and one of the mapping
options, the behavior is undefined."
I think it would be better to allow both with the user-mapping-file
overriding the preserve-numerical-ids behavior when necessary. As in
my use case I never want user name mapping.
What do you think? I appreciator the discussion, and everyone's help.
Thanks,
Clif