Memory locking allows to protect a process from kcompactd page compaction
and more importantly, migration (that is taking a PTE and replacing it with
one, which is closer in memory to reduce fragmentation). (As long as
/proc/sys/vm/compact_unevictable_allowed is 0)
For this use case we don't mind page faults as they take more or less
constant time, which we can also avoid if we wanted by preallocating guest
memory. We do, however, want PTEs to be untouched by kcompactd, which
MCL_ONFAULT accomplishes just fine without the extra memory overhead that
comes from various anonymous mappings getting write-faulted with the
currently available mem-lock=on option.
In our case we use KVM of course, TCG was just an experiment where I noticed
anonymous memory
jump way too much.
I don't think it's feasible in our case to look for the origin of every
anonymous mapping that grew compared to the no mem-lock case (which there's
about ~30 with default Q35 + KVM, without any extra devices), and try to
optimize it to map anonymous memory less eagerly.