qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: regression: insmod module failed in VM with nvdimm on


From: chenxiang (M)
Subject: Re: regression: insmod module failed in VM with nvdimm on
Date: Fri, 2 Dec 2022 10:48:06 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0

Hi Ard,


在 2022/12/1 19:07, Ard Biesheuvel 写道:
On Thu, 1 Dec 2022 at 09:07, Ard Biesheuvel <ardb@kernel.org> wrote:
On Thu, 1 Dec 2022 at 08:15, chenxiang (M) <chenxiang66@hisilicon.com> wrote:
Hi Ard,


在 2022/11/30 16:18, Ard Biesheuvel 写道:
On Wed, 30 Nov 2022 at 08:53, Marc Zyngier <maz@kernel.org> wrote:
On Wed, 30 Nov 2022 02:52:35 +0000,
"chenxiang (M)" <chenxiang66@hisilicon.com> wrote:
Hi,

We boot the VM using following commands (with nvdimm on)  (qemu
version 6.1.50, kernel 6.0-r4):
How relevant is the presence of the nvdimm? Do you observe the failure
without this?

qemu-system-aarch64 -machine
virt,kernel_irqchip=on,gic-version=3,nvdimm=on  -kernel
/home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios
/root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m
2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0
ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1'
-object memory-backend-ram,id=ram1,size=10G -device
nvdimm,id=dimm1,memdev=ram1  -device ioh3420,id=root_port1,chassis=1
-device vfio-pci,host=7d:01.0,id=net0,bus=root_port1

Then in VM we insmod a module, vmalloc error occurs as follows (kernel
5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4):

estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko
[    8.186563] vmap allocation for size 20480 failed: use
vmalloc=<size> to increase size
Have you tried increasing the vmalloc size to check that this is
indeed the problem?

[...]

We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr:
defer initialization to initcall where permitted").
I guess you mean commit fc5a89f75d2a instead, right?

Do you have any idea about the issue?
I sort of suspect that the nvdimm gets vmap-ed and consumes a large
portion of the vmalloc space, but you give very little information
that could help here...

Ouch. I suspect what's going on here: that patch defers the
randomization of the module region, so that we can decouple it from
the very early init code.

Obviously, it is happening too late now, and the randomized module
region is overlapping with a vmalloc region that is in use by the time
the randomization occurs.

Does the below fix the issue?
The issue still occurs, but it seems decrease the probability, before it
occured almost every time, after the change, i tried 2-3 times, and it
occurs.
But i change back "subsys_initcall" to "core_initcall", and i test more
than 20 times, and it is still ok.

Thank you for confirming. I will send out a patch today.

...but before I do that, could you please check whether the change
below fixes your issue as well?

diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
index 6ccc7ef600e7c1e1..c8c205b630da1951 100644
--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -20,7 +20,11 @@
  #include <asm/sections.h>
  #include <asm/setup.h>

-u64 __ro_after_init module_alloc_base;
+/*
+ * Set a reasonable default for module_alloc_base in case
+ * we end up running with module randomization disabled.
+ */
+u64 __ro_after_init module_alloc_base = (u64)_etext - MODULES_VSIZE;
  u16 __initdata memstart_offset_seed;

  struct arm64_ftr_override kaslr_feature_override __initdata;
@@ -30,12 +34,6 @@ static int __init kaslr_init(void)
         u64 module_range;
         u32 seed;

-       /*
-        * Set a reasonable default for module_alloc_base in case
-        * we end up running with module randomization disabled.
-        */
-       module_alloc_base = (u64)_etext - MODULES_VSIZE;
-
         if (kaslr_feature_override.val & kaslr_feature_override.mask & 0xf) {
                 pr_info("KASLR disabled on command line\n");
                 return 0;
.

We have tested this change, the issue is still and it doesn't fix the issue.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]