qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] pc: memhp: enforce minimal 128Mb alignment f


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH v2] pc: memhp: enforce minimal 128Mb alignment for pc-dimm
Date: Mon, 26 Oct 2015 12:28:21 +0200

On Mon, Oct 26, 2015 at 10:46:55AM +0100, Igor Mammedov wrote:
> commit aa8580cd "pc: memhp: force gaps between DIMM's GPA"
> regressed memory hot-unplug for linux guests triggering
> following BUGON
>  =====
>  kernel BUG at mm/memory_hotplug.c:703!
>  ...
>  [<ffffffff81385fa7>] acpi_memory_device_remove+0x79/0xa5
>  [<ffffffff81357818>] acpi_bus_trim+0x5a/0x8d
>  [<ffffffff81359026>] acpi_device_hotplug+0x1b7/0x418
>  ===
>     BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK);
>  ===
> 
> reson for it is that x86-64 linux guest supports memory
> hotplug in chunks of 128Mb and memory section also should
> be 128Mb aligned.
> However gaps forced between 128Mb DIMMs with backend's
> natural alignment of 2Mb make the 2nd and following
> DIMMs not being aligned on 128Mb boundary as it was
> originally. To fix regression enforce minimal 128Mb
> alignment like it was done for PPC.
> 
> Signed-off-by: Igor Mammedov <address@hidden>

So our temporary work around is creating more trouble.  I'm inclined to just
revert aa8580cd and df0acded19 with it.

> ---
> PS:
>   PAGE_SECTION_MASK is derived from SECTION_SIZE_BITS which
>   is arch dependent so this is fix for x86-64 target only.
>   If anyone cares obout 32bit guests, it should also be fine
>   for x86-32 which has 64Mb memory sections/alignment.

Like 32 bit guests are unheard of?  This does not inspire confidence at all.


So I dug in linux guest code:

#ifdef CONFIG_X86_32
# ifdef CONFIG_X86_PAE
#  define SECTION_SIZE_BITS     29
#  define MAX_PHYSADDR_BITS     36
#  define MAX_PHYSMEM_BITS      36
# else
#  define SECTION_SIZE_BITS     26
#  define MAX_PHYSADDR_BITS     32
#  define MAX_PHYSMEM_BITS      32
# endif
#else /* CONFIG_X86_32 */
# define SECTION_SIZE_BITS      27 /* matt - 128 is convenient right now */
# define MAX_PHYSADDR_BITS      44
# define MAX_PHYSMEM_BITS       46
#endif

Looks like PAE needs more alignment.
And it looks like 128 is arbitrary here.

So we are tying ourselves to specific guest quirks.
All this just looks wrong to me.


> ---
>  hw/i386/pc.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 3d958ba..0f7cf7c 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1610,6 +1610,8 @@ void ioapic_init_gsi(GSIState *gsi_state, const char 
> *parent_name)
>      }
>  }
>  
> +#define PC_MIN_DIMM_ALIGNMENT (1ULL << 27) /* 128Mb */
> +

This kind of comment doesn't really help.

>  static void pc_dimm_plug(HotplugHandler *hotplug_dev,
>                           DeviceState *dev, Error **errp)
>  {
> @@ -1624,6 +1626,16 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
>  
>      if (memory_region_get_alignment(mr) && pcms->enforce_aligned_dimm) {
>          align = memory_region_get_alignment(mr);
> +        /*
> +         * Linux x64 guests expect 128Mb aligned DIMM,

this implies no other guest cares. which isn't true.

> +         * but this change

which change?

> causes memory layout change

change compared to what?

> so
> +         * for compatibility

compatibility with what?

> apply 128Mb alignment only
> +         * when forced gaps are enabled since it is the cause
> +         * of misalignment.

Which makes no sense, sorry.

Can it be misaligned for some other reason?

If not, why limit to this case?

> +         */
> +        if (pcmc->inter_dimm_gap && align < PC_MIN_DIMM_ALIGNMENT) {
> +            align = PC_MIN_DIMM_ALIGNMENT;
> +        }
>      }
>  
>      if (!pcms->acpi_dev) {

All this sounds pretty fragile. How about we revert the inter dimm gap
thing for 2.4? It's just a work around, this is piling work arounds on
top of work arounds.

> -- 
> 1.8.3.1



reply via email to

[Prev in Thread] Current Thread [Next in Thread]