[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 3/7] Add an "info pg" command that prints the current page
From: |
Richard Henderson |
Subject: |
Re: [PATCH v4 3/7] Add an "info pg" command that prints the current page tables |
Date: |
Wed, 24 Jul 2024 13:33:13 +1000 |
User-agent: |
Mozilla Thunderbird |
On 7/23/24 11:05, Don Porter wrote:
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index d946161717..c70d31433d 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -605,10 +605,11 @@ extern bool mttcg_enabled;
/**
* cpu_paging_enabled:
* @cpu: The CPU whose state is to be inspected.
+ * @mmu_idx: 0 == traditional paging, 1 == nested paging
*
* Returns: %true if paging is enabled, %false otherwise.
*/
-bool cpu_paging_enabled(const CPUState *cpu);
+bool cpu_paging_enabled(const CPUState *cpu, int mmu_idx);
mmu_idx already means something very different to TCG.
You will only confuse matters by using this name for a different concept here.
@@ -671,9 +672,82 @@ int cpu_write_elf32_qemunote(WriteCoreDumpFunction f,
CPUState *cpu,
* Caller is responsible for freeing the data.
*/
GuestPanicInformation *cpu_get_crash_info(CPUState *cpu);
-
#endif /* !CONFIG_USER_ONLY */
+/* Maximum supported page table height - currently x86 at 5 */
+#define MAX_HEIGHT 5
+
+typedef struct PageTableLayout {
+ int height; /* Height of the page table */
+ int entries_per_node[MAX_HEIGHT + 1];
+} PageTableLayout;
+
+typedef struct DecodedPTE {
+ int prot; /* Always populated, arch-specific, decoded flags */
+ bool present;
+ bool leaf; /* Only valid if present */
+ bool reserved_bits_ok;
+ bool user_read_ok;
+ bool user_write_ok;
+ bool user_exec_ok;
+ bool super_read_ok;
+ bool super_write_ok;
+ bool super_exec_ok;
+ bool dirty;
+ hwaddr child; /* Only valid if present and !leaf */
+ uint64_t leaf_page_size; /* Only valid if present and leaf */
+ uint64_t nested_page_size; /*
+ * If nested paging, the page size of the host
+ * page storing the data, versus the size of the
+ * guest page frame in leaf_page_size
+ */
+ vaddr bits_translated; /*
+ * The virtual address bits translated in walking
+ * the page table to node[i].
+ */
+ hwaddr pte_addr; /* (guest) physical address of the PTE */
+ hwaddr pte_host_addr; /* (host) physical address of the PTE */
+ uint64_t pte_contents; /* Raw contents of the pte */
+} DecodedPTE;
+
+typedef int (*qemu_page_walker_for_each)(CPUState *cs, void *data,
+ DecodedPTE *pte,
+ int height, int offset, int mmu_idx,
+ const PageTableLayout *layout);
+
+/**
+ * for_each_pte - iterate over a page table, and
+ * call fn on each entry
+ *
+ * @cs - CPU state
+ * @fn(cs, data, pte, height, offset, layout) - User-provided function to call
+ * on each pte.
+ * * @cs - pass through cs
+ * * @data - user-provided, opaque pointer
+ * * @pte - current pte, decoded
+ * * @height - height in the tree of pte
+ * * @offset - offset within the page tabe node
+ * * @layout - pointer to a PageTableLayout for this tree
+ * @data - opaque pointer; passed through to fn
+ * @visit_interior_nodes - if true, call fn() on interior entries in
+ * page table; if false, visit only leaf entries.
+ * @visit_not_present - if true, call fn() on entries that are not present.
+ * if false, visit only present entries.
+ * @mmu_idx - Which level of the mmu we are interested in:
+ * 0 == user mode, 1 == nested page table
+ * Note that MMU_*_IDX macros are not consistent across
+ * architectures.
+ *
+ * Returns true on success, false on error.
+ *
+ * We assume all callers of this function are in debug mode, and do not
+ * want to synthesize, say, a user-mode load, on each page in the address
+ * space.
+ */
+bool for_each_pte(CPUState *cs, qemu_page_walker_for_each fn, void *data,
+ bool visit_interior_nodes, bool visit_not_present,
+ bool visit_malformed, int mmu_idx);
+
None of this should be in hw/core/cpu.h.
Isolate it somewhere else, not included by 80% of QEMU.
diff --git a/include/hw/core/sysemu-cpu-ops.h b/include/hw/core/sysemu-cpu-ops.h
index 4c94e51267..d0e939def8 100644
--- a/include/hw/core/sysemu-cpu-ops.h
+++ b/include/hw/core/sysemu-cpu-ops.h
@@ -12,6 +12,43 @@
#include "hw/core/cpu.h"
+/*
+ * struct mem_print_state: Used by qmp in walking page tables.
+ */
+struct mem_print_state {
+ GString *buf;
+ CPUArchState *env;
+ int vaw, paw; /* VA and PA width in characters */
+ int max_height;
+ int mmu_idx; /* 0 == user mode, 1 == nested page table */
+ bool (*flusher)(CPUState *cs, struct mem_print_state *state);
+ bool flush_interior; /* If false, only call flusher() on leaves */
+ bool require_physical_contiguity;
+ /*
+ * The height at which we started accumulating ranges, i.e., the
+ * next height we need to print once we hit the end of a
+ * contiguous range.
+ */
+ int start_height;
+ int leaf_height; /* The height at which we found a leaf, or -1 */
+ /*
+ * For compressing contiguous ranges, track the
+ * start and end of the range
+ */
+ hwaddr vstart[MAX_HEIGHT + 1]; /* Starting virt. addr. of open pte range */
+ hwaddr vend[MAX_HEIGHT + 1]; /* Ending virtual address of open pte range */
+ hwaddr pstart; /* Starting physical address of open pte range */
+ hwaddr pend; /* Ending physical address of open pte range */
+
+ /* PTE protection flags current root->leaf path */
+ uint64_t prot[MAX_HEIGHT + 1];
+
+ /* Page size (leaf) or address range covered (non-leaf). */
+ uint64_t pg_size[MAX_HEIGHT + 1];
+ int offset[MAX_HEIGHT + 1]; /* PTE range starting offsets */
+ int last_offset[MAX_HEIGHT + 1]; /* PTE range ending offsets */
+};
Likewise.
Also, CamelCase for typedefs, per coding style.
+ if (env->hflags & HF_GUEST_MASK) {
+
+ /* Extract the EPTP value from vmcs12 structure, store in arch state */
+ if (env->nested_state->format == KVM_STATE_NESTED_FORMAT_VMX) {
+ struct vmcs12 *vmcs =
+ (struct vmcs12 *) env->nested_state->data.vmx->vmcs12;
This is not required. You appear to be confused by nested paging.
First: nested paging is how hardware virtualization works. When we are *using* hardware
virtualization, all of that is the kernel's job. Our job as hypervisor is to give a bag
of pages to the kernel and have it map them into the guest intermediate address space.
When we are *using* hardware virtualization, we are only ever concerned with one level of
paging: from the guest to the intermediate address space. From there we use QEMU data
structures to map to QEMU virtual address space (address_space_ld/st, etc).
This is all we will ever see from KVM, HVF etc.
With TCG, we can *emulate* hardware virtualization. It is at this point where we are
concerned about two levels of paging, because QEMU is handling both.
r~
- [PATCH v4 0/7] Rework x86 page table walks, Don Porter, 2024/07/22
- [PATCH v4 1/7] Code motion: expose some TCG definitions for page table walk consolidation., Don Porter, 2024/07/22
- [PATCH v4 2/7] Import vmcs12 definition from Linux/KVM, Don Porter, 2024/07/22
- [PATCH v4 4/7] Convert 'info tlb' to use generic iterator., Don Porter, 2024/07/22
- [PATCH v4 5/7] Convert 'info mem' to use generic iterator, Don Porter, 2024/07/22
- [PATCH v4 3/7] Add an "info pg" command that prints the current page tables, Don Porter, 2024/07/22
- Re: [PATCH v4 3/7] Add an "info pg" command that prints the current page tables,
Richard Henderson <=
- [PATCH v4 6/7] Convert x86_cpu_get_memory_mapping() to use generic iterators, Don Porter, 2024/07/22
- [PATCH v4 7/7] Convert x86_mmu_translate() to use common code., Don Porter, 2024/07/22
- Re: [PATCH v4 0/7] Rework x86 page table walks, Richard Henderson, 2024/07/23