[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimizat
From: |
Jan Kiszka |
Subject: |
Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests |
Date: |
Mon, 13 Feb 2012 20:22:21 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 |
On 2012-02-13 19:50, Blue Swirl wrote:
> On Mon, Feb 13, 2012 at 10:16, Jan Kiszka <address@hidden> wrote:
>> On 2012-02-11 16:25, Blue Swirl wrote:
>>> On Fri, Feb 10, 2012 at 18:31, Jan Kiszka <address@hidden> wrote:
>>>> This enables acceleration for MMIO-based TPR registers accesses of
>>>> 32-bit Windows guest systems. It is mostly useful with KVM enabled,
>>>> either on older Intel CPUs (without flexpriority feature, can also be
>>>> manually disabled for testing) or any current AMD processor.
>>>>
>>>> The approach introduced here is derived from the original version of
>>>> qemu-kvm. It was refactored, documented, and extended by support for
>>>> user space APIC emulation, both with and without KVM acceleration. The
>>>> VMState format was kept compatible, so was the ABI to the option ROM
>>>> that implements the guest-side para-virtualized driver service. This
>>>> enables seamless migration from qemu-kvm to upstream or, one day,
>>>> between KVM and TCG mode.
>>>>
>>>> The basic concept goes like this:
>>>> - VAPIC PV interface consisting of I/O port 0x7e and (for KVM in-kernel
>>>> irqchip) a vmcall hypercall is registered
>>>> - VAPIC option ROM is loaded into guest
>>>> - option ROM activates TPR MMIO access reporting via port 0x7e
>>>> - TPR accesses are trapped and patched in the guest to call into option
>>>> ROM instead, VAPIC support is enabled
>>>> - option ROM TPR helpers track state in memory and invoke hypercall to
>>>> poll for pending IRQs if required
>>>>
>>>> Signed-off-by: Jan Kiszka <address@hidden>
>>>
>>> I must say that I find the approach horrible, patching guests and ROMs
>>> and looking up Windows internals. Taking the same approach to extreme,
>>> we could for example patch Xen guest to become a KVM guest. Not that I
>>> object merging.
>>
>> Yes, this is horrible. But there is no real better way in the absence of
>> hardware assisted virtualization of the TPR. I think MS is recommending
>> this patching approach as well.
>
> Maybe instead of routing via ROM and the hypercall, the TPR accesses
> could be handled directly with guest invisible breakpoints (like GDB
> breakpoints, but for QEMU internal use), much like other
> instrumentation could be handled.
Gleb answered it already.
>>>> @@ -238,6 +275,7 @@ static int apic_init_common(SysBusDevice *dev)
>>>> {
>>>> APICCommonState *s = APIC_COMMON(dev);
>>>> APICCommonClass *info;
>>>> + static DeviceState *vapic;
>>>> static int apic_no;
>>>>
>>>> if (apic_no >= MAX_APICS) {
>>>> @@ -248,10 +286,29 @@ static int apic_init_common(SysBusDevice *dev)
>>>> info = APIC_COMMON_GET_CLASS(s);
>>>> info->init(s);
>>>>
>>>> - sysbus_init_mmio(&s->busdev, &s->io_memory);
>>>> + sysbus_init_mmio(dev, &s->io_memory);
>>>> +
>>>> + if (!vapic && s->vapic_control & VAPIC_ENABLE_MASK) {
>>>> + vapic = sysbus_create_simple("kvmvapic", -1, NULL);
>>>> + }
>>>> + s->vapic = vapic;
>>>> + if (apic_report_tpr_access && info->enable_tpr_reporting) {
>>>
>>> I think you should not rely on apic_report_tpr_access being in sane
>>> condition during class init.
>>
>> It is mandatory, e.g. for CPU hotplug, as reporting needs to be
>> consistent accross all VCPUs. Therefore it is a static global, set to
>> false initially. However, you are right, we lack proper clearing of the
>> access report feature on reset, not only in this variable.
>
> I'd also set it to false initially.
It's a global variable, thus initialized to false by definition.
>>>> +
>>>> +#define VAPIC_CPU_SHIFT 7
>>>> +
>>>> +#define ROM_BLOCK_SIZE 512
>>>> +#define ROM_BLOCK_MASK (~(ROM_BLOCK_SIZE - 1))
>>>> +
>>>> +typedef struct VAPICHandlers {
>>>> + uint32_t set_tpr;
>>>> + uint32_t set_tpr_eax;
>>>> + uint32_t get_tpr[8];
>>>> + uint32_t get_tpr_stack;
>>>> +} QEMU_PACKED VAPICHandlers;
>>>> +
>>>> +typedef struct GuestROMState {
>>>> + char signature[8];
>>>> + uint32_t vaddr;
>>>
>>> This does not look 64 bit clean.
>>
>> It's packed.
>
> I meant "virtual address could be 64 bits on a 64 bit host", not
> structure packing.
This is for 32-bit guests only. 64-bit Windows doesn't access the TPR
via MMIO, thus is not activating the VAPIC.
>>>> + uint32_t state;
>>>> + uint32_t rom_state_paddr;
>>>> + uint32_t rom_state_vaddr;
>>>> + uint32_t vapic_paddr;
>>>> + uint32_t real_tpr_addr;
>>>> + GuestROMState rom_state;
>>>> + size_t rom_size;
>>>> +} VAPICROMState;
>>>> +
>>>> +#define TPR_INSTR_IS_WRITE 0x1
>>>> +#define TPR_INSTR_ABS_MODRM 0x2
>>>> +#define TPR_INSTR_MATCH_MODRM_REG 0x4
>>>> +
>>>> +typedef struct TPRInstruction {
>>>> + uint8_t opcode;
>>>> + uint8_t modrm_reg;
>>>> + unsigned int flags;
>>>> + size_t length;
>>>> + off_t addr_offset;
>>>> +} TPRInstruction;
>>>
>>> Also here the order is pessimized.
>>
>> Don't see the gain here, though.
>
> There are two bytes' hole between modrm_reg and flags, maybe also 4
> bytes between length and addr_offset (if size_t is 32 bits but off_t
> 64 bits). I'd reverse the order so that members with largest alignment
> needs come first.
Well, but this won't make the struct smaller. I prefer to keep the
ordering in which we also initialize it.
>
>>>> +static int find_real_tpr_addr(VAPICROMState *s, CPUState *env)
>>>> +{
>>>> + target_phys_addr_t paddr;
>>>> + target_ulong addr;
>>>> +
>>>> + if (s->state == VAPIC_ACTIVE) {
>>>> + return 0;
>>>> + }
>>>> + for (addr = 0xfffff000; addr >= 0x80000000; addr -= TARGET_PAGE_SIZE)
>>>> {
>>>> + paddr = cpu_get_phys_page_debug(env, addr);
>>>> + if (paddr != APIC_DEFAULT_ADDRESS) {
>>>> + continue;
>>>> + }
>>>> + s->real_tpr_addr = addr + 0x80;
>>>> + update_guest_rom_state(s);
>>>> + return 0;
>>>> + }
>>>
>>> This loop looks odd, what should it do, probe for unused address?
>>
>> Seems to deserve a comment: We have to scan for the guest's mapping of
>> the APIC as we enter here without a hint from an TPR accessing
>> instruction. So we probe the potential range, trying to find the page
>> that maps to that known physical address (known in the sense that
>> Windows does not remap the APIC physically - nor does QEMU support that
>> so far).
>
> Yes, more comments would be nice, especially on theory of operation.
>
>>>> +static int evaluate_tpr_instruction(VAPICROMState *s, CPUState *env,
>>>> + target_ulong *pip, int access)
>>>> +{
>>>> + const TPRInstruction *instr;
>>>> + target_ulong ip = *pip;
>>>> + uint8_t opcode[2];
>>>> + uint32_t real_tpr_addr;
>>>> + int i;
>>>> +
>>>> + if ((ip & 0xf0000000) != 0x80000000 && (ip & 0xf0000000) !=
>>>> 0xe0000000) {
>>>
>>> The constants should be using ULL suffix because target_ulong could be
>>> 64 bit, though maybe this is more optimal.
>>
>> target_ulong is 64-bit unconditionally on x86. I'll add this.
>
> No, target_phys_addr_t is now 64 bits, but target_ulong (register
> size) is 32 bits for i386-softmmu.
Ah, right.
>
>>>> +
>>>> +/*
>>>> + * Tries to read the unique processor number from the Kernel Processor
>>>> Control
>>>> + * Region (KPCR) of 32-bit Windows. Returns -1 if the KPCR cannot be
>>>> accessed
>>>> + * or is considered invalid.
>>>> + */
>>>
>>> Horrible hack. Is guest OS type or version checked somewhere?
>>
>> This is all about hacking Windows 32-bit. And this check encodes that
>> even stronger. The other important binding is the expected virtual
>> address of the ROM mapping under Windows. I would have preferred
>> checking the version directly, but no one has a complete list of
>> supported guests and their codes.
>
> Then it would be nice to only enable this with a command line switch,
> so that some random poor non-Windows OS is not patched incorrectly.
I had the same concern, but there is no need to worry, we have
sufficient checks that no other guest is affected. And we do have a
switch as well, the APIC property. But this is better left on by default
to please our guests with optimal performance.
>
>>>
>>>> +static int get_kpcr_number(CPUState *env)
>>>> +{
>>>> + struct kpcr {
>>>> + uint8_t fill1[0x1c];
>>>> + uint32_t self;
>>>> + uint8_t fill2[0x31];
>>>> + uint8_t number;
>>>> + } QEMU_PACKED kpcr;
>>>
>>> KPCR. Pointers to Windows documentation would be nice.
>>
>> Oops, yes.
>>
>> Unfortunately, this is only an internal structure, not officially
>> documented by MS. However, all supported OS versions a legacy by now, no
>> longer changing its structure.
>
> This and a note about the supported OS versions could be added as comment.
OK.
For the folks that developed it in qemu-kvm: This targets Windows XP,
Vista and Server 2003, all 32-bit, right?
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
- Re: [Qemu-devel] [PATCH v2 1/8] kvm: Set cpu_single_env only once, (continued)
- [Qemu-devel] [PATCH v2 6/8] kvmvapic: Simplify mp/up_set_tpr, Jan Kiszka, 2012/02/10
- [Qemu-devel] [PATCH v2 2/8] Allow to use pause_all_vcpus from VCPU context, Jan Kiszka, 2012/02/10
- [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests, Jan Kiszka, 2012/02/10
- Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests, Blue Swirl, 2012/02/11
- Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests, Jan Kiszka, 2012/02/13
- Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests, Blue Swirl, 2012/02/13
- Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests, Gleb Natapov, 2012/02/13
- Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests,
Jan Kiszka <=
- Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests, Gleb Natapov, 2012/02/14
- Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests, Jan Kiszka, 2012/02/14
- Re: [Qemu-devel] [PATCH v2 5/8] kvmvapic: Introduce TPR access optimization for Windows guests, Gleb Natapov, 2012/02/14
- [Qemu-devel] [PATCH v2 4/8] kvmvapic: Add option ROM, Jan Kiszka, 2012/02/10
- [Qemu-devel] [PATCH v2 8/8] kvmvapic: Use optionrom helpers, Jan Kiszka, 2012/02/10