ShuriZma/xemu - xemu

Commit Graph

Author	SHA1	Message	Date
Joao Martins	3b06f29b24	i386/xen: implement HYPERVISOR_event_channel_op Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Ditch event_channel_op_compat which was never available to HVM guests] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00
Joao Martins	5092db87e4	i386/xen: handle VCPUOP_register_runstate_memory_area Allow guest to setup the vcpu runstates which is used as steal clock. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00
Joao Martins	f068930277	i386/xen: handle VCPUOP_register_vcpu_time_info In order to support Linux vdso in Xen. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	c345104cd1	i386/xen: handle VCPUOP_register_vcpu_info Handle the hypercall to set a per vcpu info, and also wire up the default vcpu_info in the shared_info page for the first 32 vCPUs. To avoid deadlock within KVM a vCPU thread must set its own vcpu_info rather than it being set from the context in which the hypercall is invoked. Add the vcpu_info (and default) GPA to the vmstate_x86_cpu for migration, and restore it in kvm_arch_put_registers() appropriately. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	d70bd6a485	i386/xen: implement HYPERVISOR_vcpu_op This is simply when guest tries to register a vcpu_info and since vcpu_info placement is optional in the minimum ABI therefore we can just fail with -ENOSYS Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	671bfdcd47	i386/xen: implement HYPERVISOR_hvm_op This is when guest queries for support for HVMOP_pagetable_dying. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	782a79601e	i386/xen: implement XENMEM_add_to_physmap_batch Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	fb0fd2ce38	i386/xen: implement HYPERVISOR_memory_op Specifically XENMEM_add_to_physmap with space XENMAPSPACE_shared_info to allow the guest to set its shared_info page. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Use the xen_overlay device, add compat support] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	110a0ea59f	i386/xen: manage and save/restore Xen guest long_mode setting Xen will "latch" the guest's 32-bit or 64-bit ("long mode") setting when the guest writes the MSR to fill in the hypercall page, or when the guest sets the event channel callback in HVM_PARAM_CALLBACK_IRQ. KVM handles the former and sets the kernel's long_mode flag accordingly. The latter will be handled in userspace. Keep them in sync by noticing when a hypercall is made in a mode that doesn't match qemu's idea of the guest mode, and resyncing from the kernel. Do that same sync right before serialization too, in case the guest has set the hypercall page but hasn't yet made a system call. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	c789b9ef5f	i386/xen: Implement SCHEDOP_poll and SCHEDOP_yield They both do the same thing and just call sched_yield. This is enough to stop the Linux guest panicking when running on a host kernel which doesn't intercept SCHEDOP_poll and lets it reach userspace. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	79b7067dc6	i386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown It allows to shutdown itself via hypercall with any of the 3 reasons: 1) self-reboot 2) shutdown 3) crash Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather than leading to triple faults if it remains unimplemented. In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset Xen shared pages and other enlightenments and leave a clean slate for the new kernel without the hypervisor helpfully writing information at unexpected addresses. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Ditch sched_op_compat which was never available for HVM guests, Add SCHEDOP_soft_reset] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	bedcc13924	i386/xen: implement HYPERVISOR_xen_version This is just meant to serve as an example on how we can implement hypercalls. xen_version specifically since Qemu does all kind of feature controllability. So handling that here seems appropriate. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Implement kvm_gva_rw() safely] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	55a3f666b4	i386/xen: handle guest hypercalls This means handling the new exit reason for Xen but still crashing on purpose. As we implement each of the hypercalls we will then return the right return code. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	5e691a955a	i386/kvm: Set Xen vCPU ID in KVM There are (at least) three different vCPU ID number spaces. One is the internal KVM vCPU index, based purely on which vCPU was chronologically created in the kernel first. If userspace threads are all spawned and create their KVM vCPUs in essentially random order, then the KVM indices are basically random too. The second number space is the APIC ID space, which is consistent and useful for referencing vCPUs. MSIs will specify the target vCPU using the APIC ID, for example, and the KVM Xen APIs also take an APIC ID from userspace whenever a vCPU needs to be specified (as opposed to just using the appropriate vCPU fd). The third number space is not normally relevant to the kernel, and is the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But Xen timer hypercalls use it, and Xen timer hypercalls really want to be accelerated in the kernel rather than handled in userspace, so the kernel needs to be told. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	f66b8a83c5	i386/kvm: handle Xen HVM cpuid leaves Introduce support for emulating CPUID for Xen HVM guests. It doesn't make sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally when the xen-version machine property is set. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Obtain xen_version from KVM property, make it automatic] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	61491cf441	i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support This just initializes the basic Xen support in KVM for now. Only permitted on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap overlay, event channel, grant tables and other stuff will exist. There's no point having the basic hypercall support if nothing else works. Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used later by support devices. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Paolo Bonzini	3023c9b4d1	target/i386: KVM: allow fast string operations if host supports them These are just a flag that documents the performance characteristic of an instruction; it needs no hypervisor support. So include them even if KVM does not show them. In particular, FZRM/FSRS/FSRC have only been added very recently, but they are available on Sapphire Rapids processors. Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-02-27 18:44:57 +01:00
Markus Armbruster	e2c1c34f13	include/block: Untangle inclusion loops We have two inclusion loops: block/block.h -> block/block-global-state.h -> block/block-common.h -> block/blockjob.h -> block/block.h block/block.h -> block/block-io.h -> block/block-common.h -> block/blockjob.h -> block/block.h I believe these go back to Emanuele's reorganization of the block API, merged a few months ago in commit `d7e2fe4aac`. Fortunately, breaking them is merely a matter of deleting unnecessary includes from headers, and adding them back in places where they are now missing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221221133551.3967339-2-armbru@redhat.com>	2023-01-20 07:24:28 +01:00
Markus Armbruster	d1c81c3496	qapi: Use returned bool to check for failure (again) Commit `012d4c96e2` changed the visitor functions taking Error ** to return bool instead of void, and the commits following it used the new return value to simplify error checking. Since then a few more uses in need of the same treatment crept in. Do that. All pretty mechanical except for * balloon_stats_get_all() This is basically the same transformation commit `012d4c96e2` applied to the virtual walk example in include/qapi/visitor.h. * set_max_queue_size() Additionally replace "goto end of function" by return. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221121085054.683122-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2022-12-14 16:19:35 +01:00
Zeng Guang	19e2a9fb9d	target/i386: Set maximum APIC ID to KVM prior to vCPU creation Specify maximum possible APIC ID assigned for current VM session to KVM prior to the creation of vCPUs. By this setting, KVM can set up VM-scoped data structure indexed by the APIC ID, e.g. Posted-Interrupt Descriptor pointer table to support Intel IPI virtualization, with the most optimal memory footprint. It can be achieved by calling KVM_ENABLE_CAP for KVM_CAP_MAX_VCPU_ID capability once KVM has enabled it. Ignoring the return error if KVM doesn't support this capability yet. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220825025246.26618-1-guang.zeng@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-31 09:46:34 +01:00
Markus Armbruster	0a553c12c7	Drop useless casts from g_malloc() & friends to pointer These memory allocation functions return void *, and casting to another pointer type is useless clutter. Drop these casts. If you really want another pointer type, consider g_new(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20220923120025.448759-3-armbru@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-10-22 23:15:40 +02:00
Maciej S. Szmigiero	ec19444a53	hyperv: fix SynIC SINT assertion failure on guest reset Resetting a guest that has Hyper-V VMBus support enabled triggers a QEMU assertion failure: hw/hyperv/hyperv.c:131: synic_reset: Assertion `QLIST_EMPTY(&synic->sint_routes)' failed. This happens both on normal guest reboot or when using "system_reset" HMP command. The failing assertion was introduced by commit `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") to catch dangling SINT routes on SynIC reset. The root cause of this problem is that the SynIC itself is reset before devices using SINT routes have chance to clean up these routes. Since there seems to be no existing mechanism to force reset callbacks (or methods) to be executed in specific order let's use a similar method that is already used to reset another interrupt controller (APIC) after devices have been reset - by invoking the SynIC reset from the machine reset handler via a new x86_cpu_after_reset() function co-located with the existing x86_cpu_reset() in target/i386/cpu.c. Opportunistically move the APIC reset handler there, too. Fixes: `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") # exposed the bug Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <cb57cee2e29b20d06f81dce054cbcea8b5d497e8.1664552976.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:04 +02:00
Alexander Graf	37656470f6	KVM: x86: Implement MSR_CORE_THREAD_COUNT MSR The MSR_CORE_THREAD_COUNT MSR describes CPU package topology, such as number of threads and cores for a given package. This is information that QEMU has readily available and can provide through the new user space MSR deflection interface. This patch propagates the existing hvf logic from patch `027ac0cb51` ("target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNT") to KVM. Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20221004225643.65036-4-agraf@csgraf.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:01 +02:00
Alexander Graf	860054d8ce	i386: kvm: Add support for MSR filtering KVM has grown support to deflect arbitrary MSRs to user space since Linux 5.10. For now we don't expect to make a lot of use of this feature, so let's expose it the easiest way possible: With up to 16 individually maskable MSRs. This patch adds a kvm_filter_msr() function that other code can call to install a hook on KVM MSR reads or writes. Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20221004225643.65036-3-agraf@csgraf.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:01 +02:00
Chenyi Qiang	e2e69f6bb9	i386: add notify VM exit support There are cases that malicious virtual machine can cause CPU stuck (due to event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and IRQ) can be delivered. It leads the CPU to be unavailable to host or other VMs. Notify VM exit is introduced to mitigate such kind of attacks, which will generate a VM exit if no event window occurs in VM non-root mode for a specified amount of time (notify window). A new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT is exposed to user space so that the user can query the capability and set the expected notify window when creating VMs. The format of the argument when enabling this capability is as follows: Bit 63:32 - notify window specified in qemu command Bit 31:0 - some flags (e.g. KVM_X86_NOTIFY_VMEXIT_ENABLED is set to enable the feature.) Users can configure the feature by a new (x86 only) accel property: qemu -accel kvm,notify-vmexit=run\|internal-error\|disable,notify-window=n The default option of notify-vmexit is run, which will enable the capability and do nothing if the exit happens. The internal-error option raises a KVM internal error if it happens. The disable option does not enable the capability. The default value of notify-window is 0. It is valid only when notify-vmexit is not disabled. The valid range of notify-window is non-negative. It is even safe to set it to zero since there's an internal hardware threshold to be added to ensure no false positive. Because a notify VM exit may happen with VM_CONTEXT_INVALID set in exit qualification (no cases are anticipated that would set this bit), which means VM context is corrupted. It would be reflected in the flags of KVM_EXIT_NOTIFY exit. If KVM_NOTIFY_CONTEXT_INVALID bit is set, raise a KVM internal error unconditionally. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220929072014.20705-5-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:00 +02:00
Paolo Bonzini	3dba0a335c	kvm: allow target-specific accelerator properties Several hypervisor capabilities in KVM are target-specific. When exposed to QEMU users as accelerator properties (i.e. -accel kvm,prop=value), they should not be available for all targets. Add a hook for targets to add their own properties to -accel kvm, for now no such property is defined. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220929072014.20705-3-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-10 09:23:16 +02:00
Chenyi Qiang	12f89a39cf	i386: kvm: extend kvm_{get, put}_vcpu_events to support pending triple fault For the direct triple faults, i.e. hardware detected and KVM morphed to VM-Exit, KVM will never lose them. But for triple faults sythesized by KVM, e.g. the RSM path, if KVM exits to userspace before the request is serviced, userspace could migrate the VM and lose the triple fault. A new flag KVM_VCPUEVENT_VALID_TRIPLE_FAULT is defined to signal that the event.triple_fault_pending field contains a valid state if the KVM_CAP_X86_TRIPLE_FAULT_EVENT capability is enabled. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220929072014.20705-2-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-10 09:23:16 +02:00
Stefan Hajnoczi	fafd35a6da	Pull request trivial patches branch 20220930-v2 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM7XoISHGxhdXJlbnRA dml2aWVyLmV1AAoJEPMMOL0/L748D/0QAKbYtTWjhFPeapjZVoTv13YrTvczWrcF omL6IZivVq0t7hun4iem0DwmvXJELMGexEOTvEJOzM19IIlvvwvOsI8xnxpcMnEY 6GKVbs53Ba0bg2yh7Dll2W9jkou9eX27DwUHMVF8KX7qqsbU+WyD/vdGZitgGt+T 8yna7kzVvNVsdB3+DbIatI5RzzHeu4OqeuH/WCtAyzCaLB64UYTcHprskxIp4+wp dR+EUSoDEr9Qx4PC+uVEsTFK1zZjyAYNoNIkh6fhlkRvDJ1uA75m3EJ57P8xPPqe VbVkPMKi0d4c52m6XvLsQhyYryLx/qLLUAkJWVpY66aHcapYbZAEAfZmNGTQLrOJ qIOJzIkOdU6l3pRgXVdVCgkHRc2HETwET2LyVbNkUz/vBlW2wOZQbZFbezComael bQ/gNBYqP+eOGnZzeWbKBGHr/9QDBClNufidIMC+sOiUw0iSifzjkFwvH7IElx6K EQCOSV6pOhKVlinTpmBbk1XD3xDkQ7ZidiLT9g+P1c8dExrXBhWOnfUHueISb8+s KKMozuxQ/6/3c/DP5hwI9cKPEWEbqJfq1kMuxIvEivKGwUIqX2yq4VJ+hSlYJ+CW nGjXZldtf4KwH+cTsxyPmdZRR5Q7+ODr5Xo7GNvEKBuDsHs7uUl1c3vvOykQgje9 +dyJR6TfbQWn =aK29 -----END PGP SIGNATURE----- Merge tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu into staging Pull request trivial patches branch 20220930-v2 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM7XoISHGxhdXJlbnRA # dml2aWVyLmV1AAoJEPMMOL0/L748D/0QAKbYtTWjhFPeapjZVoTv13YrTvczWrcF # omL6IZivVq0t7hun4iem0DwmvXJELMGexEOTvEJOzM19IIlvvwvOsI8xnxpcMnEY # 6GKVbs53Ba0bg2yh7Dll2W9jkou9eX27DwUHMVF8KX7qqsbU+WyD/vdGZitgGt+T # 8yna7kzVvNVsdB3+DbIatI5RzzHeu4OqeuH/WCtAyzCaLB64UYTcHprskxIp4+wp # dR+EUSoDEr9Qx4PC+uVEsTFK1zZjyAYNoNIkh6fhlkRvDJ1uA75m3EJ57P8xPPqe # VbVkPMKi0d4c52m6XvLsQhyYryLx/qLLUAkJWVpY66aHcapYbZAEAfZmNGTQLrOJ # qIOJzIkOdU6l3pRgXVdVCgkHRc2HETwET2LyVbNkUz/vBlW2wOZQbZFbezComael # bQ/gNBYqP+eOGnZzeWbKBGHr/9QDBClNufidIMC+sOiUw0iSifzjkFwvH7IElx6K # EQCOSV6pOhKVlinTpmBbk1XD3xDkQ7ZidiLT9g+P1c8dExrXBhWOnfUHueISb8+s # KKMozuxQ/6/3c/DP5hwI9cKPEWEbqJfq1kMuxIvEivKGwUIqX2yq4VJ+hSlYJ+CW # nGjXZldtf4KwH+cTsxyPmdZRR5Q7+ODr5Xo7GNvEKBuDsHs7uUl1c3vvOykQgje9 # +dyJR6TfbQWn # =aK29 # -----END PGP SIGNATURE----- # gpg: Signature made Mon 03 Oct 2022 18:13:22 EDT # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu: docs: Update TPM documentation for usage of a TPM 2 Use g_new() & friends where that makes obvious sense Drop superfluous conditionals around g_free() block/qcow2-bitmap: Add missing cast to silent GCC error checkpatch: ignore target/hexagon/imported/* files mem/cxl_type3: fix GPF DVSEC .gitignore: add .cache/ to .gitignore hw/virtio/vhost-shadow-virtqueue: Silence GCC error "maybe-uninitialized" Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-10-04 14:04:18 -04:00
Markus Armbruster	76eb88b12b	Drop superfluous conditionals around g_free() There is no need to guard g_free(P) with if (P): g_free(NULL) is safe. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220923090428.93529-1-armbru@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-10-04 00:10:11 +02:00
Ray Zhang	c4ef867f29	target/i386/kvm: fix kvmclock_current_nsec: Assertion `time.tsc_timestamp <= migration_tsc' failed New KVM_CLOCK flags were added in the kernel.(c68dc1b577eabd5605c6c7c08f3e07ae18d30d5d) ``` + #define KVM_CLOCK_VALID_FLAGS \ + (KVM_CLOCK_TSC_STABLE \| KVM_CLOCK_REALTIME \| KVM_CLOCK_HOST_TSC) case KVM_CAP_ADJUST_CLOCK: - r = KVM_CLOCK_TSC_STABLE; + r = KVM_CLOCK_VALID_FLAGS; ``` kvm_has_adjust_clock_stable needs to handle additional flags, so that s->clock_is_reliable can be true and kvmclock_current_nsec doesn't need to be called. Signed-off-by: Ray Zhang <zhanglei002@gmail.com> Message-Id: <20220922100523.2362205-1-zhanglei002@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-01 21:16:36 +02:00
Vitaly Kuznetsov	45ed68a1a3	i386: do kvm_put_msr_feature_control() first thing when vCPU is reset kvm_put_sregs2() fails to reset 'locked' CR4/CR0 bits upon vCPU reset when it is in VMX root operation. Do kvm_put_msr_feature_control() before kvm_put_sregs2() to (possibly) kick vCPU out of VMX root operation. It also seems logical to do kvm_put_msr_feature_control() before kvm_put_nested_state() and not after it, especially when 'real' nested state is set. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-01 07:42:37 +02:00
Vitaly Kuznetsov	3cafdb6750	i386: reset KVM nested state upon CPU reset Make sure env->nested_state is cleaned up when a vCPU is reset, it may be stale after an incoming migration, kvm_arch_put_registers() may end up failing or putting vCPU in a weird state. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-01 07:42:37 +02:00
Vitaly Kuznetsov	3aae0854b2	i386: Hyper-V Direct TLB flush hypercall Hyper-V TLFS allows for L0 and L1 hypervisors to collaborate on L2's TLB flush hypercalls handling. With the correct setup, L2's TLB flush hypercalls can be handled by L0 directly, without the need to exit to L1. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	aa6bb5fad5	i386: Hyper-V Support extended GVA ranges for TLB flush hypercalls KVM kind of supported "extended GVA ranges" (up to 4095 additional GFNs per hypercall) since the implementation of Hyper-V PV TLB flush feature (Linux-4.18) as regardless of the request, full TLB flush was always performed. "Extended GVA ranges for TLB flush hypercalls" feature bit wasn't exposed then. Now, as KVM gains support for fine-grained TLB flush handling, exposing this feature starts making sense. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	9411e8b6fa	i386: Hyper-V XMM fast hypercall input feature Hyper-V specification allows to pass parameters for certain hypercalls using XMM registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows for faster hypercalls processing as KVM can avoid reading guest's memory. KVM supports the feature since v5.14. Rename HV_HYPERCALL_{PARAMS_XMM_AVAILABLE -> XMM_INPUT_AVAILABLE} to comply with KVM. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	869840d26c	i386: Hyper-V Enlightened MSR bitmap feature The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	7110fe56c1	i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES Previously, HV_CPUID_NESTED_FEATURES.EAX CPUID leaf was handled differently as it was only used to encode the supported eVMCS version range. In fact, there are also feature (e.g. Enlightened MSR-Bitmap) bits there. In preparation to adding these features, move HV_CPUID_NESTED_FEATURES leaf handling to hv_build_cpuid_leaf() and drop now-unneeded 'hyperv_nested'. No functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Yang Weijiang	3a7a27cffb	target/i386: Remove LBREn bit check when access Arch LBR MSRs Live migration can happen when Arch LBR LBREn bit is cleared, e.g., when migration happens after guest entered SMM mode. In this case, we still need to migrate Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220517155024.33270-1-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-23 10:56:01 +02:00
Richard Henderson	eec398119f	virtio,pc,pci: fixes,cleanups,features most of CXL support fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKCuLIPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpdDUH/12SmWaAo+0+SdIHgWFFxsmg3t/EdcO38fgi MV+GpYdbp6TlU3jdQhrMZYmFdkVVydBdxk93ujCLbFS0ixTsKj31j0IbZMfdcGgv SLqnV+E3JdHqnGP39q9a9rdwYWyqhkgHoldxilIFW76ngOSapaZVvnwnOMAMkf77 1LieL4/Xq7N9Ho86Zrs3IczQcf0czdJRDaFaSIu8GaHl8ELyuPhlSm6CSqqrEEWR PA/COQsLDbLOMxbfCi5v88r5aaxmGNZcGbXQbiH9qVHw65nlHyLH9UkNTdJn1du1 f2GYwwa7eekfw/LCvvVwxO1znJrj02sfFai7aAtQYbXPvjvQiqA= =xdSk -----END PGP SIGNATURE----- Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging virtio,pc,pci: fixes,cleanups,features most of CXL support fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKCuLIPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpdDUH/12SmWaAo+0+SdIHgWFFxsmg3t/EdcO38fgi # MV+GpYdbp6TlU3jdQhrMZYmFdkVVydBdxk93ujCLbFS0ixTsKj31j0IbZMfdcGgv # SLqnV+E3JdHqnGP39q9a9rdwYWyqhkgHoldxilIFW76ngOSapaZVvnwnOMAMkf77 # 1LieL4/Xq7N9Ho86Zrs3IczQcf0czdJRDaFaSIu8GaHl8ELyuPhlSm6CSqqrEEWR # PA/COQsLDbLOMxbfCi5v88r5aaxmGNZcGbXQbiH9qVHw65nlHyLH9UkNTdJn1du1 # f2GYwwa7eekfw/LCvvVwxO1znJrj02sfFai7aAtQYbXPvjvQiqA= # =xdSk # -----END PGP SIGNATURE----- # gpg: Signature made Mon 16 May 2022 01:48:50 PM PDT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (86 commits) vhost-user-scsi: avoid unlink(NULL) with fd passing virtio-net: don't handle mq request in userspace handler for vhost-vdpa vhost-vdpa: change name and polarity for vhost_vdpa_one_time_request() vhost-vdpa: backend feature should set only once vhost-net: fix improper cleanup in vhost_net_start vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa virtio-net: setup vhost_dev and notifiers for cvq only when feature is negotiated hw/i386/amd_iommu: Fix IOMMU event log encoding errors hw/i386: Make pic a property of common x86 base machine type hw/i386: Make pit a property of common x86 base machine type include/hw/pci/pcie_host: Correct PCIE_MMCFG_SIZE_MAX include/hw/pci/pcie_host: Correct PCIE_MMCFG_BUS_MASK docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REG vhost-user: more master/slave things virtio: add vhost support for virtio devices virtio: drop name parameter for virtio_init() virtio/vhost-user: dynamically assign VhostUserHostNotifiers hw/virtio/vhost-user: don't suppress F_CONFIG when supported include/hw: start documenting the vhost API ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-05-16 16:31:01 -07:00
David Woodhouse	dc89f32d92	target/i386: Fix sanity check on max APIC ID / X2APIC enablement The check on x86ms->apic_id_limit in pc_machine_done() had two problems. Firstly, we need KVM to support the X2APIC API in order to allow IRQ delivery to APICs >= 255. So we need to call/check kvm_enable_x2apic(), which was done elsewhere in some cases but not all. Secondly, microvm needs the same check. So move it from pc_machine_done() to x86_cpus_init() where it will work for both. The check in kvm_cpu_instance_init() is now redundant and can be dropped. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20220314142544.150555-1-dwmw2@infradead.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-16 04:38:39 -04:00
Yang Weijiang	12703d4e75	target/i386: Add MSR access interface for Arch LBR In the first generation of Arch LBR, the max support Arch LBR depth is 32, both host and guest use the value to set depth MSR. This can simplify the implementation of patch given the side-effect of mismatch of host/guest depth MSR: XRSTORS will reset all recording MSRs to 0s if the saved depth mismatches MSR_ARCH_LBR_DEPTH. In most of the cases Arch LBR is not in active status, so check the control bit before save/restore the big chunck of Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-7-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-14 12:32:41 +02:00
Yang Weijiang	5a778a5f82	target/i386: Add kvm_get_one_msr helper When try to get one msr from KVM, I found there's no such kind of existing interface while kvm_put_one_msr() is there. So here comes the patch. It'll remove redundant preparation code before finally call KVM_GET_MSRS IOCTL. No functional change intended. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-4-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-14 12:32:41 +02:00
Jon Doron	d8701185f4	hw: hyperv: Initial commit for Synthetic Debugging device Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-5-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
Jon Doron	73d2407407	hyperv: Add support to process syndbg commands SynDbg commands can come from two different flows: 1. Hypercalls, in this mode the data being sent is fully encapsulated network packets. 2. SynDbg specific MSRs, in this mode only the data that needs to be transfered is passed. Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-4-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
Jon Doron	ccbdf5e81b	hyperv: Add definitions for syndbg Add all required definitions for hyperv synthetic debugger interface. Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-3-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
Marc-André Lureau	0f9668e0c1	Remove qemu-common.h include from most units Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:55 +02:00
Paolo Bonzini	58f7db26f2	KVM: x86: workaround invalid CPUID[0xD,9] info on some AMD processors Some AMD processors expose the PKRU extended save state even if they do not have the related PKU feature in CPUID. Worse, when they do they report a size of 64, whereas the expected size of the PKRU extended save state is 8, therefore the esa->size == eax assertion does not hold. The state is already ignored by KVM_GET_SUPPORTED_CPUID because it was not enabled in the host XCR0. However, QEMU kvm_cpu_xsave_init() runs before QEMU invokes arch_prctl() to enable dynamically-enabled save states such as XTILEDATA, and KVM_GET_SUPPORTED_CPUID hides save states that have yet to be enabled. Therefore, kvm_cpu_xsave_init() needs to consult the host CPUID instead of KVM_GET_SUPPORTED_CPUID, and dies with an assertion failure. When setting up the ExtSaveArea array to match the host, ignore features that KVM does not report as supported. This will cause QEMU to skip the incorrect CPUID leaf instead of tripping the assertion. Closes: https://gitlab.com/qemu-project/qemu/-/issues/916 Reported-by: Daniel P. Berrangé <berrange@redhat.com> Analyzed-by: Yang Zhong <yang.zhong@intel.com> Reported-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 14:13:58 +01:00
luofei	cb48748af7	i386: Set MCG_STATUS_RIPV bit for mce SRAR error In the physical machine environment, when a SRAR error occurs, the IA32_MCG_STATUS RIPV bit is set, but qemu does not set this bit. When qemu injects an SRAR error into virtual machine, the virtual machine kernel just call do_machine_check() to kill the current task, but not call memory_failure() to isolate the faulty page, which will cause the faulty page to be allocated and used repeatedly. If used by the virtual machine kernel, it will cause the virtual machine to crash Signed-off-by: luofei <luofei@unicloud.com> Message-Id: <20220120084634.131450-1-luofei@unicloud.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 12:22:25 +01:00
Philippe Mathieu-Daudé	dcebbb65b8	target/i386/kvm: Free xsave_buf when destroying vCPU Fix vCPU hot-unplug related leak reported by Valgrind: ==132362== 4,096 bytes in 1 blocks are definitely lost in loss record 8,440 of 8,549 ==132362== at 0x4C3B15F: memalign (vg_replace_malloc.c:1265) ==132362== by 0x4C3B288: posix_memalign (vg_replace_malloc.c:1429) ==132362== by 0xB41195: qemu_try_memalign (memalign.c:53) ==132362== by 0xB41204: qemu_memalign (memalign.c:73) ==132362== by 0x7131CB: kvm_init_xsave (kvm.c:1601) ==132362== by 0x7148ED: kvm_arch_init_vcpu (kvm.c:2031) ==132362== by 0x91D224: kvm_init_vcpu (kvm-all.c:516) ==132362== by 0x9242C9: kvm_vcpu_thread_fn (kvm-accel-ops.c:40) ==132362== by 0xB2EB26: qemu_thread_start (qemu-thread-posix.c:556) ==132362== by 0x7EB2159: start_thread (in /usr/lib64/libpthread-2.28.so) ==132362== by 0x9D45DD2: clone (in /usr/lib64/libc-2.28.so) Reported-by: Mark Kanda <mark.kanda@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Mark Kanda <mark.kanda@oracle.com> Message-Id: <20220322120522.26200-1-philippe.mathieu.daude@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 12:22:25 +01:00
Paolo Bonzini	3ec5ad4008	target/i386: kvm: do not access uninitialized variable on older kernels KVM support for AMX includes a new system attribute, KVM_X86_XCOMP_GUEST_SUPP. Commit `19db68ca68` ("x86: Grant AMX permission for guest", 2022-03-15) however did not fully consider the behavior on older kernels. First, it warns too aggressively. Second, it invokes the KVM_GET_DEVICE_ATTR ioctl unconditionally and then uses the "bitmask" variable, which remains uninitialized if the ioctl fails. Third, kvm_ioctl returns -errno rather than -1 on errors. While at it, explain why the ioctl is needed and KVM_GET_SUPPORTED_CPUID is not enough. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-20 20:38:52 +01:00

1 2 3

123 Commits