In x86_cpu_filter_features(), if host doesn't support AVX10, the
configured avx10_version should be marked as filtered regardless of
whether prefix is NULL or not.
Check prefix before warn_report() instead of checking for
have_filtered_features.
Cc: qemu-stable@nongnu.org
Fixes: commit bccfb846fd ("target/i386: add AVX10 feature and AVX10 version property")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cf4c263551886964c5d58bd7b675b13fd497b402)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 644e3c5d81 ("missing vmx features for Skylake-Server and Cascadelake-Server")
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Chenyi Qiang <chenyi.qiang@intel.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 93dcc9390e5ad0696ae7e9b7b3a5b08c2d1b6de6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
SHA512, SM3, SM4 (CPUID[EAX=7,ECX=1).EAX bits 0 to 2) is supported by
Clearwater Forest processor, add it to QEMU as it does not need any
specific enablement.
See https://lore.kernel.org/kvm/20241105054825.870939-1-tao1.su@linux.intel.com/
for reference.
Reviewed-by: Tao Su <tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cache topology needs to be defined based on CPU topology levels. Thus,
define CPU topology enumeration in qapi/machine.json to make it generic
for all architectures.
To match the general topology naming style, rename CPU_TOPO_LEVEL_* to
CPU_TOPOLOGY_LEVEL_*, and rename SMT and package levels to thread and
socket.
Also, enumerate additional topology levels for non-i386 arches, and add
a CPU_TOPOLOGY_LEVEL_DEFAULT to help future smp-cache object to work
with compatibility requirement of arch-specific cache topology models.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241101083331.340178-3-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In the follow-up change, the CPU topology enumeration will be moved to
QAPI. And considerring "invalid" should not be exposed to QAPI as an
unsettable item, so, as a preparation for future changes, remove
"invalid" level from the current CPU topology enumeration structure
and define it by a macro instead.
Due to the removal of the enumeration of "invalid", bit 0 of
CPUX86State.avail_cpu_topo bitmap will no longer correspond to "invalid"
level, but will start at the SMT level. Therefore, to honor this change,
update the encoding rule for CPUID[0x1F].
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-ID: <20241101083331.340178-2-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Set the NaN propagation rule explicitly for the float_status words
used in the x86 target.
This is a no-behaviour-change commit, so we retain the existing
behaviour of using the x87-style "prefer QNaN over SNaN, then prefer
the NaN with the larger significand" for MMX and SSE. This is
however not the documented hardware behaviour, so we leave a TODO
note about what we should be doing instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-16-peter.maydell@linaro.org
AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
are identical to AVX512 state regardless of the supported vector lengths.
Given that some E-cores will support AVX10 but not support AVX512, add
AVX512 state components to guest when AVX10 is enabled.
Based on a patch by Tao Su <tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-8-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the highest supported vector length for a processor implies that
all lesser vector lengths are also supported, add the dependencies of
the supported vector lengths. If all vector lengths aren't supported,
clear AVX10 enable bit as well.
Note that the order of AVX10 related dependencies should be kept as:
CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
CPUID_7_1_EDX_AVX10 -> CPUID_24_0_EBX,
so that prevent user from setting weird CPUID combinations, e.g. 256-bits
and 512-bits are supported but 128-bits is not, no vector lengths are
supported but AVX10 enable bit is still set.
Since AVX10_128 will be reserved as 1, adding these dependencies has the
bonus that when user sets -cpu host,-avx10-128, CPUID_7_1_EDX_AVX10 and
CPUID_24_0_EBX will be disabled automatically.
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-5-tao1.su@linux.intel.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241031085233.425388-7-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10
Converged Vector ISA leaf" containing fields for the version number and
the supported vector bit lengths.
Introduce avx10-version property so that avx10 version can be controlled
by user and cpu model. Per spec, avx10 version can never be 0, the default
value of avx10-version is set to 0 to determine whether it is specified by
user. The default can come from the device model or, for the max model,
from KVM's reported value.
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-5-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prepare for filtering non-boolean features such as AVX10 version.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Right now, QEMU is using the "feature" and "bits" fields of ExtSaveArea
to query the accelerator for the support status of extended save areas.
This is a problem for AVX10, which attaches two feature bits (AVX512F
and AVX10) to the same extended save states.
To keep the AVX10 hacks to the minimum, limit usage of esa->features
and esa->bits. Instead, just query the accelerator for the 0xD leaf.
Do it in common code and clear esa->size if an extended save state is
unsupported.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-3-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Newer AMD CPUs support ERAPS (Enhanced Return Address Prediction Security)
feature that enables the auto-clear of RSB entries on a TLB flush, context
switches and VMEXITs. The number of default RSP entries is reflected in
RapSize.
Add the feature bit and feature word to support these features.
CPUID_Fn80000021_EAX
Bits Feature Description
24 ERAPS:
Indicates support for enhanced return address predictor security.
CPUID_Fn80000021_EBX
Bits Feature Description
31-24 Reserved
23:16 RapSize:
Return Address Predictor size. RapSize x 8 is the minimum number
of CALL instructions software needs to execute to flush the RAP.
15-00 MicrocodePatchSize. Read-only.
Reports the size of the Microcode patch in 16-byte multiples.
If 0, the size of the patch is at most 5568 (15C0h) bytes.
Link: https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/programmer-references/57238.zip
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/7c62371fe60af1e9bbd853f5f8e949bf2d908bd0.1729807947.git.babu.moger@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID leaf 0x80000022, i.e. ExtPerfMonAndDbg, advertises new performance
monitoring features for AMD processors. Bit 0 of EAX indicates support
for Performance Monitoring Version 2 (PerfMonV2) features. If found to
be set during PMU initialization, the EBX bits can be used to determine
the number of available counters for different PMUs. It also denotes the
availability of global control and status registers.
Add the required CPUID feature word and feature bit to allow guests to
make use of the PerfMonV2 features.
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/a96f00ee2637674c63c61e9fc4dee343ea818053.1729807947.git.babu.moger@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The error message for a "stepping" value that is out of bounds is a
bit odd:
$ qemu-system-x86_64 -cpu qemu64,stepping=16
qemu-system-x86_64: can't apply global qemu64-x86_64-cpu.stepping=16: Property .stepping doesn't take value 16 (minimum: 0, maximum: 15)
The "can't apply global" part is an unfortunate artifact of -cpu's
implementation. Left for another day.
The remainder feels overly verbose. Change it to
qemu64-x86_64-cpu: can't apply global qemu64-x86_64-cpu.stepping=16: parameter 'stepping' can be at most 15
Likewise for "family", "model", and "tsc-frequency".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20241010150144.986655-6-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Properties "family", "model", and "stepping" are visited as signed
integers. They are backed by bits in CPUX86State member
@cpuid_version. The code to extract and insert these bits mixes
signed and unsigned. Not actually wrong, but avoiding such mixing is
good practice.
Visit them as unsigned integers instead.
This adds a few mildly ugly cast in arguments of error_setg(). The
next commit will get rid of them again.
Property "tsc-frequency" is also visited as signed integer. The value
ultimately flows into the kernel, where it is 31 bits unsigned. The
QEMU code freely mixes int, uint32_t, int64_t. I elect not to attempt
draining this swamp today.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20241010150144.986655-5-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Putting HYPERV_FEAT_SYNDBG entry under "#ifdef CONFIG_SYNDBG" in
'kvm_hyperv_properties' array is wrong: as HYPERV_FEAT_SYNDBG is not
the highest feature number, the result is an empty (zeroed) entry in
the array (and not a skipped entry!). hyperv_feature_supported() is
designed to check that all CPUID bits are set but for a zeroed
feature in 'kvm_hyperv_properties' it returns 'true' so QEMU considers
HYPERV_FEAT_SYNDBG as always supported, regardless of whether KVM host
actually supports it.
To fix the issue, leave HYPERV_FEAT_SYNDBG's definition in
'kvm_hyperv_properties' array, there's nothing wrong in having it defined
even when 'CONFIG_SYNDBG' is not set. Instead, put "hv-syndbg" CPU property
under '#ifdef CONFIG_SYNDBG' to alter the existing behavior when the flag
is silently skipped in !CONFIG_SYNDBG builds.
Leave an 'assert' sentinel in hyperv_feature_supported() making sure there
are no 'holes' or improperly defined features in 'kvm_hyperv_properties'.
Fixes: d8701185f4 ("hw: hyperv: Initial commit for Synthetic Debugging device")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240917160051.2637594-2-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Following 5 bits in CPUID.7.2.EDX are supported by KVM. Add their
supports in QEMU. Each of them indicates certain bits of IA32_SPEC_CTRL
are supported. Those bits can control CPU speculation behavior which can
be used to defend against side-channel attacks.
bit0: intel-psfd
if 1, indicates bit 7 of the IA32_SPEC_CTRL MSR is supported. Bit 7 of
this MSR disables Fast Store Forwarding Predictor without disabling
Speculative Store Bypass
bit1: ipred-ctrl
If 1, indicates bits 3 and 4 of the IA32_SPEC_CTRL MSR are supported.
Bit 3 of this MSR enables IPRED_DIS control for CPL3. Bit 4 of this
MSR enables IPRED_DIS control for CPL0/1/2
bit2: rrsba-ctrl
If 1, indicates bits 5 and 6 of the IA32_SPEC_CTRL MSR are supported.
Bit 5 of this MSR disables RRSBA behavior for CPL3. Bit 6 of this MSR
disables RRSBA behavior for CPL0/1/2
bit3: ddpd-u
If 1, indicates bit 8 of the IA32_SPEC_CTRL MSR is supported. Bit 8 of
this MSR disables Data Dependent Prefetcher.
bit4: bhi-ctrl
if 1, indicates bit 10 of the IA32_SPEC_CTRL MSR is supported. Bit 10
of this MSR enables BHI_DIS_S behavior.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Link: https://lore.kernel.org/r/20240919051011.118309-1-chao.gao@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When user sets tsc-frequency explicitly, the invtsc feature is actually
migratable because the tsc-frequency is supposed to be fixed during the
migration.
See commit d99569d9d8 ("kvm: Allow invtsc migration if tsc-khz
is set explicitly") for referrence.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20240814075431.339209-10-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- CPUID.(EAX=07H,ECX=0H):EBX[bit 6]: x87 FPU Data Pointer updated only
on x87 exceptions if 1.
- CPUID.(EAX=07H,ECX=0H):EBX[bit 13]: Deprecates FPU CS and FPU DS
values if 1. i.e., X87 FCS and FDS are always zero.
Define names for them so that they can be exposed to guest with -cpu host.
Also define the bit field MACROs so that named cpu models can add it as
well in the future.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20240814075431.339209-3-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Copy XML files describing orig_ax from GDB and glue them with
CPUX86State.orig_ax.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240912093012.402366-5-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
According to AMD's Speculative Return Stack Overflow whitepaper (link
below), the hypervisor should synthesize the value of IBPB_BRTYPE and
SBPB CPUID bits to the guest.
Support for this is already present in the kernel with commit
e47d86083c66 ("KVM: x86: Add SBPB support") and commit 6f0f23ef76be
("KVM: x86: Add IBPB_BRTYPE support").
Add support in QEMU to expose the bits to the guest OS.
host:
# cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Safe RET
before (guest):
$ cpuid -l 0x80000021 -1 -r
0x80000021 0x00: eax=0x00000045 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
^
$ cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Vulnerable: Safe RET, no microcode
after (guest):
$ cpuid -l 0x80000021 -1 -r
0x80000021 0x00: eax=0x18000045 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
^
$ cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Safe RET
Reported-by: Fabian Vogt <fvogt@suse.de>
Link: https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240805202041.5936-1-farosas@suse.de
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add definitions of
1) VM-exit activate secondary controls bit
2) VM-entry load FRED bit
which are required to enable nested FRED.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Link: https://lore.kernel.org/r/20240807081813.735158-3-xin@zytor.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As SDM stated, CPUID 0x12 leaves depend on CPUID_7_0_EBX_SGX (SGX
feature word).
Since FEAT_SGX_12_0_EAX, FEAT_SGX_12_0_EBX and FEAT_SGX_12_1_EAX define
multiple feature words, add the dependencies of those registers to
report the warning to user if SGX is absent.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
At present, cpu_x86_cpuid() silently masks off SGX_LC if SGX is absent.
This is not proper because the user is not told about the dependency
between the two.
So explicitly define the dependency between SGX_LC and SGX feature
words, so that user could get a warning when SGX_LC is enabled but
SGX is absent.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID.0x7.0.ebx and CPUID.0x7.0.ecx leaves have been expressed as the
feature word lists, and the Host capability support has been checked
in x86_cpu_filter_features().
Therefore, such checks on SGX feature "words" are redundant, and
the follow-up adjustments to those feature "words" will not actually
take effect.
Remove unnecessary SGX feature words related checks.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The feature word 'r' is a u64, and "unavail" is a u32, the operation
'r &= ~unavail' clears the high 32 bits of 'r'. This causes many vmx cases
in kvm-unit-tests to fail. Changing 'unavail' from u32 to u64 fixes this
issue.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2442
Fixes: 0b2757412c ("target/i386: drop AMD machine check bits from Intel CPUID")
Signed-off-by: Xiong Zhang <xiong.y.zhang@linux.intel.com>
Link: https://lore.kernel.org/r/20240730082927.250180-1-xiong.y.zhang@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AVX-VNNI-INT16 (CPUID[EAX=7,ECX=1).EDX[10]) is supported by Clearwater
Forest processor, add it to QEMU as it does not need any specific
enablement.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit d7c72735f6 ("target/i386: Add new EPYC CPU versions with updated
cache_info", 2023-05-08) ensured that AMD-defined CPU models did not
have the 'complex_indexing' bit set, but left it set in "-cpu host"
which uses the default ("legacy") cache information.
Reimplement that commit using a CPU feature, so that it can be applied
to all guests using a new machine type, independent of the CPU model.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The recent addition of the SUCCOR bit to kvm_arch_get_supported_cpuid()
causes the bit to be visible when "-cpu host" VMs are started on Intel
processors.
While this should in principle be harmless, it's not tidy and we don't
even know for sure that it doesn't cause any guest OS to take unexpected
paths. Since x86_cpu_get_supported_feature_word() can return different
different values depending on the guest, adjust it to hide the SUCCOR
bit if the guest has non-AMD vendor.
Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: John Allen <john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This allows modifying the bits in "-cpu max"/"-cpu host" depending on
the guest CPU vendor (which, at least by default, is the host vendor in
the case of KVM).
For example, machine check architecture differs between Intel and AMD,
and bits from AMD should be dropped when configuring the guest for
an Intel model.
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: John Allen <john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
X86CPU::kvm_no_smi_migration was only used by the
pc-i440fx-2.3 machine, which got removed. Remove it
and simplify kvm_put_vcpu_events().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240617071118.60464-23-philmd@linaro.org>
When QEMU is started with:
-cpu host,host-cache-info=on,l3-cache=off \
-smp 2,sockets=1,dies=1,cores=1,threads=2
Guest can't acquire maximum number of addressable IDs for processor cores in
the physical package from CPUID[04H].
When creating a CPU topology of 1 core per package, host-cache-info only
uses the Host's addressable core IDs field (CPUID.04H.EAX[bits 31-26]),
resulting in a conflict (on the multicore Host) between the Guest core
topology information in this field and the Guest's actual cores number.
Fix it by removing the unnecessary condition to cover 1 core per package
case. This is safe because cores_per_pkg will not be 0 and will be at
least 1.
Fixes: d7caf13b5f ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240611032314.64076-1-xuchuangxclwt@bytedance.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add cpuid bit definition for overflow recovery. This is needed in the case
where a deferred error has been sent to the guest, a guest process accesses the
poisoned memory, but the machine_check_poll function has not yet handled the
original deferred error. If overflow recovery is not set in this case, when we
handle the uncorrected error from the poisoned memory access, the overflow bit
will be set and will result in the guest being shut down.
By the time the MCE reaches the guest, the overflow has been handled
by the host and has not caused a shutdown, so include the bit unconditionally.
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-4-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
be exposed to guests to allow them to handle machine check exceptions on AMD
hosts.
----
v2:
- Add "succor" feature word.
- Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.
Reported-by: William Roche <william.roche@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-3-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow VMX nested-exception support to be exposed in KVM guests, thus
nested KVM guests can enumerate it.
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-6-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
FRED, i.e., the Intel flexible return and event delivery architecture,
defines simple new transitions that change privilege level (ring
transitions).
The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0. One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0. Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.
In addition to these transitions, the FRED architecture defines a new
instruction (LKGS) for managing the state of the GS segment register.
The LKGS instruction can be used by 64-bit operating systems that do
not use the new FRED transitions.
WRMSRNS is an instruction that behaves exactly like WRMSR, with the
only difference being that it is not a serializing instruction by
default. Under certain conditions, WRMSRNS may replace WRMSR to improve
performance. FRED uses it to switch RSP0 in a faster manner.
Search for the latest FRED spec in most search engines with this search
pattern:
site:intel.com FRED (flexible return and event delivery) specification
The CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[17] enumerates FRED, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[18] enumerates LKGS, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[19] enumerates WRMSRNS.
Add CPUID definitions for FRED/LKGS/WRMSRNS, and expose them to KVM guests.
Because FRED relies on LKGS and WRMSRNS, add that to feature dependency
map.
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-2-xin3.li@intel.com>
[Fix order of dependencies, add dependencies from LM to FRED. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SNP guests will rely on this bit to determine certain feature support.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-12-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>