Commit Graph

13723 Commits

Author SHA1 Message Date
Daniel Henrique Barboza 68e7c86927 target/riscv: prioritize pmp errors in raise_mmu_exception()
raise_mmu_exception(), as is today, is prioritizing guest page faults by
checking first if virt_enabled && !first_stage, and then considering the
regular inst/load/store faults.

There's no mention in the spec about guest page fault being a higher
priority that PMP faults. In fact, privileged spec section 3.7.1 says:

"Attempting to fetch an instruction from a PMP region that does not have
execute permissions raises an instruction access-fault exception.
Attempting to execute a load or load-reserved instruction which accesses
a physical address within a PMP region without read permissions raises a
load access-fault exception. Attempting to execute a store,
store-conditional, or AMO instruction which accesses a physical address
within a PMP region without write permissions raises a store
access-fault exception."

So, in fact, we're doing it wrong - PMP faults should always be thrown,
regardless of also being a first or second stage fault.

The way riscv_cpu_tlb_fill() and get_physical_address() work is
adequate: a TRANSLATE_PMP_FAIL error is immediately reported and
reflected in the 'pmp_violation' flag. What we need is to change
raise_mmu_exception() to prioritize it.

Reported-by: Joseph Chan <jchan@ventanamicro.com>
Fixes: 82d53adfbb ("target/riscv/cpu_helper.c: Invalid exception on MMU translation stage")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240413105929.7030-1-alexei.filippov@syntacore.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Max Chou 93cb52b7a3 target/riscv: rvv: Remove redudant SEW checking for vector fp narrow/widen instructions
If the checking functions check both the single and double width
operators at the same time, then the single width operator checking
functions (require_rvf[min]) will check whether the SEW is 8.

Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240322092600.1198921-5-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Max Chou 692f33a3ab target/riscv: rvv: Check single width operator for vfncvt.rod.f.f.w
The opfv_narrow_check needs to check the single width float operator by
require_rvf.

Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240322092600.1198921-4-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Max Chou 7a999d4dd7 target/riscv: rvv: Check single width operator for vector fp widen instructions
The require_scale_rvf function only checks the double width operator for
the vector floating point widen instructions, so most of the widen
checking functions need to add require_rvf for single width operator.

The vfwcvt.f.x.v and vfwcvt.f.xu.v instructions convert single width
integer to double width float, so the opfxv_widen_check function doesn’t
need require_rvf for the single width operator(integer).

Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240322092600.1198921-3-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Max Chou 17b713c080 target/riscv: rvv: Fix Zvfhmin checking for vfwcvt.f.f.v and vfncvt.f.f.w instructions
According v spec 18.4, only the vfwcvt.f.f.v and vfncvt.f.f.w
instructions will be affected by Zvfhmin extension.
And the vfwcvt.f.f.v and vfncvt.f.f.w instructions only support the
conversions of

* From 1*SEW(16/32) to 2*SEW(32/64)
* From 2*SEW(32/64) to 1*SEW(16/32)

Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240322092600.1198921-2-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Christoph Müllner fd53ee268d riscv: thead: Add th.sxstatus CSR emulation
The th.sxstatus CSR can be used to identify available custom extension
on T-Head CPUs. The CSR is documented here:
  https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadsxstatus.adoc

An important property of this patch is, that the th.sxstatus MAEE field
is not set (indicating that XTheadMae is not available).
XTheadMae is a memory attribute extension (similar to Svpbmt) which is
implemented in many T-Head CPUs (C906, C910, etc.) and utilizes bits
in PTEs that are marked as reserved. QEMU maintainers prefer to not
implement XTheadMae, so we need give kernels a mechanism to identify
if XTheadMae is available in a system or not. And this patch introduces
this mechanism in QEMU in a way that's compatible with real HW
(i.e., probing the th.sxstatus.MAEE bit).

Further context can be found on the list:
https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg00775.html

Reviewed-by: LIU Zhiwei <zhiwe_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Message-ID: <20240429073656.2486732-1-christoph.muellner@vrull.eu>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Huang Tao 8c8a7cd647 target/riscv: Implement dynamic establishment of custom decoder
In this patch, we modify the decoder to be a freely composable data
structure instead of a hardcoded one. It can be dynamically builded up
according to the extensions.
This approach has several benefits:
1. Provides support for heterogeneous cpu architectures. As we add decoder in
   RISCVCPU, each cpu can have their own decoder, and the decoders can be
   different due to cpu's features.
2. Improve the decoding efficiency. We run the guard_func to see if the decoder
   can be added to the dynamic_decoder when building up the decoder. Therefore,
   there is no need to run the guard_func when decoding each instruction. It can
   improve the decoding efficiency
3. For vendor or dynamic cpus, it allows them to customize their own decoder
   functions to improve decoding efficiency, especially when vendor-defined
   instruction sets increase. Because of dynamic building up, it can skip the other
   decoder guard functions when decoding.
4. Pre patch for allowing adding a vendor decoder before decode_insn32() with minimal
   overhead for users that don't need this particular vendor decoder.

Signed-off-by: Huang Tao <eric.huang@linux.alibaba.com>
Suggested-by: Christoph Muellner <christoph.muellner@vrull.eu>
Co-authored-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240506023607.29544-1-eric.huang@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Yangyu Chen ff33b7a969 target/riscv/cpu.c: fix Zvkb extension config
This code has a typo that writes zvkb to zvkg, causing users can't
enable zvkb through the config. This patch gets this fixed.

Signed-off-by: Yangyu Chen <cyy@cyyself.name>
Fixes: ea61ef7097 ("target/riscv: Move vector crypto extensions to riscv_cpu_extensions")
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Max Chou <max.chou@sifive.com>
Reviewed-by:  Weiwei Li <liwei1518@gmail.com>
Message-ID: <tencent_7E34EEF0F90B9A68BF38BEE09EC6D4877C0A@qq.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Huang Tao 75115d880c target/riscv: Fix the element agnostic function problem
In RVV and vcrypto instructions, the masked and tail elements are set to 1s
using vext_set_elems_1s function if the vma/vta bit is set. It is the element
agnostic policy.

However, this function can't deal the big endian situation. This patch fixes
the problem by adding handling of such case.

Signed-off-by: Huang Tao <eric.huang@linux.alibaba.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240325021654.6594-1-eric.huang@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Jason Chien 4a90991234 target/riscv: Relax vector register check in RISCV gdbstub
In current implementation, the gdbstub allows reading vector registers
only if V extension is supported. However, all vector extensions and
vector crypto extensions have the vector registers and they all depend
on Zve32x. The gdbstub should check for Zve32x instead.

Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Max Chou <max.chou@sifive.com>
Message-ID: <20240328022343.6871-4-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Jason Chien e7dc5e160f target/riscv: Add support for Zve64x extension
Add support for Zve64x extension. Enabling Zve64f enables Zve64x and
enabling Zve64x enables Zve32x according to their dependency.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2107
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240328022343.6871-3-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Jason Chien 9fb41a4418 target/riscv: Add support for Zve32x extension
Add support for Zve32x extension and replace some checks for Zve32f with
Zve32x, since Zve32f depends on Zve32x.

Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240328022343.6871-2-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Daniel Henrique Barboza f15af01740 trans_privileged.c.inc: set (m|s)tval on ebreak breakpoint
Privileged spec section 4.1.9 mentions:

"When a trap is taken into S-mode, stval is written with
exception-specific information to assist software in handling the trap.
(...)

If stval is written with a nonzero value when a breakpoint,
address-misaligned, access-fault, or page-fault exception occurs on an
instruction fetch, load, or store, then stval will contain the faulting
virtual address."

A similar text is found for mtval in section 3.1.16.

Setting mtval/stval in this scenario is optional, but some softwares read
these regs when handling ebreaks.

Write 'badaddr' in all ebreak breakpoints to write the appropriate
'tval' during riscv_do_cpu_interrrupt().

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240416230437.1869024-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Daniel Henrique Barboza 0099f60534 target/riscv/debug: set tval=pc in breakpoint exceptions
We're not setting (s/m)tval when triggering breakpoints of type 2
(mcontrol) and 6 (mcontrol6). According to the debug spec section
5.7.12, "Match Control Type 6":

"The Privileged Spec says that breakpoint exceptions that occur on
instruction fetches, loads, or stores update the tval CSR with either
zero or the faulting virtual address. The faulting virtual address for
an mcontrol6 trigger with action = 0 is the address being accessed and
which caused that trigger to fire."

A similar text is also found in the Debug spec section 5.7.11 w.r.t.
mcontrol.

Note that what we're doing ATM is not violating the spec, but it's
simple enough to set mtval/stval and it makes life easier for any
software that relies on this info.

Given that we always use action = 0, save the faulting address for the
mcontrol and mcontrol6 trigger breakpoints into env->badaddr, which is
used as as scratch area for traps with address information. 'tval' is
then set during riscv_cpu_do_interrupt().

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-ID: <20240416230437.1869024-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Daniel Henrique Barboza 1215d45b2a target/riscv/kvm: tolerate KVM disable ext errors
Running a KVM guest using a 6.9-rc3 kernel, in a 6.8 host that has zkr
enabled, will fail with a kernel oops SIGILL right at the start. The
reason is that we can't expose zkr without implementing the SEED CSR.
Disabling zkr in the guest would be a workaround, but if the KVM doesn't
allow it we'll error out and never boot.

In hindsight this is too strict. If we keep proceeding, despite not
disabling the extension in the KVM vcpu, we'll not add the extension in
the riscv,isa. The guest kernel will be unaware of the extension, i.e.
it doesn't matter if the KVM vcpu has it enabled underneath or not. So
it's ok to keep booting in this case.

Change our current logic to not error out if we fail to disable an
extension in kvm_set_one_reg(), but show a warning and keep booting. It
is important to throw a warning because we must make the user aware that
the extension is still available in the vcpu, meaning that an
ill-behaved guest can ignore the riscv,isa settings and  use the
extension.

The case we're handling happens with an EINVAL error code. If we fail to
disable the extension in KVM for any other reason, error out.

We'll also keep erroring out when we fail to enable an extension in KVM,
since adding the extension in riscv,isa at this point will cause a guest
malfunction because the extension isn't enabled in the vcpu.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240422171425.333037-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:12 +10:00
Clément Léger ba7a1c5297 target/riscv: change RISCV_EXCP_SEMIHOST exception number to 63
The current semihost exception number (16) is a reserved number (range
[16-17]). The upcoming double trap specification uses that number for
the double trap exception. Since the privileged spec (Table 22) defines
ranges for custom uses change the semihosting exception number to 63
which belongs to the range [48-63] in order to avoid any future
collisions with reserved exception.

Signed-off-by: Clément Léger <cleger@rivosinc.com>

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240422135840.1959967-1-cleger@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:11 +10:00
Daniel Henrique Barboza a6b53378f5 target/riscv/kvm: implement SBI debug console (DBCN) calls
SBI defines a Debug Console extension "DBCN" that will, in time, replace
the legacy console putchar and getchar SBI extensions.

The appeal of the DBCN extension is that it allows multiple bytes to be
read/written in the SBI console in a single SBI call.

As far as KVM goes, the DBCN calls are forwarded by an in-kernel KVM
module to userspace. But this will only happens if the KVM module
actually supports this SBI extension and we activate it.

We'll check for DBCN support during init time, checking if get-reg-list
is advertising KVM_RISCV_SBI_EXT_DBCN. In that case, we'll enable it via
kvm_set_one_reg() during kvm_arch_init_vcpu().

Finally, change kvm_riscv_handle_sbi() to handle the incoming calls for
SBI_EXT_DBCN, reading and writing as required.

A simple KVM guest with 'earlycon=sbi', running in an emulated RISC-V
host, takes around 20 seconds to boot without using DBCN. With this
patch we're taking around 14 seconds to boot due to the speed-up in the
terminal output.  There's no change in boot time if the guest isn't
using earlycon.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20240425155012.581366-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:11 +10:00
Andrew Jones b62e0ce760 target/riscv: Raise exceptions on wrs.nto
Implementing wrs.nto to always just return is consistent with the
specification, as the instruction is permitted to terminate the
stall for any reason, but it's not useful for virtualization, where
we'd like the guest to trap to the hypervisor in order to allow
scheduling of the lock holding VCPU. Change to always immediately
raise exceptions when the appropriate conditions are present,
otherwise continue to just return. Note, immediately raising
exceptions is also consistent with the specification since the
time limit that should expire prior to the exception is
implementation-specific.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240424142808.62936-2-ajones@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:11 +10:00
Andrew Jones 86997772fa target/riscv/kvm: Fix exposure of Zkr
The Zkr extension may only be exposed to KVM guests if the VMM
implements the SEED CSR. Use the same implementation as TCG.

Without this patch, running with a KVM which does not forward the
SEED CSR access to QEMU will result in an ILL exception being
injected into the guest (this results in Linux guests crashing on
boot). And, when running with a KVM which does forward the access,
QEMU will crash, since QEMU doesn't know what to do with the exit.

Fixes: 3108e2f1c6 ("target/riscv/kvm: update KVM exts to Linux 6.8")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-ID: <20240422134605.534207-2-ajones@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-06-03 11:12:11 +10:00
Peter Maydell a96edb687e target/arm: Implement FEAT WFxT and enable for '-cpu max'
FEAT_WFxT introduces new instructions WFIT and WFET, which are like
the existing WFI and WFE but allow the guest to pass a timeout value
in a register.  The instructions will wait for an interrupt/event as
usual, but will also stop waiting when the value of CNTVCT_EL0 is
greater than or equal to the specified timeout value.

We implement WFIT by setting up a timer to expire at the right
point; when the timer expires it sets the EXITTB interrupt, which
will cause the CPU to leave the halted state. If we come out of
halt for some other reason, we unset the pending timer.

We implement WFET as a nop, which is architecturally permitted and
matches the way we currently make WFE a nop.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240430140035.3889879-3-peter.maydell@linaro.org
2024-05-30 16:35:17 +01:00
Peter Maydell 408b2b3d9d accel/tcg: Make TCGCPUOps::cpu_exec_halt return bool for whether to halt
The TCGCPUOps::cpu_exec_halt method is called from cpu_handle_halt()
when the CPU is halted, so that a target CPU emulation can do
anything target-specific it needs to do.  (At the moment we only use
this on i386.)

The current specification of the method doesn't allow the target
specific code to do something different if the CPU is about to come
out of the halt state, because cpu_handle_halt() only determines this
after the method has returned.  (If the method called cpu_has_work()
itself this would introduce a potential race if an interrupt arrived
between the target's method implementation checking and
cpu_handle_halt() repeating the check.)

Change the definition of the method so that it returns a bool to
tell cpu_handle_halt() whether to stay in halt or not.

We will want this for the Arm target, where FEAT_WFxT wants to do
some work only for the case where the CPU is in halt but about to
leave it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240430140035.3889879-2-peter.maydell@linaro.org
2024-05-30 16:13:48 +01:00
Marcin Juszkiewicz daf9748ac0 target/arm: Disable SVE extensions when SVE is disabled
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2304
Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-id: 20240526204551.553282-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:45:21 +01:00
Richard Henderson fa31b7e168 target/arm: Convert FCSEL to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-34-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson 44463b96d2 target/arm: Convert FMADD, FMSUB, FNMADD, FNMSUB to decodetree
These are the only instructions in the 3 source scalar class.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240528203044.612851-33-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson f80701cb44 target/arm: Convert SQDMULH, SQRDMULH to decodetree
These are the last instructions within disas_simd_three_reg_same
and disas_simd_scalar_three_reg_same, so remove them.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240528203044.612851-32-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson 8f81dced5d target/arm: Tidy SQDMULH, SQRDMULH (vector)
We already have a gvec helper for the operations, but we aren't
using it on the aa32 neon side.  Create a unified expander for
use by both aa32 and aa64 translators.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-31-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson 8db93dcd3d target/arm: Convert MLA, MLS to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-30-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson b73c7b1740 target/arm: Convert MUL, PMUL to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-29-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson 5ea1b93ef7 target/arm: Convert SABA, SABD, UABA, UABD to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-28-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson 41c34bccc2 target/arm: Convert SMAX, SMIN, UMAX, UMIN to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-27-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson 93b7b9057d target/arm: Convert SRHADD, URHADD to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-26-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson 8989b95e71 target/arm: Convert SRHADD, URHADD to gvec
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-25-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:41 +01:00
Richard Henderson fdaf45d852 target/arm: Convert SHSUB, UHSUB to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-24-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson 34c0d865a3 target/arm: Convert SHSUB, UHSUB to gvec
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-23-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson 6ef548ed4b target/arm: Convert SHADD, UHADD to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-22-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson 203aca9125 target/arm: Convert SHADD, UHADD to gvec
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-21-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson 2310eb0aca target/arm: Use TCG_COND_TSTNE in gen_cmtst_vec
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-20-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson 013506e03f target/arm: Use TCG_COND_TSTNE in gen_cmtst_{i32, i64}
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-19-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson 649005fd3a target/arm: Convert CMGT, CMHI, CMGE, CMHS, CMTST, CMEQ to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-18-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson 6c9bccf52f target/arm: Convert ADD, SUB (vector) to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson 4f92fd736d target/arm: Convert SQRSHL, UQRSHL to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson cef9d54f6b target/arm: Convert SQRSHL and UQRSHL (register) to gvec
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson 19b58f3048 target/arm: Convert SQSHL, UQSHL to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:40 +01:00
Richard Henderson e72a687815 target/arm: Convert SQSHL and UQSHL (register) to gvec
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240528203044.612851-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson 2214c9d721 target/arm: Convert SRSHL, URSHL to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson 940392c834 target/arm: Convert SRSHL and URSHL (register) to gvec
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240528203044.612851-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson beaa7c41b0 target/arm: Convert SSHL, USHL to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson 1f4dd04c23 target/arm: Convert SUQADD, USQADD to decodetree
These are faux 2-operand instructions, reading from rd.
Sort them next to the other three-operand same insns for clarity.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson aaf03a399a target/arm: Convert SQADD, SQSUB, UQADD, UQSUB to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson f4fa83d614 target/arm: Inline scalar SQADD, UQADD, SQSUB, UQSUB
This eliminates the last uses of these neon helpers.
Incorporate the MO_64 expanders as an option to the vector expander.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson 1217edace8 target/arm: Inline scalar SUQADD and USQADD
This eliminates the last uses of these neon helpers.
Incorporate the MO_64 expanders as an option to the vector expander.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson 8f6343ae18 target/arm: Convert SUQADD and USQADD to gvec
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson 01d5665bc3 target/arm: Assert oprsz in range when using vfp.qc
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240528203044.612851-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:39 +01:00
Richard Henderson 76f4a8aeca target/arm: Improve vector UQADD, UQSUB, SQADD, SQSUB
No need for a full comparison; xor produces non-zero bits
for QC just fine.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240528203044.612851-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-30 15:24:38 +01:00
Richard Henderson be0fcbc462 target/s390x: Adjust check of noreturn in translate_one
If help_op is not set, ret == DISAS_NEXT.
Shift the test up from surrounding help_wout, help_cout
to skipping to out, as we do elsewhere in the function.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-14-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:56 +02:00
Richard Henderson a47d08ee0d target/s390x: Simplify per_ifetch, per_check_exception
Set per_address and ilen in per_ifetch; this is valid for
all PER exceptions and will last until the end of the
instruction.  Therefore we don't need to give the same
data to per_check_exception.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-13-richard.henderson@linaro.org>
[thuth: Silence checkpatch.pl errors]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson 67b765d3f3 target/s390x: Fix helper_per_ifetch flags
CPU state is read on the exception path.

Fixes: 83bb161299 ("target-s390x: PER instruction-fetch nullification event support")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240502054417.234340-12-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson 31b2d4a1b3 target/s390x: Raise exception from per_store_real
At this point the instruction is complete and there's nothing
left to do but raise the exception.  With this change we need
not make two helper calls for this event.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-11-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson 5331339651 target/s390x: Raise exception from helper_per_branch
Drop from argument, since gbea has always been updated with
this address.  Add ilen argument for setting int_pgm_ilen.
Use update_cc_op before calling per_branch.

By raising the exception here, we need not call
per_check_exception later, which means we can clean up the
normal non-exception branch path.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-10-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson 619f6891ff target/s390x: Split per_breaking_event from per_branch_*
The breaking-event-address register is updated regardless
of PER being enabled.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-9-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson e640545523 target/s390x: Simplify help_branch
Always use a tcg branch, instead of movcond.  The movcond
was not a bad idea before PER was added, but since then
we have either 2 or 3 actions to perform on each leg of
the branch, and multiple movcond is inefficient.

Reorder the taken branch to be fallthrough of the tcg branch.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-8-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson 9bbbcf5ddb target/s390x: Introduce help_goto_indirect
Add a small helper to handle unconditional indirect jumps.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-7-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson a90e319569 target/s390x: Disable conditional branch-to-next for PER
For PER, we require a conditional call to helper_per_branch
for the conditional branch.  Fold the remaining optimization
into a call to helper_goto_direct, which will take care of
the remaining gbea adjustment.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240502054417.234340-6-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson 62613ca073 target/s390x: Record separate PER bits in TB flags
Record successful-branching, instruction-fetching, and
store-using-real-address.  The other PER bits are not used
during translation.  Having checked these at translation time,
we can remove runtime tests from the helpers.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240502054417.234340-5-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson 51a1718b14 target/s390x: Update CR9 bits
Update from the PoO 14th edition.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240502054417.234340-4-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson 36db37af34 target/s390x: Move cpu_get_tb_cpu_state out of line
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240502054417.234340-3-richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:41:15 +02:00
Richard Henderson a6a33760a3 target/s390x: Do not use unwind for per_check_exception
Using exception unwind via tcg_s390_program_interrupt,
we discard the current value of psw.addr, which discards
the result of a branch.

Pass in the address of the next instruction, which may
not be sequential.  Pass in ilen, which we would have
gotten from unwind and is passed to the exception handler.
Sync cc_op before the call, which we would have gotten
from unwind.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240502054417.234340-2-richard.henderson@linaro.org>
[thuth: Silence checkpatch.pl errors]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-05-29 12:40:49 +02:00
Richard Henderson f240df3c31 target/arm: Convert disas_simd_3same_logic to decodetree
This includes AND, ORR, EOR, BIC, ORN, BSF, BIT, BIF.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-37-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 641d823142 target/arm: Convert FMLAL, FMLSL to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-36-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson a9240f482c target/arm: Use gvec for neon pmax, pmin
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-35-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 28b5451bec target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree
These are the last instructions within handle_simd_3same_pair
so remove it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-34-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson a11e54ed29 target/arm: Use gvec for neon padd
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-33-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson a7e4eec6fb target/arm: Convert ADDP to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-32-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson c43a23e1aa target/arm: Use gvec for neon faddp, fmaxp, fminp
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-31-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson a13f9fb5bf target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP to decodetree
These are the last instructions within disas_simd_three_reg_same_fp16,
so remove it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-30-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 57801ca0ea target/arm: Convert FADDP to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-29-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 13db93bce5 target/arm: Convert FRECPS, FRSQRTS to decodetree
These are the last instructions within handle_3same_float
and disas_simd_scalar_three_reg_same_fp16 so remove them.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-28-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 43454734c4 target/arm: Convert FABD to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-27-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 4fe068fac0 target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-26-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 2d558efbf5 target/arm: Convert FMLA, FMLS to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-25-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 69cefabcac target/arm: Convert FNMUL to decodetree
This is the last instruction within disas_fp_2src,
so remove that and its subroutines.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-24-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 21e885aff4 target/arm: Expand vfp neg and abs inline
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-23-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 3938f94175 target/arm: Introduce vfp_load_reg16
Load and zero-extend float16 into a TCGv_i32 before
all scalar operations.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240524232121.284515-22-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson a1e250fc71 target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-21-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson e0300a9a5e target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-20-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson cb1c77feef target/arm: Convert FMULX to decodetree
Convert all forms (scalar, vector, scalar indexed, vector indexed),
which allows us to remove switch table entries elsewhere.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-19-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson d6edf915c7 target/arm: Convert Advanced SIMD copy to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-18-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson d90a473363 target/arm: Convert XAR to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 376bb8a45d target/arm: Convert Cryptographic 3-register, imm2 to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 50941556ff target/arm: Convert Cryptographic 4-register to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 54b8230107 target/arm: Convert Cryptographic 2-register SHA512 to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 7010e36676 target/arm: Convert Cryptographic 3-register SHA512 to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 66d1e1a402 target/arm: Convert Cryptographic 2-register SHA to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson c5fb9b4fad target/arm: Convert Cryptographic 3-register SHA to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 8424801eb5 target/arm: Convert Cryptographic AES to decodetree
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson a11efe30b9 target/arm: Split out gengvec64.c
Split some routines out of translate-a64.c and translate-sve.c
that are used by both.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 09a52d854a target/arm: Split out gengvec.c
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240524232121.284515-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson 5d874e5da2 target/arm: Verify sz=0 for Advanced SIMD scalar pairwise (fp16)
All of these insns have "if sz == '1' then UNDEFINED" in their pseudocode.
Fixes a RISU miscompare for invalid insn 0x5ef0c87a.

Fixes: 5c36d89567 ("arm/translate-a64: add all FP16 ops in simd_scalar_pairwise")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240524232121.284515-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson c0ca7ed049 target/arm: Fix decode of FMOV (hp) vs MOVI
The decode of FMOV (vector, immediate, half-precision) vs
invalid cases of MOVI are incorrect.

Fixes RISU mismatch for invalid insn 0x2f01fd31.

Fixes: 70b4e6a445 ("arm/translate-a64: add FP16 FMOV to simd_mod_imm")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240524232121.284515-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson fe84877ed4 target/arm: Zero-extend writeback for fp16 FCVTZS (scalar, integer)
Fixes RISU mismatch for "fcvtzs h31, h0, #14".

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240524232121.284515-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:29:01 +01:00
Richard Henderson db36e14501 target/arm: Use PLD, PLDW, PLI not NOP for t32
This fixes a bug in that neither PLI nor PLDW are present in ARMv6T2,
but are introduced with ARMv7 and ARMv7MP respectively.
For clarity, do not use NOP for PLD.

Note that there is no PLDW (literal). Architecturally in the
T1 encoding of "PLD (literal)" bit 5 is "(0)", which means
that it should be zero and if it is not then the behaviour
is CONSTRAINED UNPREDICTABLE (might UNDEF, NOP, or ignore the
value of the bit).

In our implementation we have patterns for both:

+    PLD          1111 1000 -001 1111 1111 ------------        # (literal)
+    PLD          1111 1000 -011 1111 1111 ------------        # (literal)

and so we effectively ignore the value of bit 5.  (This is a
permitted option for this CONSTRAINED UNPREDICTABLE.) This isn't a
behaviour change in this commit, since we previously had NOP lines
for both those patterns.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240524232121.284515-3-richard.henderson@linaro.org
[PMM: adjusted commit message to note that PLD (lit) T1 bit 5
being 1 is an UNPREDICTABLE case.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:23:52 +01:00
Zenghui Yu 19ed42e8ad hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers
We wrongly encoded ID_AA64PFR1_EL1 using {3,0,0,4,2} in hvf_sreg_match[] so
we fail to get the expected ARMCPRegInfo from cp_regs hash table with the
wrong key.

Fix it with the correct encoding {3,0,0,4,1}. With that fixed, the Linux
guest can properly detect FEAT_SSBS2 on my M1 HW.

All DBG{B,W}{V,C}R_EL1 registers are also wrongly encoded with op0 == 14.
It happens to work because HVF_SYSREG(CRn, CRm, 14, op1, op2) equals to
HVF_SYSREG(CRn, CRm, 2, op1, op2), by definition. But we shouldn't rely on
it.

Cc: qemu-stable@nongnu.org
Fixes: a1477da3dd ("hvf: Add Apple Silicon support")
Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev>
Reviewed-by: Alexander Graf <agraf@csgraf.de>
Message-id: 20240503153453.54389-1-zenghui.yu@linux.dev
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-05-28 14:20:48 +01:00
Richard Henderson 60b54b67c6 target/i386: Introduce X86Access and use for xsave and friends
linux-user/i386: Fix allocation and alignment of fp state in signal frame
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZT2GwdHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV87pQf9F/cmrKQG1mVWKmJd
 MI7l63lbxejdgAADv1nmro+oapCsJSaQeUSrYp904ydqJjVfBJkaoXfknGsvxrNA
 oW7nEuYt0sBKdaBUKhYpMOJ3ivfw7lVVMJmjNv9ngZRhW+WOoJrBHoleUkVLiM7D
 rxkMLL+LQ7BR9i0Lv1unorOkqUPGNOnEd45qRn6k1g/Qnqi8SNMzxFwO8+232u8m
 EG9un/oh4mKPyb5vSg3Y4JLg+yDKCRScBqBU1wcKFe1u+umBkv2BNcU+k62AJh1q
 bv8i1n+X/dFAd1aj0NEupi04EOZIof5m3T4YIWg7M4I94NiFWNZ18vgskkmiO+Mo
 0KPd/A==
 =sYrE
 -----END PGP SIGNATURE-----

Merge tag 'pull-lu-20240526' of https://gitlab.com/rth7680/qemu into staging

target/i386: Introduce X86Access and use for xsave and friends
linux-user/i386: Fix allocation and alignment of fp state in signal frame

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZT2GwdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV87pQf9F/cmrKQG1mVWKmJd
# MI7l63lbxejdgAADv1nmro+oapCsJSaQeUSrYp904ydqJjVfBJkaoXfknGsvxrNA
# oW7nEuYt0sBKdaBUKhYpMOJ3ivfw7lVVMJmjNv9ngZRhW+WOoJrBHoleUkVLiM7D
# rxkMLL+LQ7BR9i0Lv1unorOkqUPGNOnEd45qRn6k1g/Qnqi8SNMzxFwO8+232u8m
# EG9un/oh4mKPyb5vSg3Y4JLg+yDKCRScBqBU1wcKFe1u+umBkv2BNcU+k62AJh1q
# bv8i1n+X/dFAd1aj0NEupi04EOZIof5m3T4YIWg7M4I94NiFWNZ18vgskkmiO+Mo
# 0KPd/A==
# =sYrE
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 26 May 2024 05:48:44 PM PDT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]

* tag 'pull-lu-20240526' of https://gitlab.com/rth7680/qemu: (28 commits)
  target/i386: Pass host pointer and size to cpu_x86_{xsave,xrstor}
  target/i386: Pass host pointer and size to cpu_x86_{fxsave,fxrstor}
  target/i386: Pass host pointer and size to cpu_x86_{fsave,frstor}
  target/i386: Convert do_xrstor to X86Access
  target/i386: Convert do_xsave to X86Access
  linux-user/i386: Honor xfeatures in xrstor_sigcontext
  linux-user/i386: Fix allocation and alignment of fp state
  linux-user/i386: Return boolean success from xrstor_sigcontext
  linux-user/i386: Return boolean success from restore_sigcontext
  linux-user/i386: Fix -mregparm=3 for signal delivery
  linux-user/i386: Split out struct target_fregs_state
  linux-user/i386: Replace target_fpstate_fxsave with X86LegacyXSaveArea
  linux-user/i386: Remove xfeatures from target_fpstate_fxsave
  linux-user/i386: Drop xfeatures_size from sigcontext arithmetic
  target/i386: Add {hw,sw}_reserved to X86LegacyXSaveArea
  target/i386: Add rbfm argument to cpu_x86_{xsave,xrstor}
  target/i386: Split out do_xsave_chk
  target/i386: Convert do_xrstor_* to X86Access
  target/i386: Convert do_xsave_* to X86Access
  tagret/i386: Convert do_fxsave, do_fxrstor to X86Access
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 17:51:00 -07:00
Richard Henderson 701890bdd0 target/i386: Pass host pointer and size to cpu_x86_{xsave,xrstor}
We have already validated the memory region in the course of
validating the signal frame.  No need to do it again within
the helper function.

In addition, return failure when the header contains invalid
xstate_bv.  The kernel handles this via exception handling
within XSTATE_OP within xrstor_from_user_sigframe.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 15:49:58 -07:00
Richard Henderson 9c2fb9e1d5 target/i386: Pass host pointer and size to cpu_x86_{fxsave,fxrstor}
We have already validated the memory region in the course of
validating the signal frame.  No need to do it again within
the helper function.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 15:45:27 -07:00
Richard Henderson 76d8d0f85c target/i386: Pass host pointer and size to cpu_x86_{fsave,frstor}
We have already validated the memory region in the course of
validating the signal frame.  No need to do it again within
the helper function.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 15:45:27 -07:00
Richard Henderson d5dc3a927a target/i386: Convert do_xrstor to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 15:45:27 -07:00
Richard Henderson c6e6d1508a target/i386: Convert do_xsave to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 15:45:27 -07:00
Richard Henderson 6dba8b471c target/i386: Add {hw,sw}_reserved to X86LegacyXSaveArea
This completes the 512 byte structure, allowing the union to
be removed.  Assert that the structure layout is as expected.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson a2d64d61c1 target/i386: Add rbfm argument to cpu_x86_{xsave,xrstor}
For now, continue to pass all 1's from signal.c.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson a8f68831c6 target/i386: Split out do_xsave_chk
This path is not required by user-only, and can in fact
be shared between xsave and xrstor.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson 58955a96d9 target/i386: Convert do_xrstor_* to X86Access
The body of do_xrstor is now fully converted.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson 6b1b736bae target/i386: Convert do_xsave_* to X86Access
The body of do_xsave is now fully converted.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson 6d030aab29 tagret/i386: Convert do_fxsave, do_fxrstor to X86Access
Move the alignment fault from do_* to helper_*, as it need
not apply to usage from within user-only signal handling.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson e41d2eaf17 target/i386: Convert do_xrstor_{fpu,mxcr,sse} to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson b7e6d3ad30 target/i386: Convert do_xsave_{fpu,mxcr,sse} to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson 94f60f8f1c target/i386: Convert do_fsave, do_frstor to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson 505e2ef744 target/i386: Convert do_fstenv to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson bc13c2dd01 target/i386: Convert do_fldenv to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson 4526f58a27 target/i386: Convert helper_{fbld,fbst}_ST0 to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson d3e8b648ab target/i386: Convert do_fldt, do_fstt to X86Access
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson 24f6813924 target/i386: Add tcg/access.[ch]
Provide a method to amortize page lookup across large blocks.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson 78ef97c0aa Build system and target/i386/translate.c cleanups
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZRy1gUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMTtQf/ZQskuqZyTrDhB/uVUT8oT5JNKQNS
 GbFSgDK7jDdBeU3UmoYrlx9vfFR/mH5cA88MlusUy0SjQBNo4onD725o6Vvum/LW
 DPe5ZyE34wvOasM7KXqJsD+2SttjaVjCXN4ip+E9WL5By2TWJgrk6IgTtvAhT9cd
 LWb5OEIInaq7ZiWz3EpjmGvZd0M4mxqXi5OeDvmoFyf38xElfbWZWbfhJv+H5L1X
 stivPBtUbXOzh63NL491hUYQtiAWlow8Qcnn7CYRflb6Vdd4QPK+6W8FX5KyU2eC
 bXRXloW7wjEAC9pyiVky1SCvtNg7AVFL+9kxwiGreoZfo+/IMA+NP6pGOg==
 =hpWy
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

Build system and target/i386/translate.c cleanups

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZRy1gUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMTtQf/ZQskuqZyTrDhB/uVUT8oT5JNKQNS
# GbFSgDK7jDdBeU3UmoYrlx9vfFR/mH5cA88MlusUy0SjQBNo4onD725o6Vvum/LW
# DPe5ZyE34wvOasM7KXqJsD+2SttjaVjCXN4ip+E9WL5By2TWJgrk6IgTtvAhT9cd
# LWb5OEIInaq7ZiWz3EpjmGvZd0M4mxqXi5OeDvmoFyf38xElfbWZWbfhJv+H5L1X
# stivPBtUbXOzh63NL491hUYQtiAWlow8Qcnn7CYRflb6Vdd4QPK+6W8FX5KyU2eC
# bXRXloW7wjEAC9pyiVky1SCvtNg7AVFL+9kxwiGreoZfo+/IMA+NP6pGOg==
# =hpWy
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 25 May 2024 04:28:24 AM PDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (24 commits)
  migration: remove unnecessary zlib dependency
  meson: do not query modules before they are processed
  tcg: include dependencies in static_library()
  meson: remove unnecessary dependency
  meson: remove unnecessary reference to libm
  target/i386: remove aflag argument of gen_lea_v_seg
  target/i386: clean up repeated string operations
  target/i386: introduce gen_lea_ss_ofs
  target/i386: use mo_stacksize more
  target/i386: inline gen_add_A0_ds_seg
  target/i386: split gen_ldst_modrm for load and store
  target/i386: reg in gen_ldst_modrm is always OR_TMP0
  target/i386: raze the gen_eob* jungle
  target/i386: assert that gen_update_eip_cur and gen_update_eip_next are the same in tb_stop
  target/i386: avoid calling gen_eob_inhibit_irq before tb_stop
  target/i386: avoid calling gen_eob_syscall before tb_stop
  target/i386: document and group DISAS_* constants
  target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS
  target/i386: cpu_load_eflags already sets cc_op
  target/i386: remove unnecessary gen_update_cc_op before gen_eob*
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-25 06:53:34 -07:00
Paolo Bonzini 56da568f88 target/i386: remove aflag argument of gen_lea_v_seg
It is always s->aflag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:02 +02:00
Paolo Bonzini 6605817b1a target/i386: clean up repeated string operations
Do not bother generating inline wrappers for gen_repz and gen_repz2;
use s->prefix to separate REPZ from REPNZ in the case of SCAS and
CMPS.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:02 +02:00
Paolo Bonzini f0e754d3ce target/i386: introduce gen_lea_ss_ofs
Generalize gen_stack_A0() to include an initial add and to use an arbitrary
destination.  This is a common pattern and it is not a huge burden to
add the extra arguments to the only caller of gen_stack_A0().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini 20237d4070 target/i386: use mo_stacksize more
Use mo_stacksize for all stack accesses, including when
a 64-bit code segment is impossible and the code is
therefore checking only for SS32(s).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini d0e31d6d37 target/i386: inline gen_add_A0_ds_seg
It is only used in MONITOR, where a direct call of gen_lea_v_seg
is simpler, and in XLAT.  Inline it in the latter.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini 420d60caad target/i386: split gen_ldst_modrm for load and store
The is_store argument of gen_ldst_modrm has only ever been passed
a constant.  Just split the function in two.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini b3c49e654a target/i386: reg in gen_ldst_modrm is always OR_TMP0
Values other than OR_TMP0 were only ever used by MOV and MOVNTI
opcodes.  Now that these have been converted to the new decoder,
remove the argument.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini f5bd6a48ee target/i386: raze the gen_eob* jungle
Make gen_eob take the DISAS_* constant as an argument, so that
it is not necessary to have wrappers around it.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini ad8f2ad77e target/i386: assert that gen_update_eip_cur and gen_update_eip_next are the same in tb_stop
This is an invariant now that there are no calls to gen_eob_inhibit_irq()
outside tb_stop.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini 2512f786bf target/i386: avoid calling gen_eob_inhibit_irq before tb_stop
sti only has one exit, so it does not need to generate the
end-of-translation code inline.  It can be deferred to tb_stop.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini c8494cb8b1 target/i386: avoid calling gen_eob_syscall before tb_stop
syscall and sysret only have one exit, so they do not need to
generate the end-of-translation code inline.  It can be
deferred to tb_stop.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini 9594b59331 target/i386: document and group DISAS_* constants
Place DISAS_* constants that update cpu_eip first, and
the "jump" ones last.  Add comments explaining the differences
and usage.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini abdcc5c8ef target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS
Mark cc_op as clean and do not spill it at the end of the translation block.
Technically this is a tiny bit less efficient, but:

* it results in translations that are a tiny bit smaller

* for most of these instructions, it is not unlikely that they are close to
the end of the basic block, in which case cc_op would not be overwritten

* anyway the cost is probably dwarfed by that of computing flags.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini a0625efd4d target/i386: cpu_load_eflags already sets cc_op
No need to set it again at the end of the translation block, cc_op_dirty
can be set to false.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini f6ac77eab6 target/i386: remove unnecessary gen_update_cc_op before gen_eob*
This is already handled in gen_eob().  Before adding another DISAS_*
case, remove the double calls.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini 69d7281262 target/i386: cleanup eob handling of RSM
gen_helper_rsm cannot generate an exception, and reloads the flags.
So there's no need to spill cc_op and update cpu_eip, but on the
other hand cc_op must be reset to CC_OP_EFLAGS before returning.

It all works by chance, because by spilling cc_op before the call
to the helper, it becomes non-dirty and gen_eob will not overwrite
the CC_OP_EFLAGS value that is placed there by the helper.  But
let's clean it up.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini f0f0136abb target/i386: no single-step exception after MOV or POP SS
Intel SDM 18.3.1.4 "If an occurrence of the MOV or POP instruction
loads the SS register executes with EFLAGS.TF = 1, no single-step debug
exception occurs following the MOV or POP instruction."

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:27:54 +02:00
Paolo Bonzini 8225bff7c5 target/i386: disable jmp_opt if EFLAGS.RF is 1
If EFLAGS.RF is 1, special processing in gen_eob_worker() is needed and
therefore goto_tb cannot be used.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 10:00:12 +02:00
BALATON Zoltan e48fb4c590 target/ppc: Remove pp_check() and reuse ppc_hash32_pp_prot()
The ppc_hash32_pp_prot() function in mmu-hash32.c is the same as
pp_check() in mmu_common.c, merge these to remove duplicated code.
Define the common function as static lnline otherwise exporting the
function from mmu-hash32.c would stop the compiler inlining it which
results in slightly lower performance.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
[np: move ppc_hash32_pp_prot inline without changing it]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:14 +10:00
BALATON Zoltan e7baac649b target/ppc: Move out BookE and related MMU functions from mmu_common.c
Add a new mmu-booke.c file for BookE and related MMU bits from
mmu_common.c.

Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:13 +10:00
BALATON Zoltan cd1038ec1d target/ppc: Add a function to check for page protection bit
Checking if a page protection bit is set for a given access type is a
common operation. Add a function to avoid repeating the same check at
multiple places. As this relies on access type and page protection bit
values having certain relation also add an assert to ensure that this
assumption holds.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:13 +10:00
BALATON Zoltan 950251ee7b target/ppc/mmu-radix64.c: Drop a local variable
The value is only used once so no need to introduce a local variable
for it.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:12 +10:00
BALATON Zoltan e89b0629b9 target/ppc/mmu-hash32.c: Drop a local variable
In ppc_hash32_xlate() the value of need_prop is checked in two places
but precalculating it does not help because when we reach the first
check we always return and not reach the second place so the value
will only be used once. We can drop the local variable and calculate
it when needed, which makes these checks using it similar to other
places with such checks.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:12 +10:00
BALATON Zoltan 581eea5d65 target/ppc: Split off common embedded TLB init
Several 4xx CPUs and e200 share the same TLB settings enclosed in an
ifdef. Split it off in a common function to reduce code duplication
and the number of ifdefs.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:11 +10:00
BALATON Zoltan 5fd257f599 target/ppc: Remove id_tlbs flag from CPU env
This flag for split instruction/data TLBs is only set for 6xx soft TLB
MMU model and not used otherwise so no need to have a separate flag
for that.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:11 +10:00
BALATON Zoltan 306b532030 target/ppc: Move mmu_ctx_t type to mmu_common.c
Remove mmu_ctx_t definition from internal.h as this type is only used
within mmu_common.c.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:11 +10:00
BALATON Zoltan 6b9ea7f345 target/ppc: Transform ppc_jumbo_xlate() into ppc_6xx_xlate()
Now that only 6xx cases left in ppc_jumbo_xlate() we can change it
to ppc_6xx_xlate() also removing get_physical_address_wtlb().

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:10 +10:00
BALATON Zoltan 58b0132553 target/ppc: Split off 40x cases from ppc_jumbo_xlate()
Introduce ppc_40x_xlate() to split off 40x handlning leaving only 6xx
in ppc_jumbo_xlate() now.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:10 +10:00
BALATON Zoltan c29f808af5 target/ppc: Split off real mode handling from get_physical_address_wtlb()
Add ppc_real_mode_xlate() to handle real mode translation and allow
removing this case from ppc_jumbo_xlate().

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:10 +10:00
BALATON Zoltan b18489b326 target/ppc: Simplify ppc_booke_xlate() part 2
Merge the code fetch and data access cases in a common switch.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:09 +10:00
BALATON Zoltan aa20e1c8c6 target/ppc: Simplify ppc_booke_xlate() part 1
Move setting error_code that appears in every case out in front and
hoist the common fall through case for BOOKE206 as well which allows
removing the nested switches.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:08 +10:00
BALATON Zoltan ba91e5d027 target/ppc: Split off BookE handling from ppc_jumbo_xlate()
Introduce ppc_booke_xlate() to handle BookE and BookE 2.06 cases to
reduce ppc_jumbo_xlate() further.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:08 +10:00
BALATON Zoltan aa30aa7d8e target/ppc: Remove BookE from direct store handling
As BookE never returns -4 we can drop BookE from the direct store case
in ppc_jumbo_xlate().

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:08 +10:00
BALATON Zoltan e8a9c0fbff target/ppc: Don't use mmu_ctx_t in mmubooke206_get_physical_address()
mmubooke206_get_physical_address() only uses the raddr and prot fields
from mmu_ctx_t. Pass these directly instead of using a ctx struct.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:07 +10:00
BALATON Zoltan ecff3394a8 target/ppc: Don't use mmu_ctx_t in mmubooke_get_physical_address()
mmubooke_get_physical_address() only uses the raddr and prot fields
from mmu_ctx_t. Pass these directly instead of using a ctx struct.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:07 +10:00
BALATON Zoltan 5cc867a679 target/ppc: Don't use mmu_ctx_t for mmu40x_get_physical_address()
mmu40x_get_physical_address() only uses the raddr and prot fields from
mmu_ctx_t. Pass these directly instead of using a ctx struct.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:06 +10:00
BALATON Zoltan f178e4f894 target/ppc: Replace hard coded constants in ppc_jumbo_xlate()
The "2" in booke206_update_mas_tlb_miss() call corresponds to
MMU_INST_FETCH which is the value of access_type in this branch;
mmubooke206_esr() only checks for MMU_DATA_STORE and it's called from
code access so using MMU_DATA_LOAD here seems wrong so replace it with
access_type here as well that yields the same result. This also makes
these calls the same as the data access branch further down.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:06 +10:00
BALATON Zoltan 9e9ca54cdb target/ppc: Deindent ppc_jumbo_xlate()
Instead of putting a large block of code in an if, invert the
condition and return early to be able to deindent the code block.

Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:06 +10:00
BALATON Zoltan 47bededc29 target/ppc: Fix misindented qemu_log_mask() calls
Fix several qemu_log_mask() calls that are misindented.

Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:05 +10:00
BALATON Zoltan 77d9607d71 target/ppc: Inline and remove check_physical()
This function just does two assignments and and unnecessary check that
is always true so inline it in the only caller left and remove it.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:05 +10:00
BALATON Zoltan 549685161d target/ppc: Split off real mode cases in get_physical_address_wtlb()
The real mode handling is identical in the remaining switch cases.
Split off these common real mode cases into a separate conditional to
leave only the else branches in the switch that are different.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:05 +10:00
BALATON Zoltan 279fe98d0d target/ppc: Split out BookE xlate cases before checking real mode
BookE does not have real mode so split off and handle it first in
get_physical_address_wtlb() before checking for real mode for other
MMU models.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:04 +10:00
BALATON Zoltan f3f66a3157 target/ppc: Eliminate ret from mmu6xx_get_physical_address()
Return directly, which is simpler than dragging a return value through
multpile if and else blocks.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:04 +10:00
BALATON Zoltan 0af20f35d2 target/ppc: Move some debug logging in ppc6xx_tlb_check()
Move the debug logging within ppc6xx_tlb_check() from after its only
call to simplify the caller.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:43:03 +10:00
BALATON Zoltan f1418bdeb0 target/ppc: Move else branch to avoid large if block in mmu6xx_get_physical_address()
In mmu6xx_get_physical_address() we have a large if block with a two
line else branch that effectively returns. Invert the condition and
move the else there to allow deindenting the large if block to make
the flow easier to follow.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:41:18 +10:00
BALATON Zoltan 269d6f006b target/ppc: Introduce mmu6xx_get_physical_address()
Repurpose get_segment_6xx_tlb() to do the whole address translation
for POWERPC_MMU_SOFT_6xx MMU model by moving the BAT check there and
renaming it to match other similar functions. These are only called
once together so no need to keep these separate functions and
combining them simplifies the caller allowing further restructuring.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:41:18 +10:00
BALATON Zoltan cfd5c12832 target/ppc: Drop cases for unimplemented MPC8xx MMU
Drop MPC8xx cases from get_physical_address_wtlb() and ppc_jumbo_xlate().
The default case would still catch this and abort the same way and
there is still a warning about it in ppc_tlb_invalidate_all() which is
called in ppc_cpu_reset_hold() so likely we never get here but to make
sure add a case to ppc_xlate() to the same effect.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:41:17 +10:00
BALATON Zoltan fef517cd8a target/ppc: Simplify checking for real mode in get_physical_address_wtlb()
In get_physical_address_wtlb() the real_mode flag depends on either
the MSR[IR] or MSR[DR] bit depending on access_type. Extract just the
needed bit in a more straight forward way instead of doing unnecessary
computation.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:41:17 +10:00
BALATON Zoltan 750fbe3342 target/ppc: Remove unneeded local variable from booke tlb checks
In mmubooke_check_tlb() and mmubooke206_check_tlb() we can assign the
value of prot2 directly to the destination, no need to have a separate
local variable for it.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:41:17 +10:00
BALATON Zoltan 3f520078de target/ppc: Move calculation of a value closer to its usage in booke tlb checks
In mmubooke_check_tlb() and mmubooke206_check_tlb() prot2 is
calculated first but only used after an unrelated check that can
return before tha value is used. Move the calculation after the check,
closer to where it is used, to keep them together and avoid computing
it when not needed.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:41:15 +10:00
BALATON Zoltan 2b92822acc target/ppc: Remove unused helper_rac()
The helper_rac function is defined but not used, remove it.

Fixes: 005b69fdcc (target/ppc: Remove PowerPC 601 CPUs)
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:42 +10:00
Dr. David Alan Gilbert 41e9a098d1 target/ppc: Remove unused struct 'mmu_ctx_hash32'
I think it's use was removed by
Commit 5883d8b296 ("mmu-hash*: Don't use full ppc_hash{32,
64}_translate() path for get_phys_page_debug()")

Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:41 +10:00
Nicholas Piggin 0dfe59fe77 target/ppc: add SMT support to msgsnd broadcast
msgsnd has a broadcast mode that sends hypervisor doorbells to all
threads belonging to the same core as the target. A "subcore" mode
sends to all or one thread depending on 1LPAR mode.

Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:41 +10:00
Nicholas Piggin 2736432ffc target/ppc: Implement SPRC/SPRD SPRs
This implements the POWER SPRC/SPRD SPRs, and SCRATCH0-7 registers that
can be accessed via these indirect SPRs.

SCRATCH registers only provide storage, but they are used by firmware
for low level crash and progress data, so this implementation logs
writes to the registers to help with analysis.

Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:40 +10:00
Nicholas Piggin c9d5aedf40 target/ppc: Implement LDBAR, TTR SPRs
LDBAR, TTR are a Power-specific SPRs. These simple implementations
are enough for IBM proprietary firmware for now.

Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:40 +10:00
Nicholas Piggin 4d2b0ad32a target/ppc: Add SMT support to PTCR SPR
PTCR is a per-core register.

Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:40 +10:00
Nicholas Piggin e5c2ac9dc1 target/ppc: Add SMT support to simple SPRs
AMOR, MMCRC, HRMOR, TSCR, HMEER, RPR SPRs are per-core or per-LPAR
registers with simple (generic) implementations.

Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:39 +10:00
Nicholas Piggin 5fa7efe473 target/ppc: add helper to write per-LPAR SPRs
An SPR can be either per-thread, per-core, or per-LPAR. Per-LPAR means
per-thread or per-core, depending on 1LPAR mode.

Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:39 +10:00
Nicholas Piggin 1cbcbcb8d6 target/ppc: Add PPR32 SPR
PPR32 provides access to the upper half of PPR.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:39 +10:00
Nicholas Piggin e89294b27e target/ppc: BookE DECAR SPR is 32-bit
The DECAR SPR is 32-bits width.

Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:38 +10:00
Nicholas Piggin 45693f94dd target/ppc: Implement attn instruction on BookS 64-bit processors
attn is an implementation-specific instruction that on POWER (and G5/
970) can be enabled with a HID bit (disabled = illegal), and executing
it causes the host processor to stop and the service processor to be
notified. Generally used for debugging.

Implement attn and make it checkstop the system, which should be good
enough for QEMU debugging.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:38 +10:00
Nicholas Piggin 9728fb5c22 target/ppc: improve checkstop logging
Change the logging not to print to stderr as well, because a
checkstop is a guest error (or perhaps a simulated machine error)
rather than a QEMU error, so send it to the log.

Update the checkstop message, and log CPU registers too.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:38 +10:00
Nicholas Piggin cce7aee8dd target/ppc: Make checkstop actually stop the system
checkstop state does not halt the system, interrupts continue to be
serviced, and other CPUs run. Make it stop the machine with
qemu_system_guest_panicked.

Reviewed-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:38 +10:00
Nicholas Piggin c10c6ce032 target/ppc: Remove redundant MEMOP_GET_SIZE macro
There is a memop_size() function for this.

Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:37 +10:00
Nicholas Piggin 21cfc36a6c target/ppc: larx/stcx generation need only apply DEF_MEMOP() once
Use DEF_MEMOP() consistently in larx and stcx. generation, and apply it
once when it's used rather than where the macros are expanded, to reduce
typing.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:34:32 +10:00
Glenn Miles dabd6d3c3a target/ppc: Add migration support for BHRB
Adds migration support for Branch History Rolling
Buffer (BHRB) internal state.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:33:44 +10:00
Glenn Miles 6bfcf1dc23 target/ppc: Add clrbhrb and mfbhrbe instructions
Add support for the clrbhrb and mfbhrbe instructions.

Since neither instruction is believed to be critical to
performance, both instructions were implemented using helper
functions.

Access to both instructions is controlled by bits in the
HFSCR (for privileged state) and MMCR0 (for problem state).
A new function, helper_mmcr0_facility_check, was added for
checking MMCR0[BHRBA] and raising a facility_unavailable exception
if required.

NOTE: For P8 and P9, due to a performance issue, branch history will
not be kept, but the instructions will be allowed to execute
as normal with the exception that the mfbhrbe instruction will
always return a zero value.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:33:32 +10:00
Glenn Miles 4de4a4705f target/ppc: Add recording of taken branches to BHRB
This commit continues adding support for the Branch History
Rolling Buffer (BHRB) as is provided starting with the P8
processor and continuing with its successors.  This commit
is limited to the recording and filtering of taken branches.

The following changes were made:

  - Enabled functionality on P10 processors only due to
    performance impact seen with P8 and P9 where it is not
    disabled for non problem state branches.
  - Added a BHRB buffer for storing branch instruction and
    target addresses for taken branches
  - Renamed gen_update_cfar to gen_update_branch_history and
    added a 'target' parameter to hold the branch target
    address and 'inst_type' parameter to use for filtering
  - Added TCG code to gen_update_branch_history that stores
    data to the BHRB and updates the BHRB offset.
  - Added BHRB resource initialization and reset functions

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 09:33:06 +10:00
Glenn Miles a7138e28a2 target/ppc: Add new hflags to support BHRB
This commit is preparatory to the addition of Branch History
Rolling Buffer (BHRB) functionality, which is being provided
today starting with the P8 processor.

BHRB uses several SPR register fields to control whether or not
a branch instruction's address (and sometimes target address)
should be recorded.  Checking each of these fields with each
branch instruction using jitted code would lead to a significant
decrease in performance.

Therefore, it was decided that BHRB configuration bits that are
not expected to change frequently should have their state summarized
in an hflag so that the amount of checking done by jitted code can
be reduced.

This commit contains the changes for summarizing the state of the
following register fields in the HFLAGS_BHRB_ENABLE hflag:

	MMCR0[FCP] - Determines if BHRB recording is frozen in the
                     problem state

	MMCR0[FCPC] - A modifier for MMCR0[FCP]

	MMCRA[BHRBRD] - Disables all BHRB recording for a thread

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath 687a30ad3c target/ppc: Move VMX integer max/min instructions to decodetree.
Moving the following instructions to decodetree specification :

	v{max, min}{u, s}{b, h, w, d}	: VX-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath 664eb39ec9 target/ppc: Move VMX integer logical instructions to decodetree.
Moving the following instructions to decodetree specification:

	v{and, andc, nand, or, orc, nor, xor, eqv}	: VX-form

The changes were verified by validating that the tcp ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath 21b5f5464f target/ppc: Move VMX storage access instructions to decodetree
Moving the following instructions to decodetree specification :

	{l,st}ve{b,h,w}x,
	{l,st}v{x,xl},
	lvs{l,r}		: X-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath 948e257c48 target/ppc: Move logical fixed-point instructions to decodetree.
Moving the below instructions to decodetree specification :

	andi[s]., {ori, xori}[s]			: D-form

	{and, andc, nand, or, orc, nor, xor, eqv}[.],
	exts{b, h, w}[.],  cnt{l, t}z{w, d}[.],
	popcnt{b, w, d},  prty{w, d}, cmp, bpermd	: X-form

With this patch, all the fixed-point logical instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath ae556c6a49 target/ppc: Move cmp{rb, eqb}, tw[i], td[i], isel instructions to decodetree.
Moving the following instructions to decodetree specification :

	cmp{rb, eqb}, t{w, d}	: X-form
	t{w, d}i		: D-form
	isel			: A-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Also for CMPRB, following review comments :
Replaced repetition of arithmetic right shifting (tcg_gen_shri_i32) followed
by extraction of last 8 bits (tcg_gen_ext8u_i32) with extraction of the required
bits using offsets (tcg_gen_extract_i32).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath f424bc10eb target/ppc: Move div/mod fixed-point insns (64 bits operands) to decodetree.
Moving the below instructions to decodetree specification :

	divd[u, e, eu][o][.]	: XO-form
	mod{sd, ud}		: X-form

With this patch, all the fixed-point arithmetic instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Also, remaned do_divwe method in fixedpoint-impl.c.inc to do_dive because it is
now used to divide doubleword operands as well, and not just words.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath 703e88f723 target/ppc: Move multiply fixed-point insns (64-bit operands) to decodetree.
Moving the following instructions to decodetree :

	mul{ld, ldo, hd, hdu}[.]	: XO-form
	madd{hd, hdu, ld}		: VA-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op'
flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath a81b5c1867 target/ppc: Move neg, darn, mod{sw, uw} to decodetree.
Moving the below instructions to decodetree specification :

	neg[o][.]       	: XO-form
	mod{sw, uw}, darn	: X-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath 2871921d85 target/ppc: Move divw[u, e, eu] instructions to decodetree.
Moving the following instructions to decodetree specification :
	 divw[u, e, eu][o][.] 	: XO-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath 86e6202a57 target/ppc: Make divw[u] handler method decodetree compatible.
The handler methods for divw[u] instructions internally use Rc(ctx->opcode),
for extraction of Rc field of instructions, which poses a problem if we move
the above said instructions to decodetree, as the ctx->opcode field is not
popluated in decodetree. Hence, making it decodetree compatible, so that the
mentioned insns can be safely move to decodetree specs.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath a1faff873a target/ppc: Move mul{li, lw, lwo, hw, hwu} instructions to decodetree.
Moving the following instructions to decodetree specification :
	mulli                   	: D-form
	mul{lw, lwo, hw, hwu}[.]	: XO-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.
Also cleaned up code for mullw[o][.] as per review comments while
keeping the logic of the tcg ops generated semantically same.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath 177fcc06dc target/ppc: Move floating-point arithmetic instructions to decodetree.
This patch moves the below instructions to decodetree specification :

    f{add, sub, mul, div, re, rsqrte, madd, msub, nmadd, nmsub}[s][.] : A-form
    ft{div, sqrt}                                                     : X-form

With this patch, all the floating-point arithmetic instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Chinmay Rath 5747926fec target/ppc: Merge various fpu helpers
This patch merges the definitions of the following set of fpu helper methods,
which are similar, using macros :

1. f{add, sub, mul, div}(s)
2. fre(s)
3. frsqrte(s)

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Nicholas Piggin b3cfa2dd2b target/ppc: Add ISA v3.1 variants of sync instruction
POWER10 adds a new field to sync for store-store syncs, and some
new variants of the existing syncs that include persistent memory.

Implement the store-store syncs and plwsync/phwsync.

Reviewed-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Nicholas Piggin ab4f174bae target/ppc: Fix embedded memory barriers
Memory barriers are supposed to do something on BookE systems, these
were probably just missed during MTTCG enablement, maybe no targets
support SMP. Either way, add proper BookE implementations.

Reviewed-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Nicholas Piggin 13f5086783 target/ppc: Move sync instructions to decodetree
This tries to faithfully reproduce the odd BookE logic. Note the
e206 check in gen_msync_4xx() is always false, so not carried over.

It does change the handling of non-zero reserved bits outside the
defined fields from being illegal to being ignored, which the
architecture specifies ot help with backward compatibility of new
fields. The existing behaviour causes illegal instruction exceptions
when using new POWER10 sync variants that add new fields, after this
the instructions are accepted and are implemented as supersets of
the new behaviour, as intended.

Reviewed-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Nicholas Piggin 82676f1fc4 target/ppc: Fix broadcast tlbie synchronisation
With mttcg, broadcast tlbie instructions do not wait until other vCPUs
have been kicked out of TCG execution before they complete (including
necessary subsequent tlbsync, etc., instructions). This is contrary to
the ISA, and it permits other vCPUs to use translations after the TLB
flush. For example:

   CPU0
   // *memP is initially 0, memV maps to memP with *pte
   *pte = 0;
   ptesync ; tlbie ; eieio ; tlbsync ; ptesync
   *memP = 1;

   CPU1
   assert(*memV == 0);

It is possible for the assertion to fail because CPU1 translates memV
using the TLB after CPU0 has stored 1 to the underlying memory. This
race was observed with a careful test case where CPU1 checks run in a
very large expensive TB so it can run for the entire CPU0 period between
clearing the pte and storing the memory, but host vCPU thread preemption
could cause the race to hit anywhere.

As explained in commit 4ddc104689 ("target/ppc: Fix tlbie"), it is not
enough to just use tlb_flush_all_cpus_synced(), because that does not
execute until the calling CPU has finished its TB. It is also required
that the TB is ended at the point where the TLB flush must subsequently
take effect.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Nicholas Piggin c700b5e162 spapr: avoid overhead of finding vhyp class in critical operations
PPC_VIRTUAL_HYPERVISOR_GET_CLASS is used in critical operations like
interrupts and TLB misses and is quite costly. Running the
kvm-unit-tests sieve program with radix MMU enabled thrashes the TCG
TLB and spends a lot of time in TLB and page table walking code. The
test takes 67 seconds to complete with a lot of time being spent in
code related to finding the vhyp class:

   12.01%  [.] g_str_hash
    8.94%  [.] g_hash_table_lookup
    8.06%  [.] object_class_dynamic_cast
    6.21%  [.] address_space_ldq
    4.94%  [.] __strcmp_avx2
    4.28%  [.] tlb_set_page_full
    4.08%  [.] address_space_translate_internal
    3.17%  [.] object_class_dynamic_cast_assert
    2.84%  [.] ppc_radix64_xlate

Keep a pointer to the class and avoid this lookup. This reduces the
execution time to 40 seconds.

Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Richard Henderson 7b68a5fe2f * hw/i386/pc_sysfw: Alias rather than copy isa-bios region
* target/i386: add control bits support for LAM
 * target/i386: tweaks to new translator
 * target/i386: add support for LAM in CPUID enumeration
 * hw/i386/pc: Support smp.modules for x86 PC machine
 * target-i386: hyper-v: Correct kvm_hv_handle_exit return value
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZOMlAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTSwf8DOPgipepNcsxUQoV9nOBfNXqEWa6
 DilQGwuu/3eMSPITUCGKVrtLR5azwCwvNfYYErVBPVIhjImnk3XHwfKpH1csadgq
 7Np8WGjAyKEIP/yC/K1VwsanFHv3hmC6jfcO3ZnsnlmbHsRINbvU9uMlFuiQkKJG
 lP/dSUcTVhwLT6eFr9DVDUnq4Nh7j3saY85pZUoDclobpeRLaEAYrawha1/0uQpc
 g7MZYsxT3sg9PIHlM+flpRvJNPz/ZDBdj4raN1xo4q0ET0KRLni6oEOVs5GpTY1R
 t4O8a/IYkxeI15K9U7i0HwYI2wVwKZbHgp9XPMYVZFJdKBGT8bnF56pV9A==
 =lp7q
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* hw/i386/pc_sysfw: Alias rather than copy isa-bios region
* target/i386: add control bits support for LAM
* target/i386: tweaks to new translator
* target/i386: add support for LAM in CPUID enumeration
* hw/i386/pc: Support smp.modules for x86 PC machine
* target-i386: hyper-v: Correct kvm_hv_handle_exit return value

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZOMlAUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTSwf8DOPgipepNcsxUQoV9nOBfNXqEWa6
# DilQGwuu/3eMSPITUCGKVrtLR5azwCwvNfYYErVBPVIhjImnk3XHwfKpH1csadgq
# 7Np8WGjAyKEIP/yC/K1VwsanFHv3hmC6jfcO3ZnsnlmbHsRINbvU9uMlFuiQkKJG
# lP/dSUcTVhwLT6eFr9DVDUnq4Nh7j3saY85pZUoDclobpeRLaEAYrawha1/0uQpc
# g7MZYsxT3sg9PIHlM+flpRvJNPz/ZDBdj4raN1xo4q0ET0KRLni6oEOVs5GpTY1R
# t4O8a/IYkxeI15K9U7i0HwYI2wVwKZbHgp9XPMYVZFJdKBGT8bnF56pV9A==
# =lp7q
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 10:58:40 AM PDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (23 commits)
  target-i386: hyper-v: Correct kvm_hv_handle_exit return value
  i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
  i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[4]
  i386: Add cache topology info in CPUCacheInfo
  hw/i386/pc: Support smp.modules for x86 PC machine
  tests: Add test case of APIC ID for module level parsing
  i386/cpu: Introduce module-id to X86CPU
  i386: Support module_id in X86CPUTopoIDs
  i386: Expose module level in CPUID[0x1F]
  i386: Support modules_per_die in X86CPUTopoInfo
  i386: Introduce module level cpu topology to CPUX86State
  i386/cpu: Decouple CPUID[0x1F] subleaf with specific topology level
  i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB]
  i386/cpu: Introduce bitmap to cache available CPU topology levels
  i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  i386/cpu: Use APIC ID info get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  i386/cpu: Use APIC ID info to encode cache topo in CPUID[4]
  i386/cpu: Fix i/d-cache topology to core level for Intel CPU
  target/i386: add control bits support for LAM
  target/i386: add support for LAM in CPUID enumeration
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-23 08:14:03 -07:00
Bibo Mao f83434f3dc target/loongarch: Add loongarch vector property unconditionally
Currently LSX/LASX vector property is decided by the default value.
Instead vector property should be added unconditionally, and it is
irrelative with its default value. If vector is disabled by default,
vector also can be enabled from command line.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240521080549.434197-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
2024-05-23 09:30:41 +08:00
Song Gao 07c0866103 target/loongarch/kvm: fpu save the vreg registers high 192bit
On kvm side, get_fpu/set_fpu save the vreg registers high 192bits,
but QEMU missing.

Cc: qemu-stable@nongnu.org
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-Id: <20240514110752.989572-1-gaosong@loongson.cn>
2024-05-23 09:30:41 +08:00
Song Gao 0eb285c362 target/loongarch/kvm: Fix VM recovery from disk failures
vmstate does not save kvm_state_conter,
which can cause VM recovery from disk to fail.

Cc: qemu-stable@nongnu.org
Signed-off-by: Song Gao <gaosong@loongson.cn>
Acked-by: Peter Xu <peterx@redhat.com>
Message-Id: <20240508024732.3127792-1-gaosong@loongson.cn>
2024-05-23 09:30:41 +08:00
donsheng 84d4b72854 target-i386: hyper-v: Correct kvm_hv_handle_exit return value
This bug fix addresses the incorrect return value of kvm_hv_handle_exit for
KVM_EXIT_HYPERV_SYNIC, which should be EXCP_INTERRUPT.

Handling of KVM_EXIT_HYPERV_SYNIC in QEMU needs to be synchronous.
This means that async_synic_update should run in the current QEMU vCPU
thread before returning to KVM, returning EXCP_INTERRUPT to guarantee this.
Returning 0 can cause async_synic_update to run asynchronously.

One problem (kvm-unit-tests's hyperv_synic test fails with timeout error)
caused by this bug:

When a guest VM writes to the HV_X64_MSR_SCONTROL MSR to enable Hyper-V SynIC,
a VM exit is triggered and processed by the kvm_hv_handle_exit function of the
QEMU vCPU. This function then calls the async_synic_update function to set
synic->sctl_enabled to true. A true value of synic->sctl_enabled is required
before creating SINT routes using the hyperv_sint_route_new() function.

If kvm_hv_handle_exit returns 0 for KVM_EXIT_HYPERV_SYNIC, the current QEMU
vCPU thread may return to KVM and enter the guest VM before running
async_synic_update. In such case, the hyperv_synic test’s subsequent call to
synic_ctl(HV_TEST_DEV_SINT_ROUTE_CREATE, ...) immediately after writing to
HV_X64_MSR_SCONTROL can cause QEMU’s hyperv_sint_route_new() function to return
prematurely (because synic->sctl_enabled is false).

If the SINT route is not created successfully, the SINT interrupt will not be
fired, resulting in a timeout error in the hyperv_synic test.

Fixes: 267e071bd6 (“hyperv: make overlay pages for SynIC”)
Suggested-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com>
Message-ID: <20240521200114.11588-1-dongsheng.x.zhang@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:56:28 +02:00
Zhao Liu 5eb608a13b i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical
processors sharing cache.

The number of logical processors sharing this cache is
NumSharingCache + 1.

After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[0x8000001D].EAX[bits 25:14].

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-22-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:56:27 +02:00
Zhao Liu f602eb925a i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[4]
CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
Intel CPUs.

After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[4].EAX[bits 25:14].

And since with the helper max_processor_ids_for_cache(), the filed
CPUID[4].EAX[bits 25:14] (original virable "num_apic_ids") is parsed
based on cpu topology levels, which are verified when parsing -smp, it's
no need to check this value by "assert(num_apic_ids > 0)" again, so
remove this assert().

Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
helper to make the code cleaner.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-21-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:56:27 +02:00
Zhao Liu 9fcba76ab9 i386: Add cache topology info in CPUCacheInfo
Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache.

Thus, we should explicitly define the corresponding cache topology for
different cache models to increase scalability.

Except legacy_l2_cache_cpuid2 (its default topo level is
CPU_TOPO_LEVEL_UNKNOW), explicitly set the corresponding topology level
for all other cache models. In order to be compatible with the existing
cache topology, set the CPU_TOPO_LEVEL_CORE level for the i/d cache, set
the CPU_TOPO_LEVEL_CORE level for L2 cache, and set the
CPU_TOPO_LEVEL_DIE level for L3 cache.

The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits
25:14] will be set based on CPUCacheInfo.share_level.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240424154929.1487382-20-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 588208346f i386/cpu: Introduce module-id to X86CPU
Introduce module-id to be consistent with the module-id field in
CpuInstanceProperties.

Following the legacy smp check rules, also add the module_id validity
into x86_cpu_pre_plug().

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-17-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 5304873acd i386: Expose module level in CPUID[0x1F]
Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
erroneous smp_num_siblings on Intel Hybrid platforms") is able to
handle platforms with Module level enumerated via CPUID.1F.

Expose the module level in CPUID[0x1F] if the machine has more than 1
modules.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-15-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 3568adc995 i386: Support modules_per_die in X86CPUTopoInfo
Support module level in i386 cpu topology structure "X86CPUTopoInfo".

Since x86 does not yet support the "modules" parameter in "-smp",
X86CPUTopoInfo.modules_per_die is currently always 1.

Therefore, the module level width in APIC ID, which can be calculated by
"apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0 for
now, so we can directly add APIC ID related helpers to support module
level parsing.

In addition, update topology structure in test-x86-topo.c.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-14-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 81c392ab5c i386: Introduce module level cpu topology to CPUX86State
Intel CPUs implement module level on hybrid client products (e.g.,
ADL-N, MTL, etc) and E-core server products.

A module contains a set of cores that share certain resources (in
current products, the resource usually includes L2 cache, as well as
module scoped features and MSRs).

Module level support is the prerequisite for L2 cache topology on
module level. With module level, we can implement the Guest's CPU
topology and future cache topology to be consistent with the Host's on
Intel hybrid client/E-core server platforms.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-13-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 822bce9f58 i386/cpu: Decouple CPUID[0x1F] subleaf with specific topology level
At present, the subleaf 0x02 of CPUID[0x1F] is bound to the "die" level.

In fact, the specific topology level exposed in 0x1F depends on the
platform's support for extension levels (module, tile and die).

To help expose "module" level in 0x1F, decouple CPUID[0x1F] subleaf
with specific topology level.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-12-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 0f6ed7ba13 i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB]
CPUID[0xB] defines SMT, Core and Invalid types, and this leaf is shared
by Intel and AMD CPUs.

But for extended topology levels, Intel CPU (in CPUID[0x1F]) and AMD CPU
(in CPUID[0x80000026]) have the different definitions with different
enumeration values.

Though CPUID[0x80000026] hasn't been implemented in QEMU, to avoid
possible misunderstanding, split topology types of CPUID[0x1F] from the
definitions of CPUID[0xB] and introduce CPUID[0x1F]-specific topology
types.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-11-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 6ddeb0ec8c i386/cpu: Introduce bitmap to cache available CPU topology levels
Currently, QEMU checks the specify number of topology domains to detect
if there's extended topology levels (e.g., checking nr_dies).

With this bitmap, the extended CPU topology (the levels other than SMT,
core and package) could be easier to detect without touching the
topology details.

This is also in preparation for the follow-up to decouple CPUID[0x1F]
subleaf with specific topology level.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-10-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 2613747a79 i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
In cpu_x86_cpuid(), there are many variables in representing the cpu
topology, e.g., topo_info, cs->nr_cores and cs->nr_threads.

Since the names of cs->nr_cores and cs->nr_threads do not accurately
represent its meaning, the use of cs->nr_cores or cs->nr_threads is
prone to confusion and mistakes.

And the structure X86CPUTopoInfo names its members clearly, thus the
variable "topo_info" should be preferred.

In addition, in cpu_x86_cpuid(), to uniformly use the topology variable,
replace env->dies with topo_info.dies_per_pkg as well.

Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-9-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 9a085c4b4a i386/cpu: Use APIC ID info get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
The commit 8f4202fb10 ("i386: Populate AMD Processor Cache Information
for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
the number of sharing threads directly.

From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
means [1]:

The number of logical processors sharing this cache is the value of
this field incremented by 1. To determine which logical processors are
sharing a cache, determine a Share Id for each processor as follows:

ShareId = LocalApicId >> log2(NumSharingCache+1)

Logical processors with the same ShareId then share a cache. If
NumSharingCache+1 is not a power of two, round it up to the next power
of two.

From the description above, the calculation of this field should be same
as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
APIC ID to calculate this field.

[1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
     Information

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-8-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu 88dd4ca06c i386/cpu: Use APIC ID info to encode cache topo in CPUID[4]
Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
nearest power-of-2 integer.

The nearest power-of-2 integer can be calculated by pow2ceil() or by
using APIC ID offset/width (like L3 topology using 1 << die_offset [3]).

But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
are associated with APIC ID. For example, in linux kernel, the field
"num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
matched with actual core numbers and it's calculated by:
"(1 << (pkg_offset - core_offset)) - 1".

Therefore the topology information of APIC ID should be preferred to
calculate nearest power-of-2 integer for CPUID.04H:EAX[bits 25:14] and
CPUID.04H:EAX[bits 31:26]:
1. d/i cache is shared in a core, 1 << core_offset should be used
   instead of "cs->nr_threads" in encode_cache_cpuid4() for
   CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
2. L2 cache is supposed to be shared in a core as for now, thereby
   1 << core_offset should also be used instead of "cs->nr_threads" in
   encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
   calculated with the bit width between the package and SMT levels in
   the APIC ID (1 << (pkg_offset - core_offset) - 1).

In addition, use APIC ID bits calculations to replace "pow2ceil()" for
cache_info_passthrough case.

[1]: efb3934adf ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec")
[2]: d7caf13b5f ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
[3]: d65af288a8 ("i386: Update new x86_apicid parsing rules with die_offset support")

Fixes: 7e3482f824 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-7-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:26 +02:00
Zhao Liu 12f6b8280f i386/cpu: Fix i/d-cache topology to core level for Intel CPU
For i-cache and d-cache, current QEMU hardcodes the maximum IDs for CPUs
sharing cache (CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits
25:14]) to 0, and this means i-cache and d-cache are shared in the SMT
level.

This is correct if there's single thread per core, but is wrong for the
hyper threading case (one core contains multiple threads) since the
i-cache and d-cache are shared in the core level other than SMT level.

For AMD CPU, commit 8f4202fb10 ("i386: Populate AMD Processor Cache
Information for cpuid 0x8000001D") has already introduced i/d cache
topology as core level by default.

Therefore, in order to be compatible with both multi-threaded and
single-threaded situations, we should set i-cache and d-cache be shared
at the core level by default.

This fix changes the default i/d cache topology from per-thread to
per-core. Potentially, this change in L1 cache topology may affect the
performance of the VM if the user does not specifically specify the
topology or bind the vCPU. However, the way to achieve optimal
performance should be to create a reasonable topology and set the
appropriate vCPU affinity without relying on QEMU's default topology
structure.

Fixes: 7e3482f824 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240424154929.1487382-6-zhao1.liu@intel.com>
[Add compat property. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:39:33 +02:00
Binbin Wu 0117067131 target/i386: add control bits support for LAM
LAM uses CR3[61] and CR3[62] to configure/enable LAM on user pointers.
LAM uses CR4[28] to configure/enable LAM on supervisor pointers.

For CR3 LAM bits, no additional handling needed:
- TCG
  LAM is not supported for TCG of target-i386.  helper_write_crN() and
  helper_vmrun() check max physical address bits before calling
  cpu_x86_update_cr3(), no change needed, i.e. CR3 LAM bits are not allowed
  to be set in TCG.
- gdbstub
  x86_cpu_gdb_write_register() will call cpu_x86_update_cr3() to update cr3.
  Allow gdb to set the LAM bit(s) to CR3, if vcpu doesn't support LAM,
  KVM_SET_SREGS will fail as other reserved bits.

For CR4 LAM bit, its reservation depends on vcpu supporting LAM feature or
not.
- TCG
  LAM is not supported for TCG of target-i386.  helper_write_crN() and
  helper_vmrun() check CR4 reserved bit before calling cpu_x86_update_cr4(),
  i.e. CR4 LAM bit is not allowed to be set in TCG.
- gdbstub
  x86_cpu_gdb_write_register() will call cpu_x86_update_cr4() to update cr4.
  Mask out LAM bit on CR4 if vcpu doesn't support LAM.
- x86_cpu_reset_hold() doesn't need special handling.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240112060042.19925-3-binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 15:53:30 +02:00
Robert Hoo ba67809059 target/i386: add support for LAM in CPUID enumeration
Linear Address Masking (LAM) is a new Intel CPU feature, which allows
software to use of the untranslated address bits for metadata.

The bit definition:
CPUID.(EAX=7,ECX=1):EAX[26]

Add CPUID definition for LAM.

Note LAM feature is not supported for TCG of target-i386, LAM CPIUD bit
will not be added to TCG_7_1_EAX_FEATURES.

More info can be found in Intel ISE Chapter "LINEAR ADDRESS MASKING(LAM)"
https://cdrdv2.intel.com/v1/dl/getContent/671368

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Co-developed-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240112060042.19925-2-binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 15:53:30 +02:00
Paolo Bonzini ec56891984 target/i386: clean up AAM/AAD
The 32-bit AAM/AAD opcodes are using helpers that read and write flags and
env->regs[R_EAX].  Clean them up so that the table correctly includes AX
as a 16-bit input and output.

No real reason to do it to be honest, but they are nice one-output helpers
and it removes the masking of env->regs[R_EAX] that generic load/writeback
code already does.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123912.608497-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 15:53:30 +02:00
Paolo Bonzini d0414d71f6 target/i386: generate simpler code for ROL/ROR with immediate count
gen_rot_carry and gen_rot_overflow are meant to be called with count == NULL
if the count cannot be zero.  However this is not done in gen_ROL and gen_ROR,
and writing everywhere "can_be_zero ? count : NULL" is burdensome and less
readable.  Just pass can_be_zero as a separate argument.

gen_RCL and gen_RCR use a conditional branch to skip the computation
if count is zero, so they can pass false unconditionally to gen_rot_overflow.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123914.608516-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 15:53:30 +02:00
Richard Henderson 922582ace2 target/hppa:
- Use TCG_COND_TST where applicable.
   - Use CF_BP_PAGE instead of a local breakpoint search.
   - Clean up IAOQ handling during translation.
   - Implement CF_PCREL.
   - Implement PSW.B.
   - Implement PSW.X.
   - Log cpu state on interrupt and rfi.
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEgnwdHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+43gf8CakQdMSqfGV2nGP+
 7wWZOAV04IyfkJ38F/CH0ihUkblEOzXJ1shTFkrHEw257j0D10MctSSbjrqz5BwU
 obQcwoVlxzTGXqzhkZ6wagkcqjv3TtlPtznZIk6JssdlrtwIKDmE2/3t1dzHnyBD
 WTrS0SK3YvVRovq/ai51raUbiBsNq7XG3skHEsMKsFxp4EaDP5JTbputdQWdffjh
 TBmXImhHC3gm09KWIUZwfEBHlaa7YXk2orzB8kBE8S2kQj9vrGXEaC4jYnBcQLPw
 NDDkBYRqxHYQr0vIAHee+5cUgt1jDBr5rXnAnJwzK0wyEEc4Mi4OTPhNE604iu2y
 SDxS8Q==
 =A4Qf
 -----END PGP SIGNATURE-----

Merge tag 'pull-hppa-20240515' of https://gitlab.com/rth7680/qemu into staging

target/hppa:
  - Use TCG_COND_TST where applicable.
  - Use CF_BP_PAGE instead of a local breakpoint search.
  - Clean up IAOQ handling during translation.
  - Implement CF_PCREL.
  - Implement PSW.B.
  - Implement PSW.X.
  - Log cpu state on interrupt and rfi.

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEgnwdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+43gf8CakQdMSqfGV2nGP+
# 7wWZOAV04IyfkJ38F/CH0ihUkblEOzXJ1shTFkrHEw257j0D10MctSSbjrqz5BwU
# obQcwoVlxzTGXqzhkZ6wagkcqjv3TtlPtznZIk6JssdlrtwIKDmE2/3t1dzHnyBD
# WTrS0SK3YvVRovq/ai51raUbiBsNq7XG3skHEsMKsFxp4EaDP5JTbputdQWdffjh
# TBmXImhHC3gm09KWIUZwfEBHlaa7YXk2orzB8kBE8S2kQj9vrGXEaC4jYnBcQLPw
# NDDkBYRqxHYQr0vIAHee+5cUgt1jDBr5rXnAnJwzK0wyEEc4Mi4OTPhNE604iu2y
# SDxS8Q==
# =A4Qf
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 15 May 2024 11:38:04 AM CEST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]

* tag 'pull-hppa-20240515' of https://gitlab.com/rth7680/qemu: (43 commits)
  target/hppa: Log cpu state on return-from-interrupt
  target/hppa: Log cpu state at interrupt
  target/hppa: Implement CF_PCREL
  target/hppa: Adjust priv for B,GATE at runtime
  target/hppa: Drop tlb_entry return from hppa_get_physical_address
  target/hppa: Implement PSW_X
  target/hppa: Implement PSW_B
  target/hppa: Manage PSW_X and PSW_B in translator
  target/hppa: Split PSW X and B into their own field
  target/hppa: Improve hppa_cpu_dump_state
  target/hppa: Do not mask in copy_iaoq_entry
  target/hppa: Store full iaoq_f and page offset of iaoq_b in TB
  linux-user/hppa: Force all code addresses to PRIV_USER
  target/hppa: Use delay_excp for conditional trap on overflow
  target/hppa: Use delay_excp for conditional traps
  target/hppa: Introduce DisasDelayException
  target/hppa: Remove cond_free
  target/hppa: Use TCG_COND_TST* in trans_ftest
  target/hppa: Use registerfields.h for FPSR
  target/hppa: Use TCG_COND_TST* in trans_bb_imm
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 11:46:58 +02:00
Richard Henderson 9e035f0078 target/hppa: Log cpu state on return-from-interrupt
Inverse of the logging on taking an interrupt.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:45 +02:00
Richard Henderson 12959fcdcf target/hppa: Log cpu state at interrupt
This contains all of the information logged before, plus more.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:45 +02:00
Richard Henderson 6dd9b145f6 target/hppa: Implement CF_PCREL
Now that the groundwork has been laid, enabling CF_PCREL within the
translator proper is a simple matter of updating copy_iaoq_entry
and install_iaq_entries.

We also need to modify the unwind info, since we no longer have
absolute addresses to install.

As expected, this reduces the runtime overhead of compilation when
running a Linux kernel with address space randomization enabled.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:45 +02:00
Richard Henderson 804cd52d3a target/hppa: Adjust priv for B,GATE at runtime
Do not compile in the priv change based on the first translation;
look up the PTE at execution time.  This is required for CF_PCREL,
where a page may be mapped multiple times with different attributes.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:45 +02:00
Richard Henderson 190d7fa572 target/hppa: Drop tlb_entry return from hppa_get_physical_address
The return-by-reference is never used.

Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson d8bc138125 target/hppa: Implement PSW_X
Use PAGE_WRITE_INV to temporarily enable write permission
on for a given page, driven by PSW_X being set.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson 5ae8adbb01 target/hppa: Implement PSW_B
PSW_B causes B,GATE to trap as an illegal instruction, removing our
previous sequential execution test that was merely an approximation.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson d27fe7c3af target/hppa: Manage PSW_X and PSW_B in translator
PSW_X is cleared after every instruction, and only set by RFI.
PSW_B is cleared after every non-branch, or branch not taken,
and only set by taken branches.  We can clear both bits with a
single store, at most once per TB.  Taken branches set PSW_B,
at most once per TB.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson ebc9401a40 target/hppa: Split PSW X and B into their own field
Generally, both of these bits are cleared at the end of each
instruction.  By separating these, we will be able to clear
both with a single insn, instead of 2 or 3.

Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson d2e22fde14 target/hppa: Improve hppa_cpu_dump_state
Print both raw IAQ_Front and IAQ_Back as well as the GVAs.
Print control registers in system mode.
Print floating point registers if CPU_DUMP_FPU.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson 081a0ed188 target/hppa: Do not mask in copy_iaoq_entry
As with loads and stores, code offsets are kept intact until the
full gva is formed.  In qemu, this is in cpu_get_tb_cpu_state.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson 9dfcd24349 target/hppa: Store full iaoq_f and page offset of iaoq_b in TB
In preparation for CF_PCREL. store the iaoq_f in 3 parts: high
bits in cs_base, middle bits in pc, and low bits in priv.
For iaoq_b, set a bit for either of space or page differing,
else the page offset.

Install iaq entries before goto_tb. The change to not record
the full direct branch difference in TB means that we have to
store at least iaoq_b before goto_tb.  But since a later change
to enable CF_PCREL will require both iaoq_f and iaoq_b to be
updated before goto_tb, go ahead and update both fields now.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson 3c13b0ffe7 linux-user/hppa: Force all code addresses to PRIV_USER
The kernel does this along the return path to user mode.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson a0ea4becca target/hppa: Use delay_excp for conditional trap on overflow
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson 269ca0a9cc target/hppa: Use delay_excp for conditional traps
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00
Richard Henderson 806030074b target/hppa: Introduce DisasDelayException
Allow an exception to be emitted at the end of the TranslationBlock,
leaving only the conditional branch inline.  Use it for simple
exception instructions like break, which happen to be nullified.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 10:03:44 +02:00