xemu/target/arm/tcg
Peter Maydell 55f9f4ee01 target/arm: Handle denormals correctly for FMOPA (widening)
The FMOPA (widening) SME instruction takes pairs of half-precision
floating point values, widens them to single-precision, does a
two-way dot product and accumulates the results into a
single-precision destination.  We don't quite correctly handle the
FPCR bits FZ and FZ16 which control flushing of denormal inputs and
outputs.  This is because at the moment we pass a single float_status
value to the helper function, which then uses that configuration for
all the fp operations it does.  However, because the inputs to this
operation are float16 and the outputs are float32 we need to use the
fp_status_f16 for the float16 input widening but the normal fp_status
for everything else.  Otherwise we will apply the flushing control
FPCR.FZ16 to the 32-bit output rather than the FPCR.FZ control, and
incorrectly flush a denormal output to zero when we should not (or
vice-versa).

(In commit 207d30b5fd we tried to fix the FZ handling but
didn't get it right, switching from "use FPCR.FZ for everything" to
"use FPCR.FZ16 for everything".)

Pass the CPU env to the sme_fmopa_h helper instead of an fp_status
pointer, and have the helper pass an extra fp_status into the
f16_dotadd() function so that we can use the right status for the
right parts of this operation.

Cc: qemu-stable@nongnu.org
Fixes: 207d30b5fd ("target/arm: Use FPST_F16 for SME FMOPA (widening)")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2373
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2024-08-01 10:15:03 +01:00
..
a32-uncond.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
a32.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
a64.decode target/arm: Fix handling of LDAPR/STLR with negative offset 2024-07-18 13:49:28 +01:00
arm_ldst.h target/arm: Move translate-a32.h, arm_ldst.h, sve_ldst_internal.h to tcg/ 2023-05-12 15:43:36 +01:00
cpu-v7m.c target/arm: Set arm_v7m_tcg_ops cpu_exec_halt to arm_cpu_exec_halt() 2024-07-11 11:41:34 +01:00
cpu32.c target/arm: Enable FEAT_Debugv8p8 for -cpu max 2024-07-01 15:40:53 +01:00
cpu64.c target/arm: Enable FEAT_Debugv8p8 for -cpu max 2024-07-01 15:40:53 +01:00
crypto_helper.c crypto: Create sm4_subword 2023-09-11 11:45:55 +10:00
gengvec.c target/arm: Tidy SQDMULH, SQRDMULH (vector) 2024-05-30 15:24:41 +01:00
gengvec64.c target/arm: Inline scalar SUQADD and USQADD 2024-05-30 15:24:39 +01:00
helper-a64.c target/arm: Use set/clear_helper_retaddr in helper-a64.c 2024-07-23 10:56:04 +10:00
helper-a64.h target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree 2024-05-28 14:29:01 +01:00
helper-mve.h target/arm: Move helper-{a64,mve,sme,sve}.h to tcg/ 2023-05-12 15:43:37 +01:00
helper-sme.h target/arm: Handle denormals correctly for FMOPA (widening) 2024-08-01 10:15:03 +01:00
helper-sve.h target/arm: Move helper-{a64,mve,sme,sve}.h to tcg/ 2023-05-12 15:43:37 +01:00
hflags.c target/arm: Restrict translation disabled alignment check to VMSA 2024-04-30 15:01:07 +01:00
iwmmxt_helper.c target/arm: move helpers to tcg/ 2023-02-27 13:27:04 +00:00
m-nocp.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
m_helper.c exec/cpu: Extract page-protection definitions to page-protection.h 2024-05-06 11:17:15 +02:00
meson.build target/arm: Split out gengvec64.c 2024-05-28 14:29:01 +01:00
mte_helper.c target/arm: Make some MTE helpers widely available 2024-07-05 12:35:11 +01:00
mte_helper.h target/arm: Make some MTE helpers widely available 2024-07-05 12:35:11 +01:00
mve.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
mve_helper.c target/arm: Rename FPCR_ QC, NZCV macros to FPSR_ 2024-07-11 11:41:33 +01:00
neon-dp.decode target/arm: Convert SQRSHL and UQRSHL (register) to gvec 2024-05-30 15:24:40 +01:00
neon-ls.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
neon-shared.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
neon_helper.c target/arm: Convert SRHADD, URHADD to gvec 2024-05-30 15:24:41 +01:00
op_helper.c target/arm: Implement FEAT WFxT and enable for '-cpu max' 2024-05-30 16:35:17 +01:00
pauth_helper.c target/arm: Move feature test functions to their own header 2023-10-27 11:44:32 +01:00
psci.c target/arm: Expose arm_cpu_mp_affinity() in 'multiprocessing.h' header 2024-01-26 11:30:48 +00:00
sme-fa64.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
sme.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
sme_helper.c target/arm: Handle denormals correctly for FMOPA (widening) 2024-08-01 10:15:03 +01:00
sve.decode target/arm: Demultiplex AESE and AESMC 2023-07-08 07:30:18 +01:00
sve_helper.c target/arm: Use set/clear_helper_retaddr in SVE and SME helpers 2024-07-23 10:56:04 +10:00
sve_ldst_internal.h target/arm: Move translate-a32.h, arm_ldst.h, sve_ldst_internal.h to tcg/ 2023-05-12 15:43:36 +01:00
t16.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
t32.decode target/arm: Use PLD, PLDW, PLI not NOP for t32 2024-05-28 14:23:52 +01:00
tlb_helper.c target/arm: Split out arm_env_mmu_index 2024-02-03 08:52:25 +10:00
translate-a32.h target/arm: Implement store_cpu_field_low32() macro 2024-07-11 11:41:33 +01:00
translate-a64.c target/arm: LDAPR should honour SCTLR_ELx.nAA 2024-07-18 13:49:28 +01:00
translate-a64.h target/arm: Inline scalar SUQADD and USQADD 2024-05-30 15:24:39 +01:00
translate-m-nocp.c target/arm: Rename FPCR_ QC, NZCV macros to FPSR_ 2024-07-11 11:41:33 +01:00
translate-mve.c tcg: Rename cpu_env to tcg_env 2023-10-03 08:01:02 -07:00
translate-neon.c target/arm: Tidy SQDMULH, SQRDMULH (vector) 2024-05-30 15:24:41 +01:00
translate-sme.c target/arm: Handle denormals correctly for FMOPA (widening) 2024-08-01 10:15:03 +01:00
translate-sve.c target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl() 2024-07-29 16:56:46 +01:00
translate-vfp.c target/arm: Rename FPCR_ QC, NZCV macros to FPSR_ 2024-07-11 11:41:33 +01:00
translate.c target/arm: Split out gengvec.c 2024-05-28 14:29:01 +01:00
translate.h target/arm: Store FPSR and FPCR in separate CPU state fields 2024-07-11 11:41:33 +01:00
vec_helper.c target/arm: Convert SQRDMLAH, SQRDMLSH to decodetree 2024-07-01 15:40:52 +01:00
vec_internal.h target/arm: Use clmul_16* routines 2023-09-15 13:57:00 +00:00
vfp-uncond.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00
vfp.decode target/arm: move translate modules to tcg/ 2023-02-27 13:27:04 +00:00