ShuriZma/xemu - xemu

Commit Graph

Author	SHA1	Message	Date
Peter Maydell	e6b2fa1b81	target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed) Our implementation of the indexed version of SVE SDOT/UDOT/USDOT got the calculation of the inner loop terminator wrong. Although we correctly account for the element size when we calculate the terminator for the first iteration: intptr_t segend = MIN(16 / sizeof(TYPED), opr_sz_n); we don't do that when we move it forward after the first inner loop completes. The intention is that we process the vector in 128-bit segments, which for a 64-bit element size should mean (1, 2), (3, 4), (5, 6), etc. This bug meant that we would iterate (1, 2), (3, 4, 5, 6), (7, 8, 9, 10) etc and apply the wrong indexed element to some of the operations, and also index off the end of the vector. You don't see this bug if the vector length is small enough that we don't need to iterate the outer loop, i.e. if it is only 128 bits, or if it is the 64-bit special case from AA32/AA64 AdvSIMD. If the vector length is 256 bits then we calculate the right results for the elements in the vector but do index off the end of the vector. Vector lengths greater than 256 bits see wrong answers. The instructions that produce 32-bit results behave correctly. Fix the recalculation of 'segend' for subsequent iterations, and restore a version of the comment that was lost in the refactor of commit `7020ffd656` that explains why we only need to clamp segend to opr_sz_n for the first iteration, not the later ones. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2595 Fixes: `7020ffd656` ("target/arm: Macroize helper_gvec_{s,u}dot_idx_{b,h}") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241101185544.2130972-1-peter.maydell@linaro.org	2024-11-05 10:09:58 +00:00
Peter Maydell	0e1850182a	target/arm: Implement FPCR.EBF=1 semantics for bfdotadd() Implement the FPCR.EBF=1 semantics for bfdotadd() operations: * is_ebf() sets up fpst and fpst_odd * bfdotadd_ebf() implements the fused paired-multiply-and-add operation that we need The paired-multiply-and-add is similar to f16_dotadd() and we use the same trick here as in that function, but the inputs here are bfloat16 rather than float16. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2024-09-05 13:12:36 +01:00
Peter Maydell	09b0d9e0ad	target/arm: Prepare bfdotadd() callers for FEAT_EBF support We use bfdotadd() in four callsites for various helper functions. Currently this all assumes that we have the FPCR.EBF=0 semantics. For FPCR.EBF=1 we will need to: * call a different routine to bfdotadd() because we need to do a fused multiply-add rather than separate multiply and add steps * use a different float_status that honours the FPCR rounding mode and denormal-flushing fields * pass in an extra float_status that has been set up to perform round-to-odd rounding To prepare for this, refactor all the callsites so that instead of for (...) { x = bfdotadd(...); } they are: float_status fpst, fpst_odd; if (is_ebf(env, &fpst, &fpst_odd)) { for (...) { x = bfdotadd_ebf(..., &fpst, &fpst_odd); } } else { for (...) { x = bfdotadd(..., &fpst); } } For the moment the is_ebf() function always returns false, sets up fpst for EBF=0 semantics and never sets up fpst_odd; bfdotadd_ebf() will assert if called. We'll fill in the handling for EBF=1 in the next commit. This change should be a zero-behaviour-change refactor. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2024-09-05 13:12:36 +01:00
Peter Maydell	2da2d7dc90	target/arm: Pass env pointer through to gvec_bfmmla helper Pass the env pointer through to the gvec_bfmmla helper, so we can use it to add support for FEAT_EBF16. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2024-09-05 13:12:36 +01:00
Peter Maydell	c8d644b951	target/arm: Pass env pointer through to gvec_bfdot_idx helper Pass the env pointer through to the gvec_bfdot_idx helper, so we can use it to add support for FEAT_EBF16. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2024-09-05 13:12:35 +01:00
Peter Maydell	75a6784dad	target/arm: Pass env pointer through to gvec_bfdot helper Pass the env pointer through to the gvec_bfdot helper, so we can use it to add support for FEAT_EBF16. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2024-09-05 13:12:35 +01:00
Richard Henderson	f698e45270	target/arm: Convert SQRDMLAH, SQRDMLSH to decodetree Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240625183536.1672454-5-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-07-01 15:40:52 +01:00
Richard Henderson	a5b72ccc0f	target/arm: Fix SQDMULH (by element) with Q=0 The inner loop, bounded by eltspersegment, must not be larger than the outer loop, bounded by elements. Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240625183536.1672454-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-07-01 12:48:55 +01:00
Richard Henderson	76bccf3cb9	target/arm: Fix VCMLA Dd, Dn, Dm[idx] The inner loop, bounded by eltspersegment, must not be larger than the outer loop, bounded by elements. Cc: qemu-stable@nongnu.org Fixes: `18fc240578` ("target/arm: Implement SVE fp complex multiply add (indexed)") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2376 Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240625183536.1672454-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-07-01 12:48:55 +01:00
Richard Henderson	f80701cb44	target/arm: Convert SQDMULH, SQRDMULH to decodetree These are the last instructions within disas_simd_three_reg_same and disas_simd_scalar_three_reg_same, so remove them. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240528203044.612851-32-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-30 15:24:41 +01:00
Richard Henderson	8f6343ae18	target/arm: Convert SUQADD and USQADD to gvec Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240528203044.612851-5-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-30 15:24:39 +01:00
Richard Henderson	28b5451bec	target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree These are the last instructions within handle_simd_3same_pair so remove it. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-34-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	a7e4eec6fb	target/arm: Convert ADDP to decodetree Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-32-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	c43a23e1aa	target/arm: Use gvec for neon faddp, fmaxp, fminp Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-31-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	a13f9fb5bf	target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP to decodetree These are the last instructions within disas_simd_three_reg_same_fp16, so remove it. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-30-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	57801ca0ea	target/arm: Convert FADDP to decodetree Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-29-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	43454734c4	target/arm: Convert FABD to decodetree Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-27-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	4fe068fac0	target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT to decodetree Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-26-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	2d558efbf5	target/arm: Convert FMLA, FMLS to decodetree Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-25-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	a1e250fc71	target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM to decodetree Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-21-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	e0300a9a5e	target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-20-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	cb1c77feef	target/arm: Convert FMULX to decodetree Convert all forms (scalar, vector, scalar indexed, vector indexed), which allows us to remove switch table entries elsewhere. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240524232121.284515-19-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-05-28 14:29:01 +01:00
Richard Henderson	a50cfdf0be	target/arm: Use clmul_64 Use generic routine for 64-bit carry-less multiply. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-09-15 13:57:00 +00:00
Richard Henderson	bae25f648e	target/arm: Use clmul_32* routines Use generic routines for 32-bit carry-less multiply. Remove our local version of pmull_d. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-09-15 13:57:00 +00:00
Richard Henderson	c6f0dcb1fd	target/arm: Use clmul_16* routines Use generic routines for 16-bit carry-less multiply. Remove our local version of pmull_w. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-09-15 13:57:00 +00:00
Richard Henderson	8e3da4c716	target/arm: Use clmul_8* routines Use generic routines for 8-bit carry-less multiply. Remove our local version of pmull_h. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-09-15 13:56:59 +00:00
Michael Tokarev	673d821541	arm: spelling fixes Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2023-07-25 17:13:53 +03:00
Claudio Fontana	a3ef070ea9	target/arm: move helpers to tcg/ Signed-off-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2023-02-27 13:27:04 +00:00

28 Commits