Commit Graph

13796 Commits

Author SHA1 Message Date
Rajnesh Kanwal 74112400df target/riscv: More accurately model priv mode filtering.
In case of programmable counters configured to count inst/cycles
we often end-up with counter not incrementing at all from kernel's
perspective.

For example:
- Kernel configures hpm3 to count instructions and sets hpmcounter
  to -10000 and all modes except U mode are inhibited.
- In QEMU we configure a timer to expire after ~10000 instructions.
- Problem is, it's often the case that kernel might not even schedule
  Umode task and we hit the timer callback in QEMU.
- In the timer callback we inject the interrupt into kernel, kernel
  runs the handler and reads hpmcounter3 value.
- Given QEMU maintains individual counters to count for each privilege
  mode, and given umode never ran, the umode counter didn't increment
  and QEMU returns same value as was programmed by the kernel when
  starting the counter.
- Kernel checks for overflow using previous and current value of the
  counter and reprograms the counter given there wasn't an overflow
  as per the counter value. (Which itself is a problem. We have QEMU
  telling kernel that counter3 overflowed but the counter value
  returned by QEMU doesn't seem to reflect that.).

This change makes sure that timer is reprogrammed from the handler
if the counter didn't overflow based on the counter value.

Second, this change makes sure that whenever the counter is read,
it's value is updated to reflect the latest count.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240711-smcntrpmf_v7-v8-11-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:45 +10:00
Rajnesh Kanwal 22c721c34c target/riscv: Start counters from both mhpmcounter and mcountinhibit
Currently we start timer counter from write_mhpmcounter path only
without checking for mcountinhibit bit. This changes adds mcountinhibit
check and also programs the counter from write_mcountinhibit as well.

When a counter is stopped using mcountinhibit we simply update
the value of the counter based on current host ticks and save
it for future reads.

We don't need to disable running timer as pmu_timer_trigger_irq
will discard the interrupt if the counter has been inhibited.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240711-smcntrpmf_v7-v8-10-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:45 +10:00
Atish Patra 8cff74c26d target/riscv: Enforce WARL behavior for scounteren/hcounteren
scounteren/hcountern are also WARL registers similar to mcountern.
Only set the bits for the available counters during the write to
preserve the WARL behavior.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-9-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Atish Patra 46023470e0 target/riscv: Save counter values during countinhibit update
Currently, if a counter monitoring cycle/instret is stopped via
mcountinhibit we just update the state while the value is saved
during the next read. This is not accurate as the read may happen
many cycles after the counter is stopped. Ideally, the read should
return the value saved when the counter is stopped.

Thus, save the value of the counter during the inhibit update
operation and return that value during the read if corresponding bit
in mcountihibit is set.

Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-8-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Atish Patra b2d7a7c7e4 target/riscv: Implement privilege mode filtering for cycle/instret
Privilege mode filtering can also be emulated for cycle/instret by
tracking host_ticks/icount during each privilege mode switch. This
patch implements that for both cycle/instret and mhpmcounters. The
first one requires Smcntrpmf while the other one requires Sscofpmf
to be enabled.

The cycle/instret are still computed using host ticks when icount
is not enabled. Otherwise, they are computed using raw icount which
is more accurate in icount mode.

Co-Developed-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-7-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Atish Patra 3b31b7baff target/riscv: Only set INH fields if priv mode is available
Currently, the INH fields are set in mhpmevent uncoditionally
without checking if a particular priv mode is supported or not.

Suggested-by: Alistair Francis <alistair23@gmail.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-6-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Kaiwen Xue b54a84c15e target/riscv: Add cycle & instret privilege mode filtering support
QEMU only calculates dummy cycles and instructions, so there is no
actual means to stop the icount in QEMU. Hence this patch merely adds
the functionality of accessing the cfg registers, and cause no actual
effects on the counting of cycle and instret counters.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Kaiwen Xue <kaiwenx@rivosinc.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-5-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Kaiwen Xue 6d1e3893cf target/riscv: Add cycle & instret privilege mode filtering definitions
This adds the definitions for ISA extension smcntrpmf.

Signed-off-by: Kaiwen Xue <kaiwenx@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-4-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Kaiwen Xue 251dccc09a target/riscv: Add cycle & instret privilege mode filtering properties
This adds the properties for ISA extension smcntrpmf. Patches
implementing it will follow.

Signed-off-by: Kaiwen Xue <kaiwenx@rivosinc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-3-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Atish Patra be470e5977 target/riscv: Fix the predicate functions for mhpmeventhX CSRs
mhpmeventhX CSRs are available for RV32. The predicate function
should check that first before checking sscofpmf extension.

Fixes: 1466448345 ("target/riscv: Add sscofpmf extension support")
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Message-ID: <20240711-smcntrpmf_v7-v8-2-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Rajnesh Kanwal 68c05fb530 target/riscv: Combine set_mode and set_virt functions.
Combining riscv_cpu_set_virt_enabled() and riscv_cpu_set_mode()
functions. This is to make complete mode change information
available through a single function.

This allows to easily differentiate between HS->VS, VS->HS
and VS->VS transitions when executing state update codes.
For example: One use-case which inspired this change is
to update mode-specific instruction and cycle counters
which requires information of both prev mode and current
mode.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240711-smcntrpmf_v7-v8-1-b7c38ae7b263@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Daniel Henrique Barboza 3cb9f20499 target/riscv/kvm: update KVM regs to Linux 6.10-rc5
Two new regs added: ztso and zacas.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240709085431.455541-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
Jiayi Li 910c18a917 target/riscv: Validate the mode in write_vstvec
Base on the riscv-privileged spec, vstvec substitutes for the usual stvec.
Therefore, the encoding of the MODE should also be restricted to 0 and 1.

Signed-off-by: Jiayi Li <lijiayi@eswincomputing.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-ID: <20240701022553.1982-1-lijiayi@eswincomputing.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:08:44 +10:00
LIU Zhiwei 8aebaa2591 target/riscv: Expose zabha extension as a cpu property
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240709113652.1239-11-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:00:42 +10:00
LIU Zhiwei d34e406602 target/riscv: Add amocas.[b|h] for Zabha
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240709113652.1239-10-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:00:42 +10:00
LIU Zhiwei 8d07887bcb target/riscv: Move gen_cmpxchg before adding amocas.[b|h]
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240709113652.1239-9-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:00:42 +10:00
LIU Zhiwei be4a8db7f3 target/riscv: Add AMO instructions for Zabha
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240709113652.1239-8-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:00:42 +10:00
LIU Zhiwei 24da9cbaca target/riscv: Move gen_amo before implement Zabha
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240709113652.1239-7-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:00:42 +10:00
LIU Zhiwei a60ce58fd9 target/riscv: Support Zama16b extension
Zama16b is the property that misaligned load/stores/atomics within
a naturally aligned 16-byte region are atomic.

According to the specification, Zama16b applies only to AMOs, loads
and stores defined in the base ISAs, and loads and stores of no more
than XLEN bits defined in the F, D, and Q extensions. Thus it should
not apply to zacas or RVC instructions.

For an instruction in that set, if all accessed bytes lie within 16B granule,
the instruction will not raise an exception for reasons of address alignment,
and the instruction will give rise to only one memory operation for the
purposes of RVWMO—i.e., it will execute atomically.

Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240709113652.1239-6-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:00:42 +10:00
LIU Zhiwei 197e4d2988 target/riscv: Add zcmop extension
Zcmop defines eight 16-bit MOP instructions named C.MOP.n, where n is
an odd integer between 1 and 15, inclusive. C.MOP.n is encoded in
the reserved encoding space corresponding to C.LUI xn, 0.

Unlike the MOPs defined in the Zimop extension, the C.MOP.n instructions
are defined to not write any register.

In current implementation, C.MOP.n only has an check function, without any
other more behavior.

Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Message-ID: <20240709113652.1239-4-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:00:42 +10:00
LIU Zhiwei 6eab278d38 target/riscv: Add zimop extension
Zimop extension defines an encoding space for 40 MOPs.The Zimop
extension defines 32 MOP instructions named MOP.R.n, where n is
an integer between 0 and 31, inclusive. The Zimop extension
additionally defines 8 MOP instructions named MOP.RR.n, where n
is an integer between 0 and 7.

These 40 MOPs initially are defined to simply write zero to x[rd],
but are designed to be redefined by later extensions to perform some
other action.

Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Message-ID: <20240709113652.1239-2-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-07-18 12:00:42 +10:00
Richard Henderson 58ee924b97 * target/i386/tcg: fixes for seg_helper.c
* SEV: Don't allow automatic fallback to legacy KVM_SEV_INIT,
   but also don't use it by default
 * scsi: honor bootindex again for legacy drives
 * hpet, utils, scsi, build, cpu: miscellaneous bugfixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaWoP0UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOqfggAg3jxUp6B8dFTEid5aV6qvT4M6nwD
 TAYcAl5kRqTOklEmXiPCoA5PeS0rbr+5xzWLAKgkumjCVXbxMoYSr0xJHVuDwQWv
 XunUm4kpxJBLKK3uTGAIW9A21thOaA5eAoLIcqu2smBMU953TBevMqA7T67h22rp
 y8NnZWWdyQRH0RAaWsCBaHVkkf+DuHSG5LHMYhkdyxzno+UWkTADFppVhaDO78Ba
 Egk49oMO+G6of4+dY//p1OtAkAf4bEHePKgxnbZePInJrkgHzr0TJWf9gERWFzdK
 JiM0q6DeqopZm+vENxS+WOx7AyDzdN0qOrf6t9bziXMg0Rr2Z8bu01yBCQ==
 =cZhV
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386/tcg: fixes for seg_helper.c
* SEV: Don't allow automatic fallback to legacy KVM_SEV_INIT,
  but also don't use it by default
* scsi: honor bootindex again for legacy drives
* hpet, utils, scsi, build, cpu: miscellaneous bugfixes

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaWoP0UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOqfggAg3jxUp6B8dFTEid5aV6qvT4M6nwD
# TAYcAl5kRqTOklEmXiPCoA5PeS0rbr+5xzWLAKgkumjCVXbxMoYSr0xJHVuDwQWv
# XunUm4kpxJBLKK3uTGAIW9A21thOaA5eAoLIcqu2smBMU953TBevMqA7T67h22rp
# y8NnZWWdyQRH0RAaWsCBaHVkkf+DuHSG5LHMYhkdyxzno+UWkTADFppVhaDO78Ba
# Egk49oMO+G6of4+dY//p1OtAkAf4bEHePKgxnbZePInJrkgHzr0TJWf9gERWFzdK
# JiM0q6DeqopZm+vENxS+WOx7AyDzdN0qOrf6t9bziXMg0Rr2Z8bu01yBCQ==
# =cZhV
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 17 Jul 2024 02:34:05 AM AEST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
  target/i386/tcg: save current task state before loading new one
  target/i386/tcg: use X86Access for TSS access
  target/i386/tcg: check for correct busy state before switching to a new task
  target/i386/tcg: Compute MMU index once
  target/i386/tcg: Introduce x86_mmu_index_{kernel_,}pl
  target/i386/tcg: Reorg push/pop within seg_helper.c
  target/i386/tcg: use PUSHL/PUSHW for error code
  target/i386/tcg: Allow IRET from user mode to user mode with SMAP
  target/i386/tcg: Remove SEG_ADDL
  target/i386/tcg: fix POP to memory in long mode
  hpet: fix HPET_TN_SETVAL for high 32-bits of the comparator
  hpet: fix clamping of period
  docs: Update description of 'user=username' for '-run-with'
  qemu/timer: Add host ticks function for LoongArch
  scsi: fix regression and honor bootindex again for legacy drives
  hw/scsi/lsi53c895a: bump instruction limit in scripts processing to fix regression
  disas: Fix build against Capstone v6
  cpu: Free queued CPU work
  Revert "qemu-char: do not operate on sources from finalize callbacks"
  i386/sev: Don't allow automatic fallback to legacy KVM_SEV*_INIT

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-17 15:40:28 +10:00
Peter Maydell de680286b5 accel/tcg: Make cpu_exec_interrupt hook mandatory
The TCGCPUOps::cpu_exec_interrupt hook is currently not mandatory; if
it is left NULL then we treat it as if it had returned false. However
since pretty much every architecture needs to handle interrupts,
almost every target we have provides the hook. The one exception is
Tricore, which doesn't currently implement the architectural
interrupt handling.

Add a "do nothing" implementation of cpu_exec_hook for Tricore,
assert on startup that the CPU does provide the hook, and remove
the runtime NULL check before calling it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240712113949.4146855-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-16 20:04:08 +02:00
Paolo Bonzini 6a079f2e68 target/i386/tcg: save current task state before loading new one
This is how the steps are ordered in the manual.  EFLAGS.NT is
overwritten after the fact in the saved image.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:25 +02:00
Paolo Bonzini 8b13106508 target/i386/tcg: use X86Access for TSS access
This takes care of probing the vaddr range in advance, and is also faster
because it avoids repeated TLB lookups.  It also matches the Intel manual
better, as it says "Checks that the current (old) TSS, new TSS, and all
segment descriptors used in the task switch are paged into system memory";
note however that it's not clear how the processor checks for segment
descriptors, and this check is not included in the AMD manual.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:25 +02:00
Paolo Bonzini 05d41bbcb3 target/i386/tcg: check for correct busy state before switching to a new task
This step is listed in the Intel manual: "Checks that the new task is available
(call, jump, exception, or interrupt) or busy (IRET return)".

The AMD manual lists the same operation under the "Preventing recursion"
paragraph of "12.3.4 Nesting Tasks", though it is not clear if the processor
checks the busy bit in the IRET case.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Paolo Bonzini 8053862af9 target/i386/tcg: Compute MMU index once
Add the MMU index to the StackAccess struct, so that it can be cached
or (in the next patch) computed from information that is not in
CPUX86State.

Co-developed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Richard Henderson fffe424b38 target/i386/tcg: Introduce x86_mmu_index_{kernel_,}pl
Disconnect mmu index computation from the current pl
as stored in env->hflags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240617161210.4639-2-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Richard Henderson 059368bcf5 target/i386/tcg: Reorg push/pop within seg_helper.c
Interrupts and call gates should use accesses with the DPL as
the privilege level.  While computing the applicable MMU index
is easy, the harder thing is how to plumb it in the code.

One possibility could be to add a single argument to the PUSH* macros
for the privilege level, but this is repetitive and risks confusion
between the involved privilege levels.

Another possibility is to pass both CPL and DPL, and adjusting both
PUSH* and POP* to use specific privilege levels (instead of using
cpu_{ld,st}*_data). This makes the code more symmetric.

However, a more complicated but much nicer approach is to use a structure
to contain the stack parameters, env, unwind return address, and rewrite
the macros into functions.  The struct provides an easy home for the MMU
index as well.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240617161210.4639-4-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Paolo Bonzini 312ef3243e target/i386/tcg: use PUSHL/PUSHW for error code
Do not pre-decrement esp, let the macros subtract the appropriate
operand size.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Paolo Bonzini 0bd385e7e3 target/i386/tcg: Allow IRET from user mode to user mode with SMAP
This fixes a bug wherein i386/tcg assumed an interrupt return using
the IRET instruction was always returning from kernel mode to either
kernel mode or user mode. This assumption is violated when IRET is used
as a clever way to restore thread state, as for example in the dotnet
runtime. There, IRET returns from user mode to user mode.

This bug is that stack accesses from IRET and RETF, as well as accesses
to the parameters in a call gate, are normal data accesses using the
current CPL.  This manifested itself as a page fault in the guest Linux
kernel due to SMAP preventing the access.

This bug appears to have been in QEMU since the beginning.

Analyzed-by: Robert R. Henry <rrh.henry@gmail.com>
Co-developed-by: Robert R. Henry <rrh.henry@gmail.com>
Signed-off-by: Robert R. Henry <rrh.henry@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Richard Henderson a7cf494993 target/i386/tcg: Remove SEG_ADDL
This truncation is now handled by MMU_*32_IDX.  The introduction of
MMU_*32_IDX in fact applied correct 32-bit wraparound to 16-bit accesses
with a high segment base (e.g.  big real mode or vm86 mode), which did
not use SEG_ADDL.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240617161210.4639-3-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Paolo Bonzini 3afc6539a8 target/i386/tcg: fix POP to memory in long mode
In long mode, POP to memory will write a full 64-bit value.  However,
the call to gen_writeback() in gen_POP will use MO_32 because the
decoding table is incorrect.

The bug was latent until commit aea49fbb01 ("target/i386: use gen_writeback()
within gen_POP()", 2024-06-08), and then became visible because gen_op_st_v
now receives op->ot instead of the "ot" returned by gen_pop_T0.

Analyzed-by: Clément Chigot <chigot@adacore.com>
Fixes: 5e9e21bcc4 ("target/i386: move 60-BF opcodes to new decoder", 2024-05-07)
Tested-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Michael Roth 9d38d9dca2 i386/sev: Don't allow automatic fallback to legacy KVM_SEV*_INIT
Currently if the 'legacy-vm-type' property of the sev-guest object is
'on', QEMU will attempt to use the newer KVM_SEV_INIT2 kernel
interface in conjunction with the newer KVM_X86_SEV_VM and
KVM_X86_SEV_ES_VM KVM VM types.

This can lead to measurement changes if, for instance, an SEV guest was
created on a host that originally had an older kernel that didn't
support KVM_SEV_INIT2, but is booted on the same host later on after the
host kernel was upgraded.

Instead, if legacy-vm-type is 'off', QEMU should fail if the
KVM_SEV_INIT2 interface is not provided by the current host kernel.
Modify the fallback handling accordingly.

In the future, VMSA features and other flags might be added to QEMU
which will require legacy-vm-type to be 'off' because they will rely
on the newer KVM_SEV_INIT2 interface. It may be difficult to convey to
users what values of legacy-vm-type are compatible with which
features/options, so as part of this rework, switch legacy-vm-type to a
tri-state OnOffAuto option. 'auto' in this case will automatically
switch to using the newer KVM_SEV_INIT2, but only if it is required to
make use of new VMSA features or other options only available via
KVM_SEV_INIT2.

Defining 'auto' in this way would avoid inadvertantly breaking
compatibility with older kernels since it would only be used in cases
where users opt into newer features that are only available via
KVM_SEV_INIT2 and newer kernels, and provide better default behavior
than the legacy-vm-type=off behavior that was previously in place, so
make it the default for 9.1+ machine types.

Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
cc: kvm@vger.kernel.org
Signed-off-by: Michael Roth <michael.roth@amd.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Link: https://lore.kernel.org/r/20240710041005.83720-1-michael.roth@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 10:45:06 +02:00
Song Gao 3ef4b21a5c target/loongarch: Fix cpu_reset set wrong CSR_CRMD
After cpu_reset, DATF in CSR_CRMD is 0, DATM is 0.
See the manual[1] 6.4.

  [1]: https://github.com/loongson/LoongArch-Documentation/releases/download/2023.04.20/LoongArch-Vol1-v1.10-EN.pdf

Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-Id: <20240705021839.1004374-2-gaosong@loongson.cn>
2024-07-12 09:41:18 +08:00
Song Gao bba1c36da0 target/loongarch: Set CSR_PRCFG1 and CSR_PRCFG2 values
We set the value of register CSR_PRCFG3, but left out CSR_PRCFG1
and CSR_PRCFG2. Set CSR_PRCFG1 and CSR_PRCFG2 according to the
default values of the physical machine.

Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-Id: <20240705021839.1004374-1-gaosong@loongson.cn>
2024-07-12 09:41:18 +08:00
Feiyang Chen 785875874d target/loongarch: Remove avail_64 in trans_srai_w() and simplify it
Since srai.w is a valid instruction on la32, remove the avail_64 check
and simplify trans_srai_w().

Fixes: c0c0461e3a ("target/loongarch: Add avail_64 to check la64-only instructions")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Feiyang Chen <chris.chenfeiyang@gmail.com>
Message-Id: <20240628033357.50027-1-chris.chenfeiyang@gmail.com>
Signed-off-by: Song Gao <gaosong@loongson.cn>
2024-07-12 09:41:18 +08:00
Bibo Mao d38e31ef74 target/loongarch/kvm: Add software breakpoint support
With KVM virtualization, debug exception is injected to guest kernel
rather than host for normal break intruction. Here hypercall
instruction with special code is used for sw breakpoint usage,
and detailed instruction comes from kvm kernel with user API
KVM_REG_LOONGARCH_DEBUG_INST.

Now only software breakpoint is supported, and it is allowed to
insert/remove software breakpoint. We can debug guest kernel with gdb
method after kernel is loaded, hardware breakpoint will be added in later.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Tested-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240607035016.2975799-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
2024-07-12 09:41:18 +08:00
Richard Henderson 7f49089158 target/arm: Convert PMULL to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240709000610.382391-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-11 11:41:34 +01:00
Richard Henderson f7a8456586 target/arm: Convert ADDHN, SUBHN, RADDHN, RSUBHN to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240709000610.382391-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-11 11:41:34 +01:00
Richard Henderson 26cb9dbed8 target/arm: Convert SADDW, SSUBW, UADDW, USUBW to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240709000610.382391-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-11 11:41:34 +01:00
Richard Henderson 7575c5710c target/arm: Convert SQDMULL, SQDMLAL, SQDMLSL to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240709000610.382391-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-11 11:41:34 +01:00
Richard Henderson eb191187f6 target/arm: Convert SADDL, SSUBL, SABDL, SABAL, and unsigned to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240709000610.382391-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-11 11:41:34 +01:00
Richard Henderson 97b06ab705 target/arm: Convert SMULL, UMULL, SMLAL, UMLAL, SMLSL, UMLSL to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240709000610.382391-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-11 11:41:34 +01:00
Peter Maydell 4f7b1ecba8 target: Set TCGCPUOps::cpu_exec_halt to target's has_work implementation
Currently the TCGCPUOps::cpu_exec_halt method is optional, and if it
is not set then the default is to call the CPUClass::has_work
method (which has an identical function signature).

We would like to make the cpu_exec_halt method mandatory so we can
remove the runtime check and fallback handling.  In preparation for
that, make all the targets which don't need special handling in their
cpu_exec_halt set it to their cpu_has_work implementation instead of
leaving it unset.  (This is every target except for arm and i386.)

In the riscv case this requires us to make the function not
be local to the source file it's defined in.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-11 11:41:34 +01:00
Peter Maydell fcee3707eb target/arm: Set arm_v7m_tcg_ops cpu_exec_halt to arm_cpu_exec_halt()
In commit a96edb687e we set the cpu_exec_halt field of the
TCGCPUOps arm_tcg_ops to arm_cpu_exec_halt(), but we left the
arm_v7m_tcg_ops struct unchanged.  That isn't wrong, because for
M-profile FEAT_WFxT doesn't exist and the default handling for "no
cpu_exec_halt method" is correct, but it's perhaps a little
confusing.  We would also like to make setting the cpu_exec_halt
method mandatory.

Initialize arm_v7m_tcg_ops cpu_exec_halt to the same function we use
for A-profile.  (On M-profile we never set up the wfxt timer so there
is no change in behaviour here.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-11 11:41:34 +01:00
Richard Henderson efceb7d2bd target/arm: Use cpu_env in cpu_untagged_addr
In a completely artifical memset benchmark object_dynamic_cast_assert
dominates the profile, even above guest address resolution and
the underlying host memset.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240702154911.1667418-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-11 11:41:33 +01:00
Peter Maydell a8ab8706d4 target/arm: Allow FPCR bits that aren't in FPSCR
In order to allow FPCR bits that aren't in the FPSCR (like the new
bits that are defined for FEAT_AFP), we need to make sure that writes
to the FPSCR only write to the bits of FPCR that are architecturally
mapped, and not the others.

Implement this with a new function vfp_set_fpcr_masked() which
takes a mask of which bits to update.

(We could do the same for FPSR, but we leave that until we actually
are likely to need it.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-10-peter.maydell@linaro.org
2024-07-11 11:41:33 +01:00
Peter Maydell db397a81ee target/arm: Rename FPSR_MASK and FPCR_MASK and define them symbolically
Now that we store FPSR and FPCR separately, the FPSR_MASK and
FPCR_MASK macros are slightly confusingly named and the comment
describing them is out of date.  Rename them to FPSCR_FPSR_MASK and
FPSCR_FPCR_MASK, document that they are the mask of which FPSCR bits
are architecturally mapped to which AArch64 register, and define them
symbolically rather than as hex values.  (This latter requires
defining some extra macros for bits which we haven't previously
defined.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-9-peter.maydell@linaro.org
2024-07-11 11:41:33 +01:00
Peter Maydell a26db547b7 target/arm: Rename FPCR_ QC, NZCV macros to FPSR_
The QC, N, Z, C, V bits live in the FPSR, not the FPCR. Rename the
macros that define these bits accordingly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240628142347.1283015-8-peter.maydell@linaro.org
2024-07-11 11:41:33 +01:00