LAM uses CR3[61] and CR3[62] to configure/enable LAM on user pointers.
LAM uses CR4[28] to configure/enable LAM on supervisor pointers.
For CR3 LAM bits, no additional handling needed:
- TCG
LAM is not supported for TCG of target-i386. helper_write_crN() and
helper_vmrun() check max physical address bits before calling
cpu_x86_update_cr3(), no change needed, i.e. CR3 LAM bits are not allowed
to be set in TCG.
- gdbstub
x86_cpu_gdb_write_register() will call cpu_x86_update_cr3() to update cr3.
Allow gdb to set the LAM bit(s) to CR3, if vcpu doesn't support LAM,
KVM_SET_SREGS will fail as other reserved bits.
For CR4 LAM bit, its reservation depends on vcpu supporting LAM feature or
not.
- TCG
LAM is not supported for TCG of target-i386. helper_write_crN() and
helper_vmrun() check CR4 reserved bit before calling cpu_x86_update_cr4(),
i.e. CR4 LAM bit is not allowed to be set in TCG.
- gdbstub
x86_cpu_gdb_write_register() will call cpu_x86_update_cr4() to update cr4.
Mask out LAM bit on CR4 if vcpu doesn't support LAM.
- x86_cpu_reset_hold() doesn't need special handling.
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240112060042.19925-3-binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linear Address Masking (LAM) is a new Intel CPU feature, which allows
software to use of the untranslated address bits for metadata.
The bit definition:
CPUID.(EAX=7,ECX=1):EAX[26]
Add CPUID definition for LAM.
Note LAM feature is not supported for TCG of target-i386, LAM CPIUD bit
will not be added to TCG_7_1_EAX_FEATURES.
More info can be found in Intel ISE Chapter "LINEAR ADDRESS MASKING(LAM)"
https://cdrdv2.intel.com/v1/dl/getContent/671368
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Co-developed-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240112060042.19925-2-binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 32-bit AAM/AAD opcodes are using helpers that read and write flags and
env->regs[R_EAX]. Clean them up so that the table correctly includes AX
as a 16-bit input and output.
No real reason to do it to be honest, but they are nice one-output helpers
and it removes the masking of env->regs[R_EAX] that generic load/writeback
code already does.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123912.608497-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_rot_carry and gen_rot_overflow are meant to be called with count == NULL
if the count cannot be zero. However this is not done in gen_ROL and gen_ROR,
and writing everywhere "can_be_zero ? count : NULL" is burdensome and less
readable. Just pass can_be_zero as a separate argument.
gen_RCL and gen_RCR use a conditional branch to skip the computation
if count is zero, so they can pass false unconditionally to gen_rot_overflow.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123914.608516-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Use TCG_COND_TST where applicable.
- Use CF_BP_PAGE instead of a local breakpoint search.
- Clean up IAOQ handling during translation.
- Implement CF_PCREL.
- Implement PSW.B.
- Implement PSW.X.
- Log cpu state on interrupt and rfi.
-----BEGIN PGP SIGNATURE-----
iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEgnwdHHJpY2hhcmQu
aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+43gf8CakQdMSqfGV2nGP+
7wWZOAV04IyfkJ38F/CH0ihUkblEOzXJ1shTFkrHEw257j0D10MctSSbjrqz5BwU
obQcwoVlxzTGXqzhkZ6wagkcqjv3TtlPtznZIk6JssdlrtwIKDmE2/3t1dzHnyBD
WTrS0SK3YvVRovq/ai51raUbiBsNq7XG3skHEsMKsFxp4EaDP5JTbputdQWdffjh
TBmXImhHC3gm09KWIUZwfEBHlaa7YXk2orzB8kBE8S2kQj9vrGXEaC4jYnBcQLPw
NDDkBYRqxHYQr0vIAHee+5cUgt1jDBr5rXnAnJwzK0wyEEc4Mi4OTPhNE604iu2y
SDxS8Q==
=A4Qf
-----END PGP SIGNATURE-----
Merge tag 'pull-hppa-20240515' of https://gitlab.com/rth7680/qemu into staging
target/hppa:
- Use TCG_COND_TST where applicable.
- Use CF_BP_PAGE instead of a local breakpoint search.
- Clean up IAOQ handling during translation.
- Implement CF_PCREL.
- Implement PSW.B.
- Implement PSW.X.
- Log cpu state on interrupt and rfi.
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEgnwdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+43gf8CakQdMSqfGV2nGP+
# 7wWZOAV04IyfkJ38F/CH0ihUkblEOzXJ1shTFkrHEw257j0D10MctSSbjrqz5BwU
# obQcwoVlxzTGXqzhkZ6wagkcqjv3TtlPtznZIk6JssdlrtwIKDmE2/3t1dzHnyBD
# WTrS0SK3YvVRovq/ai51raUbiBsNq7XG3skHEsMKsFxp4EaDP5JTbputdQWdffjh
# TBmXImhHC3gm09KWIUZwfEBHlaa7YXk2orzB8kBE8S2kQj9vrGXEaC4jYnBcQLPw
# NDDkBYRqxHYQr0vIAHee+5cUgt1jDBr5rXnAnJwzK0wyEEc4Mi4OTPhNE604iu2y
# SDxS8Q==
# =A4Qf
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 15 May 2024 11:38:04 AM CEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-hppa-20240515' of https://gitlab.com/rth7680/qemu: (43 commits)
target/hppa: Log cpu state on return-from-interrupt
target/hppa: Log cpu state at interrupt
target/hppa: Implement CF_PCREL
target/hppa: Adjust priv for B,GATE at runtime
target/hppa: Drop tlb_entry return from hppa_get_physical_address
target/hppa: Implement PSW_X
target/hppa: Implement PSW_B
target/hppa: Manage PSW_X and PSW_B in translator
target/hppa: Split PSW X and B into their own field
target/hppa: Improve hppa_cpu_dump_state
target/hppa: Do not mask in copy_iaoq_entry
target/hppa: Store full iaoq_f and page offset of iaoq_b in TB
linux-user/hppa: Force all code addresses to PRIV_USER
target/hppa: Use delay_excp for conditional trap on overflow
target/hppa: Use delay_excp for conditional traps
target/hppa: Introduce DisasDelayException
target/hppa: Remove cond_free
target/hppa: Use TCG_COND_TST* in trans_ftest
target/hppa: Use registerfields.h for FPSR
target/hppa: Use TCG_COND_TST* in trans_bb_imm
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that the groundwork has been laid, enabling CF_PCREL within the
translator proper is a simple matter of updating copy_iaoq_entry
and install_iaq_entries.
We also need to modify the unwind info, since we no longer have
absolute addresses to install.
As expected, this reduces the runtime overhead of compilation when
running a Linux kernel with address space randomization enabled.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not compile in the priv change based on the first translation;
look up the PTE at execution time. This is required for CF_PCREL,
where a page may be mapped multiple times with different attributes.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The return-by-reference is never used.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use PAGE_WRITE_INV to temporarily enable write permission
on for a given page, driven by PSW_X being set.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PSW_B causes B,GATE to trap as an illegal instruction, removing our
previous sequential execution test that was merely an approximation.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PSW_X is cleared after every instruction, and only set by RFI.
PSW_B is cleared after every non-branch, or branch not taken,
and only set by taken branches. We can clear both bits with a
single store, at most once per TB. Taken branches set PSW_B,
at most once per TB.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Generally, both of these bits are cleared at the end of each
instruction. By separating these, we will be able to clear
both with a single insn, instead of 2 or 3.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Print both raw IAQ_Front and IAQ_Back as well as the GVAs.
Print control registers in system mode.
Print floating point registers if CPU_DUMP_FPU.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As with loads and stores, code offsets are kept intact until the
full gva is formed. In qemu, this is in cpu_get_tb_cpu_state.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for CF_PCREL. store the iaoq_f in 3 parts: high
bits in cs_base, middle bits in pc, and low bits in priv.
For iaoq_b, set a bit for either of space or page differing,
else the page offset.
Install iaq entries before goto_tb. The change to not record
the full direct branch difference in TB means that we have to
store at least iaoq_b before goto_tb. But since a later change
to enable CF_PCREL will require both iaoq_f and iaoq_b to be
updated before goto_tb, go ahead and update both fields now.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The kernel does this along the return path to user mode.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Allow an exception to be emitted at the end of the TranslationBlock,
leaving only the conditional branch inline. Use it for simple
exception instructions like break, which happen to be nullified.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we do not need to free tcg temporaries, the only
thing cond_free does is reset the condition to never.
Instead, simply write a new condition over the old, which
may be simply cond_make_f() for the never condition.
The do_*_cond functions do the right thing with c or cf == 0,
so there's no need for a special case anymore.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Define all of the context dependent field definitions.
Use FIELD_EX32 and FIELD_DP32 with named fields instead
of extract32 and deposit32 with raw constants.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can directly test bits of a 32-bit comparison without
zero or sign-extending an intermediate result.
We can directly test bit 0 for odd/even.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can directly test bits of a 32-bit comparison without
zero or sign-extending an intermediate result.
We can directly test bit 0 for odd/even.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use 'v' for a variable that needs copying, 't' for a temp that
doesn't need copying, and 'i' for an immediate, and use this
naming for both arguments of the comparison. So:
cond_make_tmp -> cond_make_tt
cond_make_0_tmp -> cond_make_ti
cond_make_0 -> cond_make_vi
cond_make -> cond_make_vv
Pass 0 explictly, rather than implicitly in the function name.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is a first step in enabling CF_PCREL, but for now
we regenerate the absolute address before writeback.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Wrap offset and space together in one structure, ensuring
that they're copied together as required.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This simplifies callers, which might otherwise have
to make another copy.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Using umax is clearer than the same operation using movcond.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This allows unification of BE, BLR, BV, BVE with a common helper.
Since we can now track space with IAQ_Next, we can now let the
TranslationBlock continue across the delay slot with BE, BVE.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move space assighments to a central location.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add variable to track space changes to IAQ. So far, no such changes
are introduced, but the new checks vs ctx->iasq_b may eliminate an
unnecessary copy to cpu_iasq_f with e.g. BLR.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Minimize the amount of code in hppa_tr_translate_insn advancing the
insn queue for the next insn. Move the goto_tb path to hppa_tr_tb_stop.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We no longer have to allocate a temp and perform an
addition before translation of the rest of the insn.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a common routine for writing the return address.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of two separate cpu_iaoq_entry calls, use one call to update
both IAQ_Front and IAQ_Back. Simplify with an argument combination
that automatically handles a simple increment from Front to Back.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The generic tcg driver will have already checked for breakpoints.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Simplify the function by not attempting a conditional move
on the branch destination -- just use nullify_over normally.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass a displacement instead of an absolute value.
In trans_be, remove the user-only do_dbranch case. The branch we are
attempting to optimize is to the zero page, which is perforce on a
different page than the code currently executing, which means that
we will *not* use a goto_tb. Use a plain indirect branch instead,
which is what we got out of the attempted direct branch anyway.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Share this check between gen_goto_tb and hppa_tr_translate_insn.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This function is for log_pc(), which needs to produce a
similar result to cpu_get_tb_cpu_state().
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>