We will need a backend interface for performing 16-bit sign-extend.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for performing 8-bit zero-extend.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for performing 8-bit sign-extend.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For both _CALL_SYSV and _CALL_DARWIN, return is by reference,
not in 4 integer registers. For _CALL_SYSV, argument is also
by reference.
This error resulted in
$ ./qemu-system-i386 -nographic
qemu-system-i386: tcg/ppc/tcg-target.c.inc:185: \
tcg_target_call_oarg_reg: Assertion `slot >= 0 && slot <= 1' failed.
Fixes: 5427a9a760 ("tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128")
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The return is by reference, not in 4 integer registers.
This error resulted in
qemu-system-i386: tcg/mips/tcg-target.c.inc:140: \
tcg_target_call_oarg_reg: Assertion `slot >= 0 && slot <= 1' failed.
Fixes: 5427a9a760 ("tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can arrive here on _WIN64 because Int128 is passed by reference.
Change the assert to check that the immediate is in range,
instead of attempting to check the host ABI.
Fixes: 6a6d772e30 ("tcg: Introduce tcg_out_addi_ptr")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1581
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Something is wrong with this code, and also wrong with gdb on the
sparc systems to which I have access, so I cannot debug it either.
Disable for now, so the release is not broken.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qemu-user can hang in a multi-threaded fork. One common
reason is that when creating a TB, between fork and exec
we manipulate a GTree whose memory allocator (GSlice) is
not fork-safe.
Although POSIX does not mandate it, the system's allocator
(e.g. tcmalloc, libc malloc) is probably fork-safe.
Fix some of these hangs by using QTree, which uses the system's
allocator regardless of the Glib version that we used at
configuration time.
Tested with the test program in the original bug report, i.e.:
```
void garble() {
int pid = fork();
if (pid == 0) {
exit(0);
} else {
int wstatus;
waitpid(pid, &wstatus, 0);
}
}
void supragarble(unsigned depth) {
if (depth == 0)
return ;
std::thread a(supragarble, depth-1);
std::thread b(supragarble, depth-1);
garble();
a.join();
b.join();
}
int main() {
supragarble(10);
}
```
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/285
Reported-by: Valentin David <me@valentindavid.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Emilio Cota <cota@braap.org>
Message-Id: <20230205163758.416992-3-cota@braap.org>
[rth: Add QEMU_DISABLE_CFI for all callback using functions.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we call qemu_plugin_disable_mem_helpers in cpu_tb_exec,
we don't need to do this in generated code as well.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230310195252.210956-3-richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230315174331.2959-13-alex.bennee@linaro.org>
Reviewed-by: Emilio Cota <cota@braap.org>
These functions are no longer used.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace with tcg_constant_vec*.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These three instances got missed in previous conversion.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the tcg_temp_free_* and tcg_temp_ebb_new_* declarations
and inlines to the new header. These are private to the
implementation, and will prevent tcg_temp_free_* from creeping
back into the guest front ends.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since all temps allocated by guest front-ends are now TEMP_TB,
and we don't recycle TEMP_TB, there's no point in requiring
that the front-ends free the temps at all. Begin by dropping
the inner-most checks that all temps have been freed.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While we do not include these in tcg_target_reg_alloc_order,
and therefore they ought never be allocated, it seems safer
to mark them reserved as well.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the two uses of asm to expand xgetbv with an inline function.
Since one of the two has been using the mnemonic, assume that the
comment about "older versions of the assember" is obsolete, as even
that is 4 years old.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove the first label and redirect all uses to the second.
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This allows us to easily find all branches that use a label.
Since 'refs' is only tested vs zero, remove it and test for
an empty list instead. Drop the use of bitfields, which had
been used to pack refs into a single 32-bit word.
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When CONFIG_PROFILER is set there are various undefined references to
profile_getclock. Include the header which defines this function.
For example:
../tcg/tcg.c: In function ‘tcg_gen_code’:
../tcg/tcg.c:4905:51: warning: implicit declaration of function ‘profile_getclock’ [-Wimplicit-function-declaration]
4905 | qatomic_set(&prof->opt_time, prof->opt_time - profile_getclock());
| ^~~~~~~~~~~~~~~~
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230303084948.3351546-1-rjones@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reusing TEMP_TB interferes with detecting whether the
temp can be adjusted to TEMP_EBB.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All of these have obvious and quite local scope.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While the argument can only be TEMP_EBB or TEMP_TB,
it's more obvious this way.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
TEMP_NORMAL is a subset of TEMP_EBB. Promote single basic
block temps to single extended basic block.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Attempt to reduce the lifetime of TEMP_TB.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This makes it easier to assign blame with perf.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use TEMP_TB as that is more explicit about the default
lifetime of the data. While "global" and "local" used
to be contrasting, we have more lifetimes than that now.
Do not yet rename tcg_temp_local_new_*, just the enum.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Just because the label reference count is more than 1 does
not mean we cannot remove a branch-to-next. By doing this
first, the label reference count may drop to 0, and then
the label itself gets removed as before.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Change the temps_in_use check to use assert not fprintf.
Move the assert for double-free before the check for count,
since that is the more immediate problem.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221219170806.60580-3-philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221219170806.60580-2-philmd@linaro.org>
This commit was created with scripts/clean-includes.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230202133830.2152150-19-armbru@redhat.com>
'offset' should be bits [23:5] of LDR instruction, rather than [4:0].
Fixes: d59d83a1c3 ("tcg/aarch64: Reorg goto_tb implementation")
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Normally this is automatically handled by the CF_PARALLEL checks
with in tcg_gen_atomic_cmpxchg_i{32,64}, but x86 has a special
case of !PREFIX_LOCK where it always wants the non-atomic version.
Split these out so that x86 does not have to roll its own.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will allow targets to avoid rolling their own.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These are not yet considering atomicity of the 16-byte value;
this is a direct replacement for the current target code which
uses a pair of 8-byte operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add code generation functions for data movement between
TCGv_i128 (mov) and to/from TCGv_i64 (concat, extract).
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This enables allocation of i128. The type is not yet
usable, as we have not yet added data movement ops.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fill in the parameters for the host ABI for Int128 for
those backends which require no extra modification.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fill in the parameters for libffi for Int128.
Adjust the interpreter to allow for 16-byte return values.
Adjust tcg_out_call to record the return value length.
Call parameters are no longer all the same size, so we
cannot reuse the same call_slots array for every function.
Compute it each time now, but only fill in slots required
for the call we're about to make.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We expect the backend to require register pairs in
host-endian ordering, thus for big-endian the first
register of a pair contains the high part.
We were forcing R0 to contain the low part for calls.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fill in the parameters for the host ABI for Int128.
Adjust tcg_target_call_oarg_reg for _WIN64, and
tcg_out_call for i386 sysv. Allow TCG_TYPE_V128
stores without AVX enabled.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will be used by _WIN64 to return i128. Not yet used,
because allocation is not yet enabled.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the flat array tcg_target_call_oarg_regs[] with
a function call including the TCGCallReturnKind.
Extend the set of registers for ARM to r0-r3 to match the ABI:
https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#result-return
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These will be used by some hosts, both 32 and 64-bit, to pass and
return i128. Not yet used, because allocation is not yet enabled.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Implement the function for arm, i386, and s390x, which will use it.
Add stubs for all other backends.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When allocating a temp to the stack frame, consider the
base type and allocate all parts at once.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Many hosts pass and return 128-bit quantities like sequential
64-bit quantities. Treat this just like we currently break
down 64-bit quantities for a 32-bit host.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Correctly handle large types while lowering.
Fixes: fac87bd2a4 ("tcg: Add temp_subindex to TCGTemp")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There are actually a whole bunch of helpers that don't affect memory
that we shouldn't instrument. They are helpfully identified by the
TCG_CALL_NO_SIDE_EFFECTS flag which marks out lookup_tb_ptr as well as
a lot of the maths helpers. To avoid the string compare we introduce a
new flag for plugin internals so we skip that too.
Related: #1381
Signed-off-by: Emilio Cota <cota@braap.org>
Message-Id: <20230108164731.61469-4-cota@braap.org>
[AJB: updated to skip all no SE plugins, add flag for plugin helper]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230124180127.1881110-34-alex.bennee@linaro.org>
The old implementation replaces two insns, swapping between
b <dest>
nop
and
pcaddu18i tmp, <dest>
jirl zero, tmp, <dest> & 0xffff
There is a race condition in which a thread could be stopped at
the jirl, i.e. with the top of the address loaded, and when
restarted we have re-linked to a different TB, so that the top
half no longer matches the bottom half.
Note that while we never directly re-link to a different TB, we
can link, unlink, and link again all while the stopped thread
remains stopped.
The new implementation replaces only one insn, swapping between
b <dest>
and
pcadd tmp, <jmp_addr>
falling through to load the address from tmp, and branch.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Take the w^x split into account when computing the
pc-relative distance to an absolute pointer.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out a helper function, tcg_out_setcond_int, which
does not always produce the complete boolean result, but
returns a set of flags to do so.
Accept all int32_t as constant input, so that LE/GT can
adjust the constant to LT.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Adjust the constraints to allow any int32_t for immediate
addition. Split immediate adds into addu16i + addi, which
covers quite a lot of the immediate space. For the hole in
the middle, load the constant into TMP0 instead.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Regenerate with ADDU16I included:
$ cd loongarch-opcodes/scripts/go
$ go run ./genqemutcgdefs > $QEMU/tcg/loongarch64/tcg-insn-defs.c.inc
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Although we still can't use ldrd and strd for all operations,
increase the chances by getting the register allocation correct.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have a test for one of TCG_TARGET_HAS_mulu2_i32 or
TCG_TARGET_HAS_muluh_i32 being defined, but the test
became non-functional when we changed to always define
all of these macros.
Replace this with a build-time test in tcg_gen_mulu2_i32.
Fixes: 25c4d9cc84 ("tcg: Always define all of the TCGOpcode enum members.")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1435
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We failed to update this with the w^x split, so misses the fact
that true pc-relative offsets are usually small.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230117230415.354239-1-richard.henderson@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Now that tcg can handle direct and indirect goto_tb simultaneously,
we can optimistically leave space for a direct branch and fall back
to loading the pointer from the TB for an indirect branch.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that tcg can handle direct and indirect goto_tb
simultaneously, we can optimistically leave space for
a direct branch and fall back to loading the pointer
from the TB for an indirect branch.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The old sparc64 implementation may replace two insns, which leaves
a race condition in which a thread could be stopped at a PC in the
middle of the sequence, and when restarted does not see the complete
address computation and branches to nowhere.
The new implemetation replaces only one insn, swapping between a
direct branch and a direct call. The TCG_REG_TB register is loaded
from tb->jmp_target_addr[] in the delay slot.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is always true for sparc64, so this is dead since 3a5f6805c7.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The old ppc64 implementation replaces 2 or 4 insns, which leaves a race
condition in which a thread could be stopped at a PC in the middle of
the sequence, and when restarted does not see the complete address
computation and branches to nowhere.
The new implemetation replaces only one insn, swapping between
b <dest>
and
mtctr r31
falling through to a general-case indirect branch.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The old implementation replaces two insns, swapping between
b <dest>
nop
br x30
and
adrp x30, <dest>
addi x30, x30, lo12:<dest>
br x30
There is a race condition in which a thread could be stopped at
the PC of the second insn, and when restarted does not see the
complete address computation and branches to nowhere.
The new implemetation replaces only one insn, swapping between
b <dest>
br tmp
and
ldr tmp, <jmp_addr>
br tmp
Reported-by: hev <r@hev.cc>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We now have the option to generate direct or indirect
goto_tb depending on the dynamic displacement, thus
the define is no longer necessary or completely accurate.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Install empty versions for !TCG_TARGET_HAS_direct_jump hosts.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace 'tc_ptr' and 'addr' with 'tb' and 'n'.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Stop overloading jmp_target_arg for both offset and address,
depending on TCG_TARGET_HAS_direct_jump. Instead, add a new
field to hold the jump insn offset and always set the target
address in jmp_target_addr[]. This will allow a tcg backend
to use either direct or indirect depending on displacement.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This can replace four other variables that are references
into the TranslationBlock structure.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will shortly be used for more than reset.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The INDEX_op_goto_tb opcode needs no register allocation.
Split out a dedicated helper function for it.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Similar to the existing set_jmp_reset_offset. Include the
rw->rx address space conversion done by arm and s390x, and
forgotten by mips and riscv.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Similar to the existing set_jmp_reset_offset. Move any assert for
TCG_TARGET_HAS_direct_jump into the new function (which now cannot
be build-time). Will be unused if TCG_TARGET_HAS_direct_jump is
constant 0, but we can't test for constant in the preprocessor,
so just mark it G_GNUC_UNUSED.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Test TCG_TARGET_HAS_direct_jump instead of testing an
implementation pointer.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The INDEX_op_exit_tb opcode needs no register allocation.
Split out a dedicated helper function for it.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add ability to dump /tmp/perf-<pid>.map and jit-<pid>.dump.
The first one allows the perf tool to map samples to each individual
translation block. The second one adds the ability to resolve symbol
names, line numbers and inspect JITed code.
Example of use:
perf record qemu-x86_64 -perfmap ./a.out
perf report
or
perf record -k 1 qemu-x86_64 -jitdump ./a.out
DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted
perf report -i perf.data.jitted
Co-developed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
Co-developed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230112152013.125680-4-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
- drop support for pre-z196 cpus (eol before 2017)
- add support for misc-instruction-extensions-3
- misc cleanups
-----BEGIN PGP SIGNATURE-----
iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmO5I68dHHJpY2hhcmQu
aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8qOAgAgfLyzz5yajzVD0TZ
ufEw07Jlgw2th4Il4FZ3cKVaO9X1Sec9yTLaDXSByKJzTHQXso4Y7hBHciKjC8s4
LatuVWJaUdYUHry39iDgjvKb+ZrQ1ajve1sDNrIrCj5ITqoGT23f1NxQSS+0MhPB
cMVIEmiblPlynKDtOYBE9lcrkUFOBUqxqnV734AP1gOVyJOMjaVtm9T4wKIipC2M
UOXKOo/YPYeygSUFZdrmjoaJE0qCyJbpQqVDGUPXN2Md7UADfVhIo9jrbnvmr7BR
OfWDPuuVdt5g7P+TzLe0BXKPmbyqGi4vnjwPWNSl4ow+8xbxQ8fsSjpyx9kKd1i1
swoGSQ==
=gWI0
-----END PGP SIGNATURE-----
Merge tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu into staging
tcg/s390x improvements:
- drop support for pre-z196 cpus (eol before 2017)
- add support for misc-instruction-extensions-3
- misc cleanups
# gpg: Signature made Sat 07 Jan 2023 07:47:59 GMT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu: (27 commits)
tcg/s390x: Avoid the constant pool in tcg_out_movi
tcg/s390x: Cleanup tcg_out_movi
tcg/s390x: Tighten constraints for 64-bit compare
tcg/s390x: Implement ctpop operation
tcg/s390x: Use tgen_movcond_int in tgen_clz
tcg/s390x: Support SELGR instruction in movcond
tcg/s390x: Generalize movcond implementation
tcg/s390x: Create tgen_cmp2 to simplify movcond
tcg/s390x: Support MIE3 logical operations
tcg/s390x: Tighten constraints for and_i64
tcg/s390x: Tighten constraints for or_i64 and xor_i64
tcg/s390x: Issue XILF directly for xor_i32
tcg/s390x: Support MIE2 MGRK instruction
tcg/s390x: Support MIE2 multiply single instructions
tcg/s390x: Distinguish RIE formats
tcg/s390x: Distinguish RRF-a and RRF-c formats
tcg/s390x: Use LARL+AGHI for odd addresses
tcg/s390x: Remove DISTINCT_OPERANDS facility check
tcg/s390x: Remove FAST_BCR_SER facility check
tcg/s390x: Check for load-on-condition facility at startup
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Load constants in no more than two insns, which turns
out to be faster than using the constant pool.
Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge maybe_out_small_movi, as it no longer has additional users.
Use is_const_p{16,32}.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Give 64-bit comparison second operand a signed 33-bit immediate.
This is the smallest superset of uint32_t and int32_t, as used
by CLGFI and CGFI respectively. The rest of the 33-bit space
can be loaded into TCG_TMP0. Drop use of the constant pool.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is an older form that produces per-byte results,
and a newer form that produces per-register results.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reuse code from movcond to conditionally copy a2 to dest,
based on the condition codes produced by FLOGR.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The new select instruction provides two separate register inputs,
whereas the old load-on-condition instruction overlaps one of the
register inputs with the destination.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Generalize movcond to support pre-computed conditions, and the same
set of arguments at all times. This will be assumed by a following
patch, which needs to reuse tgen_movcond_int.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Return both regular and inverted condition codes from tgen_cmp2.
This lets us choose after the fact which comparision we want.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is andc, orc, nand, nor, eqv.
We can use nor for implementing not.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Let the register allocator handle such immediates by matching
only what one insn can achieve.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>