we cannot remove the call to gen_arith() in decode_RV32_64G() since it
is used to translate multiply instructions.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
this splits the 64-bit only instructions into its own decode file such
that we generate the decoder for these instructions only for the RISC-V
64 bit target.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
for now only LUI & AUIPC are decoded and translated. If decodetree fails, we
fall back to the old decoder.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
The current code assumes that we don't need to exit the TB
if a Data Cache Flush or Insert has happend. However, as we
have a shared Data/Instruction TLB, a Data cache flush also
flushes Instruction TLB entries, and a Data cache TLB insert
might also evict a Instruction TLB entry.
So exit the TB in all cases if Instruction translation is enabled.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-11-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-10-svens@stackframe.org>
[rth: Add required tlb flushing when prot id registers change.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The ODE software calls itlbp on existing TLB entries without
calling itlba first, so this seems to be valid.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-9-svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
b,gate does GR[t] ← cat(GR[t]{0..29},IAOQ_Front{30..31});
instead of saving the link address to register t.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-8-svens@stackframe.org>
[rth: Move link check outside of ifndef CONFIG_USER_ONLY;
use ctx->privilege; nullify the insn earlier.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
DIAG is usually only used by diagnostics software as it's CPU
specific. In most of the cases it's better to ignore it and log
a message that it's not implemented.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-7-svens@stackframe.org>
[rth: Free the nullify condition.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
HP ODE use rfi to set the Q bit, and i don't see anything in the
documentation that this is forbidden. So remove it.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-6-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To ease TLB debugging add a few trace events, which are disabled
by default so that there's no performance impact.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-5-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-4-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Assume the following sequence:
pitlbe r0(sr0,r0)
iitlba r4,(sr0,r0)
ldil L%3000000,r5
iitlbp r5,(sr0,r0)
This will purge the whole TLB and add an entry for page 0. However
the current TLB implementation in helper_iitlba() will store to
the last empty TLB entry, while helper_iitlbp() will write to the
first empty entry. That is because an empty entry will match address
0 in helper_iitlba()
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-3-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When one of the source registers is the same as the destination register,
the source register gets overwritten with the destionation value before
do_add_sv() is called, which leads to unexpection condition matches.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-2-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We got away with eliding this check when target/hppa was user-only,
but missed adding this check when adding system support.
Fixes an early crash in the HP-UX 11 installer.
Reported-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20190309214255.9952-3-f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The t0 tcg_temp register is now unused, remove it.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20190309214255.9952-2-f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We now have enough support to boot a PowerNV machine with a POWER9
processor. Allow HV mode on POWER9.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-16-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that all VSX registers are stored in host endian order, there is no need
to go via different accessors depending upon the register number. Instead we
introduce vsr64_offset() and use it directly from within get_cpu_vsr{l,h}() and
set_cpu_vsr{l,h}().
This also allows us to rewrite avr64_offset() and fpr_offset() in terms of the
new vsr64_offset() function to more clearly express the relationship between the
VSX, FPR and VMX registers, and also remove vsrl_offset() which is no longer
required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-8-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When VSX support was initially added, the fpr registers were added at
offset 0 of the VSR register and the vsrl registers were added at offset
1. This is in contrast to the VMX registers (the last 32 VSX registers) which
are stored in host-endian order.
Switch the fpr/vsrl registers so that the lower 32 VSX registers are now also
stored in host endian order to match the VMX registers. This ensures that TCG
vector operations involving mixed VMX and VSX registers will function
correctly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
By using the VsrD macro in avr64_offset() the same offset calculation can be
used regardless of the host endian. This allows get_avr64() and set_avr64() to
be simplified accordingly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-6-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All TCG vector operations require pointers to the base address of the vector
rather than separate access to the top and bottom 64-bits. Convert the VMX TCG
instructions to use a new avr_full_offset() function instead of avr64_offset()
which can then itself be written as a simple wrapper onto vsr_full_offset().
This same function can also reused in cpu_avr_ptr() to avoid having more than
one copy of the offset calculation logic.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-5-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It isn't possible to include internal.h from cpu.h so move the Vsr* macros
into cpu.h alongside the other VMX/VSX register access functions.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-4-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead of having multiple copies of the offset calculation logic, move it to a
single vsrl_offset() function.
This commit also renames the existing get_vsr()/set_vsr() functions to
get_vsrl()/set_vsrl() which better describes their purpose.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead of having multiple copies of the offset calculation logic, move it to a
single fpr_offset() function.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The H_CALL H_PAGE_INIT can be used to zero or copy a page of guest
memory. Enable the in-kernel H_PAGE_INIT handler.
The in-kernel handler takes half the time to complete compared to
handling the H_CALL in userspace.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190306060608.19935-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There are four scenarios being handled in this function:
- single stepping
- hardware breakpoints
- software breakpoints
- fallback (no debug supported)
A future patch will add code to handle specific single step and
software breakpoints cases so let's split each scenario into its own
function now to avoid hurting readability.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190228225759.21328-5-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is in preparation for a refactoring of the kvm_handle_debug
function in the next patch.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20190228225759.21328-4-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce a new spapr_cap SPAPR_CAP_CCF_ASSIST to be used to indicate
the requirement for a hw-assisted version of the count cache flush
workaround.
The count cache flush workaround is a software workaround which can be
used to flush the count cache on context switch. Some revisions of
hardware may have a hardware accelerated flush, in which case the
software flush can be shortened. This cap is used to set the
availability of such hardware acceleration for the count cache flush
routine.
The availability of such hardware acceleration is indicated by the
H_CPU_CHAR_BCCTR_FLUSH_ASSIST flag being set in the characteristics
returned from the KVM_PPC_GET_CPU_CHAR ioctl.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301031912.28809-2-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr_cap SPAPR_CAP_IBS is used to indicate the level of capability
for mitigations for indirect branch speculation. Currently the available
values are broken (default), fixed-ibs (fixed by serialising indirect
branches) and fixed-ccd (fixed by diabling the count cache).
Introduce a new value for this capability denoted workaround, meaning that
software can work around the issue by flushing the count cache on
context switch. This option is available if the hypervisor sets the
H_CPU_BEHAV_FLUSH_COUNT_CACHE flag in the cpu behaviours returned from
the KVM_PPC_GET_CPU_CHAR ioctl.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301031912.28809-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Implement support to allow KVM guests to take advantage of the large
decrementer introduced on POWER9 cpus.
To determine if the host can support the requested large decrementer
size, we check it matches that specified in the ibm,dec-bits device-tree
property. We also need to enable it in KVM by setting the LPCR_LD bit in
the LPCR. Note that to do this we need to try and set the bit, then read
it back to check the host allowed us to set it, if so we can use it but
if we were unable to set it the host cannot support it and we must not
use the large decrementer.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190301024317.22137-3-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Prior to POWER9 the decrementer was a 32-bit register which decremented
with each tick of the timebase. From POWER9 onwards the decrementer can
be set to operate in a mode called large decrementer where it acts as a
n-bit decrementing register which is visible as a 64-bit register, that
is the value of the decrementer is sign extended to 64 bits (where n is
implementation dependant).
The mode in which the decrementer operates is controlled by the LPCR_LD
bit in the logical paritition control register (LPCR).
>From POWER9 onwards the HDEC (hypervisor decrementer) was enlarged to
h-bits, also sign extended to 64 bits (where h is implementation
dependant). Note this isn't configurable and is always enabled.
On POWER9 the large decrementer and hdec are both 56 bits, as
represented by the lrg_decr_bits cpu class property. Since they are the
same size we only add one property for now, which could be extended in
the case they ever differ in the future.
We also add the lrg_decr_bits property for POWER5+/7/8 since it is used
to determine the size of the hdec, which is only generated on the
POWER5+ processor and later. On these processors it is 32 bits.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190301024317.22137-2-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Intel Processor Trace required CPUID[0x14] but the cpuid_level
have no change when create a kvm guest with
e.g. "-cpu qemu64,+intel-pt".
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1548805979-12321-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CPUID code will call kvm_arch_get_supported_cpuid() and, even though
it is undef kvm_enabled() so it never runs for user-mode emulators,
sometimes clang will not optimize it out at -O0.
That could be considered a compiler bug, however at -O0 we give it
a pass and just add the stubs.
Reported-by: Kamil Rytarowski <n54@gmx.com>
Tested-by: Kamil Rytarowski <n54@gmx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Combine all variant in a single handler. As source and destination
have different element sizes, we can't use gvec expansion. Expand
manually. Also watch out for overlapping source and destination
registers. Use a safe evaluation order depending on the operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-33-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Very similar to VECTOR LOAD WITH LENGTH, just the opposite direction.
Properly probe write access before modifying memory.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-32-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Similar to VECTOR LOAD MULTIPLE, just the opposite direction. Probe
write access first.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-31-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
As we only store one element, there is nothing to consider regarding
exceptions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-30-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Instead of checking e.g. the first access on every touched page, we should
check the actual access, otherwise we might get false positives when Low
Address Protection (LAP) is active. As probe_write() can only deal with
accesses to one page, we have to loop.
Use i64 for the length, although not needed - easier to reuse
TCG temps we already have in the translation functions where this will
be used. Also allow it to be used from other helpers.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-28-david@redhat.com>
[CH: add missing page_check_range()]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Load both elements signed and store them into the two 64 bit elements.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-27-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Provide an implementation based on i64 and on real host vectors.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-26-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Similar to VECTOR GATHER ELEMENT, but the other direction.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-25-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Like VECTOR REPLICATE, but the element to be replicated comes from an
immediate.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-24-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Replicate via the special gvec helper.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-23-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Read the whole input before modifying the destination vector.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-22-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Take care of overlying inputs and outputs by using a temporary vector.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-21-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is a big one. Luckily we only have a limited set of such nasty
instructions.
We'll implement all variants with helpers, except when sources and
the destination don't overlap for VECTOR PACK. Provide different helpers
when the cc is to be modified. We'll return the cc then via env->cc_op.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-20-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We cannot use gvec expansion as source and destination elements are
have different element numbers. So we'll expand using a fancy loop.
Also, we have to take care of overlapping source and destination
registers, therefore use a safe evaluation irder depending on the
operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-19-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We can reuse the helper introduced along with VECTOR LOAD TO BLOCK
BOUNDARY. We just have to take care of converting the highest index into
a length.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-18-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Fairly easy, just load from to gprs into a single vector.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-17-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Very similar to VECTOR LOAD GR FROM VR ELEMENT, just the opposite
direction. Also provide a fast path in case we don't care about the
register content.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-16-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Very similar to LOAD COUNT TO BLOCK BOUNDARY, but instead of only
calculating, the actual vector is loaded. Use a temporary vector to
not modify the real vector on exceptions. Initialize that one to zero,
to not leak any data. Provide a fast path if we're loading a full
vector.
As we don't have gvec ool handlers for single vectors, just calculate
the vector address manually.
We can reuse the helper later on for VECTOR LOAD WITH LENGTH. In fact,
we are going to name it "vll" right from the beginning, because that's
a better match.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-15-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Try to load the last element first. Access to the first element will
be checked afterwards. This way, we can guarantee that the vector is
not modified before we checked for all possible exceptions. (16 vectors
cannot cross more than two pages)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Fairly easy, zero out the vector before we load the desired element.
Load the element before touching the vector.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-13-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
To avoid an helper, we have to do the actual calculation of the element
address (offset in cpu_env + cpu_env) manually. Factor that out into
get_vec_element_ptr_i64(). The same logic will be reused for "VECTOR
LOAD VR ELEMENT FROM GR".
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-12-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Take care of properly sign-extending the immediate.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-11-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Fairly easy, load with desired size and store it into the right element.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We can use tcg_gen_gvec_dup_i64() to carry out the duplication.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-9-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When loading from memory, load both elements into temps first before
modifying the target vector
Loading with strange alingment from the end of the address space will
not properly wrap, we can ignore that for now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Add gen_gvec_dupi() for handling duplication of immediates, so it can
be reused later.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's optimize it for the common cases (setting a vector to zero or all
ones) - courtesy of Richard.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's start with a more involved one, but it is the first in the list
of vector support instructions (introduced with the vector facility).
Good thing is, we need a lot of basic infrastructure for this. Reading
and writing vector elements as well as checking element validity.
All vector instruction related translation functions will reside in
translate_vx.inc.c, to be included in translate.c - similar to how
other architectures handle it.
While at it, directly add some documentation (which contains parts about
things added in follow-up patches, but splitting this up does not make
too much sense). Also add ES_* defines heavily used later.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We'll have to read/write vector elements quite frequently from helpers.
The tricky bit is properly taking care of endianess. Handle it similar
to aarch64.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-4-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Check them at a central point. We'll use a new instruction flag to
flag all vector instructions (IF_VEC) and handle it very similar to
AFP, whereby we use another unused position in the PSW mask to store
the state of vector register enablement per translation block.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
These are the new instruction formats related to vector instructions as
up to the z14 (a.k.a. latest PoP).
As v2 appeares (like x2 in VRX) with d2/b2 in VRV, we have to assign it a
higher field number to avoid collisions.
Properly take care of the MSB (to be able to address 32 registers) for
each vector register field stored in the RXB field (Bit 36 - 30 for all
vector instructions). As we have 32 bit vector registers and the
"v" fields are only 4 bit in size, the 5th bit is stored in the RXB.
We use a new type to indicate that the MSB has to be fetched from the
RXB.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
There are some fields in our struct LowCore which apparently have
been copied from a very old version of the Linux kernel. These
fields are not architected in the "Principles of Operation", and
only used on these memory locations in Linux kernels older than
2.6.29. Newer Linux kernels moved the entries to different locations
or are not using them at all anymore. Thus we should never access
these fields from the QEMU side, so they should be removed.
While we're at it, also add a QEMU_BUILD_BUG_ON() statement to
assert that struct LowCore has the right size.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1551775581-27989-1-git-send-email-thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We can eliminate an extra TB in this case, which merely
loads a "return address" into rn.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For priv levels 1 & 2, we were doing so from do_ibranch_priv.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add the kvm_arm_get_max_vm_ipa_size() helper that returns the
number of bits in the IPA address space supported by KVM.
This capability needs to be known to create the VM with a
specific IPA max size (kvm_type passed along KVM_CREATE_VM ioctl.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20190304101339.25970-6-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This will allow sharing code that adjusts rmode beyond
the existing users.
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-10-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This decoding more closely matches the ARMv8.4 Table C4-6,
Encoding table for Data Processing - Register Group.
In particular, op2 == 0 is now more than just Add/sub (with carry).
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-7-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We do not need an out-of-line helper for manipulating bits in pstate.
While changing things, share the implementation of gen_ss_advance.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The EL0+UMA check is unique to DAIF. While SPSel had avoided the
check by nature of already checking EL >= 1, the other post v8.0
extensions to MSR (imm) allow EL0 and do not require UMA. Avoid
the unconditional write to pc and use raise_exception_ra to unwind.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Minimize the number of places that will need updating when
the virtual host extensions are added.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Found by inspection: Rn is the base register against which the
load began; I is the register within the mask being processed.
The exception return should of course be processed from the loaded PC.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190301202921.21209-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The floating-point extension facility implemented certain changes to
BFP, HFP and DFP instructions.
As we don't implement HFP/DFP, we can ignore those completely. Related
to BFP, the changes include
- SET BFP ROUNDING MODE (SRNMB) instruction
- BFP-rounding-mode field in the FPC register is changed to 3 bits
- CONVERT FROM LOGICAL instructions
- CONVERT TO LOGICAL instructions
- Changes (rounding mode + XxC) added to
-- CONVERT TO FIXED
-- CONVERT FROM FIXED
-- LOAD FP INTEGER
-- LOAD ROUNDED
-- DIVIDE TO INTEGER
For TCG, we don't implement DIVIDE TO INTEGER, and it is harder to
implement, so skip that. Also, as we don't implement PFPO, we can skip
changes to that as well. The other parts are now implemented, we can
indicate the facility.
z14 PoP mentions that "The floating-point extension facility is installed
in the z/Architecture architectural mode. When bit 37 is one, bit 42 is
also one.", meaning that the DFP (decimal-floating-point) facility also
has to be indicated. We can ignore that for now.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-16-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
"round to nearest with ties away from 0" maps to float_round_ties_away.
"round to prepare for shorter precision" maps to float_round_to_odd.
As all instructions properly check for valid rounding modes in translate.c
we can add an assert. Fix one missing empty line.
Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-15-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
With the floating-point extension facility, LOAD ROUNDED has
a rounding mode specification and the inexact-exception control (XxC).
Handle them just like e.g. LOAD FP INTEGER.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
With the floating-point extension facility
- CONVERT FROM LOGICAL
- CONVERT TO LOGICAL
- CONVERT TO FIXED
- CONVERT FROM FIXED
- LOAD FP INTEGER
have both, a rounding mode specification and the inexact-exception control
(XxC). Other instructions will be handled separatly.
Check for valid rounding modes and forward also the XxC (via m4). To avoid
a lot of boilerplate code and changes to the helpers, combine both, the
m3 and m4 field in a combined 32 bit TCG variable. Perform checks at
a central place, taking in account if the m3 or m4 field was ignore
before the floating-point extension facility was introduced.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-13-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Some instructions allow to suppress IEEE inexact exceptions.
z14 PoP, 9-23, "Suppression of Certain IEEE Exceptions"
IEEE-inexact-exception control (XxC): Bit 1 of
the M4 field is the XxC bit. If XxC is zero, recogni-
tion of IEEE-inexact exception is not suppressed;
if XxC is one, recognition of IEEE-inexact excep-
tion is suppressed.
Especially, handling for overflow/unerflow remains as is, inexact is
reported along
z14 PoP, 9-23, "Suppression of Certain IEEE Exceptions"
For example, the IEEE-inexact-exception control (XxC)
has no effect on the DXC; that is, the DXC for IEEE-
overflow or IEEE-underflow exceptions along with the
detail for exact, inexact and truncated, or inexact and
incremented, is reported according to the actual con-
dition.
Follow up patches will wire it correctly up for the applicable
instructions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-12-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We want to reuse this in the context of vector instructions. So use
better matching names and introduce s390_restore_bfp_rounding_mode().
While at it, add proper newlines.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-11-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's split handling of BFP/DFP rounding mode configuration. Also,
let's not reuse the sfpc handler, use a separate handler so we can
properly check for specification exceptions for SRNMB.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We already forward the 3 bits correctly in the translation functions. We
also have to handle them properly and check for specification
exceptions.
Setting an invalid rounding mode (BFP only, all DFP rounding modes)
results in a specification exception. Setting unassigned bits in the
fpc, results in a specification exception.
This fixes LOAD FPC (AND SIGNAL), SET FPC (AND SIGNAL). Also for,
SET BFP ROUNDING MODE, 3-bit rounding mode is now explicitly checked.
Note: TCG_CALL_NO_WG is required for sfpc handler, as we now inject
exceptions.
We won't be modeling abscence of the "floating-point extension facility"
for now, not necessary as most take the facility for granted without
checking.
z14 PoP, 9-23, "LOAD FPC"
When the floating-point extension facility is
installed, bits 29-31 of the second operand must
specify a valid BFP rounding mode and bits 6-7,
14-15, 24, and 28 must be zero; otherwise, a
specification exception is recognized.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-9-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The trap is triggered based on priority of the enabled signaling flags.
Only overflow and underflow allow a concurrent inexact exception.
z14 PoP, 9-33, Figure 9-21
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We can directly work on the uint64_t value, no need for a temporary
uint32_t value.
Also cleanup and shorten the comments.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
IEEE underflows are not reported when the mask bit is off and we don't
also have an inexact exception.
z14 PoP, 9-20, "IEEE Underflow":
An IEEE-underflow exception is recognized for an
IEEE target when the tininess condition exists and
either: (1) the IEEE-underflow mask bit in the FPC
register is zero and the result value is inexact, or (2)
the IEEE-underflow mask bit in the FPC register is
one.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Many things are wrong and some parts cannot be fixed yet. Fix what we
can fix easily and add two FIXMEs:
The fpc flags are not updated in case an exception is actually injected.
Inexact exceptions have to be handled separately, as they are the only
exceptions that can coexist with underflows and overflows.
I reread the horribly complicated chapters in the PoP at least 5 times
and hope I got it right.
For references:
- z14 PoP, 9-18, "IEEE Exceptions"
- z14 PoP, 19-9, Figure 19-8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We want to reuse that function in vector instruction context. While at it,
cleanup the code, using defines for magic values and avoiding the
handcrafted bit conversion.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's use the proper conversion functions now that we have them.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's detect normal and denormal ("subnormal") numbers reliably. Also
test for quiet NaN's. As only one class is possible, test common cases
first.
While at it, use a better check to test for the mask bits in the data
class mask. The data class mask has 12 bits, whereby bit 0 is the
leftmost bit and bit 11 the rightmost bit. In the PoP an easy to read
table with the numbers is provided for the VECTOR FP TEST DATA CLASS
IMMEDIATE instruction, the table for TEST DATA CLASS is more confusing
as it is based on 64 bit values.
Factor the checks out into separate functions, as they will also be
needed for floating point vector instructions. We can use a makro to
generate the functions.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-2-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Use a new CC helper to calculate the CC lazily if needed. While the
PoP mentions that "A 32-bit unsigned binary integer" is placed into the
first operand, there is no word telling that the other 32 bits (high
part) are left untouched. Maybe the other 32-bit are unpredictable.
So store 64 bit for now.
Bit magic courtesy of Richard.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190225200318.16102-8-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Nice trick to load a 32 bit value into vector element 0 (32 bit element
size) from memory, zeroing out element1. The short HFP to long HFP
conversion really only is a shift.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190225200318.16102-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Also properly wrap in 24bit mode. While at it, convert the comment (and
drop the comment about fundamental TCG optimizations).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190225200318.16102-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We'll use that a lot along with gvec helpers, to calculate the start
address of a vector.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190225200318.16102-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We will use s390x speak "Element Size" (es) for MO_8 == 0, MO_16 == 1
... Simple rename of variables.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190225200318.16102-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Will be needed, so add it to the format description.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190225200318.16102-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
If we have vector registers and the designation is not zero, we have
to try to write the vector registers. If the designation is zero or
if storing fails, we must not indicate validity. s390_build_validity_mcic()
automatically already sets validity if the vector instruction facility
is installed.
As long as we don't support the guarded-storage facility, the alignment
and size of the area is always 1024 bytes.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190222081153.14206-4-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Convert this to QEMU style.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190222081153.14206-3-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
As we will support vector instructions soon, and vector registers are
stored in 64bit host chunks, let's use cpu_to_be64. Same applies to the
guarded storage control block.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190222081153.14206-2-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
* add MHU and dual-core support to Musca boards
* refactor some VFP insns to be gated by ID registers
* Revert "arm: Allow system registers for KVM guests to be changed by QEMU code"
* Implement ARMv8.2-FHM extension
* Advertise JSCVT via HWCAP for linux-user
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAlx3wM8ZHHBldGVyLm1h
eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3t+yD/4hbg4UCNDNHvnHv5N0dwVo
xDnEwN8Ath5jhcIlwjB4sPg44wO1dTy9PXK75UskGbUXnJfl4VFQsTVOg6GELVPc
RJJ7S1hBjaipRxaS7tgBl+sE03JFSFniGaYuU5cpwxh62HWlZRBZ85+Pw3iNb9So
UgrnQeThPNb9STKt2x0T8TvgjmwuS6fRYqA0DSVqUWT7FRNgIpfJ+dVkGxAhC8Mh
YJVmLfR1Z/HS3lWRHkZHDBkv036by7XnrRdTEb7yftNflmFHaX0OdSO/4+Uueslf
Lz9uem7LUOwnz9x0tBDSdaUrfJ4hmJSNXZhoeINR0V4MUKQBVWvRUrlfymRlFL15
SlI7i19FS0OleFTZs26TflGutgLwvMTRzAvhVR/F+pBqlYs1UxvNk4eMPLZFYPuc
OlRsgoUUtmF722TjW2l+Uewixo22AMatyv9VsiR6Ut7etmLIj8HHABkDX5kQbqFc
wz60pkUvPcywGGATaMImQJ+uoHOTXZhegBPyfYZYhbTVXshjvEYxFSLtmfhoyVAo
SyUUhsQyu4KGRVm4zGXKQuAPALElaDcKJ/T1H11pobMrCgM48C3br3EGsSZyOEFp
2A7ulT73sYL+7EjQ2fS/4kTXUGOiIWijo1oR9ANvqYbcROQiKDYsl023oEz0dpVY
n2tWg1Gzt/KjeM0md8B/Lg==
=7MnB
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20190228-1' into staging
target-arm queue:
* add MHU and dual-core support to Musca boards
* refactor some VFP insns to be gated by ID registers
* Revert "arm: Allow system registers for KVM guests to be changed by QEMU code"
* Implement ARMv8.2-FHM extension
* Advertise JSCVT via HWCAP for linux-user
# gpg: Signature made Thu 28 Feb 2019 11:06:55 GMT
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20190228-1:
linux-user: Enable HWCAP_ASIMDFHM, HWCAP_JSCVT
target/arm: Enable ARMv8.2-FHM for -cpu max
target/arm: Implement VFMAL and VFMSL for aarch32
target/arm: Implement FMLAL and FMLSL for aarch64
target/arm: Add helpers for FMLAL
Revert "arm: Allow system registers for KVM guests to be changed by QEMU code"
target/arm: Gate "miscellaneous FP" insns by ID register field
target/arm: Use MVFR1 feature bits to gate A32/T32 FP16 instructions
hw/arm/armsse: Unify init-svtor and cpuwait handling
hw/arm/iotkit-sysctl: Implement CPUWAIT and INITSVTOR*
hw/arm/iotkit-sysctl: Add SSE-200 registers
hw/misc/iotkit-sysctl: Correct typo in INITSVTOR0 register name
target/arm/arm-powerctl: Add new arm_set_cpu_on_and_reset()
target/arm/cpu: Allow init-svtor property to be set after realize
hw/arm/armsse: Wire up the MHUs
hw/misc/armsse-mhu.c: Model the SSE-200 Message Handling Unit
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJcdpBIAAoJENSXKoln91plXDcH/377ByoCFuKu0uYN0f8m3eJ5
wCwOrcwExM36vga/zaMCkkj44TbrzpNtjeo/frBn+8pabFDpfF6NOXlBSC+CE/hg
i3G4Wm09GeNOyPH9JIdvItE1LvL3EEOf10pbheNdv6PeuFPRnUAV4pyQ/Rcu9USC
7pAwIJvR3GYXAEhsqa8sKbbuCBq1oiFXWpsEuBNwybWKgdVEpia6IJVYDi+xwnVc
FcpMF7BAqZDIX13kCSIgOAaa/XCKRFgxUYnZMd3bwD9m+x3iC442eS7Idx/HyXXK
5HDeyubff8bMKBzTUWFuM1J8t0uuQsRDqR61WptQ0rxf5Qf9Uiv9OcAww+LIENI=
=9h9l
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-feb-27-2019' into staging
MIPS queue for February 27th, 2019
# gpg: Signature made Wed 27 Feb 2019 13:27:36 GMT
# gpg: using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65
* remotes/amarkovic/tags/mips-queue-feb-27-2019:
target/mips: Preparing for adding MMI instructions
tests/tcg: target/mips: Add tests for MSA integer max/min instructions
tests/tcg: target/mips: Add wrappers for MSA integer max/min instructions
qemu-doc: Add section on MIPS' Boston board
qemu-doc: Add section on MIPS' Fulong 2E board
qemu-doc: Move section on MIPS' mipssim pseudo board
disas: nanoMIPS: Fix a function misnomer
tests/tcg: target/mips: Add tests for MSA integer compare instructions
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Load/store opcodes may raise MMU exceptions. Normally exceptions should
be checked in priority order before any actual operations, but since MMU
exceptions are tightly coupled with actual memory access, there's
currently no way to do it.
Approximate this behavior by executing all load, then all store, and
then all other opcodes in the FLIX bundles. Use opcode dependency
mechanism to express ordering. Mark load/store opcodes with
XTENSA_OP_{LOAD,STORE} flags. Newer libisa has classifier functions that
can tell whether opcode is a load or store, but this information is not
available in the existing overlays.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Currently topologic opcode sorting stops at the first detected
dependency loop. Introduce struct opcode_arg_copy that describes
temporary register copy. Scan remaining opcodes searching for
dependencies that can be broken, break them by introducing temporary
register copies and record them in an array. In case of success
create local temporaries and initialize them with current register
values. Share single temporary copy between all register users. Delete
temporaries after translation.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
libisa represents boolean registers b0..b16 as a BR register file and as
BR4 and BR8 register groups. Add these register files and use
OpcodeArg::{in,out} parameters to access boolean registers in
translators.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
libisa represents MAC16 registers m0..m3 as an MR register file. Add
this register file and reference its registers directly from the
translate_mac16. Drop translator parameter that indicates whether opcode
argument is in ar or in mr.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
To support circular register dependencies in FLIX bundles opcode inputs
and outputs must be separate and adjustable. Circular dependencies can
be broken by making temporary copies of opcode inputs and substituting
them into the arguments array instead of the original registers.
E.g. the circular register dependency in the following bundle:
{ mov a2, a3 ; mov a3, a2 }
can be resolved by making copy a2' = a2 and substituting it as input
argument of the second opcode:
{ mov a2, a3 ; mov a3, a2' }
Change opcode translator prototype to accept OpcodeArg array as
argument. For each register argument initialize OpcodeArg::{in,out} with
TCGv_* of the respective register. Don't explicitly use cpu_R in the
opcode translators, use OpcodeArg::{in,out} instead.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Move return address calculation and WINDOW_START adjustment out of the
retw helper to simplify logic a bit and avoid using registers directly.
Pass a0 as a parameter to the helper.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Opcodes that modify WINDOW_BASE SR don't have dependency on opcodes that
use windowed registers. If such opcodes are combined in a single
instruction they may not be correctly ordered. Instead of adding said
dependency use temporary register to store changed WINDOW_BASE value and
do actual register window rotation as a postprocessing step.
Not all opcodes that change WINDOW_BASE need this: retw, rfwo and rfwu
are also jump opcodes, so they are guaranteed to be translated last and
thus will not affect other opcodes in the same instruction.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Some opcodes may need additional actions at every exit from the
translated instruction or may need to amend TB exit slots available to
jumps generated for the instruction. Add gen_postprocess function and
call it from the gen_jump_slot and from the disas_xtensa_insn.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Opcodes in different slots may read and write same resources (registers,
states). In the absence of resource dependency loops it must be possible
to sort opcodes to avoid interference.
Record resources used by each opcode in the bundle. Build opcode
dependency graph and use topological sort to order its nodes. In case of
success translate opcodes in sort order. In case of failure report and
raise invalid opcode exception.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219222952.22183-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219222952.22183-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219222952.22183-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Note that float16_to_float32 rightly squashes SNaN to QNaN.
But of course pickNaNMulAdd, for ARM, selects SNaNs first.
So we have to preserve SNaN long enough for the correct NaN
to be selected. Thus float16_to_float32_by_bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219222952.22183-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts commit 823e1b3818,
which introduces a regression running EDK2 guest firmware
under KVM:
error: kvm run failed Function not implemented
PC=000000013f5a6208 X00=00000000404003c4 X01=000000000000003a
X02=0000000000000000 X03=00000000404003c4 X04=0000000000000000
X05=0000000096000046 X06=000000013d2ef270 X07=000000013e3d1710
X08=09010755ffaf8ba8 X09=ffaf8b9cfeeb5468 X10=feeb546409010756
X11=09010757ffaf8b90 X12=feeb50680903068b X13=090306a1ffaf8bc0
X14=0000000000000000 X15=0000000000000000 X16=000000013f872da0
X17=00000000ffffa6ab X18=0000000000000000 X19=000000013f5a92d0
X20=000000013f5a7a78 X21=000000000000003a X22=000000013f5a7ab2
X23=000000013f5a92e8 X24=000000013f631090 X25=0000000000000010
X26=0000000000000100 X27=000000013f89501b X28=000000013e3d14e0
X29=000000013e3d12a0 X30=000000013f5a2518 SP=000000013b7be0b0
PSTATE=404003c4 -Z-- EL1t
with
[ 3507.926571] kvm [35042]: load/store instruction decoding not implemented
in the host dmesg.
Revert the change for the moment until we can investigate the
cause of the regression.
Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There is a set of VFP instructions which we implement in
disas_vfp_v8_insn() and gate on the ARM_FEATURE_V8 bit.
These were all first introduced in v8 for A-profile, but in
M-profile they appeared in v7M. Gate them on the MVFR2
FPMisc field instead, and rename the function appropriately.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190222170936.13268-3-peter.maydell@linaro.org
Instead of gating the A32/T32 FP16 conversion instructions on
the ARM_FEATURE_VFP_FP16 flag, switch to our new approach of
looking at ID register bits. In this case MVFR1 fields FPHP
and SIMDHP indicate the presence of these insns.
This change doesn't alter behaviour for any of our CPUs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190222170936.13268-2-peter.maydell@linaro.org
Currently the Arm arm-powerctl.h APIs allow:
* arm_set_cpu_on(), which powers on a CPU and sets its
initial PC and other startup state
* arm_reset_cpu(), which resets a CPU which is already on
(and fails if the CPU is powered off)
but there is no way to say "power on a CPU as if it had
just come out of reset and don't do anything else to it".
Add a new function arm_set_cpu_on_and_reset(), which does this.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-5-peter.maydell@linaro.org
Make the M-profile "init-svtor" property be settable after realize.
This matches the hardware, where this is a config signal which
is sampled on CPU reset and can thus be changed between one
reset and another. To do this we have to change the API we
use to add the property.
(We will need this capability for the SSE-200.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-4-peter.maydell@linaro.org
Set up MMI code to be compiled only for TARGET_MIPS64. This is
needed so that GPRs are 64 bit, and combined with MMI registers,
they will form full 128 bit registers.
Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1551183797-13570-2-git-send-email-mateja.marjanovic@rt-rk.com>
No guest support yet
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-13-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(Might need more patch splitting)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-12-clg@kaod.org>
[dwg: Hack to fix compile with some earlier include tweaks of mine]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
That "b" means "base address" and thus shouldn't be in the name
of actual entries and related constants.
This patch keeps the synthetic patb_entry field of the spapr
virtual hypervisor unchanged until I figure out if that has
an impact on the migration stream.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-11-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Our TCG TLB only tags whether it's a HV vs a guest access, so it must
be flushed when the LPIDR is changed.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>