TYPE_AMD_IOMMU_PCI is user-creatable but not well described.
Implement its class_init() handler to add it to the 'Misc
devices' category, and add a description.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210926175648.1649075-4-f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Various functions are SysBus specific. Rename them using the
consistent amdvi_sysbus_XXX() pattern, to differentiate them
from PCI specific functions (which we'll add in the next
commit).
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210926175648.1649075-3-f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Per 'QEMU Coding Style':
Naming
======
Variables are lower_case_with_underscores; easy to type and read.
Rename amdviPCI variable as amdvi_pci.
amdviPCI_register_types() register more than PCI types:
TYPE_AMD_IOMMU_DEVICE inherits TYPE_X86_IOMMU_DEVICE which
itself inherits TYPE_SYS_BUS_DEVICE.
Rename it more generically as amdvi_register_types().
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210926175648.1649075-2-f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210624110415.187164-1-lizhijian@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
The subsection name for page-poison was typo'd as:
vitio-balloon-device/page-poison
Note the missing 'r' in virtio.
When we have a machine type that enables page poison, and the guest
enables it (which needs a new kernel), things fail rather unpredictably.
The fallout from this is that most of the other subsections fail to
load, including things like the feature bits in the device, one
possible fallout is that the physical addresses of the queues
then get aligned differently and we fail with an error about
last_avail_idx being wrong.
It's not obvious to me why this doesn't produce a more obvious failure,
but virtio's vmstate loading is a bit open-coded.
Fixes: 7483cbbaf8 ("virtio-balloon: Implement support for page poison reporting feature")
bz: https://bugzilla.redhat.com/show_bug.cgi?id=1984401
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210914131716.102851-1-dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-35-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian
conversions when building table and use endian agnostic
build_append_int_noprefix() API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-34-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
while at it, replace packed structure with endian agnostic
build_append_FOO() API.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-33-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
while at it, replace packed structure with endian agnostic
build_append_FOO() API.
PS:
Spec is Microsoft hosted, however 1.02 is no where to be found
(MS lists only the current revision) and the current revision is 1.07,
so bring comments in line with 1.07 as this is the only available spec.
There is no content change between originally implemented 1.02
(using QEMU code as reference) and 1.07. The only change is renaming
'Reserved2' field to 'Language', with the same 0 value.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-32-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
implicit cast to structure uint8_t member didn't raise error when
assigning value from incorrect enum, but when using build_append_gas()
(next patch) it will error out with (clang):
implicit conversion from enumeration type 'AmlRegionSpace'
to different enumeration type 'AmlAddressSpace'
fix cast error by using correct AML_AS_SYSTEM_MEMORY enum
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-31-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian conversions
when building IORT table use endian agnostic build_append_int_noprefix()
API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210924122802.1455362-30-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-29-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-28-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-27-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian conversions
when building MADT table for arm/x86 and use endian agnostic
build_append_int_noprefix() API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-26-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian conversions
when building MADT table for arm/x86 and use endian agnostic
build_append_int_noprefix() API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-25-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Instead of composing disabled _MAT entry and then later on
patching it to enabled for hotpluggbale CPUs in DSDT,
set it to enabled at the time _MAT entry is built.
It will allow to drop usage of packed structures in
following patches when build_madt() is switched to use
build_append_int_noprefix() API.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-24-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-22-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-21-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-20-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
While at it switch to build_append_int_noprefix() to build
table entries tables.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-19-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian conversions
when building SRAT tables for arm/x86 and use endian agnostic
build_append_int_noprefix() API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-18-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
While at it switch to build_append_int_noprefix() to build
table entries (which also removes some manual offset
calculations)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-17-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
While at it switch to build_append_int_noprefix() to build
table entries (which also removes some manual offset
calculations).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-16-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
while at it convert build_hpet() to endian agnostic
build_append_FOO() API
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210924122802.1455362-15-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-14-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-13-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-12-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Also since acpi_table_begin() reserves space only for standard header
while previous acpi_data_push() reserved the header + 4 bytes field,
add 4 bytes 'Reserved' field into nvdimm_build_nfit() which didn't
have it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-11-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Also since acpi_table_begin() reserves space only for standard header
while previous acpi_data_push() reserved the header + 4 bytes field,
add 4 bytes 'Reserved' field into hmat_build_table_structs()
which didn have it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-10-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-9-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Dongjiu Geng <gengdongjiu1@gmail.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-8-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-7-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-6-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-5-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
While at it switch to build_append_int_noprefix() to build
entries to other tables (which also removes some manual offset
calculations).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offests magic from API user.
While at it switch to build_append_int_noprefix() to build
entries to other tables (which also removes some manual offset
calculations).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Patch introduces acpi_table_begin()/ acpi_table_end() API
that hides pointer/offset arithmetic from user as opposed
to build_header(), to prevent errors caused by it [1].
acpi_table_begin():
initializes table header and keeps track of
table data/offsets
acpi_table_end():
sets actual table length and tells bios loader
where table is for the later initialization on
guest side.
1) commits
bb9feea431 x86: acpi: use offset instead of pointer when using build_header()
4d027afeb3 Virt: ACPI: fix qemu assert due to re-assigned table data address
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
virtio-vsock features, like VIRTIO_VSOCK_F_SEQPACKET, can be handled
by vhost-vsock-common parent class. In this way, we can reuse the
same code for all virtio-vsock backends (i.e. vhost-vsock,
vhost-user-vsock).
Let's move `seqpacket` property to vhost-vsock-common class, add
vhost_vsock_common_get_features() used by children, and disable
`seqpacket` for vhost-user-vsock device for machine types < 6.2.
The behavior of vhost-vsock device doesn't change; vhost-user-vsock
device now supports `seqpacket` property.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210921161642.206461-3-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
enabled the SEQPACKET feature bit.
This commit is released with QEMU 6.1, so if we try to migrate a VM where
the host kernel supports SEQPACKET but machine type version is less than
6.1, we get the following errors:
Features 0x130000002 unsupported. Allowed features: 0x179000000
Failed to load virtio-vhost_vsock:virtio
error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
load of migration failed: Operation not permitted
Let's disable the feature bit for machine types < 6.1.
We add a new OnOffAuto property for this, called `seqpacket`.
When it is `auto` (default), QEMU behaves as before, trying to enable the
feature, when it is `on` QEMU will fail if the backend (vhost-vsock
kernel module) doesn't support it.
Fixes: 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
Cc: qemu-stable@nongnu.org
Reported-by: Jiang Wang <jiang.wang@bytedance.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210921161642.206461-2-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Both virtqueue_packed_get_avail_bytes() and
virtqueue_split_get_avail_bytes() access the region cache, but
their caller also does. Simplify by having virtqueue_get_avail_bytes
calling both with RCU lock held, and passing the caches as argument.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210906104318.1569967-4-philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
vring_get_region_caches() must be called with the RCU read lock
acquired. virtqueue_packed_drop_all() does not, and uses the
'caches' pointer. Fix that by using the RCU_READ_LOCK_GUARD()
macro.
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210906104318.1569967-3-philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
If SEV is enabled and a kernel is passed via -kernel, pass the hashes of
kernel/initrd/cmdline in an encrypted guest page to OVMF for SEV
measured boot.
Co-developed-by: James Bottomley <jejb@linux.ibm.com>
Signed-off-by: James Bottomley <jejb@linux.ibm.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210930054915.13252-3-dovmurik@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We might not start at the beginning of the memory region. Let's
calculate the offset into the memory region via the difference in the
host addresses.
Acked-by: Stefan Berger <stefanb@linux.ibm.com>
Fixes: ffab1be706 ("tpm: clear RAM when "memory overwrite" requested")
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Claudio Fontana <cfontana@suse.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210727082545.17934-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As we might not always have a device id, it is impossible to always
match MEMORY_DEVICE_SIZE_CHANGE events to an actual device. Let's
include the qom-path in the event, which allows for reliable mapping of
events to devices.
Fixes: 722a3c783e ("virtio-pci: Send qapi events when the virtio-mem size changes")
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210929162445.64060-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Apparently, we don't have to duplicate the string.
Fixes: 722a3c783e ("virtio-pci: Send qapi events when the virtio-mem size changes")
Cc: qemu-stable@nongnu.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210929162445.64060-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM implements some Hyper-V 2016 functions so providing WS2008R2 version
is somewhat incorrect. While generally guests shouldn't care about it
and always check feature bits, it is known that some tools in Windows
actually check version info.
For compatibility reasons make the change for 6.2 machine types only.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210902093530.345756-9-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Put both sanity-check of the input SMP configuration and sanity-check
of the output SMP configuration uniformly in the generic parser. Then
machine_set_smp() will become cleaner, also all the invalid scenarios
can be tested only by calling the parser.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-16-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now we have a common structure SMPCompatProps used to store information
about SMP compatibility stuff, so we can also move smp_prefer_sockets
there for cleaner code.
No functional change intended.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-15-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now we have a generic smp parser for all arches, and there will
not be any other arch specific ones, so let's remove the callback
from MachineClass and call the parser directly.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-14-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently the only difference between smp_parse and pc_smp_parse
is the support of dies parameter and the related error reporting.
With some arch compat variables like "bool dies_supported", we can
make smp_parse generic enough for all arches and the PC specific
one can be removed.
Making smp_parse() generic enough can reduce code duplication and
ease the code maintenance, and also allows extending the topology
with more arch specific members (e.g., clusters) in the future.
Suggested-by: Andrew Jones <drjones@redhat.com>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-13-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that all the possible topology parameters are integrated in struct
CpuTopology, tweak the order of topology members to be "cpus/sockets/
dies/cores/threads/maxcpus" for readability and consistency. We also
tweak the comment by adding explanation of dies parameter.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210929025816.21076-12-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the sanity-check of smp_cpus and max_cpus against mc in function
machine_set_smp(), we are now using ms->smp.max_cpus for the check
but using current_machine->smp.max_cpus in the error message.
Tweak this by uniformly using the local ms.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210929025816.21076-11-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the real SMP hardware topology world, it's much more likely that
we have high cores-per-socket counts and few sockets totally. While
the current preference of sockets over cores in smp parsing results
in a virtual cpu topology with low cores-per-sockets counts and a
large number of sockets, which is just contrary to the real world.
Given that it is better to make the virtual cpu topology be more
reflective of the real world and also for the sake of compatibility,
we start to prefer cores over sockets over threads in smp parsing
since machine type 6.2 for different arches.
In this patch, a boolean "smp_prefer_sockets" is added, and we only
enable the old preference on older machines and enable the new one
since type 6.2 for all arches by using the machine compat mechanism.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-10-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have two requirements for a valid SMP configuration:
the product of "sockets * cores * threads" must represent all the
possible cpus, i.e., max_cpus, and then must include the initially
present cpus, i.e., smp_cpus.
So we only need to ensure 1) "sockets * cores * threads == maxcpus"
at first and then ensure 2) "maxcpus >= cpus". With a reasonable
order of the sanity check, we can simplify the error reporting code.
When reporting an error message we also report the exact value of
each topology member to make users easily see what's going on.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210929025816.21076-7-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently we directly calculate the omitted cpus based on the given
incomplete collection of parameters. This makes some cmdlines like:
-smp maxcpus=16
-smp sockets=2,maxcpus=16
-smp sockets=2,dies=2,maxcpus=16
-smp sockets=2,cores=4,maxcpus=16
not work. We should probably set the value of cpus to match maxcpus
if it's omitted, which will make above configs start to work.
So the calculation logic of cpus/maxcpus after this patch will be:
When both maxcpus and cpus are omitted, maxcpus will be calculated
from the given parameters and cpus will be set equal to maxcpus.
When only one of maxcpus and cpus is given then the omitted one
will be set to its counterpart's value. Both maxcpus and cpus may
be specified, but maxcpus must be equal to or greater than cpus.
Note: change in this patch won't affect any existing working cmdlines
but allows more incomplete configs to be valid.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-6-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are currently using maxcpus to calculate the omitted sockets
but using cpus to calculate the omitted cores/threads. This makes
cmdlines like:
-smp cpus=8,maxcpus=16
-smp cpus=8,cores=4,maxcpus=16
-smp cpus=8,threads=2,maxcpus=16
work fine but the ones like:
-smp cpus=8,sockets=2,maxcpus=16
-smp cpus=8,sockets=2,cores=4,maxcpus=16
-smp cpus=8,sockets=2,threads=2,maxcpus=16
break the sanity check.
Since we require for a valid config that the product of "sockets * cores
* threads" should equal to the maxcpus, we should uniformly use maxcpus
to calculate their omitted values.
Also the if-branch of "cpus == 0 || sockets == 0" was split into two
branches of "cpus == 0" and "sockets == 0" so that we can clearly read
that we are parsing the configuration with a preference on cpus over
sockets over cores over threads.
Note: change in this patch won't affect any existing working cmdlines
but improves consistency and allows more incomplete configs to be valid.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-5-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To pave the way for the functional improvement in later patches,
make some refactor/cleanup for the smp parsers, including using
local maxcpus instead of ms->smp.max_cpus in the calculation,
defaulting dies to 0 initially like other members, cleanup the
sanity check for dies.
We actually also fix a hidden defect by avoiding directly using
the provided *zero value* in the calculation, which could cause
a segment fault (e.g. using dies=0 in the calculation).
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-4-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the SMP configuration, we should either provide a topology
parameter with a reasonable value (greater than zero) or just
omit it and QEMU will compute the missing value.
The users shouldn't provide a configuration with any parameter
of it specified as zero (e.g. -smp 8,sockets=0) which could
possibly cause unexpected results in the -smp parsing. So we
deprecate this kind of configurations since 6.2 by adding the
explicit sanity check.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-3-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Here's the next batch of ppc related patches for qemu-6.2. Highlights
are:
* Fixes for several TCG math instructions from the El Dorado Institute
* A number of improvements to the powernv machine type
* Support for a new DEVICE_UNPLUG_GUEST_ERROR QAPI event from Daniel
Barboza
* Support for the new FORM2 PAPR NUMA representation. This allows
more specific NUMA distances, as well as asymmetric configurations
* Fix for 64-bit decrementer (used on MicroWatt CPUs)
* Assorted fixes and cleanups
* A number of updates to MAINTAINERS
Note that the DEVICE_UNPLUG_GUEST_ERROR stuff includes changes to
files outside my normal area, but has suitable Acks.
The MAINTAINERS updates are mostly about marking minor platforms
unmaintained / orphaned, and moving some pieces away from myself and
Greg. As we move onto other projects, we're going to need to drop
more of the ppc maintainership, though we're hoping we can avoid too
abrupt a change.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmFVTlEACgkQbDjKyiDZ
s5IATRAAi3qB5riIXQEO41SoG5EjwJ9+2VQmagMkgwyNTQp5SyrqiWpTJ7Cq8IMF
inEgV8vHv40GFJLdXXjTvdsRkAxh1siVZR7okmbd144ZdwL/ntJ4FFiGG5BjFgf2
DajbDW/RgNDsWkTFbnhKVDNTzFbxHgeGf/g+KOPmQ5MtPPJk/Og2jJVOVE+CCY+Q
FsF7xe9ehffrGP4ObkwKM8pNMWEaJ8KP+Z1ZV5WQwQSHPZD4x1qb6aaH8JQWv7Fl
pqep5nlUv0KNsRPFj4hGhzjB9JUMMnWtSEuw5GNL0bQAj1v6tPNESRMw6fInAR/A
JlvG6iYba2ZKZ7+SGipwYyVzBPnE7ur9i2E5MaANTSQF1zgsSwcYT6J5dYSCZmNH
XrEZYYm1MyCTem8PNKrP44QRQAQz+vfHb8hUkwzNsZC7DLLsZ3JQ2AWcAvWFOKRe
/s3WyODeXa3gZMCBFbvCWVAPiKNNFGQMsa2/bkF7FpvNGl99xL3o6MDskA2dSdTK
0oreMT9OKdDS2W2LpDCasq6HHVsoJ+hc+TkWCEshEEPb7jUAIOnKqhHh/aqlXVf7
J3r68FFbNbBoPR0SgdLc4teBmUKDJIGpZT8gaJ7Vbx6IiHXH21ZzOHKuTjbJZWWN
u1vcjBBqvbQmrwhCyi6Fe5gGM63GcB/ulizM1rNyqIpTUuzyx9I=
=shYQ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.2-20210930' into staging
ppc patch queue for 2021-09-30
Here's the next batch of ppc related patches for qemu-6.2. Highlights
are:
* Fixes for several TCG math instructions from the El Dorado Institute
* A number of improvements to the powernv machine type
* Support for a new DEVICE_UNPLUG_GUEST_ERROR QAPI event from Daniel
Barboza
* Support for the new FORM2 PAPR NUMA representation. This allows
more specific NUMA distances, as well as asymmetric configurations
* Fix for 64-bit decrementer (used on MicroWatt CPUs)
* Assorted fixes and cleanups
* A number of updates to MAINTAINERS
Note that the DEVICE_UNPLUG_GUEST_ERROR stuff includes changes to
files outside my normal area, but has suitable Acks.
The MAINTAINERS updates are mostly about marking minor platforms
unmaintained / orphaned, and moving some pieces away from myself and
Greg. As we move onto other projects, we're going to need to drop
more of the ppc maintainership, though we're hoping we can avoid too
abrupt a change.
# gpg: Signature made Thu 30 Sep 2021 06:42:41 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dg-gitlab/tags/ppc-for-6.2-20210930: (44 commits)
MAINTAINERS: Demote sPAPR from "Supported" to "Maintained"
MAINTAINERS: Add information for OpenPIC
MAINTAINERS: Remove David & Greg as reviewers/co-maintainers of powernv
MAINTAINERS: Orphan obscure ppc platforms
MAINTAINERS: Remove David & Greg as reviewers for a number of boards
MAINTAINERS: Remove machine specific files from ppc TCG CPUs entry
spapr/xive: Fix kvm_xive_source_reset trace event
spapr_numa.c: fixes in spapr_numa_FORM2_write_rtas_tables()
hw/intc: openpic: Clean up the styles
hw/intc: openpic: Drop Raven related codes
hw/intc: openpic: Correct the reset value of IPIDR for FSL chipset
target/ppc: Fix 64-bit decrementer
target/ppc: Convert debug to trace events (decrementer and IRQ)
spapr_numa.c: handle auto NUMA node with no distance info
spapr_numa.c: FORM2 NUMA affinity support
spapr: move FORM1 verifications to post CAS
spapr_numa.c: rename numa_assoc_array to FORM1_assoc_array
spapr_numa.c: parametrize FORM1 macros
spapr_numa.c: scrap 'legacy_numa' concept
spapr_numa.c: split FORM1 code into helpers
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Provide a name field for all the memory listeners. It can be used to identify
which memory listener is which.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210817013553.30584-2-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Libvirt can use query-sgx-capabilities to get the host
sgx capabilities to decide how to allocate SGX EPC size to VM.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210910102258.46648-3-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The QMP and HMP interfaces can be used by monitor or QMP tools to retrieve
the SGX information from VM side when SGX is enabled on Intel platform.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210910102258.46648-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since there is no fill_device_info() callback support, and when we
execute "info memory-devices" command in the monitor, the segfault
will be found.
This patch will add this callback support and "info memory-devices"
will show sgx epc memory exposed to guest. The result as below:
qemu) info memory-devices
Memory device [sgx-epc]: ""
memaddr: 0x180000000
size: 29360128
memdev: /objects/mem1
Memory device [sgx-epc]: ""
memaddr: 0x181c00000
size: 10485760
memdev: /objects/mem2
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-33-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Enable SGX EPC virtualization, which is currently only support by KVM.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-22-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Enable SGX EPC virtualization, which is currently only support by KVM.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-21-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The ACPI Device entry for SGX EPC is essentially a hack whose primary
purpose is to provide software with a way to autoprobe SGX support,
e.g. to allow software to implement SGX support as a driver. Details
on the individual EPC sections are not enumerated through ACPI tables,
i.e. software must enumerate the EPC sections via CPUID. Furthermore,
software expects to see only a single EPC Device in the ACPI tables
regardless of the number of EPC sections in the system.
However, several versions of Windows do rely on the ACPI tables to
enumerate the address and size of the EPC. So, regardless of the number
of EPC sections exposed to the guest, create exactly *one* EPC device
with a _CRS entry that spans the entirety of all EPC sections (which are
guaranteed to be contiguous in Qemu).
Note, NUMA support for EPC memory is intentionally not considered as
enumerating EPC NUMA information is not yet defined for bare metal.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-20-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Note that SGX EPC is currently guaranteed to reside in a single
contiguous chunk of memory regardless of the number of EPC sections.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-19-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add helpers to detect if SGX EPC exists above 4g, and if so, where SGX
EPC above 4g ends. Use the helpers to adjust the device memory range
if SGX EPC exists above 4g.
For multiple virtual EPC sections, we just put them together physically
contiguous for the simplicity because we don't support EPC NUMA affinity
now. Once the SGX EPC NUMA support in the kernel SGX driver, we will
support this in the future.
Note that SGX EPC is currently hardcoded to reside above 4g.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-18-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Request SGX an SGX Launch Control to be enabled in FEATURE_CONTROL
when the features are exposed to the guest. Our design is the SGX
Launch Control bit will be unconditionally set in FEATURE_CONTROL,
which is unlike host bios.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-17-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose SGX to the guest if and only if KVM is enabled and supports
virtualization of SGX. While the majority of ENCLS can be emulated to
some degree, because SGX uses a hardware-based root of trust, the
attestation aspects of SGX cannot be emulated in software, i.e.
ultimately emulation will fail as software cannot generate a valid
quote/report. The complexity of partially emulating SGX in Qemu far
outweighs the value added, e.g. an SGX specific simulator for userspace
applications can emulate SGX for development and testing purposes.
Note, access to the PROVISIONKEY is not yet advertised to the guest as
KVM blocks access to the PROVISIONKEY by default and requires userspace
to provide additional credentials (via ioctl()) to expose PROVISIONKEY.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-13-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because SGX EPC is enumerated through CPUID, EPC "devices" need to be
realized prior to realizing the vCPUs themselves, i.e. long before
generic devices are parsed and realized. From a virtualization
perspective, the CPUID aspect also means that EPC sections cannot be
hotplugged without paravirtualizing the guest kernel (hardware does
not support hotplugging as EPC sections must be locked down during
pre-boot to provide EPC's security properties).
So even though EPC sections could be realized through the generic
-devices command, they need to be created much earlier for them to
actually be usable by the guest. Place all EPC sections in a
contiguous block, somewhat arbitrarily starting after RAM above 4g.
Ensuring EPC is in a contiguous region simplifies calculations, e.g.
device memory base, PCI hole, etc..., allows dynamic calculation of the
total EPC size, e.g. exposing EPC to guests does not require -maxmem,
and last but not least allows all of EPC to be enumerated in a single
ACPI entry, which is expected by some kernels, e.g. Windows 7 and 8.
The new compound properties command for sgx like below:
......
-object memory-backend-epc,id=mem1,size=28M,prealloc=on \
-object memory-backend-epc,id=mem2,size=10M \
-M sgx-epc.0.memdev=mem1,sgx-epc.1.memdev=mem2
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-6-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SGX EPC is enumerated through CPUID, i.e. EPC "devices" need to be
realized prior to realizing the vCPUs themselves, which occurs long
before generic devices are parsed and realized. Because of this,
do not allow 'sgx-epc' devices to be instantiated after vCPUS have
been created.
The 'sgx-epc' device is essentially a placholder at this time, it will
be fully implemented in a future patch along with a dedicated command
to create 'sgx-epc' devices.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-5-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add new CONFIG_SGX for sgx support in the Qemu, and the Kconfig
default enable sgx in the i386 platform.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-32-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new RAMBlock flag to denote "protected" memory, i.e. memory that
looks and acts like RAM but is inaccessible via normal mechanisms,
including DMA. Use the flag to skip protected memory regions when
mapping RAM for DMA in VFIO.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Linux spi-imx driver does not work on QEMU. The reason is that the
state of m25p80 loops in STATE_READING_DATA state after receiving
RDSR command, the new command is ignored. Before sending a new command,
CS line should be pulled high to make the state of m25p80 back to IDLE.
Currently the SPI flash CS line is connected to the SPI controller, but
on the real board, it's connected to GPIO3_19. This matches the ecspi1
device node in the board dts.
ecspi1 node in imx6qdl-sabrelite.dtsi:
&ecspi1 {
cs-gpios = <&gpio3 19 GPIO_ACTIVE_LOW>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_ecspi1>;
status = "okay";
flash: m25p80@0 {
compatible = "sst,sst25vf016b", "jedec,spi-nor";
spi-max-frequency = <20000000>;
reg = <0>;
};
};
Should connect the SSI_GPIO_CS to GPIO3_19 when adding a spi-nor to
spi1 on sabrelite machine.
Verified this patch on Linux v5.14.
Logs:
# echo "01234567899876543210" > test
# mtd_debug erase /dev/mtd0 0x0 0x1000
Erased 4096 bytes from address 0x00000000 in flash
# mtd_debug write /dev/mtdblock0 0x0 20 test
Copied 20 bytes from test to address 0x00000000 in flash
# mtd_debug read /dev/mtdblock0 0x0 20 test_out
Copied 20 bytes from address 0x00000000 in flash to test_out
# cat test_out
01234567899876543210#
Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210927142825.491-1-xchengl.cn@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The function ide_bus_new() does an in-place initialization. Rename
it to ide_bus_init() to follow our _init vs _new convention.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Acked-by: John Snow <jsnow@redhat.com> (Feel free to merge.)
Message-id: 20210923121153.23754-7-peter.maydell@linaro.org
Rename the "allocate and return" qbus creation function to
qbus_new(), to bring it into line with our _init vs _new convention.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Message-id: 20210923121153.23754-6-peter.maydell@linaro.org
Rename qbus_create_inplace() to qbus_init(); this is more in line
with our usual naming convention for functions that in-place
initialize objects.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20210923121153.23754-5-peter.maydell@linaro.org
Rename the pci_root_bus_new_inplace() function to
pci_root_bus_init(); this brings the bus type in to line with a
"_init for in-place init, _new for allocate-and-return" convention.
To do this we need to rename the implementation-internal function
that was using the pci_root_bus_init() name to
pci_root_bus_internal_init().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20210923121153.23754-4-peter.maydell@linaro.org
Rename ipack_bus_new_inplace() to ipack_bus_init(), to bring it in to
line with a "_init for in-place init, _new for allocate-and-return"
convention. Drop the 'name' argument, because the only caller does
not pass in a name. If a future caller does need to specify the bus
name, we should create an ipack_bus_init_named() function at that
point.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20210923121153.23754-3-peter.maydell@linaro.org
The function scsi_bus_new() creates a new SCSI bus; callers can
either pass in a name argument to specify the name of the new bus, or
they can pass in NULL to allow the bus to be given an automatically
generated unique name. Almost all callers want to use the
autogenerated name; the only exception is the virtio-scsi device.
Taking a name argument that should almost always be NULL is an
easy-to-misuse API design -- it encourages callers to think perhaps
they should pass in some standard name like "scsi" or "scsi-bus". We
don't do this anywhere for SCSI, but we do (incorrectly) do it for
other bus types such as i2c.
The function name also implies that it will return a newly allocated
object, when it in fact does in-place allocation. We more commonly
name such functions foo_init(), with foo_new() being the
allocate-and-return variant.
Replace all the scsi_bus_new() callsites with either:
* scsi_bus_init() for the usual case where the caller wants
an autogenerated bus name
* scsi_bus_init_named() for the rare case where the caller
needs to specify the bus name
and document that for the _named() version it's then the caller's
responsibility to think about uniqueness of bus names.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210923121153.23754-2-peter.maydell@linaro.org
Connect the support for ZynqMP eFUSE one-time field-programmable
bit array.
The command argument:
-drive if=pflash,index=3,...
Can be used to optionally connect the bit array to a
backend storage, such that field-programmed values
in one invocation can be made available to next
invocation.
The backend storage must be a seekable binary file, and
its size must be 768 bytes or larger. A file with all
binary 0's is a 'blank'.
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-9-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Connect the support for Xilinx ZynqMP Battery-Backed RAM (BBRAM)
The command argument:
-drive if=pflash,index=2,...
Can be used to optionally connect the bbram to a backend
storage, such that field-programmed values in one
invocation can be made available to next invocation.
The backend storage must be a seekable binary file, and
its size must be 36 bytes or larger. A file with all
binary 0's is a 'blank'.
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-8-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Connect the support for Versal eFUSE one-time field-programmable
bit array.
The command argument:
-drive if=pflash,index=1,...
Can be used to optionally connect the bit array to a
backend storage, such that field-programmed values
in one invocation can be made available to next
invocation.
The backend storage must be a seekable binary file, and
its size must be 3072 bytes or larger. A file with all
binary 0's is a 'blank'.
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-7-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Connect the support for Versal Battery-Backed RAM (BBRAM)
The command argument:
-drive if=pflash,index=0,...
Can be used to optionally connect the bbram to a backend
storage, such that field-programmed values in one
invocation can be made available to next invocation.
The backend storage must be a seekable binary file, and
its size must be 36 bytes or larger. A file with all
binary 0's is a 'blank'.
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-6-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This device is present in Versal and ZynqMP product
families to store a 256-bit encryption key.
Co-authored-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Co-authored-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-5-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This implements the Xilinx ZynqMP eFuse, an one-time
field-programmable non-volatile storage device. There is
only one such device in the Xilinx ZynqMP product family.
Co-authored-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Co-authored-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-4-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This implements the Xilinx Versal eFuse, an one-time
field-programmable non-volatile storage device. There is
only one such device in the Xilinx Versal product family.
This device has two separate mmio interfaces, a controller
and a flatten readback.
The controller provides interfaces for field-programming,
configuration, control, and status.
The flatten readback is a cache to provide a byte-accessible
read-only interface to efficiently read efuse array.
Co-authored-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Co-authored-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-3-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This introduces the QOM for Xilinx eFuse, an one-time
field-programmable storage bit array.
The actual mmio interface to the array varies by device
families and will be provided in different change-sets.
Co-authored-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Co-authored-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-2-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Allwinner H3 SoC uses Cortex-A7 cores which support virtualization.
However, today we are configuring QEMU to use HVC as PSCI conduit.
That means HVC calls get trapped into QEMU instead of the guest's own
emulated CPU and thus break the guest's ability to execute virtualization.
Fix this by moving to SMC as conduit, freeing up HYP completely to the VM.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Message-id: 20210920203931.66527-1-agraf@csgraf.de
Fixes: 740dafc0ba ("hw/arm: add Allwinner H3 System-on-Chip")
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The trace event was placed in the wrong routine. Move it under
kvmppc_xive_source_reset_one().
Fixes: 4e960974d4 ("xive: Add trace events")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210922070205.1235943-1-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch has a handful of modifications for the recent added
FORM2 support:
- to not allocate more than the necessary size in 'distance_table'.
At this moment the array is oversized due to allocating uint32_t for
all elements, when most of them fits in an uint8_t. Fix it by
changing the array to uint8_t and allocating the exact size;
- use stl_be_p() to store the uint32_t at the start of 'distance_table';
- use sizeof(uint32_t) to skip the uint32_t length when populating the
distances;
- use the NUMA_DISTANCE_MIN macro from sysemu/numa.h to avoid hardcoding
the local distance value.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210922122852.130054-2-danielhb413@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Correct the multi-line comment format. No functional changes.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20210918032653.646370-3-bin.meng@windriver.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There is no machine that uses Motorola MCP750 (aka Raven) model.
Drop the related codes.
While we are here, drop the mentioning of Intel GW80314 I/O
companion chip in the comments as it has been obsolete for years,
and correct a typo too.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20210918032653.646370-2-bin.meng@windriver.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The reset value of IPIDR should be zero for Freescale chipset, per
the following 2 manuals I checked:
- P2020RM (https://www.nxp.com/webapp/Download?colCode=P2020RM)
- P4080RM (https://www.nxp.com/webapp/Download?colCode=P4080RM)
Currently it is set to 1, which leaves the IPI enabled on core 0
after power-on reset. Such may cause unexpected interrupt to be
delivered to core 0 if the IPI is triggered from core 0 to other
cores later.
Fixes: ffd5e9fe02 ("openpic: Reset IRQ source private members")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/584
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20210918032653.646370-1-bin.meng@windriver.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The current way the mask is built can overflow with a 64-bit decrementer.
Use sextract64() to extract the signed values and remove the logic to
handle negative values which has become useless.
Cc: Luis Fernando Fujita Pires <luis.pires@eldorado.org.br>
Fixes: a8dafa5251 ("target/ppc: Implement large decrementer support for TCG")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210920061203.989563-5-clg@kaod.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
numa_complete_configuration() in hw/core/numa.c always adds a NUMA node
for the pSeries machine if none was specified, but without node distance
information for the single node created.
NUMA FORM1 affinity code didn't rely on numa_state information to do its
job, but FORM2 does. As is now, this is the result of a pSeries guest
with NUMA FORM2 affinity when no NUMA nodes is specified:
$ numactl -H
available: 1 nodes (0)
node 0 cpus: 0
node 0 size: 16222 MB
node 0 free: 15681 MB
No distance information available.
This can be amended in spapr_numa_FORM2_write_rtas_tables(). We're
enforcing that the local distance (the distance to the node to itself) is
always 10. This allows for the proper creation of the NUMA distance tables,
fixing the output of 'numactl -H' in the guest:
$ numactl -H
available: 1 nodes (0)
node 0 cpus: 0
node 0 size: 16222 MB
node 0 free: 15685 MB
node distances:
node 0
0: 10
CC: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-8-danielhb413@gmail.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The main feature of FORM2 affinity support is the separation of NUMA
distances from ibm,associativity information. This allows for a more
flexible and straightforward NUMA distance assignment without relying on
complex associations between several levels of NUMA via
ibm,associativity matches. Another feature is its extensibility. This base
support contains the facilities for NUMA distance assignment, but in the
future more facilities will be added for latency, performance, bandwidth
and so on.
This patch implements the base FORM2 affinity support as follows:
- the use of FORM2 associativity is indicated by using bit 2 of byte 5
of ibm,architecture-vec-5. A FORM2 aware guest can choose to use FORM1
or FORM2 affinity. Setting both forms will default to FORM2. We're not
advertising FORM2 for pseries-6.1 and older machine versions to prevent
guest visible changes in those;
- ibm,associativity-reference-points has a new semantic. Instead of
being used to calculate distances via NUMA levels, it's now used to
indicate the primary domain index in the ibm,associativity domain of
each resource. In our case it's set to {0x4}, matching the position
where we already place logical_domain_id;
- two new RTAS DT artifacts are introduced: ibm,numa-lookup-index-table
and ibm,numa-distance-table. The index table is used to list all the
NUMA logical domains of the platform, in ascending order, and allows for
spartial NUMA configurations (although QEMU ATM doesn't support that).
ibm,numa-distance-table is an array that contains all the distances from
the first NUMA node to all other nodes, then the second NUMA node
distances to all other nodes and so on;
- get_max_dist_ref_points(), get_numa_assoc_size() and get_associativity()
now checks for OV5_FORM2_AFFINITY and returns FORM2 values if the guest
selected FORM2 affinity during CAS.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-7-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
FORM2 NUMA affinity is prepared to deal with empty (memory/cpu less)
NUMA nodes. This is used by the DAX KMEM driver to locate a PAPR SCM
device that has a different latency than the original NUMA node from the
regular memory. FORM2 is also able to deal with asymmetric NUMA
distances gracefully, something that our FORM1 implementation doesn't
do.
Move these FORM1 verifications to a new function and wait until after
CAS, when we're sure that we're sticking with FORM1, to enforce them.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-6-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introducing a new NUMA affinity, FORM2, requires a new mechanism to
switch between affinity modes after CAS. Also, we want FORM2 data
structures and functions to be completely separated from the existing
FORM1 code, allowing us to avoid adding new code that inherits the
existing complexity of FORM1.
The idea of switching values used by the write_dt() functions in
spapr_numa.c was already introduced in the previous patch, and
the same approach will be used when dealing with the FORM1 and FORM2
arrays.
We can accomplish that by that by renaming the existing numa_assoc_array
to FORM1_assoc_array, which now is used exclusively to handle FORM1 affinity
data. A new helper get_associativity() is then introduced to be used by the
write_dt() functions to retrieve the current ibm,associativity array of
a given node, after considering affinity selection that might have been
done during CAS. All code that was using numa_assoc_array now needs to
retrieve the array by calling this function.
This will allow for an easier plug of FORM2 data later on.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-5-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The next preliminary step to introduce NUMA FORM2 affinity is to make
the existing code independent of FORM1 macros and values, i.e.
MAX_DISTANCE_REF_POINTS, NUMA_ASSOC_SIZE and VCPU_ASSOC_SIZE. This patch
accomplishes that by doing the following:
- move the NUMA related macros from spapr.h to spapr_numa.c where they
are used. spapr.h gets instead a 'NUMA_NODES_MAX_NUM' macro that is used
to refer to the maximum number of NUMA nodes, including GPU nodes, that
the machine can support;
- MAX_DISTANCE_REF_POINTS and NUMA_ASSOC_SIZE are renamed to
FORM1_DIST_REF_POINTS and FORM1_NUMA_ASSOC_SIZE. These FORM1 specific
macros are used in FORM1 init functions;
- code that uses MAX_DISTANCE_REF_POINTS now retrieves the
max_dist_ref_points value using get_max_dist_ref_points().
NUMA_ASSOC_SIZE is replaced by get_numa_assoc_size() and VCPU_ASSOC_SIZE
is replaced by get_vcpu_assoc_size(). These functions are used by the
generic device tree functions and h_home_node_associativity() and will
allow them to switch between FORM1 and FORM2 without changing their core
logic.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-4-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When first introduced, 'legacy_numa' was a way to refer to guests that
either wouldn't be affected by associativity domain calculations, namely
the ones with only 1 NUMA node, and pre 5.2 guests that shouldn't be
affected by it because it would be an userspace change. Calling these
cases 'legacy_numa' was a convenient way to label these cases.
We're about to introduce a new NUMA affinity, FORM2, and this concept
of 'legacy_numa' is now a bit misleading because, although it is called
'legacy' it is in fact a FORM1 exclusive contraint.
This patch removes spapr_machine_using_legacy_numa() and open code the
conditions in each caller. While we're at it, move the chunk inside
spapr_numa_FORM1_affinity_init() that sets all numa_assoc_array domains
with 'node_id' to spapr_numa_define_FORM1_domains(). This chunk was
being executed if !pre_5_2_numa_associativity and num_nodes => 1, the
same conditions in which spapr_numa_define_FORM1_domains() is called
shortly after.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The upcoming FORM2 NUMA affinity will support asymmetric NUMA topologies
and doesn't need be concerned with all the legacy support for older
pseries FORM1 guests.
We're also not going to calculate associativity domains based on numa
distance (via spapr_numa_define_associativity_domains) since the
distances will be written directly into new DT properties.
Let's split FORM1 code into its own functions to allow for easier
insertion of FORM2 logic later on.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If an unknown pin of the IRQ controller is raised, something is very
wrong in the QEMU model. It is better to abort.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210920061203.989563-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
MEM_UNPLUG_ERROR is deprecated since the introduction of
DEVICE_UNPLUG_GUEST_ERROR. Keep emitting both while the deprecation of
MEM_UNPLUG_ERROR is pending.
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210907004755.424931-8-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Linux Kernel 5.12 is now unisolating CPU DRCs in the device_removal
error path, signalling that the hotunplug process wasn't successful.
This allow us to send a DEVICE_UNPLUG_GUEST_ERROR in drc_unisolate_logical()
to signal this error to the management layer.
We also have another error path in spapr_memory_unplug_rollback() for
configured LMB DRCs. Kernels older than 5.13 will not unisolate the LMBs
in the hotunplug error path, but it will reconfigure them. Let's send
the DEVICE_UNPLUG_GUEST_ERROR event in that code path as well to cover the
case of older kernels.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210907004755.424931-7-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The error_report() call in drc_unisolate_logical() is not considering
that drc->dev->id can be NULL, and the underlying functions error_report()
calls to do its job (vprintf(), g_strdup_printf() ...) has undefined
behavior when trying to handle "%s" with NULL arguments.
Besides, there is no utility into reporting that an unknown device was
rejected by the guest.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210907004755.424931-4-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
As done in hw/acpi/memory_hotplug.c, pass an empty string if dev->id
is NULL to qapi_event_send_mem_unplug_error() to avoid relying on
a behavior that can be changed in the future.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210907004755.424931-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qapi_event_send_mem_unplug_error() deals with @device being NULL by
replacing it with an empty string ("") when emitting the event. Aside
from the fact that this behavior (qapi visitor mapping NULL pointer to
"") can be patched/changed someday, there's also the lack of utility
that the event brings to listeners, e.g. "a memory unplug error happened
somewhere".
In theory we should just avoit emitting this event at all if dev->id is
NULL, but this would be an incompatible change to existing guests.
Instead, let's make the forementioned behavior explicit: if dev->id is
NULL, pass an empty string to qapi_event_send_mem_unplug_error().
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210907004755.424931-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210902130928.528803-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This to avoid possible conflicts with the "id" property of QOM objects.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210901094153.227671-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On P10, the chip id is calculated from the "Primary topology table
index". See skiboot commits for more information [1].
This information is extracted from the hdata on real systems which
QEMU needs to emulate. Add this property for all machines even if it
is only used on POWER10.
[1] https://github.com/open-power/skiboot/commit/2ce3f083f399https://github.com/open-power/skiboot/commit/a2d4d7f9e14a
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210901094153.227671-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210901094153.227671-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Drop abs64() and use uabs64() from host-utils, which avoids
an undefined behavior when taking abs of the most negative value.
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20210910112624.72748-5-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Slot 0x9 is reserved for use by the in-built framebuffer whilst only slots
0xc, 0xd and 0xe physically exist on the Quadra 800.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-21-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Nubus IRQs are routed to the CPU through the VIA2 device so wire up the IRQs
using gpios accordingly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-20-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Each Nubus slot has an IRQ line that can be used to request service from the
CPU. Connect the IRQs to the Nubus bridge so that they can be wired up using qdev
gpios accordingly, and introduce a new nubus_set_irq() function that can be used
by Nubus devices to control the slot IRQ.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210924073808.1041-19-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This is to allow Macintosh machines to further specify which slots are available
since the number of addressable slots may not match the number of physical slots
present in the machine.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210924073808.1041-18-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Since nubus-bridge is a container for NubusBus then it should be embedded
directly within the bridge device using qbus_create_inplace().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-17-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Now that Nubus has its own address space rather than mapping directly into the
system bus, move the Nubus reference from MacNubusBridge to NubusBridge.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-16-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This better reflects that the mac-nubus-bridge device is derived from the
nubus-bridge device, and that the structure represents the state of the bridge
device and not the Nubus itself. Also update the comment in the file header to
reflect that mac-nubus-bridge is specific to the Macintosh.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-15-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This is to allow the Nubus bridge to store its own additional state. Also update
the comment in the file header to reflect that nubus-bridge is not specific to
the Macintosh.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-14-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
According to "Designing Cards and Drivers for the Macintosh Family" the Nubus
has its own 32-bit address space based upon physical slot addressing.
Move Nubus to its own 32-bit address space and then use memory region aliases
to map available slot and super slot ranges into the q800 system address
space via the Macintosh Nubus bridge.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The declaration ROM is located at the top-most address of the standard slot
space.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-12-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Since there is no need to generate a dummy declaration ROM, remove both
nubus_register_rom() and nubus_register_format_block(). These will shortly be
replaced with a mechanism to optionally load a declaration ROM from disk to
allow real images to be used within QEMU.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-11-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The macfb device is an on-board framebuffer and so is initialised by the
system declaration ROM included within the MacOS toolbox ROM.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
According to "Designing Cards and Drivers for the Macintosh Family" any attempt
to access an unimplemented address location on Nubus generates a bus error. MacOS
uses a custom bus error handler to detect empty Nubus slots, and with the current
implementation assumes that all slots are occupied as the Nubus transactions
never fail.
Switch nubus_slot_ops and nubus_super_slot_ops over to use {read,write}_with_attrs
and hard-code them to return MEMTX_DECODE_ERROR so that unoccupied Nubus slots
will generate the expected bus error.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Increase the max_access_size to 4 bytes for empty Nubus slot and super slot
accesses to allow tracing of the Nubus enumeration process by the guest OS.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Allow Nubus to manage the slot allocations itself using the BusClass check_address()
virtual function rather than managing this during NubusDevice realize().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210924073808.1041-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Convert nubus_device_realize() to use a bitmap to manage available slots to allow
for future Nubus devices to be plugged into arbitrary slots from the command line
using a new qdev "slot" parameter for nubus devices.
Update mac_nubus_bridge_init() to only allow slots 0x9 to 0xe on Macintosh machines
as documented in "Designing Cards and Drivers for the Macintosh Family".
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210924073808.1041-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
According to "Designing Cards and Drivers for the Macintosh Family" each physical
nubus slot can access 2 separate address ranges: a super slot memory region which
is 256MB and a standard slot memory region which is 16MB.
Currently a Nubus device uses the physical slot number to determine whether it is
using a standard slot memory region or a super slot memory region rather than
exposing both memory regions for use as required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This is in preparation for creating a qdev property of the same name.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210924073808.1041-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The PC_ROM_* definitions are only used by the PC machine,
and are irrelevant to the other architectures / machines.
Reduce their scope by moving them to hw/i386/pc.c.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210917185949.2244956-1-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Currently, FUSED operations are not supported by QEMU. As per the 1.4 SPEC,
controller should abort the command that requested a fused operation with
an INVALID FIELD error code if they are not supported.
Changes from v1:
Added FUSE flag check also to the admin cmd processing as the FUSED
operations are mentioned in the general SQE section in the SPEC.
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Fix is added to check for reserved value in select field for
namespace attachment
CC: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Naveen Nagar <naveen.n1@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Address 0x0 is a valid address. Fix the admin submission and completion
queue address validation to not error out on this.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
- ePMP CSR address updates
- Convert internal interrupts to use QEMU GPIO lines
- SiFive PWM support
- Support for RISC-V ACLINT
- SiFive PDMA fixes
- Update to u-boot instructions for sifive_u
- mstatus.SD bug fix for hypervisor extensions
- OpenTitan fix for USB dev address
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE9sSsRtSTSGjTuM6PIeENKd+XcFQFAmFJgSoACgkQIeENKd+X
cFQTOwf8DC7rBqOWQS3v/r+H2hlfDqW+4G3pPPBcoyCEiqO+cL26ox+EmTHDbieh
+0yWyp7L6SU/zcJ86oBAFNGH46ltXuUKOYWhkfA1QwlGzAwjZ82hnZ3jJqXf1jin
Wq0ElzKk6rvcRkHTVhdjkGvoxskaXPQ/kFzyTHrxMDlkmHO3L4IaYe0xsamRI11D
E7UJC97YmpSAsCNUc5irpkeLyiFobyR8TEL3nBEPK/6Xj0ojRT4zoGe1EotC7+sN
zL8a9ZuU0bL3rQH8Ai7wnXBP8D2PQa0tZQV6wne/BzeEUSpKrC/rGW73vQCz0Pps
U8VNkIlbAqD1s6aXlqE24H535x10Mw==
=WYF5
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/alistair23/tags/pull-riscv-to-apply-20210921' into staging
Second RISC-V PR for QEMU 6.2
- ePMP CSR address updates
- Convert internal interrupts to use QEMU GPIO lines
- SiFive PWM support
- Support for RISC-V ACLINT
- SiFive PDMA fixes
- Update to u-boot instructions for sifive_u
- mstatus.SD bug fix for hypervisor extensions
- OpenTitan fix for USB dev address
# gpg: Signature made Mon 20 Sep 2021 11:52:26 PM PDT
# gpg: using RSA key F6C4AC46D4934868D3B8CE8F21E10D29DF977054
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: F6C4 AC46 D493 4868 D3B8 CE8F 21E1 0D29 DF97 7054
* remotes/alistair23/tags/pull-riscv-to-apply-20210921: (21 commits)
hw/riscv: opentitan: Correct the USB Dev address
target/riscv: csr: Rename HCOUNTEREN_CY and friends
target/riscv: Backup/restore mstatus.SD bit when virtual register swapped
docs/system/riscv: sifive_u: Update U-Boot instructions
hw/dma: sifive_pdma: don't set Control.error if 0 bytes to transfer
hw/dma: sifive_pdma: allow non-multiple transaction size transactions
hw/dma: sifive_pdma: claim bit must be set before DMA transactions
hw/dma: sifive_pdma: reset Next* registers when Control.claim is set
hw/riscv: virt: Add optional ACLINT support to virt machine
hw/riscv: virt: Re-factor FDT generation
hw/intc: Upgrade the SiFive CLINT implementation to RISC-V ACLINT
hw/intc: Rename sifive_clint sources to riscv_aclint sources
sifive_u: Connect the SiFive PWM device
hw/timer: Add SiFive PWM support
hw/intc: ibex_timer: Convert the timer to use RISC-V CPU GPIO lines
hw/intc: sifive_plic: Convert the PLIC to use RISC-V CPU GPIO lines
hw/intc: ibex_plic: Convert the PLIC to use RISC-V CPU GPIO lines
hw/intc: sifive_clint: Use RISC-V CPU GPIO lines
target/riscv: Expose interrupt pending bits as GPIO lines
target/riscv: Fix satp write
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Setting Control.claim clears all of the chanel's Next registers.
This is effective only when Control.claim is set from 0 to 1.
Signed-off-by: Frank Chang <frank.chang@sifive.com>
Tested-by: Max Hsu <max.hsu@sifive.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210912130553.179501-2-frank.chang@sifive.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We extend virt machine to emulate ACLINT devices only when "aclint=on"
parameter is passed along with machine name in QEMU command-line.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210831110603.338681-5-anup.patel@wdc.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We re-factor and break the FDT generation into smaller functions
so that it is easier to modify FDT generation for different
configurations of virt machine.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210831110603.338681-4-anup.patel@wdc.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The RISC-V ACLINT is more modular and backward compatible with
original SiFive CLINT so instead of duplicating the original
SiFive CLINT implementation we upgrade the current SiFive CLINT
implementation to RISC-V ACLINT implementation.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210831110603.338681-3-anup.patel@wdc.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We will be upgrading SiFive CLINT implementation into RISC-V ACLINT
implementation so let's first rename the sources.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210831110603.338681-2-anup.patel@wdc.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This is the initial commit of the SiFive PWM timer. This is used by
guest software as a timer and is included in the SiFive FU540 SoC.
Signed-off-by: Justin Restivo <jrestivo@draper.com>
Signed-off-by: Alexandra Clifford <aclifford@draper.com>
Signed-off-by: Amanda Strnad <astrnad@draper.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 9f70a210acbfaf0e1ea6ad311ab892ac69134d8b.1631159656.git.alistair.francis@wdc.com
Instead of using riscv_cpu_update_mip() let's instead use the new RISC-V
CPU GPIO lines to set the timer MIP bits.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 84d5b1d5783d2e79eee69a2f7ac480cc0c070db3.1630301632.git.alistair.francis@wdc.com
Instead of using riscv_cpu_update_mip() let's instead use the new RISC-V
CPU GPIO lines to set the external MIP bits.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 0364190bfa935058a845c0fa1ecf650328840ad5.1630301632.git.alistair.francis@wdc.com
Instead of using riscv_cpu_update_mip() let's instead use the new RISC-V
CPU GPIO lines to set the external MIP bits.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 0a76946981852f5bd15f0c37ab35b253371027a8.1630301632.git.alistair.francis@wdc.com
Instead of using riscv_cpu_update_mip() let's instead use the new RISC-V
CPU GPIO lines to set the timer and soft MIP bits.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
Message-id: 946e1ef5e268b24084c7ddad84c146de62a56736.1630301632.git.alistair.francis@wdc.com
During sbsa acs level 3 testing, it is seen that the GIC maintenance
interrupts are not triggered and the related test cases fail. This
is because we were incorrectly passing the value of the MISR register
(from maintenance_interrupt_state()) to qemu_set_irq() as the level
argument, whereas the device on the other end of this irq line
expects a 0/1 value.
Fix the logic to pass a 0/1 level indication, rather than a
0/not-0 value.
Fixes: c5fc89b36c ("hw/intc/arm_gicv3: Implement gicv3_cpuif_virt_update()")
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210915205809.59068-1-shashi.mallela@linaro.org
[PMM: tweaked commit message; collapsed nested if()s into one]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When you run QEMU with an Aspeed machine and a single serial device
using stdio like this:
qemu -machine ast2600-evb -drive ... -serial stdio
The guest OS can read and write to the UART5 registers at 0x1E784000 and
it will receive from stdin and write to stdout. The Aspeed SoC's have a
lot more UART's though (AST2500 has 5, AST2600 has 13) and depending on
the board design, may be using any of them as the serial console. (See
"stdout-path" in a DTS to check which one is chosen).
Most boards, including all of those currently defined in
hw/arm/aspeed.c, just use UART5, but some use UART1. This change adds
some flexibility for different boards without requiring users to change
their command-line invocation of QEMU.
I tested this doesn't break existing code by booting an AST2500 OpenBMC
image and an AST2600 OpenBMC image, each using UART5 as the console.
Then I tested switching the default to UART1 and booting an AST2600
OpenBMC image that uses UART1, and that worked too.
Signed-off-by: Peter Delevoryas <pdel@fb.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210901153615.2746885-2-pdel@fb.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
UART5 is typically used as the default debug UART on the AST2600, but
UART1 is also designed to be a debug UART. All the AST2600 UART's have
semi-configurable clock rates through registers in the System Control
Unit (SCU), but only UART5 works out of the box with zero-initialized
values. The rest of the UART's expect a few of the registers to be
initialized to non-zero values, or else the clock rate calculation will
yield zero or undefined (due to a divide-by-zero).
For reference, the U-Boot clock rate driver here shows the calculation:
https://github.com/facebook/openbmc-uboot/blob/15f7e0dc01d8/drivers/clk/aspeed/clk_ast2600.c#L357
To summarize, UART5 allows selection from 4 rates: 24 MHz, 192 MHz, 24 /
13 MHz, and 192 / 13 MHz. The other UART's allow selecting either the
"low" rate (UARTCLK) or the "high" rate (HUARTCLK). UARTCLK and HUARTCLK
are configurable themselves:
UARTCLK = UXCLK * R / (N * 2)
HUARTCLK = HUXCLK * HR / (HN * 2)
UXCLK and HUXCLK are also configurable, and depend on the APLL and/or
HPLL clock rates, which also derive from complicated calculations. Long
story short, there's lots of multiplication and division from
configurable registers, and most of these registers are zero-initialized
in QEMU, which at best is unexpected and at worst causes this clock rate
driver to hang from divide-by-zero's. This can also be difficult to
diagnose, because it may cause U-Boot to hang before serial console
initialization completes, requiring intervention from gdb.
This change just initializes all of these registers with default values
from the datasheet.
To test this, I used Facebook's AST2600 OpenBMC image for "fuji", with
the following diff applied (because fuji uses UART1 for console output,
not UART5).
@@ -323,8 +323,8 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp)
}
/* UART - attach an 8250 to the IO space as our UART5 */
- serial_mm_init(get_system_memory(), sc->memmap[ASPEED_DEV_UART5], 2,
- aspeed_soc_get_irq(s, ASPEED_DEV_UART5),
+ serial_mm_init(get_system_memory(), sc->memmap[ASPEED_DEV_UART1], 2,
+ aspeed_soc_get_irq(s, ASPEED_DEV_UART1),
38400, serial_hd(0), DEVICE_LITTLE_ENDIAN);
/* I2C */
Without these clock rate registers being initialized, U-Boot hangs in
the clock rate driver from a divide-by-zero, because the UART1 clock
rate register reads return zero, and there's no console output. After
initializing them with default values, fuji boots successfully.
Signed-off-by: Peter Delevoryas <pdel@fb.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
[ clg: Removed _PARAM suffix ]
Message-Id: <20210906134023.3711031-2-pdel@fb.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Witherspoon uses the DPS310 as a temperature sensor. Rainier uses it as
a temperature and humidity sensor.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210629142336.750058-5-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This contains some hardcoded register values that were obtained from the
hardware after reading the temperature.
It does enough to test the Linux kernel driver. The FIFO mode, IRQs and
operation modes other than the default as used by Linux are not modelled.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Message-Id: <20210616073358.750472-2-joel@jms.id.au>
[ clg: - Fixed sequential reading
- Reworked regs_reset_state array
- Moved model under hw/sensor/ ]
Message-Id: <20210629142336.750058-4-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This is the latest revision of the ASPEED 2600 SoC. As there is no
need to model multiple revisions of the same SoC for the moment,
update the SCU AST2600 to model the A3 revision instead of the A1 and
adapt the AST2600 SoC and machines.
Reset values are taken from v8 of the datasheet.
Signed-off-by: Joel Stanley <joel@jms.id.au>
[ clg: - Introduced an Aspeed "ast2600-a3" SoC class
- Commit log update ]
Message-Id: <20210629142336.750058-3-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
These are the devices documented by the Rainier device tree. With this
we can see the guest discovering the multiplexers and probing the eeprom
devices:
i2c i2c-2: Added multiplexed i2c bus 16
i2c i2c-2: Added multiplexed i2c bus 17
i2c i2c-2: Added multiplexed i2c bus 18
i2c i2c-2: Added multiplexed i2c bus 19
i2c-mux-gpio i2cmux: 4 port mux on 1e78a180.i2c-bus adapter
at24 20-0050: 8192 byte 24c64 EEPROM, writable, 1 bytes/write
i2c i2c-4: Added multiplexed i2c bus 20
at24 21-0051: 8192 byte 24c64 EEPROM, writable, 1 bytes/write
i2c i2c-4: Added multiplexed i2c bus 21
at24 22-0052: 8192 byte 24c64 EEPROM, writable, 1 bytes/write
Signed-off-by: Joel Stanley <joel@jms.id.au>
[ clg: Introduced aspeed_eeprom_init ]
Message-Id: <20210629142336.750058-2-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
There was a bit of a thinko in the state calculation where every odd pin
in was reported in e.g. "pwm0" mode rather than "off". This was the
result of an incorrect bit shift for the 2-bit field representing each
LED state.
Fixes: a90d8f8467 ("misc/pca9552: Add qom set and get")
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210723043624.348158-1-andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
There are two GPIO controllers in the ast2600; one is 3.3V and the other
is 1.8V.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210713065854.134634-4-joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
There's no need to define the registers relative to the 0x800 offset
where the controller is mapped, as the device is instantiated as it's
own model at the correct memory address.
Simplify the defines and remove the offset to save future confusion.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210713065854.134634-3-joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The logic in the handling for the control register required toggling the
enable state for writes to stick. Rework the condition chain to allow
sequential writes that do not update the enable state.
Fixes: 854123bf8d ("wdt: Add Aspeed watchdog device model")
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210709053107.1829304-3-andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
While some of the critical fields remain the same, there is variation in
the definition of the control register across the SoC generations.
Reserved regions are adjusted, while in other cases the mutability or
behaviour of fields change.
Introduce a callback to sanitize the value on writes to ensure model
behaviour reflects the hardware.
Fixes: 854123bf8d ("wdt: Add Aspeed watchdog device model")
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210709053107.1829304-2-andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
According to its dts file in the Linux kernel, we need mac0 and mac1 enabled
instead of mac1 and mac2. Also, g220a is based on aspeed-g5 (ast2500) which
doesn't even have the third interface.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210810035742.550391-1-linux@roeck-us.net>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Commit 7582591ae7 ("aspeed: Support AST2600A1 silicon revision") switched
the silicon revision for AST2600 to revision A1. On revision A1, the first
Ethernet interface is operational. Enable it.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210808200457.889955-1-linux@roeck-us.net>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
When mergeable buffer is enabled, we try to set the num_buffers after
the virtqueue elem has been unmapped. This will lead several issues,
E.g a use after free when the descriptor has an address which belongs
to the non direct access region. In this case we use bounce buffer
that is allocated during address_space_map() and freed during
address_space_unmap().
Fixing this by storing the elems temporarily in an array and delay the
unmap after we set the the num_buffers.
This addresses CVE-2021-3748.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: fbe78f4f55 ("virtio-net support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Adding this callback provides a way to resume the processing of
cmds in fenceq and cmdq that were not processed because the UI
was waiting on a fence and blocked cmd processing.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Message-Id: <20210914211837.3229977-6-vivek.kasireddy@intel.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Create sync objects and fences only for dmabufs that are blobs. Once a
fence is created (after glFlush) and is signalled,
graphic_hw_gl_flushed() will be called and virtio-gpu cmd processing
will be resumed.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Message-Id: <20210914211837.3229977-4-vivek.kasireddy@intel.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* mark MPS2/MPS3 board-internal i2c buses as 'full' so that command
line user-created devices are not plugged into them
* Take an exception if PSTATE.IL is set
* Support an emulated ITS in the virt board
* Add support for kudo-bmc board
* Probe for KVM_CAP_ARM_VM_IPA_SIZE when creating scratch VM
* cadence_uart: Fix clock handling issues that prevented
u-boot from running
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmE/ruQZHHBldGVyLm1h
eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3krdD/sHLxbPua1IOA1+uxLJwRnr
N7BZa0GVNX8+dKi3w3jtYHOyFG1u9NeOp/VI93I7G9k0vRvYT8eMN4cMWwsaG5rr
PPjiLIFAIFwxV9QkafIONLxLYFfc6T48tstG6BYaJU2tLPwIlSZK4ZbKqrxWesAm
mMw75AtESjYI77yQcsEXDflmcvbvM++IrqQAa190i2D8rizbbv/gqZtzJJpU2OGy
My51t+g1SPPJvoih6edpURGmKH1vmB0UwadnOG3GFv76c9nYeVPXAtdXS+8Rs+vU
QJpvJ0MSRc5ZztsltvXQefH4aseSHrZybpZGI0tNpZ1G2oRwZHIXEMDcZwtRHKlZ
o5M6oeNOUZFRFrLM8FRv4ErIFhgMwWUghy+oVejCF791j1WeasDpFL+ZZTWUNYiP
qmNdh6z7Dt7F1fxBxMiCw9PTRNB2zudyz/ZtymPGYEDj7leIpQ/HudRmaDKZ+zMG
A8omXNEw1LFsVrTE5MjLT7tr2Eq+71V2m0OkDB+Tvmpl4AXVG9b7kCoOp6NiAXZd
Y4Vdi5I8NN3OHK0yO1vMxOlNk7qo4BTqT7FYaSb1qaTZ/6TQtrWb7ThU989JJaQE
28H1p8uezMDC8NsaEBa2eBsen6Uf45jYKxgUpG0jB9QuXtRY1xUdaU06fQlz4dpn
7SyfLZbzeB0v+Bqd7z3Y9A==
=7BH/
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210913-3' into staging
target-arm queue:
* mark MPS2/MPS3 board-internal i2c buses as 'full' so that command
line user-created devices are not plugged into them
* Take an exception if PSTATE.IL is set
* Support an emulated ITS in the virt board
* Add support for kudo-bmc board
* Probe for KVM_CAP_ARM_VM_IPA_SIZE when creating scratch VM
* cadence_uart: Fix clock handling issues that prevented
u-boot from running
# gpg: Signature made Mon 13 Sep 2021 21:04:52 BST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20210913-3: (23 commits)
hw/arm/mps2.c: Mark internal-only I2C buses as 'full'
hw/arm/mps2-tz.c: Mark internal-only I2C buses as 'full'
hw/arm/mps2-tz.c: Add extra data parameter to MakeDevFn
qdev: Support marking individual buses as 'full'
target/arm: Merge disas_a64_insn into aarch64_tr_translate_insn
target/arm: Take an exception if PSTATE.IL is set
tests/data/acpi/virt: Update IORT files for ITS
hw/arm/virt: add ITS support in virt GIC
tests/data/acpi/virt: Add IORT files for ITS
hw/intc: GICv3 redistributor ITS processing
hw/intc: GICv3 ITS Feature enablement
hw/intc: GICv3 ITS Command processing
hw/intc: GICv3 ITS command queue framework
hw/intc: GICv3 ITS register definitions added
hw/intc: GICv3 ITS initial framework
hw/arm: Add support for kudo-bmc board.
hw/arm/virt: KVM: Probe for KVM_CAP_ARM_VM_IPA_SIZE when creating scratch VM
hw/char: cadence_uart: Log a guest error when device is unclocked or in reset
hw/char: cadence_uart: Ignore access when unclocked or in reset for uart_{read, write}()
hw/char: cadence_uart: Convert to memop_with_attrs() ops
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The various MPS2 boards implemented in mps2.c have multiple I2C
buses: a bus dedicated to the audio configuration, one for the LCD
touchscreen controller, and two which are connected to the external
Shield expansion connector. Mark the buses which are used only for
board-internal devices as 'full' so that if the user creates i2c
devices on the commandline without specifying a bus name then they
will be connected to the I2C controller used for the Shield
connector, where guest software will expect them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210903151435.22379-5-peter.maydell@linaro.org
The various MPS2 boards have multiple I2C buses: typically a bus
dedicated to the audio configuration, one for the LCD touchscreen
controller, one for a DDR4 EEPROM, and two which are connected to the
external Shield expansion connector. Mark the buses which are used
only for board-internal devices as 'full' so that if the user creates
i2c devices on the commandline without specifying a bus name then
they will be connected to the I2C controller used for the Shield
connector, where guest software will expect them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210903151435.22379-4-peter.maydell@linaro.org
The mps2-tz boards use a data-driven structure to create the devices
that sit behind peripheral protection controllers. Currently the
functions which create these devices are passed an 'opaque' pointer
which is always the address within the machine struct of the device
to create, and some "all devices need this" information like irqs and
addresses.
If a specific device needs more information than this, it is
currently not possible to pass that through from the PPCInfo
data structure. Add support for passing an extra data parameter,
so that we can more flexibly handle the needs of specific
device types. To provide some type-safety we make this extra
parameter a pointer to a union (which initially has no members).
In particular, we would like to be able to indicate which of the
i2c controllers are for on-board devices only and which are
connected to the external 'shield' expansion port; a subsequent
patch will use this mechanism for that purpose.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210903151435.22379-3-peter.maydell@linaro.org
Included creation of ITS as part of virt platform GIC
initialization. This Emulated ITS model now co-exists with kvm
ITS and is enabled in absence of kvm irq kernel support in a
platform.
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210910143951.92242-9-shashi.mallela@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implemented lpi processing at redistributor to get lpi config info
from lpi configuration table,determine priority,set pending state in
lpi pending table and forward the lpi to cpuif.Added logic to invoke
redistributor lpi processing with translated LPI which set/clear LPI
from ITS device as part of ITS INT,CLEAR,DISCARD command and
GITS_TRANSLATER processing.
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210910143951.92242-7-shashi.mallela@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Added properties to enable ITS feature and define qemu system
address space memory in gicv3 common,setup distributor and
redistributor registers to indicate LPI support.
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Message-id: 20210910143951.92242-6-shashi.mallela@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Added ITS command queue handling for MAPTI,MAPI commands,handled ITS
translation which triggers an LPI via INT command as well as write
to GITS_TRANSLATER register,defined enum to differentiate between ITS
command interrupt trigger and GITS_TRANSLATER based interrupt trigger.
Each of these commands make use of other functionalities implemented to
get device table entry,collection table entry or interrupt translation
table entry required for their processing.
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210910143951.92242-5-shashi.mallela@linaro.org
[PMM: use INTERRUPT for ItsCmdType enum name to avoid
conflict with INT type defined by Windows headers]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Added functionality to trigger ITS command queue processing on
write to CWRITE register and process each command queue entry to
identify the command type and handle commands like MAPD,MAPC,SYNC.
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Message-id: 20210910143951.92242-4-shashi.mallela@linaro.org
[PMM: fixed format string nit]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Defined descriptors for ITS device table,collection table and ITS
command queue entities.Implemented register read/write functions,
extract ITS table parameters and command queue parameters,extended
gicv3 common to capture qemu address space(which host the ITS table
platform memories required for subsequent ITS processing) and
initialize the same in ITS device.
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Message-id: 20210910143951.92242-3-shashi.mallela@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Added register definitions relevant to ITS,implemented overall
ITS device framework with stubs for ITS control and translater
regions read/write,extended ITS common to handle mmio init between
existing kvm device and newer qemu device.
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Message-id: 20210910143951.92242-2-shashi.mallela@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We've got SW that expects FSBL (Bootlooader) to setup clocks and
resets. It's quite common that users run that SW on QEMU without
FSBL (FSBL typically requires the Xilinx tools installed). That's
fine, since users can stil use -device loader to enable clocks etc.
To help folks understand what's going, a log (guest-error) message
would be helpful here. In particular with the serial port since
things will go very quiet if they get things wrong.
Suggested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210901124521.30599-7-bmeng.cn@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Read or write to uart registers when unclocked or in reset should be
ignored. Add the check there, and as a result of this, the check in
uart_write_tx_fifo() is now unnecessary.
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210901124521.30599-6-bmeng.cn@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This converts uart_read() and uart_write() to memop_with_attrs() ops.
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210901124521.30599-5-bmeng.cn@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently the clock/reset check is done in uart_receive(), but we
can move the check to uart_can_receive() which is earlier.
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210901124521.30599-4-bmeng.cn@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
At present when input clock is disabled, any character transmitted
to tx fifo can still show on the serial line, which is wrong.
Fixes: b636db306e ("hw/char/cadence_uart: add clock support")
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20210901124521.30599-3-bmeng.cn@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As of today, when booting upstream U-Boot for Xilinx Zynq, the UART
does not receive anything. Debugging shows that the UART input clock
frequency is zero which prevents the UART from receiving anything as
per the logic in uart_receive().
From zynq_slcr_reset_exit() comment, it intends to compute output
clocks according to ps_clk and registers. zynq_slcr_compute_clocks()
is called to accomplish the task, inside which device_is_in_reset()
is called to actually make the attempt in vain.
Rework reset_hold() and reset_exit() so that in the reset exit phase,
the logic can really compute output clocks in reset_exit().
With this change, upstream U-Boot boots properly again with:
$ qemu-system-arm -M xilinx-zynq-a9 -m 1G -display none -serial null -serial stdio \
-device loader,file=u-boot-dtb.bin,addr=0x4000000,cpu-num=0
Fixes: 38867cb7ec ("hw/misc/zynq_slcr: add clock generation for uarts")
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20210901124521.30599-2-bmeng.cn@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The file already existed, but nobody had noticed the warning until now.
Add it at the bottom, since that is where unknown files go in legacy mode.
Fixes: 217f1b4a72 ("target-i386: Publish advised value of MSR_IA32_FEATURE_CONTROL via fw_cfg")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A PS/2 keyboard has a separate command reply queue that is
independent of the key queue. This prevents that command replies
and keyboard input mix. Keyboard command replies take precedence
over queued keystrokes. A new keyboard command removes any
remaining command replies from the command reply queue.
Implement a separate keyboard command reply queue and clear the
command reply queue before command execution. This brings the
PS/2 keyboard emulation much closer to a real PS/2 keyboard.
The command reply queue is located in a few free bytes directly
in front of the scancode queue. Because the scancode queue has
a maximum length of 16 bytes there are 240 bytes available for
the command reply queue. At the moment only a maximum of 3 bytes
are required. For compatibility reasons rptr, wptr and count kept
their function. rptr is the start, wptr is the end and count is
the length of the entire keyboard queue. The new variable cwptr
is the end of the command reply queue or -1 if the queue is
empty. To write to the command reply queue, rptr is moved
backward by the number of required bytes and the command replies
are written to the buffer starting at the new rptr position.
After writing, cwptr is at the old rptr position. Copying cwptr
to rptr clears the command reply queue. The command reply queue
can't overflow because each new keyboard command clears the
command reply queue.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/501
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/502
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20210810133258.8231-2-vr_qemu@t-online.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Extend the used ps2 buffer size to the available buffer size but
keep the maximum ps2 queue size.
The next patch needs a few bytes of the larger buffer size.
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20210810133258.8231-1-vr_qemu@t-online.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
These will soon be required to enable nubus devices to support interrupts.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210830102447.10806-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Now that q800 VIA1 and VIA2 are completely separate devices there is no need to
add a specific device prefix to ensure that the IRQ lines remain separate.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210830102447.10806-11-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Remove the mac_via device and wire up both q800 VIA1 and VIA2 directly for the
m68k q800 machine.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210830102447.10806-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
After this change mac_via_reset() is now empty and can be removed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210830102447.10806-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
These variables are already present in MOS6522Q800VIA1State and so it is just
the VMStateDescription move that is needed.
With this change the mac_via VMStateDescription is now empty and can be removed
completely.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210830102447.10806-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The ADB is accessed using clock and data pins on q800 VIA1 port B and so can be
moved to MOS6522Q800VIA1State.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210830102447.10806-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The PRAM/RTC is accessed using clock and data pins on q800 VIA1 port B and so
can be moved to MOS6522Q800VIA1State.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210830102447.10806-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The PRAM contents are accessed using clock and data pins on q800 VIA1 port B
and so can be moved to MOS6522Q800VIA1State.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210830102447.10806-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This variable is already present in MOS6522Q800VIA1State and can be moved
immediately into the q800 VIA1 VMStateDescription.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210830102447.10806-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Move the parent mos6522 objects from vmstate_mac_via into the new VMStateDescription
structures to begin the process of splitting MacVIAState into separate VIA1 and
VIA2 devices.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210830102447.10806-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
After an SDLC "Enter hunt" command has been sent the STATUS_SYNC bit should remain
high until the flag byte has been detected. Whilst the ESCC device doesn't yet
implement SDLC mode, without this change the active low STATUS_SYNC is constantly
asserted causing the MacOS OpenTransport extension to hang on startup as it thinks
it is constantly receiving LocalTalk responses during its initial negotiation
phase.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210903113223.19551-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This removes duplication of the internal device state initialisation between
device reset and soft reset.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210903113223.19551-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Now that register values at reset are handled elsewhere for all of device reset,
soft reset and hard reset, escc_reset_chn() only needs to handle initialisation
of internal device state.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210903113223.19551-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The hardware reset differs from a device reset in that it only changes the contents
of specific registers. Remove the code that resets all the registers to zero during
hardware reset and implement the default values using the existing soft reset code
with the additional changes listed in the table in the "Z85C30 Reset" section.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210903113223.19551-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The software reset differs from a device reset in that it only changes the contents
of specific registers. Remove the code that resets all the registers to zero during
soft reset and implement the default values listed in the table in the "Z85C30 Reset"
section.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210903113223.19551-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This new hardware reset function is to be called for both channels when the
hardware reset bit is written to register WR9. Its initial implementation is
the same as the existing escc_reset_chn() function used for device reset.
Add a new trace event when the guest initiates a hard reset via the WR9 register
to help diagnose guest reset issues.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210903113223.19551-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This new software reset function is to be called when the appropriate channel
software reset bit is written to register WR9. Its initial implementation is
the same as the existing escc_reset_chn() function used for device reset.
Add a new trace event when the guest initiates a soft reset via the WR9 register
to help diagnose guest reset issues.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210903113223.19551-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This is to ensure that a device reset always returns the ESCC to a known state.
Note that this is currently redundant with the same code in escc_reset_chn()
but that will change shortly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210903113223.19551-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Also fix a couple of spelling mistakes in comments.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210903113223.19551-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Commit 24f675cd3b ("sparc/sun4m: Use start-powered-off CPUState property") changed
the sun4m CPU reset code to use the start-powered-off property and so split the
creation of the CPU into separate instantiation and realization phases to enable
the new start-powered-off property to be set.
This accidentally broke sun4m machines with more than one CPU present since
sparc_cpu_realizefn() sets a default CPU id, and now that realization occurs after
calling cpu_sparc_set_id() in cpu_devinit() the CPU id gets reset back to the
default instead of being uniquely encoded based upon the CPU number. As soon as
another CPU is brought online, the OS gets confused between them and promptly
panics.
Resolve the issue by moving the cpu_sparc_set_id() call in cpu_devinit() to after
the point where the CPU device has been realized as before.
Fixes: 24f675cd3b ("sparc/sun4m: Use start-powered-off CPUState property")
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210825095100.20180-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Add the new gen16 features to the default model and fence them for
machine version 6.1 and earlier.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210907101017.27126-1-borntraeger@de.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The PAGE_SIZE macro is causing trouble on Alpine Linux since it
clashes with a macro from a system header there. We already have
the TARGET_PAGE_SIZE, TARGET_PAGE_MASK and TARGET_PAGE_BITS macros
in QEMU anyway, so let's simply replace the PAGE_SIZE, PAGE_MASK
and PAGE_SHIFT macro with their TARGET_* counterparts.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/572
Message-Id: <20210901125800.611183-1-thuth@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Let's enable storage keys lazily under TCG, just as we do under KVM.
Only fairly old Linux versions actually make use of storage keys, so it
can be kind of wasteful to allocate quite some memory and track
changes and references if nobody cares.
We have to make sure to flush the TLB when enabling storage keys after
the VM was already running: otherwise it might happen that we don't
catch references or modifications afterwards.
Add proper documentation to all callbacks.
The kvm-unit-tests skey tests keeps on working with this change.
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210903155514.44772-14-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
... and make it return a bool instead.
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210903155514.44772-13-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Let's validate the given address and report a proper error in case it's
not. All call paths now properly check the validity of the given GFN.
Remove the TODO.
The errors inside the getter and setter should only trigger if something
really goes wrong now, for example, with a broken migration stream. Or
when we forget to update the storage key allocation with memory hotplug.
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210903155514.44772-12-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Handle it similar to migration. Assert that we're holding the BQL, to
make sure we don't see concurrent modifications.
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210903155514.44772-11-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Let's use the guest_phys_blocks API to get physical memory regions
that are well defined inside our physical address space and migrate the
storage keys of these.
This is a preparation for having memory besides initial ram defined in
the guest physical address space, for example, via memory devices. We
get rid of the ms->ram_size dependency.
Please note that we will usually have very little (--> 1) physical
ranges. With virtio-mem might have significantly more ranges in the
future. If that turns out to be a problem (e.g., total memory
footprint of the list), we could look into a memory mapping
API that avoids creation of a list and instead triggers a callback for
each range.
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210903155514.44772-10-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
hsch and csch basically have two parts: execute the command,
and perform the halt/clear function. For fully emulated
subchannels, it is pretty clear how it will work: check the
subchannel state, and actually 'perform the halt/clear function'
and set cc 0 if everything looks good.
For passthrough subchannels, some of the checking is done
within QEMU, but some has to be done within the kernel. QEMU's
subchannel state may be such that we can perform the async
function, but the kernel may still get a cc != 0 when it is
actually executing the instruction. In that case, we need to
set the condition actually encountered by the kernel; if we
set cc 0 on error, we would actually need to inject an interrupt
as well.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Jared Rossi <jrossi@linux.ibm.com>
Message-Id: <20210705163952.736020-2-cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This patch switches to initialize dev.nvqs from the VhostNetOptions
instead of assuming it was 2. This is useful for implementing control
virtqueue support which will be a single vhost_net structure with a
single cvq.
Note that nvqs is still set to 2 for all users and this patch does not
change functionality.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-6-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The nvqs and vqs have been initialized during vhost_net_init() and are
not expected to change during the life cycle of vhost_net
structure. So this patch removes the meaningless assignment.
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-4-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We should return error code instead of zero, otherwise there's no way
for the caller to detect the failure.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-3-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Qemu will crash on vhost backend unexpected exit and re-connect │
in some case due to access released memory.
Signed-off-by: Yuwei Zhang <zhangyuwei.9149@bytedance.com>
Message-Id: <20210830123433.45727-1-zhangyuwei.9149@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio_free_region_cache() is called within call_rcu(),
always with a non-NULL argument. Ensure new code keep it
that way by replacing the NULL check by an assertion.
Add a comment this function is called within call_rcu().
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210826172658.2116840-3-philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
While virtio_queue_packed_empty_rcu() uses the '_rcu' suffix,
it is not obvious it is called within rcu_read_lock(). All other
functions from this file called with the RCU locked have a comment
describing it. Document this one similarly for consistency.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210826172658.2116840-2-philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
There is no need to use fresh typecasts to get references to pci device structs
when there is an existing reference to pci device struct. Use existing reference.
Minor cleanup.
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210825031949.919376-3-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
commit c0e427d6eb ("hw/acpi/ich9: Enable ACPI PCI hot-plug") removed all
uses of find_i440fx() function. This has been replaced by the more generic call
acpi_get_i386_pci_host() which maybe able to find the root bus both for i440fx
machine type as well as for the q35 machine type. There seems to be no more any
need to maintain a i440fx specific version of the api call. Remove it.
Tested by building from a clean tree successfully.
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210825031949.919376-2-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since commits aa57020774 ("numa: move numa global variable
nb_numa_nodes into MachineState") and 7e721e7b10 ("numa: move
numa global variable numa_info into MachineState"), we can get
NUMA information completely from MachineState::numa_state.
Remove PCMachineState::numa_nodes and PCMachineState::node_mem,
since they are just copied from MachineState::numa_state.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Message-Id: <20210823011254.28506-1-jingqi.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Vhost used to compare the dma_as against the address_space_memory to
detect whether the IOMMU is enabled or not. This might not work well
since the virito-bus may call get_dma_as if VIRTIO_F_IOMMU_PLATFORM is
set without an actual IOMMU enabled when device is plugged. In the
case of PCI where pci_get_address_space() is used, the bus master as
is returned. So vhost actually tries to enable device IOTLB even if
the IOMMU is not enabled. This will lead a lots of unnecessary
transactions between vhost and Qemu and will introduce a huge drop of
the performance.
For PCI, an ideal approach is to use pci_device_iommu_address_space()
just for get_dma_as. But Qemu may choose to initialize the IOMMU after
the virtio-pci which lead a wrong address space is returned during
device plugged. So this patch switch to use transport specific way via
iommu_enabled() to detect the IOMMU during vhost start. In this case,
we are fine since we know the IOMMU is initialized correctly.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210804034803.1644-4-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch implements the PCI transport version of iommu_enabled. This
is done by comparing the address space returned by
pci_device_iommu_address_space() against address_space_memory.
Note that an ideal approach is to use pci_device_iommu_address_space()
in get_dma_as(), but it might not work well since the IOMMU could be
initialized after the virtio-pci device is initialized.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210804034803.1644-3-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduce a new method for the virtio-bus for the transport
to report whether or not the IOMMU is enabled for the device.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210804034803.1644-2-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's compress the code a bit to improve readability. We can drop the
vm_running check in virtio_balloon_free_page_start() as it's already
properly checked in the single caller.
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210708095339.20274-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Postcopy never worked properly with 'free-page-hint=on', as there are
at least two issues:
1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
and consequently won't release free pages back to the OS once
migration finishes.
The issue is that for postcopy, we won't do a final bitmap sync while
the guest is stopped on the source and
virtio_balloon_free_page_hint_notify() will only call
virtio_balloon_free_page_done() on the source during
PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
the destination.
2) Once the VM touches a page on the destination that has been excluded
from migration on the source via qemu_guest_free_page_hint() while
postcopy is active, that thread will stall until postcopy finishes
and all threads are woken up. (with older Linux kernels that won't
retry faults when woken up via userfaultfd, we might actually get a
SEGFAULT)
The issue is that the source will refuse to migrate any pages that
are not marked as dirty in the dirty bmap -- for example, because the
page might just have been sent. Consequently, the faulting thread will
stall, waiting for the page to be migrated -- which could take quite
a while and result in guest OS issues.
While we could fix 1) comparatively easily, 2) is harder to get right and
might require more involved RAM migration changes on source and destination
[1].
As it never worked properly, let's not start free page hinting in the
precopy notifier if the postcopy migration capability was enabled to fix
it easily. Capabilities cannot be enabled once migration is already
running.
Note 1: in the future we might either adjust migration code on the source
to track pages that have actually been sent or adjust
migration code on source and destination to eventually send
pages multiple times from the source and and deal with pages
that are sent multiple times on the destination.
Note 2: virtio-mem has similar issues, however, access to "unplugged"
memory by the guest is very rare and we would have to be very
lucky for it to happen during migration. The spec states
"The driver SHOULD NOT read from unplugged memory blocks ..."
and "The driver MUST NOT write to unplugged memory blocks".
virtio-mem will move away from virtio_balloon_free_page_done()
soon and handle this case explicitly on the destination.
[1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com
Fixes: c13c4153f7 ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210708095339.20274-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
OBJECT_CHECK(PciHostState, ..., TYPE_PCI_HOST_BRIDGE) is exactly
what the PCI_HOST_BRIDGE macro does. We can just use the macro
instead of using OBJECT_CHECK manually.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20210805193431.307761-7-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This would previously give error messages like
> Received unexpected msg type.Expected 0 received 1
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Message-Id: <20210806143926.315725-1-hi@alyssa.is>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Just a small refactor patch.
vhost_set_backend_type() gets called only in vhost.c, so we can move the
function there and make it static. We can then extern the visibility of
kernel_ops, to match the other VhostOps in vhost-backend.h.
The VhostOps constants now make more sense in vhost.h
Suggested-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Tiberiu Georgescu <tiberiu.georgescu@nutanix.com>
Message-Id: <20210809134015.67941-1-tiberiu.georgescu@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently various acpi hotplug modules like cpu hotplug, memory hotplug, pci
hotplug, nvdimm hotplug are all pulled in when CONFIG_ACPI_X86 is turned on.
This brings in support for whole lot of subsystems that some targets like
mips does not need. They are added just to satisfy symbol dependencies. This
is ugly and should be avoided. Targets should be able to pull in just what they
need and no more. For example, mips only needs support for PIIX4 and does not
need acpi pci hotplug support or cpu hotplug support or memory hotplug support
etc. This change is an effort to clean this up.
In this change, new config variables are added for various acpi hotplug
subsystems. Targets like mips can only enable PIIX4 support and not the rest
of all the other modules which were being previously pulled in as a part of
CONFIG_ACPI_X86. Function stubs make sure that symbols which piix4 needs but
are not required by mips (for example, symbols specific to pci hotplug etc)
are available to satisfy the dependencies.
Currently, this change only addresses issues with mips malta targets. In future
we might be able to clean up other targets which are similarly pulling in lot
of unnecessary hotplug modules by enabling ACPI_X86.
This change should also address issues such as the following:
https://gitlab.com/qemu-project/qemu/-/issues/221https://gitlab.com/qemu-project/qemu/-/issues/193
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20210812071409.492299-1-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Related: https://bugzilla.redhat.com//show_bug.cgi?id=1985924
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210812102341.3316254-1-kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that we have "acpi-pci-hotplug-with-bridge-support" PIIX4 PM property being
used for both q35 and i440fx machine types, it is better that we defined this
property string at a single place within a header file like other PIIX4
properties. We can then use this single definition at all the places that needs
it instead of duplicating the string everywhere. While at it, this change also
adds a definition for "acpi-root-pci-hotplug" PIIX4 PM property and uses
this definition at all places that were formally using the string value.
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20210816083214.105740-1-ani@anisinha.ca>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On vhost-user-blk migration, qemu normally sends a number of commands
to enable logging if VHOST_USER_PROTOCOL_F_LOG_SHMFD is negotiated.
Qemu sends VHOST_USER_SET_FEATURES to enable buffers logging and
VHOST_USER_SET_VRING_ADDR per each started ring to enable "used ring"
data logging.
The issue is that qemu doesn't wait for reply from the vhost daemon
for these commands which may result in races between qemu expectation
of logging starting and actual login starting in vhost daemon.
The race can appear as follows: on migration setup, qemu enables dirty page
logging by sending VHOST_USER_SET_FEATURES. The command doesn't arrive to a
vhost-user-blk daemon immediately and the daemon needs some time to turn the
logging on internally. If qemu doesn't wait for reply, after sending the
command, qemu may start migrateing memory pages to a destination. At this time,
the logging may not be actually turned on in the daemon but some guest pages,
which the daemon is about to write to, may have already been transferred
without logging to the destination. Since the logging wasn't turned on,
those pages won't be transferred again as dirty. So we may end up with
corrupted data on the destination.
The same scenario is applicable for "used ring" data logging, which is
turned on with VHOST_USER_SET_VRING_ADDR command.
To resolve this issue, this patch makes qemu wait for the command result
explicitly if VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated and logging enabled.
Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
Message-Id: <20210809104824.78830-1-den-plotnikov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If call virtio_queue_set_host_notifier_mr fails, should free
host-notifier memory-region.
Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Message-Id: <1629077555-19907-1-git-send-email-yajunw@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With the introduction of the batch hinting, meaningless batches can be
created with no IOTLB updates if the memory region was skipped by
vhost_vdpa_listener_skipped_section. This is the case of host notifiers
memory regions, device un/realize, and others. This causes the vdpa
device to receive dma mapping settings with no changes, a possibly
expensive operation for nothing.
To avoid that, VHOST_IOTLB_BATCH_BEGIN hint is delayed until we have a
meaningful (not skipped section) mapping or unmapping operation, and
VHOST_IOTLB_BATCH_END is not written unless at least one of _UPDATE /
_INVALIDATE has been issued.
v3:
* Use a bool instead of a counter avoiding potential number wrapping
* Fix bad check on _commit
* Move VHOST_BACKEND_F_IOTLB_BATCH check to
vhost_vdpa_iotlb_batch_begin_once
v2 (from RFC):
* Rename misleading name
* Abstract start batching function for listener_add/del
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20210812140933.226288-1-eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
- Make the backup-top filter driver available for user-created block
nodes (i.e. via blockdev-add)
- Allow running iotests with gdb or valgrind being attached to qemu
instances
- Fix the raw format driver's permissions: There is no metadata, so we
only need WRITE or RESIZE when the parent needs it
- Basic reopen implementation for win32 files (file-win32.c) so that
qemu-img commit can work
- uclibc/musl build fix for the FUSE export code
- Some iotests delinting
- block-hmp-cmds.c refactoring
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEy2LXoO44KeRfAE00ofpA0JgBnN8FAmEvleISHGhyZWl0ekBy
ZWRoYXQuY29tAAoJEKH6QNCYAZzfdOEP/j4gutKzxqHEgaOxus3e1u77JnIX5OBO
E3wr0W8IaILp3a3N8f8lcq8frw6aSvTmW8l2woKNbg/C1yuX4NFN9tyQ6jpoFAP9
9X9GoPlU7YG5c1bEnJlO/ySt3xHRssIsZBpKWnzWwUI5nMpGUrNPem3rW8T2DaPy
RwnRhBl2kzHYqyPXDx13lA3zKIunAISWRM9adWyKDdRo6Lqk0Us7ND+f6nRHJSG1
uJ26uKWWXx+qYC7F8uc45vrOjesWwC0sqUn7RC/0pbBGp9L6Bgc3yWbnJWZBNEBM
zbv47B6HsJs2tqHGj0T+EKkhqChGz3B/vMeSSw5c3dXFBfQ53Rjm4Nlr9YBzuCGV
erMoq0j/Ytz8+T865N/kjzwdgkl+xcKWF/GIaM5rxiJ2syyCV9CY2SxD6AC+WPBk
yCezNnZEAx2POS2ylRy+EQvJm3YdoWrXZr05Blj28TtqNLs3qCP7evG6IjH58idU
A4YgmltwN5UdajOK9b7O7zFAFhCZCKqAVJNKI0NCTYaT3zEim5dduXfn3gHTu5Wl
jgWvpicNgsEXC4/etp5jOVkbBXtelh66ibdDIQJEanAG1W0Br+gQIBLN84pf7gpY
8R9BBpZRk2DzYJnYJS2FV6xRGKY+XjU4zd7yVfyq6jZYyvfUxBDhCghCB6M37EmJ
oE64kzab7uVm
=0tJJ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/hreitz/tags/pull-block-2021-09-01' into staging
Block patches:
- Make the backup-top filter driver available for user-created block
nodes (i.e. via blockdev-add)
- Allow running iotests with gdb or valgrind being attached to qemu
instances
- Fix the raw format driver's permissions: There is no metadata, so we
only need WRITE or RESIZE when the parent needs it
- Basic reopen implementation for win32 files (file-win32.c) so that
qemu-img commit can work
- uclibc/musl build fix for the FUSE export code
- Some iotests delinting
- block-hmp-cmds.c refactoring
# gpg: Signature made Wed 01 Sep 2021 16:01:54 BST
# gpg: using RSA key CB62D7A0EE3829E45F004D34A1FA40D098019CDF
# gpg: issuer "hreitz@redhat.com"
# gpg: Good signature from "Hanna Reitz <hreitz@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: CB62 D7A0 EE38 29E4 5F00 4D34 A1FA 40D0 9801 9CDF
* remotes/hreitz/tags/pull-block-2021-09-01: (56 commits)
block/file-win32: add reopen handlers
block/export/fuse.c: fix fuse-lseek on uclibc or musl
block/block-copy: block_copy_state_new(): drop extra arguments
iotests/image-fleecing: add test-case for copy-before-write filter
iotests/image-fleecing: prepare for adding new test-case
iotests/image-fleecing: rename tgt_node
iotests/image-fleecing: proper source device
iotests.py: hmp_qemu_io: support qdev
iotests: move 222 to tests/image-fleecing
iotests/222: constantly use single quotes for strings
iotests/222: fix pylint and mypy complains
python:QEMUMachine: template typing for self returning methods
python/qemu/machine: QEMUMachine: improve qmp() method
python/qemu/machine.py: refactor _qemu_args()
qapi: publish copy-before-write filter
block/copy-before-write: make public block driver
block/block-copy: make setting progress optional
block/copy-before-write: initialize block-copy bitmap
block/copy-before-write: cbw_init(): use options
block/copy-before-write: bdrv_cbw_append(): drop unused compress arg
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
v9fs_walk() utilizes the v9fs_co_run_in_worker({...}) macro to run the
supplied fs driver code block on a background worker thread.
When either the 'Twalk' client request was interrupted or if the client
requested fid for that 'Twalk' request caused a stat error then that
fs driver code block was left by 'break' keyword, with the intention to
return from worker thread back to main thread as well:
v9fs_co_run_in_worker({
if (v9fs_request_cancelled(pdu)) {
err = -EINTR;
break;
}
err = s->ops->lstat(&s->ctx, &dpath, &fidst);
if (err < 0) {
err = -errno;
break;
}
...
});
However that 'break;' statement also skipped the v9fs_co_run_in_worker()
macro's final and mandatory
/* re-enter back to qemu thread */
qemu_coroutine_yield();
call and thus caused the rest of v9fs_walk() to be continued being
executed on the worker thread instead of main thread, eventually
leading to a crash in the transport virtio transport driver.
To fix this issue and to prevent the same error from happening again by
other users of v9fs_co_run_in_worker() in future, auto wrap the supplied
code block into its own
do { } while (0);
loop inside the 'v9fs_co_run_in_worker' macro definition.
Full discussion and backtrace:
https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg05209.htmlhttps://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00174.html
Fixes: 8d6cb10073
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1mLTBg-0002Bh-2D@lizzy.crudebyte.com>
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <b51670d2a39399535a035f6bc77c3cbeed85edae.1629208359.git.qemu_oss@crudebyte.com>
The v9fs_walk() function resolves all client submitted path nodes to the
local 'pathes' array. Using a separate string scalar variable 'path'
inside the background worker thread loop and copying that local 'path'
string scalar variable subsequently to the 'pathes' array (at the end of
each loop iteration) is not necessary.
Instead simply resolve each path directly to the 'pathes' array and
don't use the string scalar variable 'path' inside the fs worker thread
loop at all.
The only advantage of the 'path' scalar was that in case of an error
the respective 'pathes' element would not be filled. Right now this is
not an issue as the v9fs_walk() function returns as soon as any error
occurs.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <7dacbecf25b2c9b4a0ce12d689a8a535f09a31e3.1629208359.git.qemu_oss@crudebyte.com>
We need an ability to insert filters above top block node, attached to
block device. It can't be achieved with blockdev-reopen command. So, we
want do it with help of qom-set.
Intended usage:
Assume there is a node A that is attached to some guest device.
1. blockdev-add to create a filter node B that has A as its child.
2. qom-set to change the node attached to the guest device’s
BlockBackend from A to B.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210824083856.17408-5-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Add field, so property can declare support for setting the property
when device is realized. To be used in the following commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210824083856.17408-4-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
All the devices that used to use system_clock_scale have now been
converted to use Clock inputs instead, so the global is no longer
needed; remove it and all the code that sets it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210812093356.1946-26-peter.maydell@linaro.org
The stellaris-gptm timer currently uses system_clock_scale for one of
its timer modes where the timer runs at the CPU clock rate. Make it
use a Clock input instead.
We don't try to make the timer handle changes in the clock frequency
while the downcounter is running. This is not a change in behaviour
from the previous system_clock_scale implementation -- we will pick
up the new frequency only when the downcounter hits zero. Handling
dynamic clock changes when the counter is running would require state
that the current gptm implementation doesn't have.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Message-id: 20210812093356.1946-25-peter.maydell@linaro.org
The implementation of the Stellaris general purpose timer module
device stellaris-gptm is currently in the same source file as the
board model. Split it out into its own source file in hw/timer.
Apart from the new file comment headers and the Kconfig and
meson.build changes, this is just code movement.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Message-id: 20210812093356.1946-24-peter.maydell@linaro.org
Fix the code style issues in the Stellaris general purpose timer
module code, so that when we move it to a different file in a
following patch checkpatch doesn't complain.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Message-id: 20210812093356.1946-23-peter.maydell@linaro.org
Now that all users of the systick devices wire up the clock inputs,
use those instead of the system_clock_scale and the hardwired 1MHz
value for the reference clock.
This will fix various board models where we were incorrectly
providing a 1MHz reference clock instead of some other value or
instead of providing no reference clock at all.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Message-id: 20210812093356.1946-22-peter.maydell@linaro.org
Wire up the refclk for the msf2 SoC. This SoC runs the refclk at a
frequency which is programmably either /4, /8, /16 or /32 of the main
CPU clock. We don't currently model the register which allows the
guest to set the divisor, so implement the refclk as a fixed /32 of
the CPU clock (which is the value of the divisor at reset).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Message-id: 20210812093356.1946-21-peter.maydell@linaro.org
Instead of passing the MSF2 SoC an integer property specifying the
CPU clock rate, pass it a Clock instead. This lets us wire that
clock up to the armv7m object.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Message-id: 20210812093356.1946-20-peter.maydell@linaro.org
In the realize method of the msf2-soc SoC object, we call g_new() to
create new MemoryRegion objects for the nvm, nvm_alias, and sram.
This is unnecessary; make these MemoryRegions member fields of the
device state struct instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Message-id: 20210812093356.1946-19-peter.maydell@linaro.org
Connect the sysclk to the armv7m object. This board's SoC does not
connect up the systick reference clock, so we don't need to connect a
refclk.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Message-id: 20210812093356.1946-18-peter.maydell@linaro.org
Currently the stellaris_sys_init() function creates the
TYPE_STELLARIS_SYS object, sets its properties, realizes it, maps its
MMIO region and connects its IRQ. In order to support wiring the
sysclk up to the armv7m object, we need to split this function apart,
because to connect the clock output of the STELLARIS_SYS object to
the armv7m object we need to create the STELLARIS_SYS object before
the armv7m object, but we can't wire up the IRQ until after we've
created the armv7m object.
Remove the stellaris_sys_init() function, and instead put the
create/configure/realize parts before we create the armv7m object and
the mmio/irq connection parts afterwards.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Message-id: 20210812093356.1946-17-peter.maydell@linaro.org
Wire up the sysclk input to the armv7m object.
Strictly this SoC should not have a systick device at all, but our
armv7m container object doesn't currently support disabling the
systick device. For the moment, add a TODO comment, but note that
this is why we aren't wiring up a refclk (no need for one).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Message-id: 20210812093356.1946-16-peter.maydell@linaro.org
Delete the trailing blank line at the end of the source file.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-15-peter.maydell@linaro.org
Wire up the sysclk and refclk for the stm32f405 SoC. This SoC always
runs the systick refclk at 1/8 the frequency of the main CPU clock,
so the board code only needs to provide a single sysclk clock.
Because there is only one board using this SoC, we convert the SoC
and the board together, rather than splitting it into "add clock to
SoC; connect clock in board; add error check in SoC code that clock
is wired up".
When the systick device starts honouring its clock inputs, this will
fix an emulation inaccuracy in the netduinoplus2 board where the
systick reference clock was running at 1MHz rather than 21MHz.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-14-peter.maydell@linaro.org
Wire up the sysclk and refclk for the stm32f205 SoC. This SoC always
runs the systick refclk at 1/8 the frequency of the main CPU clock,
so the board code only needs to provide a single sysclk clock.
Because there is only one board using this SoC, we convert the SoC
and the board together, rather than splitting it into "add clock to
SoC; connect clock in board; add error check in SoC code that clock
is wired up".
When the systick device starts honouring its clock inputs, this will
fix an emulation inaccuracy in the netduino2 board where the systick
reference clock was running at 1MHz rather than 15MHz.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-13-peter.maydell@linaro.org
Wire up the sysclk and refclk for the stm32f100 SoC. This SoC always
runs the systick refclk at 1/8 the frequency of the main CPU clock,
so the board code only needs to provide a single sysclk clock.
Because there is only one board using this SoC, we convert the SoC
and the board together, rather than splitting it into "add clock to
SoC; connect clock in board; add error check in SoC code that clock
is wired up".
When the systick device starts honouring its clock inputs, this will
fix an emulation inaccuracy in the stm32vldiscovery board where the
systick reference clock was running at 1MHz rather than 3MHz.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-12-peter.maydell@linaro.org
In the realize methods of the stm32f100 and stm32f205 SoC objects, we
call g_new() to create new MemoryRegion objects for the sram, flash,
and flash_alias. This is unnecessary (and leaves open the
possibility of leaking the allocations if we exit from realize with
an error). Make these MemoryRegions member fields of the device
state struct instead, as stm32f405 already does.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-11-peter.maydell@linaro.org
It is quite common for a clock tree to involve possibly programmable
clock multipliers or dividers, where the frequency of a clock is for
instance divided by 8 to produce a slower clock to feed to a
particular device.
Currently we provide no convenient mechanism for modelling this. You
can implement it by having an input Clock and an output Clock, and
manually setting the period of the output clock in the period-changed
callback of the input clock, but that's quite clunky.
This patch adds support in the Clock objects themselves for setting a
multiplier or divider. The effect of setting this on a clock is that
when the clock's period is changed, all the children of the clock are
set to period * multiplier / divider, rather than being set to the
same period as the parent clock.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Message-id: 20210812093356.1946-10-peter.maydell@linaro.org
Connect up the armv7m clocks on the mps2-an385/386/500/511.
Connect up the armv7m object's clocks on the MPS boards defined in
mps2.c. The documentation for these FPGA images doesn't specify what
systick reference clock is used (if any), so for the moment we
provide a 1MHz refclock, which will result in no behavioural change
from the current hardwired 1MHz clock implemented in
armv7m_systick.c:systick_scale().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-9-peter.maydell@linaro.org
Wire up the cpuclk for the systick devices to the SSE object's
existing mainclk clock.
We do not wire up the refclk because the SSE subsystems do not
provide a refclk. (This is documented in the IoTKit and SSE-200
TRMs; the SSE-300 TRM doesn't mention it but we assume it follows the
same approach.) When we update the systick device later to honour "no
refclk connected" this will fix a minor emulation inaccuracy for the
SSE-based boards.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-8-peter.maydell@linaro.org
Create input clocks on the armv7m container object which pass through
to the systick timers, so that users of the armv7m object can specify
the clocks being used.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-7-peter.maydell@linaro.org
The v7M systick timer can be programmed to run from either of
two clocks:
* an "external reference clock" (when SYST_CSR.CLKSOURCE == 0)
* the main CPU clock (when SYST_CSR.CLKSOURCE == 1)
Our implementation currently hardwires the external reference clock
to be 1MHz, and allows boards to set the main CPU clock frequency via
the global 'system_clock_scale'. (Most boards set that to a constant
value; the Stellaris boards allow the guest to reprogram it via the
board-specific RCC registers).
As the first step in converting this to use the Clock infrastructure,
add input clocks to the systick device for the reference clock and
the CPU clock. The device implementation ignores them; once we have
made all the users of the device correctly wire up the new Clocks we
will switch the implementation to use them and ignore the old
system_clock_scale.
This is a migration compat break for all M-profile boards, because of
the addition of the new clock objects to the vmstate struct.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-6-peter.maydell@linaro.org
Instead of having the NVIC device provide a single sysbus memory
region covering the whole of the "System PPB" space, which implements
the default behaviour for unimplemented ranges and provides the NS
alias window to the sysregs as well as the main sysreg MR, move this
handling to the container armv7m device. The NVIC now provides a
single memory region which just implements the system registers.
This consolidates all the handling of "map various devices in the
PPB" into the armv7m container where it belongs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-4-peter.maydell@linaro.org
There's no particular reason why the NVIC should be owning the
SysTick device objects; move them into the ARMv7M container object
instead, as part of consolidating the "create the devices which are
built into an M-profile CPU and map them into their architected
locations in the address space" work into one place.
This involves temporarily creating a duplicate copy of the
nvic_sysreg_ns_ops struct and its read/write functions (renamed as
v7m_sysreg_ns_*), but we will delete the NVIC's copy of this code in
a subsequent patch.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20210812093356.1946-3-peter.maydell@linaro.org
Currently we implement the RAS register block within the NVIC device.
It isn't really very tightly coupled with the NVIC proper, so instead
move it out into a sysbus device of its own and have the top level
ARMv7M container create it and map it into memory at the right
address.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Message-id: 20210812093356.1946-2-peter.maydell@linaro.org
Add -cpu a64fx to use A64FX processor when -machine virt option is
specified. In addition, add a64fx to the Supported guest CPU types
in the virt.rst document.
Signed-off-by: Shuuichirou Ishii <ishii.shuuichir@fujitsu.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>