In current code, when guest does S3, virtio-gpu are reset due to the
bit No_Soft_Reset is not set. After resetting, the display resources
of virtio-gpu are destroyed, then the display can't come back and only
show blank after resuming.
Implement No_Soft_Reset bit of PCI_PM_CTRL register, then guest can check
this bit, if this bit is set, the devices resetting will not be done, and
then the display can work after resuming.
No_Soft_Reset bit is implemented for all virtio devices, and was tested
only on virtio-gpu device. Set it false by default for safety.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Message-Id: <20240606102205.114671-3-Jiqian.Chen@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Shutdown requests are normally hardware dependent.
By extending pvpanic to also handle shutdown requests, guests can
submit such requests with an easily implementable and cross-platform
mechanism.
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240527-pvpanic-shutdown-v8-5-5a28ec02558b@t-8ch.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The different components of pvpanic duplicate the list of supported
events. Move it to the shared header file to minimize changes when new
events are added.
MST: tweak: keep header included in pvpanic.c to avoid header
dependency, rebase.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240527-pvpanic-shutdown-v8-3-5a28ec02558b@t-8ch.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240527-pvpanic-shutdown-v8-2-5a28ec02558b@t-8ch.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All DPA ranges in the DC regions are invalid to access until an extent
covering the range has been successfully accepted by the host. A bitmap
is added to each region to record whether a DC block in the region has
been backed by a DC extent. Each bit in the bitmap represents a DC block.
When a DC extent is accepted, all the bits representing the blocks in the
extent are set, which will be cleared when the extent is released.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-13-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To simulate FM functionalities for initiating Dynamic Capacity Add
(Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL spec
r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces to issue
add/release dynamic capacity extents requests.
With the change, we allow to release an extent only when its DPA range
is contained by a single accepted extent in the device. That is to say,
extent superset release is not supported yet.
1. Add dynamic capacity extents:
For example, the command to add two continuous extents (each 128MiB long)
to region 0 (starting at DPA offset 0) looks like below:
{ "execute": "qmp_capabilities" }
{ "execute": "cxl-add-dynamic-capacity",
"arguments": {
"path": "/machine/peripheral/cxl-dcd0",
"host-id": 0,
"selection-policy": "prescriptive",
"region": 0,
"extents": [
{
"offset": 0,
"len": 134217728
},
{
"offset": 134217728,
"len": 134217728
}
]
}
}
2. Release dynamic capacity extents:
For example, the command to release an extent of size 128MiB from region 0
(DPA offset 128MiB) looks like below:
{ "execute": "cxl-release-dynamic-capacity",
"arguments": {
"path": "/machine/peripheral/cxl-dcd0",
"host-id": 0,
"removal-policy":"prescriptive",
"region": 0,
"extents": [
{
"offset": 134217728,
"len": 134217728
}
]
}
}
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-12-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Per CXL spec 3.1, two mailbox commands are implemented:
Add Dynamic Capacity Response (Opcode 4802h) 8.2.9.9.9.3, and
Release Dynamic Capacity (Opcode 4803h) 8.2.9.9.9.4.
For the process of the above two commands, we use two-pass approach.
Pass 1: Check whether the input payload is valid or not; if not, skip
Pass 2 and return mailbox process error.
Pass 2: Do the real work--add or release extents, respectively.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-11-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add dynamic capacity extent list representative to the definition of
CXLType3Dev and implement get DC extent list mailbox command per
CXL.spec.3.1:.8.2.9.9.9.2.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-10-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add (file/memory backed) host backend for DCD. All the dynamic capacity
regions will share a single, large enough host backend. Set up address
space for DC regions to support read/write operations to dynamic capacity
for DCD.
With the change, the following support is added:
1. Add a new property to type3 device "volatile-dc-memdev" to point to host
memory backend for dynamic capacity. Currently, all DC regions share one
host backend;
2. Add namespace for dynamic capacity for read/write support;
3. Create cdat entries for each dynamic capacity region.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-9-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rename mem_size as static_mem_size for type3 memdev to cover static RAM and
pmem capacity, preparing for the introduction of dynamic capacity to support
dynamic capacity devices.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-6-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Per cxl spec r3.1, add dynamic capacity (DC) region representative based on
Table 8-165 and extend the cxl type3 device definition to include DC region
information. Also, based on info in 8.2.9.9.9.1, add 'Get Dynamic Capacity
Configuration' mailbox support.
Note: we store region decode length as byte-wise length on the device, which
should be divided by 256 * MiB before being returned to the host
for "Get Dynamic Capacity Configuration" mailbox command per
specification.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-5-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This enables wrapper devices to customize the base device's CCI
(for example, with custom commands outside the specification)
without the need to change the base device.
The also enabled the base device to dispatch those commands without
requiring additional driver support.
Heavily edited by Jonathan Cameron to increase code reuse
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-3-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This allows devices to have fully customized CCIs, along with complex
devices where wrapper devices can override or add additional CCI
commands without having to replicate full command structures or
pollute a base device with every command that might ever be used.
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Message-Id: <20240523174651.1089554-2-nifan.cxl@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit f02a4b8e64.
Since the current patch cannot completely fix the lost reconnect
problem, there is a scenario that is not considered:
- When the virtio-blk driver is removed from the guest os,
s->connected has no chance to be set to false, resulting in
subsequent reconnection not being executed.
The next patch will completely fix this issue with a better approach.
Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20240516025753.130171-2-fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support to virtio-pci devices for handling the extra data sent
from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA
transport feature has been negotiated.
The extra data that's passed to the virtio-pci device when this
feature is enabled varies depending on the device's virtqueue
layout.
In a split virtqueue layout, this data includes:
- upper 16 bits: shadow_avail_idx
- lower 16 bits: virtqueue index
In a packed virtqueue layout, this data includes:
- upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx
- lower 16 bits: virtqueue index
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240315165557.26942-2-jonah.palmer@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On setups with one or more virtio-net devices with vhost on,
dirty tracking iteration increases cost the bigger the number
amount of queues are set up e.g. on idle guests migration the
following is observed with virtio-net with vhost=on:
48 queues -> 78.11% [.] vhost_dev_sync_region.isra.13
8 queues -> 40.50% [.] vhost_dev_sync_region.isra.13
1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13
2 devices, 1 queue -> 18.60% [.] vhost_dev_sync_region.isra.14
With high memory rates the symptom is lack of convergence as soon
as it has a vhost device with a sufficiently high number of queues,
the sufficient number of vhost devices.
On every migration iteration (every 100msecs) it will redundantly
query the *shared log* the number of queues configured with vhost
that exist in the guest. For the virtqueue data, this is necessary,
but not for the memory sections which are the same. So essentially
we end up scanning the dirty log too often.
To fix that, select a vhost device responsible for scanning the
log with regards to memory sections dirty tracking. It is selected
when we enable the logger (during migration) and cleared when we
disable the logger. If the vhost logger device goes away for some
reason, the logger will be re-selected from the rest of vhost
devices.
After making mem-section logger a singleton instance, constant cost
of 7%-9% (like the 1 queue report) will be seen, no matter how many
queues or how many vhost devices are configured:
48 queues -> 8.71% [.] vhost_dev_sync_region.isra.13
2 devices, 8 queues -> 7.97% [.] vhost_dev_sync_region.isra.14
Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Message-Id: <1710448055-11709-2-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Introduce global xen_is_stubdomain variable when qemu is running inside
a stubdomain instead of dom0. This will be relevant for subsequent
patches, as few things like accessing PCI config space need to be done
differently.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <e66aa97dca5120f22e015c19710b2ff04f525720.1711506237.git-series.marmarek@invisiblethingslab.com>
Signed-off-by: Anthony PERARD <anthony@xenproject.org>
hmp_info_roms() was removed in commit dd98234c05 ("qapi:
introduce x-query-roms QMP command"),
hmp_info_numa() in commit 1b8ae799d8 ("qapi: introduce
x-query-numa QMP command"),
hmp_info_ramblock() in commit ca411b7c8a ("qapi: introduce
x-query-ramblock QMP command")
and hmp_info_irq() in commit 91f2fa7045 ("qapi: introduce
x-query-irq QMP command").
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The typeof_strip_qual() is most useful for the atomic fetch-and-modify
operations in atomic.h, but it can be used elsewhere as well. For example,
QAPI_LIST_LENGTH() assumes that the argument is not const, which is not a
requirement.
Move the macro to compiler.h and, while at it, move it under #ifndef
__cplusplus to emphasize that it uses C-only constructs. A C++ version
of typeof_strip_qual() using type traits is possible[1], but beyond the
scope of this patch because the little C++ code that is in QEMU does not
use QAPI.
The patch was tested by changing the declaration of strv_from_str_list()
in qapi/qapi-type-helpers.c to:
char **strv_from_str_list(const strList *const list)
This is valid C code, and it fails to compile without this change.
[1] https://lore.kernel.org/qemu-devel/20240624205647.112034-1-flwu@google.com/
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Tested-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
void* pointer arithmetic is a GCC extentension which could not be
available in other build tools (e.g. C++). This changes removes this
assumption.
Signed-off-by: Roman Kiryanov <rkir@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20240620201654.598024-1-rkir@google.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
bdrv_file_open and bdrv_open are completely equivalent, they are
never checked except to see which one to invoke. So merge them
into a single one.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We need #address-cells properties in all interrupt controllers that are
referred by an interrupt-map [1]. For the RISC-V machine, both PLIC and
APLIC controllers must have this property.
PLIC already sets it in create_fdt_socket_plic(). Set the property for
APLIC in create_fdt_one_aplic().
[1] https://lore.kernel.org/linux-arm-kernel/CAL_JsqJE15D-xXxmELsmuD+JQHZzxGzdXvikChn6KFWqk6NzPw@mail.gmail.com/
Suggested-by: Anup Patel <apatel@ventanamicro.com>
Fixes: e6faee6585 ("hw/riscv: virt: Add optional AIA APLIC support to virt machine")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240531202759.911601-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
One fix and various cleanups for the SD card model.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmZ5cRUACgkQ4+MsLN6t
wN59Qw//cUdjD287pB5Ml5aQqr9sOTyVnHUceZtz7AOZ5w8RM2tlPDgOImeLOvU6
OV7qfWvNaUxtQxhfh5jpe8Pj4eHBtRQzA6a1AWToEvnN4189QWHZpqf5TUa4AlFS
uAk7k2TkoNv9zbNKca0bP3L1x6sT9l0VPZBLaLbgdXDIX2ycD0r3NVQxXb/bJRgM
6pFRcLCF/isKzLQDwqnTa11hB/JDTvOU7xnY0kazGRvyWjbSvE2sOJzLNJXHkW0I
/FNfRbOKJo2t+47Z5qSXUFFLeIEBTy7VqNBsOQ6sMIgrWzbOSrtBcuxKp0p9NCGH
fdZHlDVRnNGXewUya4RjbmXiCNuGL4zJ82b2BaQZVd5ZwU2opIr8xO96WCojQ4dZ
+Dq3uv7su3PUVOh95i38Eo93OG9jXFx642XD4q2uKu5j70IoGXAkIoLUcFkZZdGS
9rCsaNUHyHJrN6nXf3Cekvkqxz36p6QXaUF9I1vB0JF6CrexMD35sBUK+RE9k4uW
LnqL7ZwQDGDGVl3kPS/VCXv1mMim4aRLSEIveq7Ui6dKzaaJMIIodZ8CFMuyTTsD
cGE+Cd053nf6SzX3+kEZftNdjtJ906O8xIAw+RNdARYx003l4kUxgsPDk7ELyzIP
Tb+VlZl2P+ROJmeWvRMTW7ZQ49M9IEMrg8zlGF4hLCxB1JndeOA=
=O5er
-----END PGP SIGNATURE-----
Merge tag 'sdmmc-20240624' of https://github.com/philmd/qemu into staging
SD/MMC patches queue
One fix and various cleanups for the SD card model.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmZ5cRUACgkQ4+MsLN6t
# wN59Qw//cUdjD287pB5Ml5aQqr9sOTyVnHUceZtz7AOZ5w8RM2tlPDgOImeLOvU6
# OV7qfWvNaUxtQxhfh5jpe8Pj4eHBtRQzA6a1AWToEvnN4189QWHZpqf5TUa4AlFS
# uAk7k2TkoNv9zbNKca0bP3L1x6sT9l0VPZBLaLbgdXDIX2ycD0r3NVQxXb/bJRgM
# 6pFRcLCF/isKzLQDwqnTa11hB/JDTvOU7xnY0kazGRvyWjbSvE2sOJzLNJXHkW0I
# /FNfRbOKJo2t+47Z5qSXUFFLeIEBTy7VqNBsOQ6sMIgrWzbOSrtBcuxKp0p9NCGH
# fdZHlDVRnNGXewUya4RjbmXiCNuGL4zJ82b2BaQZVd5ZwU2opIr8xO96WCojQ4dZ
# +Dq3uv7su3PUVOh95i38Eo93OG9jXFx642XD4q2uKu5j70IoGXAkIoLUcFkZZdGS
# 9rCsaNUHyHJrN6nXf3Cekvkqxz36p6QXaUF9I1vB0JF6CrexMD35sBUK+RE9k4uW
# LnqL7ZwQDGDGVl3kPS/VCXv1mMim4aRLSEIveq7Ui6dKzaaJMIIodZ8CFMuyTTsD
# cGE+Cd053nf6SzX3+kEZftNdjtJ906O8xIAw+RNdARYx003l4kUxgsPDk7ELyzIP
# Tb+VlZl2P+ROJmeWvRMTW7ZQ49M9IEMrg8zlGF4hLCxB1JndeOA=
# =O5er
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 24 Jun 2024 06:13:57 AM PDT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'sdmmc-20240624' of https://github.com/philmd/qemu:
hw/sd/sdcard: Add comments around registers and commands
hw/sd/sdcard: Inline BLK_READ_BLOCK / BLK_WRITE_BLOCK macros
hw/sd/sdcard: Add sd_invalid_mode_for_cmd to report invalid mode switch
hw/sd/sdcard: Only call sd_req_get_address() where address is used
hw/sd/sdcard: Factor sd_req_get_address() method out
hw/sd/sdcard: Only call sd_req_get_rca() where RCA is used
hw/sd/sdcard: Factor sd_req_get_rca() method out
hw/sd/sdcard: Have cmd_valid_while_locked() return a boolean value
hw/sd/sdcard: Trace update of block count (CMD23)
hw/sd/sdcard: Remove explicit entries for illegal commands
hw/sd/sdcard: Remove ACMD6 handler for SPI mode
hw/sd/sdcard: Use Load/Store API to fill some CID/CSD registers
hw/sd/sdcard: Use registerfield CSR::CURRENT_STATE definition
hw/sd/sdcard: Use HWBLOCK_SHIFT definition instead of magic values
hw/sd/sdcard: Fix typo in SEND_OP_COND command name
hw/sd/sdcard: Rewrite sd_cmd_ALL_SEND_CID using switch case (CMD2)
hw/sd/sdcard: Correct code indentation
hw/sd/sdcard: Avoid OOB in sd_read_byte() during unexpected CMD switch
bswap: Add st24_be_p() to store 24 bits in big-endian order
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
vfio_container_destroy() clears the resources allocated
VFIOContainerBase object. Now that VFIOContainerBase is a QOM object,
add an instance_finalize() handler to do the cleanup. It will be
called through object_unref().
Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
It's now empty.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Instead, use VFIO_IOMMU_GET_CLASS() to get the class pointer.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
VFIOContainerBase was made a QOM interface because we believed that a
QOM object would expose all the IOMMU backends to the QEMU machine and
human interface. This only applies to user creatable devices or objects.
Change the VFIOContainerBase nature from interface to object and make
the necessary adjustments in the VFIO_IOMMU hierarchy.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Assign the base container VFIOAddressSpace 'space' pointer in
vfio_address_space_insert(). The ultimate goal is to remove
vfio_container_init() and instead rely on an .instance_init() handler
to perfom the initialization of VFIOContainerBase.
To be noted that vfio_connect_container() will assign the 'space'
pointer later in the execution flow. This should not have any
consequence.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
It prepares ground for a future change initializing the 'space' pointer
of VFIOContainerBase. The goal is to replace vfio_container_init() by
an .instance_init() handler when VFIOContainerBase is QOMified.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
These were forgotten in the recent cleanups.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since the host IOVA ranges are now passed through the
PCIIOMMUOps set_host_resv_regions and we have removed
the only implementation of iommu_set_iova_range() in
the virtio-iommu and the only call site in vfio/common,
let's retire the IOMMU MR API and its memory wrapper.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Store the aliased bus and devfn in the HostIOMMUDevice.
This will be useful to handle info that are iommu group
specific and not device specific (such as reserved
iova ranges).
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a new HostIOMMUDevice callback that allows to
retrieve the usable IOVA ranges.
Implement this callback in the legacy VFIO and IOMMUFD VFIO
host iommu devices. This relies on the VFIODevice agent's
base container iova_ranges resource.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Implement PCIIOMMUOPs [set|unset]_iommu_device() callbacks.
In set(), the HostIOMMUDevice handle is stored in a hash
table indexed by PCI BDF. The object will allow to retrieve
information related to the physical IOMMU.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Store the agent device (VFIO or VDPA) in the host IOMMU device.
This will allow easy access to some of its resources.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
In set call, we take a reference of HostIOMMUDevice and store it
in hash table indexed by PCI BDF.
Note this BDF index is device's real BDF not the aliased one which
is different from the index of VTDAddressSpace. There can be multiple
assigned devices under same virtual iommu group and share same
VTDAddressSpace, but each has its own HostIOMMUDevice.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
pci_device_[set|unset]_iommu_device() call pci_device_get_iommu_bus_devfn()
to get iommu_bus->iommu_ops and call [set|unset]_iommu_device callback to
set/unset HostIOMMUDevice for a given PCI device.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Create host IOMMU device instance in vfio_attach_device() and call
.realize() to initialize it further.
Introuduce attribute VFIOIOMMUClass::hiod_typename and initialize
it based on VFIO backend type. It will facilitate HostIOMMUDevice
creation in vfio_attach_device().
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a helper function iommufd_backend_get_device_info() to get
host IOMMU related information through iommufd uAPI.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
The realize function populates the capabilities. For now only the
aw_bits caps is computed for legacy backend.
Introduce a helper function vfio_device_get_aw_bits() which calls
range_get_last_bit() to get host aw_bits and package it in
HostIOMMUDeviceCaps for query with .get_cap(). This helper will
also be used by iommufd backend.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
This helper get the highest 1 bit position of the upper bound.
If the range is empty or upper bound is zero, -1 is returned.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
TYPE_HOST_IOMMU_DEVICE_IOMMUFD represents a host IOMMU device under
iommufd backend. It is abstract, because it is going to be derived
into VFIO or VDPA type'd device.
It will have its own .get_cap() implementation.
TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO is a sub-class of
TYPE_HOST_IOMMU_DEVICE_IOMMUFD, represents a VFIO type'd host IOMMU
device under iommufd backend. It will be created during VFIO device
attaching and passed to vIOMMU.
It will have its own .realize() implementation.
Opportunistically, add missed header to include/sysemu/iommufd.h.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO represents a host IOMMU device under
VFIO legacy container backend.
It will have its own realize implementation.
Suggested-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
Different platform IOMMU can support different elements.
Currently only two elements, type and aw_bits, type hints the host
platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
host IOMMU address width.
Introduce .get_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
is supported.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
A HostIOMMUDevice is an abstraction for an assigned device that is protected
by a physical IOMMU (aka host IOMMU). The userspace interaction with this
physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
backend or the new iommufd backend. The assigned device can be a VFIO device
or a VDPA device. The HostIOMMUDevice is needed to interact with the host
IOMMU that protects the assigned device. It is especially useful when the
device is also protected by a virtual IOMMU as this latter use the translation
services of the physical IOMMU and is constrained by it. In that context the
HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
capabilities such as the supported address width. In the future, the virtual
IOMMU will use the HostIOMMUDevice to program the guest page tables in the
first translation stage of the physical IOMMU.
Introduce .realize() to initialize HostIOMMUDevice further after instance init.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
- add missing include guard comment to gdbstub.h
- move gdbstub enums into separate header
- move qtest_[get|set]_virtual_clock functions
- allow plugins to manipulate the virtual clock
- introduce an Instructions Per Second plugin
- fix inject_mem_cb rw mask tests
- allow qemu_plugin_vcpu_mem_cb to shortcut when no memory cbs
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmZ5OjoACgkQ+9DbCVqe
KkQPlwf/VK673BAjYktuCLnf3DgWvIkkiHWwzBREP5MmseUloLjK2CQPLY/xWZED
pbA/1OSzHViD/mvG5wTxwef36b9PIleWj5/YwBxGlrb/rh6hCd9004pZK4EMI3qU
53SK8Qron8TIXjey6XfmAY8rcl030GsHr0Zqf5i2pZKE5g0iaGlM3Cwkpo0SxQsu
kMNqiSs9NzX7LxB+YeuAauIvC1YA2F/MGTXeFCTtO9Beyp5oV7oOI+2zIvLjlG5M
Z5hKjG/STkNOteoIBGZpe1+QNpoGHSBoGE3nQnGpXb82iLx1KVBcKuQ6GoWGv1Wo
hqiSh9kJX479l0mLML+IzaDsgSglbg==
=pvWx
-----END PGP SIGNATURE-----
Merge tag 'pull-maintainer-june24-240624-1' of https://gitlab.com/stsquad/qemu into staging
maintainer updates (plugins, gdbstub):
- add missing include guard comment to gdbstub.h
- move gdbstub enums into separate header
- move qtest_[get|set]_virtual_clock functions
- allow plugins to manipulate the virtual clock
- introduce an Instructions Per Second plugin
- fix inject_mem_cb rw mask tests
- allow qemu_plugin_vcpu_mem_cb to shortcut when no memory cbs
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmZ5OjoACgkQ+9DbCVqe
# KkQPlwf/VK673BAjYktuCLnf3DgWvIkkiHWwzBREP5MmseUloLjK2CQPLY/xWZED
# pbA/1OSzHViD/mvG5wTxwef36b9PIleWj5/YwBxGlrb/rh6hCd9004pZK4EMI3qU
# 53SK8Qron8TIXjey6XfmAY8rcl030GsHr0Zqf5i2pZKE5g0iaGlM3Cwkpo0SxQsu
# kMNqiSs9NzX7LxB+YeuAauIvC1YA2F/MGTXeFCTtO9Beyp5oV7oOI+2zIvLjlG5M
# Z5hKjG/STkNOteoIBGZpe1+QNpoGHSBoGE3nQnGpXb82iLx1KVBcKuQ6GoWGv1Wo
# hqiSh9kJX479l0mLML+IzaDsgSglbg==
# =pvWx
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 24 Jun 2024 02:19:54 AM PDT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-june24-240624-1' of https://gitlab.com/stsquad/qemu:
accel/tcg: Avoid unnecessary call overhead from qemu_plugin_vcpu_mem_cb
plugins: fix inject_mem_cb rw masking
contrib/plugins: add Instructions Per Second (IPS) example for cost modeling
plugins: add migration blocker
plugins: add time control API
qtest: move qtest_{get, set}_virtual_clock to accel/qtest/qtest.c
sysemu: generalise qtest_warp_clock as qemu_clock_advance_virtual_time
qtest: use cpu interface in qtest_clock_warp
sysemu: add set_virtual_time to accel ops
plugins: Ensure register handles are not NULL
gdbstub: move enums into separate header
include/exec: add missing include guard comment
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Expose the ability to control time through the plugin API. Only one
plugin can control time so it has to request control when loaded.
There are probably more corner cases to catch here.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
[AJB: tweaked user-mode handling, merged QEMU_PLUGIN_API fix]
Message-Id: <20240530220610.1245424-6-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240620152220.2192768-9-alex.bennee@linaro.org>