Commit Graph

33727 Commits

Author SHA1 Message Date
Leonardo Bras 625b370c45 pcie: Add hotplug detect state register to cmask
When trying to migrate a machine type pc-q35-6.0 or lower, with this
cmdline options,

-device driver=pcie-root-port,port=18,chassis=19,id=pcie-root-port18,bus=pcie.0,addr=0x12 \
-device driver=nec-usb-xhci,p2=4,p3=4,id=nex-usb-xhci0,bus=pcie-root-port18,addr=0x12.0x1

the following bug happens after all ram pages were sent:

qemu-kvm: get_pci_config_device: Bad config data: i=0x6e read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19
qemu-kvm: Failed to load PCIDevice:config
qemu-kvm: Failed to load pcie-root-port:parent_obj.parent_obj.parent_obj
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:12.0/pcie-root-port'
qemu-kvm: load of migration failed: Invalid argument

This happens on pc-q35-6.0 or lower because of:
{ "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" }

In this scenario, hotplug_handler_plug() calls pcie_cap_slot_plug_cb(),
which sets dev->config byte 0x6e with bit PCI_EXP_SLTSTA_PDS to signal PCI
hotplug for the guest. After a while the guest will deal with this hotplug
and qemu will clear the above bit.

Then, during migration, get_pci_config_device() will compare the
configs of both the freshly created device and the one that is being
received via migration, which will differ due to the PCI_EXP_SLTSTA_PDS bit
and cause the bug to reproduce.

To avoid this fake incompatibility, there are tree fields in PCIDevice that
can help:

- wmask: Used to implement R/W bytes, and
- w1cmask: Used to implement RW1C(Write 1 to Clear) bytes
- cmask: Used to enable config checks on load.

According to PCI Express® Base Specification Revision 5.0 Version 1.0,
table 7-27 (Slot Status Register) bit 6, the "Presence Detect State" is
listed as RO (read-only), so it only makes sense to make use of the cmask
field.

So, clear PCI_EXP_SLTSTA_PDS bit on cmask, so the fake incompatibility on
get_pci_config_device() does not abort the migration.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215819
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20230706045546.593605-3-leobras@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
2023-07-10 18:59:32 -04:00
Eric Auger 587a7641d5 virtio-iommu: Rework the traces in virtio_iommu_set_page_size_mask()
The current error messages in virtio_iommu_set_page_size_mask()
sound quite similar for different situations and miss the IOMMU
memory region that causes the issue.

Clarify them and rework the comment.

Also remove the trace when the new page_size_mask is not applied as
the current frozen granule is kept. This message is rather confusing
for the end user and anyway the current granule would have been used
by the driver.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20230705165118.28194-3-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
2023-07-10 18:59:32 -04:00
Eric Auger 94df5b2180 virtio-iommu: Fix 64kB host page size VFIO device assignment
When running on a 64kB page size host and protecting a VFIO device
with the virtio-iommu, qemu crashes with this kind of message:

qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible
with mask 0x20010000
qemu: hardware error: vfio: DMA mapping failed, unable to continue

This is due to the fact the IOMMU MR corresponding to the VFIO device
is enabled very late on domain attach, after the machine init.
The device reports a minimal 64kB page size but it is too late to be
applied. virtio_iommu_set_page_size_mask() fails and this causes
vfio_listener_region_add() to end up with hw_error();

To work around this issue, we transiently enable the IOMMU MR on
machine init to collect the page size requirements and then restore
the bypass state.

Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned device")
Signed-off-by: Eric Auger <eric.auger@redhat.com>

Message-Id: <20230705165118.28194-2-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
2023-07-10 18:59:32 -04:00
Ani Sinha ca92eb5def hw/pci: warn when PCIe device is plugged into non-zero slot of downstream port
PCIe downstream ports only have a single device 0, so PCI Express devices can
only be plugged into slot 0 on a PCIe port. Add a warning to let users know
when the invalid configuration is used. We may enforce this more strongly later
once we get more clarity on whether we are introducing a bad regression for
users currently using the wrong configuration.

The change has been tested to not break or alter behaviors of ARI capable
devices by instantiating seven vfs on an emulated igb device (the maximum
number of vfs the igb device supports). The vfs are instantiated correctly
and are seen to have non-zero device/slot numbers in the conventional PCI BDF
representation.

CC: jusual@redhat.com
CC: imammedo@redhat.com
CC: mst@redhat.com
CC: akihiko.odaki@daynix.com

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20230705115925.5339-6-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
2023-07-10 18:59:32 -04:00
Laurent Vivier 77812aa7b1 vhost-vdpa: mute unaligned memory error report
With TPM CRM device, vhost-vdpa reports an error when it tries
to register a listener for a non aligned memory region:

  qemu-system-x86_64: vhost_vdpa_listener_region_add received unaligned region
  qemu-system-x86_64: vhost_vdpa_listener_region_del received unaligned region

This error can be confusing for the user whereas we only need to skip
the region (as it's already done after the error_report())

Rather than introducing a special case for TPM CRB memory section
to not display the message in this case, simply replace the
error_report() by a trace function (with more information, like the
memory region name).

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20230704071931.575888-2-lvivier@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 18:59:32 -04:00
Akihiko Odaki 08f6328480 pcie: Release references of virtual functions
pci_new() automatically retains a reference to a virtual function when
registering it so we need to release the reference when unregistering.

Fixes: 7c0fa8dff8 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230411090408.48366-1-akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2023-07-10 18:59:32 -04:00
Bernhard Beschow c925f40a29 hw/pci/pci: Remove multifunction parameter from pci_new_multifunction()
There is also pci_new() which creates non-multifunction PCI devices.
Accordingly the parameter is always set to true when a multi function PCI
device is to be created.

The reason for the parameter's existence seems to be that it is used in the
internal PCI code as well which is the only location where it gets set to
false. This one usage can be resolved by factoring out an internal helper
function.

Remove this redundant, error-prone parameter.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230304114043.121024-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 18:59:32 -04:00
Bernhard Beschow e052944a96 hw/pci/pci: Remove multifunction parameter from pci_create_simple_multifunction()
There is also pci_create_simple() which creates non-multifunction PCI
devices. Accordingly the parameter is always set to true when a multi
function PCI device is to be created.

The reason for the parameter's existence seems to be that it is used in the
internal PCI code as well which is the only location where it gets set to
false. This one usage can be replaced by trivial code.

Remove this redundant, error-prone parameter.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230304114043.121024-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow c589f7cf08 hw/i386/pc_piix: Move i440fx' realize near its qdev_new()
I440FX realization is currently mixed with PIIX3 creation. Furthermore, it is
common practice to only set properties between a device's qdev_new() and
qdev_realize(). Clean up to resolve both issues.

Since I440FX spawns a PCI bus let's also move the pci_bus initialization there.

Note that when running `qemu-system-x86_64 -M pc -S` before and after this
patch, `info mtree` in the QEMU console doesn't show any differences except that
the ordering is different.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-18-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow ce5ac09a75 hw/pci-host/i440fx: Resolve i440fx_init()
i440fx_init() is a legacy init function. The previous patches worked towards
TYPE_I440FX_PCI_HOST_BRIDGE to be instantiated the QOM way. Do this now by
transforming the parameters passed to i440fx_init() into property assignments.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-17-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-07-10 16:29:17 -04:00
Bernhard Beschow ff0a8cc4be hw/pci-host/i440fx: Add I440FX_HOST_PROP_PCI_TYPE property
I440FX needs a different PCI device model if the "igd-passthru" property is
enabled. The type name is currently passed as a parameter to i440fx_init(). This
parameter will be replaced by a property assignment once i440fx_init() gets
resolved.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-16-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow 82feef45f4 hw/pci-host/i440fx: Add PCI_HOST_{ABOVE, BELOW}_4G_MEM_SIZE properties
Introduce the properties in anticipation of QOM'ification; Q35 has the same
properties.

Note that we want to avoid a "ram size" property in the QOM interface since it
seems redundant to both properties introduced in this change. Thus the removal
of the ram_size parameter. We assume the invariant of both properties to sum up
to "ram size" which is already asserted in pc_memory_init(). Under Xen the
invariant seems to hold as well, so we now also check it there.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-15-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow c84858fd90 hw/pci-host/i440fx: Add PCI_HOST_PROP_IO_MEM property
Introduce the property in anticipation of QOM'ification; Q35 has the same
property.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-14-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow 09f85b7b93 hw/pci-host/i440fx: Make MemoryRegion pointers accessible as properties
The goal is to eliminate i440fx_init() which is a legacy init function. This
neccessitates the memory regions to be properties, like in Q35, which will be
assigned in board code.

Since i440fx needs different PCI devices in Xen mode, and since i440fx shall
be self-contained, the PCI device will be created during realization of the
host. Thus the pointers need to be moved to the host structure to be usable as
properties.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-13-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow 44df0552a0 hw/pci-host/i440fx: Move i440fx_realize() into PCII440FXState section
i440fx_realize() realizes the PCI device inside the host bridge
(PCII440FXState), but is implemented between i440fx_pcihost_realize() and
i440fx_init() which deal with the host bridge itself (I440FXState). Since we
want to append i440fx_init() to i440fx_pcihost_realize() later let's move
i440fx_realize() out of the way.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-12-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow a707466dd6 hw/pci-host/i440fx: Have common names for some local variables
`PCIHostState` is often referred to as `phb`, own device state usually as `s`.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-11-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow cda39f134b hw/pci-host/i440fx: Replace magic values by existing constants
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-10-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow f00f5e4b00 hw/pci-host/i440fx: Add "i440fx" child property in board code
The parent-child relation is usually established near a child's qdev_new(). For
i440fx this allows for reusing the machine parameter, thus avoiding
qdev_get_machine() which relies on a global variable.

Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-9-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow b90d7bff18 hw/i386/pc_piix: Turn some local variables into initializers
Eliminates an else branch.

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-8-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow 3d664a9a38 hw/pci-host/q35: Make some property name macros reusable by i440fx
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-7-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow e36102cb07 hw/pci-host/q35: Initialize PCI_HOST_BYPASS_IOMMU property from board code
The Q35 PCI host already has a PCI_HOST_BYPASS_IOMMU property. However, the
host initializes this property itself by accessing global machine state,
thereby assuming it to be a PC machine. Avoid this by having board code
set this property.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow 3b20f4ca59 hw/pci/pci_host: Introduce PCI_HOST_BYPASS_IOMMU macro
Introduce a macro to avoid copy and pasting strings which can easily
cause typos.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230630073720.21297-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow 06a492bd2b hw/pci-host/q35: Initialize PCMachineState::bus in board code
The Q35 PCI host currently sets the PC machine's PCI bus attribute
through global state, thereby assuming the machine to be a PC machine.
The Q35 machine code already holds on to Q35's pci bus attribute, so can
easily set its own property while preserving encapsulation.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow 8cf08065b1 hw/pci-host/q35: Fix double, contradicting .endianness assignment
Fixes the following clangd warning (-Winitializer-overrides):

  q35.c:297:19: Initializer overrides prior initialization of this subobject
  q35.c:292:19: previous initialization is here

Settle on little endian which is consistent with using pci_host_conf_le_ops.

Fixes: bafc90bdc5 ("q35: implement TSEG")
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Bernhard Beschow 00f52e77d7 hw/i386/pc_q35: Resolve redundant q35_host variable
The variable is redundant to "phb" and is never used by its real type.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:29:17 -04:00
Tom Lonergan 667e58aef1 vhost-user: Make RESET_DEVICE a per device message
A device reset is issued per device, not per VQ. The legacy device reset
message, VHOST_USER_RESET_OWNER, is already a per device message. Therefore,
this change adds the proper message, VHOST_USER_RESET_DEVICE, to per device
messages.

Signed-off-by: Tom Lonergan <tom.lonergan@nutanix.com>
Message-Id: <20230628163927.108171-3-tom.lonergan@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2023-07-10 16:17:08 -04:00
Tom Lonergan 0dcb4172f2 vhost-user: Change one_time to per_device request
Some devices, like virtio-scsi, consist of one vhost_dev, while others, like
virtio-net, contain multiple vhost_devs. The QEMU vhost-user code has a
concept of one-time messages which is misleading. One-time messages are sent
once per operation on the device, not once for the lifetime of the device.
Therefore, as discussed in [1], vhost_user_one_time_request should be
renamed to vhost_user_per_device_request and the relevant comments updated
to match the real functionality.

[1] https://lore.kernel.org/qemu-devel/20230127083027-mutt-send-email-mst@kernel.org/

Signed-off-by: Tom Lonergan <tom.lonergan@nutanix.com>
Message-Id: <20230628163927.108171-2-tom.lonergan@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2023-07-10 16:17:08 -04:00
Zhao Liu 196ea60a73 hw/smbios: Fix core count in type4
>From SMBIOS 3.0 specification, core count field means:

Core Count is the number of cores detected by the BIOS for this
processor socket. [1]

Before 003f230e37 ("machine: Tweak the order of topology members in
struct CpuTopology"), MachineState.smp.cores means "the number of cores
in one package", and it's correct to use smp.cores for core count.

But 003f230e37 changes the smp.cores' meaning to "the number of cores
in one die" and doesn't change the original smp.cores' use in smbios as
well, which makes core count in type4 go wrong.

Fix this issue with the correct "cores per socket" caculation.

[1] SMBIOS 3.0.0, section 7.5.6, Processor Information - Core Count

Fixes: 003f230e37 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-5-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:17:08 -04:00
Zhao Liu 7298fd7de5 hw/smbios: Fix thread count in type4
>From SMBIOS 3.0 specification, thread count field means:

Thread Count is the total number of threads detected by the BIOS for
this processor socket. It is a processor-wide count, not a
thread-per-core count. [1]

So here we should use threads per socket other than threads per core.

[1] SMBIOS 3.0.0, section 7.5.8, Processor Information - Thread Count

Fixes: c97294ec1b ("SMBIOS: Build aggregate smbios tables and entry point")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-4-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:17:08 -04:00
Zhao Liu d79a284a44 hw/smbios: Fix smbios_smp_sockets caculation
smp.sockets is the number of sockets which is configured by "-smp" (
otherwise, the default is 1). Trying to recalculate it here with another
rules leads to errors, such as:

1. 003f230e37 ("machine: Tweak the order of topology members in struct
   CpuTopology") changes the meaning of smp.cores but doesn't fix
   original smp.cores uses.

   With the introduction of cluster, now smp.cores means the number of
   cores in one cluster. So smp.cores * smp.threads just means the
   threads in a cluster not in a socket.

2. On the other hand, we shouldn't use smp.cpus here because it
   indicates the initial number of online CPUs at the boot time, and is
   not mathematically related to smp.sockets.

So stop reinventing the another wheel and use the topo values that
has been calculated.

Fixes: 003f230e37 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-3-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:17:08 -04:00
Zhao Liu a1d027be95 machine: Add helpers to get cores/threads per socket
The number of cores/threads per socket are needed for smbios, and are
also useful for other modules.

Provide the helpers to wrap the calculation of cores/threads per socket
so that we can avoid calculation errors caused by other modules miss
topology changes.

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-2-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:17:08 -04:00
Milan Zamazal c46b20cf83 hw/virtio: Add vhost-user-scmi-pci boilerplate
This allows is to instantiate a vhost-user-scmi device as part of a PCI bus.
It is mostly boilerplate similar to the other vhost-user-*-pci boilerplates
of similar devices.

Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Message-Id: <20230628100524.342666-3-mzamazal@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:17:08 -04:00
Milan Zamazal a5dab090e1 hw/virtio: Add boilerplate for vhost-user-scmi device
This creates the QEMU side of the vhost-user-scmi device which connects to
the remote daemon.  It is based on code of similar vhost-user devices.

Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Message-Id: <20230628100524.342666-2-mzamazal@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:17:07 -04:00
Erico Nunes 31f137e3d6 vhost-user-gpu: implement get_edid frontend feature
Implement the frontend side of the get_edid feature in the qemu
vhost-user-gpu frontend device.

Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230626164708.1163239-5-ernunes@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:17:07 -04:00
Erico Nunes ee3729d9b0 virtio-gpu: refactor generate_edid function to virtio_gpu_base
This functionality can be shared with upcoming use in vhost-user-gpu, so
move it to the shared file to avoid duplicating it.

Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230626164708.1163239-2-ernunes@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 16:17:07 -04:00
Viktor Prutyanov cd9b834688 virtio-net: pass Device-TLB enable/disable events to vhost
If vhost is enabled for virtio-net, Device-TLB enable/disable events
must be passed to vhost for proper IOMMU unmap flag selection.

Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230626091258.24453-3-viktor@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 15:07:50 -04:00
Viktor Prutyanov ee071f67f7 vhost: register and change IOMMU flag depending on Device-TLB state
The guest can disable or never enable Device-TLB. In these cases,
it can't be used even if enabled in QEMU. So, check Device-TLB state
before registering IOMMU notifier and select unmap flag depending on
that. Also, implement a way to change IOMMU notifier flag if Device-TLB
state is changed.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2001312
Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230626091258.24453-2-viktor@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 15:07:50 -04:00
Eugenio Pérez 2b5de4d7df vdpa: Remove status in reset tracing
It is always 0 and it is not useful to route call through file
descriptor.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230526153736.472443-1-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10 15:07:50 -04:00
Alex Williamson c00aac6f14 vfio/pci: Enable AtomicOps completers on root ports
Dynamically enable Atomic Ops completer support around realize/exit of
vfio-pci devices reporting host support for these accesses and adhering
to a minimal configuration standard.  While the Atomic Ops completer
bits in the root port device capabilities2 register are read-only, the
PCIe spec does allow RO bits to change to reflect hardware state.  We
take advantage of that here around the realize and exit functions of
the vfio-pci device.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Robin Voetter <robin@streamhpc.com>
Tested-by: Robin Voetter <robin@streamhpc.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Alex Williamson cf2916f6a1 pcie: Add a PCIe capability version helper
Report the PCIe capability version for a device

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Robin Voetter <robin@streamhpc.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Tony Krowiak 1360b2ad1f s390x/ap: Wire up the device request notifier interface
Let's wire up the device request notifier interface to handle device unplug
requests for AP.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Link: https://lore.kernel.org/qemu-devel/20230530225544.280031-1-akrowiak@linux.ibm.com/
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Avihai Horon 8af87a3ec7 vfio: Fix null pointer dereference bug in vfio_bars_finalize()
vfio_realize() has the following flow:
1. vfio_bars_prepare() -- sets VFIOBAR->size.
2. msix_early_setup().
3. vfio_bars_register() -- allocates VFIOBAR->mr.

After vfio_bars_prepare() is called msix_early_setup() can fail. If it
does fail, vfio_bars_register() is never called and VFIOBAR->mr is not
allocated.

In this case, vfio_bars_finalize() is called as part of the error flow
to free the bars' resources. However, vfio_bars_finalize() calls
object_unparent() for VFIOBAR->mr after checking only VFIOBAR->size, and
thus we get a null pointer dereference.

Fix it by checking VFIOBAR->mr in vfio_bars_finalize().

Fixes: 89d5202edc ("vfio/pci: Allow relocating MSI-X MMIO")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Zhenzhong Duan d4a2af747d vfio/migration: Return bool type for vfio_migration_realize()
Make vfio_migration_realize() adhere to the convention of other realize()
callbacks(like qdev_realize) by returning bool instead of int.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Zhenzhong Duan 0520d63c77 vfio/migration: Remove print of "Migration disabled"
Property enable_migration supports [on/off/auto].
In ON mode, error pointer is passed to errp and logged.
In OFF mode, we doesn't need to log "Migration disabled" as it's intentional.
In AUTO mode, we should only ever see errors or warnings if the device
supports migration and an error or incompatibility occurs while further
probing or configuring it. Lack of support for migration shoundn't
generate an error or warning.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Zhenzhong Duan 2b43b2995b vfio/migration: Free resources when vfio_migration_realize fails
When vfio_realize() succeeds, hot unplug will call vfio_exitfn()
to free resources allocated in vfio_realize(); when vfio_realize()
fails, vfio_exitfn() is never called and we need to free resources
in vfio_realize().

In the case that vfio_migration_realize() fails,
e.g: with -only-migratable & enable-migration=off, we see below:

(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off
0000:81:11.1: Migration disabled
Error: disallowing migration blocker (--only-migratable) for: 0000:81:11.1: Migration is disabled for VFIO device

If we hotplug again we should see same log as above, but we see:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off
Error: vfio 0000:81:11.1: device is already attached

That's because some references to VFIO device isn't released.
For resources allocated in vfio_migration_realize(), free them by
jumping to out_deinit path with calling a new function
vfio_migration_deinit(). For resources allocated in vfio_realize(),
free them by jumping to de-register path in vfio_realize().

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Fixes: a22651053b ("vfio: Make vfio-pci device migration capable")
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Zhenzhong Duan 3c26c80a0a vfio/migration: Change vIOMMU blocker from global to per device
Contrary to multiple device blocker which needs to consider already-attached
devices to unblock/block dynamically, the vIOMMU migration blocker is a device
specific config. Meaning it only needs to know whether the device is bypassing
or not the vIOMMU (via machine property, or per pxb-pcie::bypass_iommu), and
does not need the state of currently present devices. For this reason, the
vIOMMU global migration blocker can be consolidated into the per-device
migration blocker, allowing us to remove some unnecessary code.

This change also makes vfio_mig_active() more accurate as it doesn't check for
global blocker.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Zhenzhong Duan adee0da036 vfio/pci: Disable INTx in vfio_realize error path
When vfio realize fails, INTx isn't disabled if it has been enabled.
This may confuse host side with unhandled interrupt report.

Fixes: c5478fea27 ("vfio/pci: Respond to KVM irqchip change notifier")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Alex Williamson 0ddcb39c93 hw/vfio/pci-quirks: Sanitize capability pointer
Coverity reports a tained scalar when traversing the capabilities
chain (CID 1516589).  In practice I've never seen a device with a
chain so broken as to cause an issue, but it's also pretty easy to
sanitize.

Fixes: f6b30c1984 ("hw/vfio/pci-quirks: Support alternate offset for GPUDirect Cliques")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-07-10 09:52:52 +02:00
Richard Henderson 9e7ce9ebce qemu trivial patches for 2023-07-08
-----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmSo5UsPHG1qdEB0bHMu
 bXNrLnJ1AAoJEHAbT2saaT5ZIygH/0z5heVC4CnU5sNuv8GRbXZL7sOXxvoOQf9K
 dEBD0/Lzu8QL7mJBpqPuzK4FN/CNnY2nUaGGn3k7U44MrsU6g2P5ksD4Y0hUebkZ
 sw+bsaqDeG0J+kZ9IN4V1iLoyGGQ53GDynisZMWY8w/hDmCkoRe5Xy2LZrXS7fgw
 LVT04wlxTYG/EwXqMukBYd/S/lap8pUnzgc+VngHmX5gBF4O7qIAZV1j5WobIW6S
 /OOOOBZ4YJU9Ha18H4DWrE5+ptfE2sfWAO0z+c/v3bpNTUb9bCyNIE3R3mUwg7z6
 i/96za3R9XH0ChoBWrKHtFn5wygUMGHt63JwIph0bwr73ENVlrE=
 =4nBR
 -----END PGP SIGNATURE-----

Merge tag 'trivial-patches-20230708' of https://gitlab.com/mjt0k/qemu into staging

qemu trivial patches for 2023-07-08

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmSo5UsPHG1qdEB0bHMu
# bXNrLnJ1AAoJEHAbT2saaT5ZIygH/0z5heVC4CnU5sNuv8GRbXZL7sOXxvoOQf9K
# dEBD0/Lzu8QL7mJBpqPuzK4FN/CNnY2nUaGGn3k7U44MrsU6g2P5ksD4Y0hUebkZ
# sw+bsaqDeG0J+kZ9IN4V1iLoyGGQ53GDynisZMWY8w/hDmCkoRe5Xy2LZrXS7fgw
# LVT04wlxTYG/EwXqMukBYd/S/lap8pUnzgc+VngHmX5gBF4O7qIAZV1j5WobIW6S
# /OOOOBZ4YJU9Ha18H4DWrE5+ptfE2sfWAO0z+c/v3bpNTUb9bCyNIE3R3mUwg7z6
# i/96za3R9XH0ChoBWrKHtFn5wygUMGHt63JwIph0bwr73ENVlrE=
# =4nBR
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 08 Jul 2023 05:25:47 AM BST
# gpg:                using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59
# gpg:                issuer "mjt@tls.msk.ru"
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [undefined]
# gpg:                 aka "Michael Tokarev <mjt@debian.org>" [undefined]
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* tag 'trivial-patches-20230708' of https://gitlab.com/mjt0k/qemu:
  hw/arm/virt-acpi-build.c: Add missing header
  migration: unexport migrate_fd_error()
  migration: factor out "resume_requested" in qmp_migrate()
  qemu-options.hx: Fix indentation of some option descriptions
  vdpa: Sort vdpa_feature_bits array alphabetically
  vdpa: Delete duplicated VIRTIO_NET_F_RSS in vdpa_feature_bits
  hw: Simplify calls to pci_nic_init_nofail()
  trivial: man page: document display::gtk::zoom-to-fit
  target/avr: Fix handling of interrupts above 33.
  hw/riscv/virt.c: fix typo in 'aia' description

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:24:30 +01:00
Peng Liang 13a637430b hw/arm/virt-acpi-build.c: Add missing header
virt-acpi-build.c uses warn_report. However, it doesn't include
qemu/error-report.h directly, it include qemu/error-report.h via trace.h
if we enable log trace backend. But if we disable the log trace backend
(e.g., --enable-trace-backends=nop), then virt-acpi-build.c will not
include qemu/error-report.h any more and it will lead to build errors.
Include qemu/error-report.h directly in virt-acpi-build.c to avoid the
errors.

Fixes: 451b157041 ("acpi: Align the size to 128k")
Signed-off-by: Peng Liang <tcx4c70@gmail.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(mjt: move the #include higher as suggested by Ani Sinha)
2023-07-08 07:24:38 +03:00