Commit Graph

33106 Commits

Author SHA1 Message Date
David Woodhouse ecb0e98b4f hw/intc/i8259: Implement legacy LTIM Edge/Level Bank Select
Back in the mists of time, before EISA came along and required per-pin
level control in the ELCR register, the i8259 had a single chip-wide
level-mode control in bit 3 of ICW1.

Even in the PIIX3 datasheet from 1996 this is documented as 'This bit is
disabled', but apparently MorphOS is using it in the version of the
i8259 which is in the Pegasos2 board as part of the VT8231 chipset.

It's easy enough to implement, and I think it's harmless enough to do so
unconditionally.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
[balaton: updated commit message as asked by author]
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <3f09b2dd109d19851d786047ad5c2ff459c90cd7.1678188711.git.balaton@eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-03-08 00:37:48 +01:00
BALATON Zoltan 4e02105257 hw/display/sm501: Add debug property to control pixman usage
Add a property to allow disabling pixman and always use the fallbacks
for different operations which is useful for testing different drawing
methods or debugging pixman related issues.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Tested-by: Rene Engel <ReneEngel80@emailn.de>
Message-Id: <61768ffaefa71b65a657d1365823bd43c7ee9354.1678188711.git.balaton@eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-03-08 00:37:48 +01:00
BALATON Zoltan 3820001131 Revert "hw/isa/vt82c686: Remove intermediate IRQ forwarder"
To be 'usable', QDev objects (which are QOM objects) must be
1/ initialized (at this point their properties can be modified), then
2/ realized (properties are consumed).
Some devices (objects) might depend on other devices. When creating
the 'QOM composition tree', parent objects can't be 'realized' until
all their children are. We might also have circular dependencies.
A common circular dependency occurs with IRQs. Device (A) has an
output IRQ wired to device (B), and device (B) has one to device (A).
When (A) is realized and connects its IRQ to an unrealized (B), the
IRQ handler on (B) is not yet created. QEMU pass IRQ between objects
as pointer. When (A) poll (B)'s IRQ, it is NULL. Later (B) is realized
and its IRQ pointers are populated, but (A) keeps a reference to a
NULL pointer.
A common pattern to bypass this circular limitation is to use 'proxy'
objects. Proxy (P) is created (and realized) before (A) and (B). Then
(A) and (B) can be created in different order, it doesn't matter: (P)
pointers are already populated.

Commit bb98e0f59c ("hw/isa/vt82c686: Remove intermediate IRQ
forwarder") neglected the QOM/QDev circular dependency issue, and
removed the 'proxy' between the southbridge, its PCI functions and the
interrupt controller, resulting in PCI functions wiring output IRQs to
'NULL', leading to guest failures (IRQ never delivered) [1] [2].

Since we are entering feature freeze, it is safer to revert the
offending patch until we figure a way to strengthen our APIs.

[1] https://lore.kernel.org/qemu-devel/928a8552-ab62-9e6c-a492-d6453e338b9d@redhat.com/
[2] https://lore.kernel.org/qemu-devel/cover.1677628524.git.balaton@eik.bme.hu/

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Tested-by: Rene Engel <ReneEngel80@emailn.de>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <cdfb3c5a42e505450f6803124f27856434c5b298.1677628524.git.balaton@eik.bme.hu>
[PMD: Reworded description]
Inspired-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-03-08 00:37:48 +01:00
Philippe Mathieu-Daudé d1396cc749 Revert "hw/isa/i82378: Remove intermediate IRQ forwarder"
To be 'usable', QDev objects (which are QOM objects) must be
1/ initialized (at this point their properties can be modified), then
2/ realized (properties are consumed).
Some devices (objects) might depend on other devices. When creating
the 'QOM composition tree', parent objects can't be 'realized' until
all their children are. We might also have circular dependencies.
A common circular dependency occurs with IRQs. Device (A) has an
output IRQ wired to device (B), and device (B) has one to device (A).
When (A) is realized and connects its IRQ to an unrealized (B), the
IRQ handler on (B) is not yet created. QEMU pass IRQ between objects
as pointer. When (A) poll (B)'s IRQ, it is NULL. Later (B) is realized
and its IRQ pointers are populated, but (A) keeps a reference to a
NULL pointer.
A common pattern to bypass this circular limitation is to use 'proxy'
objects. Proxy (P) is created (and realized) before (A) and (B). Then
(A) and (B) can be created in different order, it doesn't matter: (P)
pointers are already populated.

Commit cef2e7148e ("hw/isa/i82378: Remove intermediate IRQ forwarder")
neglected the QOM/QDev circular dependency issue, and removed the
'proxy' between the southbridge, its PCI functions and the interrupt
controller, resulting in PCI functions wiring output IRQs to
'NULL', leading to guest failures (IRQ never delivered) [1] [2].

Since we are entering feature freeze, it is safer to revert the
offending patch until we figure a way to strengthen our APIs.

[1] https://lore.kernel.org/qemu-devel/928a8552-ab62-9e6c-a492-d6453e338b9d@redhat.com/
[2] https://lore.kernel.org/qemu-devel/cover.1677628524.git.balaton@eik.bme.hu/

This reverts commit cef2e7148e.

Reported-by: Thomas Huth <thuth@redhat.com>
Inspired-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-03-08 00:37:48 +01:00
Philippe Mathieu-Daudé 4c921e3fb2 hw/mips/itu: Pass SAAR using QOM link property
QOM objects shouldn't access each other internals fields
except using the QOM API.

mips_cps_realize() instantiates a TYPE_MIPS_ITU object, and
directly sets the 'saar' pointer:

   if (saar_present) {
       s->itu.saar = &env->CP0_SAAR;
   }

In order to avoid that, pass the MIPS_CPU object via a QOM
link property, and set the 'saar' pointer in mips_itu_realize().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20230203113650.78146-10-philmd@linaro.org>
2023-03-08 00:37:48 +01:00
Philippe Mathieu-Daudé 10997f2d1d hw/mips: Declare all length properties as unsigned
Some length properties are signed, other unsigned:

  hw/mips/cps.c:183:    DEFINE_PROP_UINT32("num-vp", MIPSCPSState, num_vp, 1),
  hw/mips/cps.c:184:    DEFINE_PROP_UINT32("num-irq", MIPSCPSState, num_irq, 256),
  hw/misc/mips_cmgcr.c:215:    DEFINE_PROP_INT32("num-vp", MIPSGCRState, num_vps, 1),
  hw/misc/mips_cpc.c:167:    DEFINE_PROP_UINT32("num-vp", MIPSCPCState, num_vp, 0x1),
  hw/misc/mips_itu.c:552:    DEFINE_PROP_INT32("num-fifo", MIPSITUState, num_fifo,
  hw/misc/mips_itu.c:554:    DEFINE_PROP_INT32("num-semaphores", MIPSITUState,

Since negative values are not used (the minimum is '0'),
unify by declaring all properties as unsigned.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230203113650.78146-9-philmd@linaro.org>
2023-03-08 00:37:48 +01:00
Cédric Le Goater 969dae5448 vfio: Fix vfio_get_dev_region() trace event
Simply transpose 'x8' to fix the typo and remove the ending '8'

Fixes: e61a424f05 ("vfio: Create device specific region info helper")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1526
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20230303074330.2609377-1-clg@kaod.org
[aw: commit log s/revert/transpose/]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 11:19:07 -07:00
Alex Williamson 8249cffc62 vfio/migration: Rename entry points
Pick names that align with the section drivers should use them from,
avoiding the confusion of calling a _finalize() function from _exit()
and generalizing the actual _finalize() to handle removing the viommu
blocker.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/167820912978.606734.12740287349119694623.stgit@omen
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 11:19:07 -07:00
Jonathan Cameron 415442a1b4 hw/mem/cxl_type3: Add CXL RAS Error Injection Support.
CXL uses PCI AER Internal errors to signal to the host that an error has
occurred. The host can then read more detailed status from the CXL RAS
capability.

For uncorrectable errors: support multiple injection in one operation
as this is needed to reliably test multiple header logging support in an
OS. The equivalent feature doesn't exist for correctable errors, so only
one error need be injected at a time.

Note:
 - Header content needs to be manually specified in a fashion that
   matches the specification for what can be in the header for each
   error type.

Injection via QMP:
{ "execute": "qmp_capabilities" }
...
{ "execute": "cxl-inject-uncorrectable-errors",
  "arguments": {
    "path": "/machine/peripheral/cxl-pmem0",
    "errors": [
        {
            "type": "cache-address-parity",
            "header": [ 3, 4]
        },
        {
            "type": "cache-data-parity",
            "header": [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]
        },
        {
            "type": "internal",
            "header": [ 1, 2, 4]
        }
        ]
  }}
...
{ "execute": "cxl-inject-correctable-error",
    "arguments": {
        "path": "/machine/peripheral/cxl-pmem0",
        "type": "physical"
    } }

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230302133709.30373-9-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Jonathan Cameron 4a295211f7 hw/pci/aer: Make PCIE AER error injection facility available for other emulation to use.
This infrastructure will be reused for CXL RAS error injection
in patches that follow.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230302133709.30373-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07 12:39:00 -05:00
Jonathan Cameron cb4e642cfa hw/cxl: Fix endian issues in CXL RAS capability defaults / masks
As these are about to be modified, fix the endian handle for
this set of registers rather than making it worse.

Note that CXL is currently only supported in QEMU on
x86 (arm64 patches out of tree) so we aren't going to yet hit
an problems with big endian. However it is good to avoid making
things worse for that support in the future.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230302133709.30373-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07 12:39:00 -05:00
Jonathan Cameron 6be947bdfc hw/mem/cxl-type3: Add AER extended capability
This enables AER error injection to function as expected.
It is intended as a building block in enabling CXL RAS error injection
in the following patches.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Message-Id: <20230302133709.30373-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07 12:39:00 -05:00
Jonathan Cameron 7e33517fdd hw/pci-bridge/cxl_root_port: Wire up MSI
Done to avoid fixing ACPI route description of traditional PCI interrupts on q35
and because we should probably move with the times anyway.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Message-Id: <20230302133709.30373-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07 12:39:00 -05:00
Jonathan Cameron 47f0e7ab32 hw/pci-bridge/cxl_root_port: Wire up AER
We are missing necessary config write handling for AER emulation in
the CXL root port. Add it based on pcie_root_port.c

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Message-Id: <20230302133709.30373-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07 12:39:00 -05:00
Jonathan Cameron 9a6ef182c0 hw/pci/aer: Add missing routing for AER errors
PCIe r6.0 Figure 6-3 "Pseudo Logic Diagram for Selected Error Message Control
and Status Bits" includes a right hand branch under "All PCI Express devices"
that allows for messages to be generated or sent onwards without SERR#
being set as long as the appropriate per error class bit in the PCIe
Device Control Register is set.

Implement that branch thus enabling routing of ERR_COR, ERR_NONFATAL
and ERR_FATAL under OSes that set these bits appropriately (e.g. Linux)

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Message-Id: <20230302133709.30373-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07 12:39:00 -05:00
Jonathan Cameron 010746ae1d hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register
This register in AER should be both writeable and should
have a default value with a couple of the errors masked
including the Uncorrectable Internal Error used by CXL for
it's error reporting.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Message-Id: <20230302133709.30373-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov f18e29fc90 pcihp: add ACPI PCI hotplug specific is_hotpluggable_bus() callback
Provide pcihp specific callback to check if bus is hotpluggable
and consolidate its scattered hotplug criteria there.
While at it clean up no longer needed
   qbus_set_hotplug_handler(BUS(bus), NULL)
workarounds since callback makes qbus_is_hotpluggable() return
correct answer even if hotplug_handler is set on bus.

PS:
see ("pci: fix 'hotplugglable' property behavior") for details
why callback was introduced.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-35-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov 6536e427ce pcihp: move fields enabling hotplug into AcpiPciHpState
... instead of duplicating them in piix4 and lpc and then
trying to pass them to pcihp routines as arguments.
it simplifies call sites and places pcihp specific in
its own structure.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-34-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov 02c106139a acpi: pci: move out ACPI PCI hotplug generator from generic slot generator build_append_pci_bus_devices()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-33-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov 62dd55fcf7 acpi: pci: move BSEL into build_append_pcihp_slots()
Generic PCI enumeration code doesn't really need access to
BSEL value, it is only used as means to decide if hotplug
enumerator should be called.

Use stateless object_property_find() to do that, and move
the rest of BSEL handling into build_append_pcihp_slots()
where it belongs.

This cleans up generic code a bit from hotplug stuff
and follow up patch will remove remaining call to
build_append_pcihp_slots() from generic code, making
it possible to use without ACPI PCI hotplug dependencies.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-32-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov 419233b2b4 acpi: pci: drop BSEL usage when deciding that device isn't hotpluggable
previous commit ("pci: fix 'hotplugglable' property behavior") fixed
pcie root port's 'hotpluggable' property to behave consistently.

So we don't need a BSEL crutch anymore to see of device is not
hotpluggable, drop it from 'generic' PCI slots description handling.

BSEL is still used to decide if hotplug part should be called
but that will be moved out of generic code to hotplug one by
followup patches.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-31-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov 041b1c40f3 pci: move acpi-index uniqueness check to generic PCI device code
acpi-index is now working with non-hotpluggable buses
(pci/q35 machine hostbridge), it can be used even if
ACPI PCI hotplug is disabled and as result acpi-index
uniqueness check will be omitted (since the check is
done by ACPI PCI hotplug handler, which isn't wired
when ACPI PCI hotplug is disabled).
Move check and related code to generic PCIDevice so it
would be independent of ACPI PCI hotplug.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-30-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov 05a49b9c2f acpi: pci: describe all functions on populated slots
describing all present devices on functions other than
0 was complicated when non hotplug and hotplug code
was intermixed. So QEMU has been excluding non zero
functions since they are not supported by hotplug code,
then a condition to whitelist coldplugged bridges was
added and later whitelisting of devices that advertise
presence of their own AML description.

With non hotplug and hotplug code separated, it is
possible to relax rules and allow describing all
non-hotpluggble functions and hence simplify
conditions whether PCI device should be enumerated by
generic (non-hotplug) code.

Price of that simplification is an extra few Device()
descriptors in DSDT exposing built-in chipset functions,
which has no functional effect on guest side.

Apart from that, the enumeration of non zero functions,
allows to attach more NICs with acpi-index enabled
directly on hostbridge (if hotplug is not required).

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-25-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov 7fb1d7388b acpi: pci: support acpi-index for non-hotpluggable devices
Inject static _DSM (EDSM) if non-hotpluggable device has
acpi-index configured on it.
It lets use acpi-index non-hotpluggable devices / devices
attached to non-hotpluggable bus.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-22-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov fe0d5f5319 acpi: pci: add EDSM method to DSDT
it's a helper method for acpi-index support on PCI buses
that do no support or have disabled ACPI PCI hotplug
or for non-hotpluggble endpoint devices.
(like non-hotpluggble NICs, integrated endpoints and
later for machines that do not support ACPI PCI hotplug)

no functional change, commit adds only EDSM method in DSDT
without any users. (the follow up patches will use it)

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-18-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov 0a4584fca3 pcihp: move PCI _DSM function 0 prolog into separate function
it will be reused by follow up patches that will implement
static _DSM for non-hotpluggable devices.

no functional AML change, only context one, where 'cap' (Local1)
initialization is moved after UUID/revision checks.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-15-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov ceefa0b746 pci: fix 'hotplugglable' property behavior
Currently the property may flip its state
during VM bring up or just doesn't work as
the name implies.

In particular with PCIE root port that has
'hotplug={on|off}' property, and when it's
turned off, one would expect
  'hotpluggable' == false
for any devices attached to it.
Which is not the case since qbus_is_hotpluggable()
used by the property just checks for presence
of any hotplug_handler set on bus.

The problem is that name BusState::hotplug_handler
from its inception is misnomer, as it handles
not only hotplug but also in many cases coldplug
as well (i.e. generic wiring interface), and
it's fine to have hotplug_handler set on bus
while it doesn't support hotplug (ex. pcie-slot
with hotplug=off).

Another case of root port flipping 'hotpluggable'
state when ACPI PCI hotplug is enabled in this
case root port with 'hotplug=off' starts as
hotpluggable and then later on, pcihp
hotplug_handler clears hotplug_handler
explicitly after checking root port's 'hotplug'
property.

So root-port hotpluggablity check sort of works
if pcihp is enabled but is broken if pcihp is
disabled.

One way to deal with the issue is to ask
hotplug_handler if bus it controls is hotpluggable
or not. To do that add is_hotpluggable_bus()
hook to HotplugHandler interface and use it in
'hotpluggable' property + teach pcie-slot to
actually look into 'hotplug' property state
before deciding if bus is hotpluggable.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-13-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov f40e6a4cc1 pcihp: piix4: do not redirect hotplug controller to piix4 when ACPI hotplug is disabled
commit [1] added ability to disable ACPI PCI hotplug
on hostbridge but forgot to take into account that it
should disable all ACPI hotplug machinery in case both
hostbridge and bridge hotplug are disabled.

Commit [2] tried to fix that, however it forgot to
remove hotplug_handler override which hands hotplug
control over to piix4 hotplug controller
(uninitialized after [2]).

As result at the time bridge is plugged in, its default
(SHPC) hotplug handler is replaced by piix4 one in
  acpi_pcihp_device_plug_cb()
    ...
    if (!s->legacy_piix &&
       ...
       qbus_set_hotplug_handler(BUS(sec), OBJECT(hotplug_dev));

which is acting on uninitialized s->legacy_piix value
(0 by default) that was supposed to be initialized by
acpi_pcihp_init(), that is no longer called due to
following condition being false:

  piix4_acpi_system_hot_add_init()
    if (s->use_acpi_hotplug_bridge || s->use_acpi_root_pci_hotplug) {

and the bridge ends up with piix4 as hotplug handler
instead of shpc one.

Followup hotplug on that bridge as result yields
piix4 specific error:

  Error: Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set

1) 3d7e78aa77 (Introduce a new flag for i440fx to disable PCI hotplug on the root bus)
2) df4008c9c5 (piix4: don't reserve hw resources when hotplug is off globally)

Fixes: df4008c9c5 (piix4: don't reserve hw resources when hotplug is off globally)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-12-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov 0e84fd3b98 x86: pcihp: fix missing bridge AML when intermediate root-port has 'hotplug=off' set
(I practice [1] hasn't broke anything since on hardware side we unset
hotplug_handler on such intermediate port => hotplug behind it has
never worked)

When deciding if bridge should be described, the original
condition was

  cold_plugged_bridge && pcihp_bridge_en

which was replaced [1] by

  bridge has ACPI_PCIHP_PROP_BSEL

the later however is not the same thing as the original
and flips to false if intermediate bridge has hotplug
turned off (root-port with 'hotplug=off' option).

Since we already in build_pci_bridge_aml(), the question
if it's bridge is answered. Use DeviceState::hotplugged
to make decision if bridge should describe its slots.

What's left out is pcihp_bridge_en, which tells us if
ACPI bridge hotplug is enabled.

With hotplug and non hotplug part now being mostly
separated, omitting this check will only lead to
colplugged bridges describe occupied slots in case
when ACPI bridge hotplug is disabled.
Which makes behavior consistent with occupied slots
on hostbridge.

Ex (pc/DSDT.hpbrroot diff):
  ...
               Device (S20)
               {
                   Name (_ADR, 0x00040000)  // _ADR: Address
  +                Device (S08)
  +                {
  +                    Name (_ADR, 0x00010000)  // _ADR: Address
  +                }
  +
  +                Device (S10)
  +                {
  +                    Name (_ADR, 0x00020000)  // _ADR: Address
  +                }
               }
  ...

PS:
testing shows that above doesn't affect adversely guest OS
behavior: i.e. if ACPI bridge hotplug is enabled it's
expected behaviour, and with ACPI bridge hotplug is disabled
(a.k. native hotplug), it doesn't break slot enumeration
nor native hotplug. (tested with RHEL9.0 and WS2022).

1)
Fixes: 6c36ec46b0 ("pcihp: make bridge describe itself using AcpiDevAmlIfClass:build_dev_aml")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-10-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov 11215a349e x86: pcihp: fix missing PCNT callchain when intermediate root-port has 'hotplug=off' set
Beside BSEL numbers change (due to 2 extra root-ports in q35/miltibridge test),
following change is expected:

       Scope (\_SB.PCI0)
       {
  ...
  +        Scope (S50)
  +        {
  +            Scope (S00)
  +            {
  +                Method (PCNT, 0, NotSerialized)
  +                {
  +                    BNUM = Zero
  +                    DVNT (PCIU, One)
  +                    DVNT (PCID, 0x03)
  +                }
  +            }
  +
  +            Method (PCNT, 0, NotSerialized)
  +            {
  +                ^S00.PCNT
  +            }
  +        }
  ...
           Method (PCNT, 0, NotSerialized)
           {
  +            ^S50.PCNT ()
               ^S13.PCNT ()
               ^S12.PCNT ()
               ^S11.PCNT ()

I practice [1] hasn't broke anything since on hardware side we unset
hotplug_handler on such intermediate port => hotplug behind it has
not been properly wired and as result not worked.

1)
Fixes: ddab4d3fae ("pcihp: compose PCNT callchain right before its user _GPE._E01")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-8-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez ab7337e3b2 vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices
vhost-vdpa devices can return this feature now that blockers have been
set in case some features are not met.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-15-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 57ac831865 vdpa: block migration if SVQ does not admit a feature
Next patches enable devices to be migrated even if vdpa netdev has not
been started with x-svq. However, not all devices are migratable, so we
need to block migration if we detect that.

Block migration if we detect the device expose a feature SVQ does not
know how to work with.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-13-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 9c363cf6d5 vdpa net: block migration if the device has CVQ
Devices with CVQ need to migrate state beyond vq state.  Leaving this to
future series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-11-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez a230c4712b vdpa: disable RAM block discard only for the first device
Although it does not make a big difference, its more correct and
simplifies the cleanup path in subsequent patches.

Move ram_block_discard_disable(false) call to the top of
vhost_vdpa_cleanup because:
* We cannot use vhost_vdpa_first_dev after dev->opaque = NULL
  assignment.
* Improve the stack order in cleanup: since it is the last action taken
  in init, it should be the first at cleanup.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-10-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez c3716f260b vdpa: move vhost reset after get vring base
The function vhost.c:vhost_dev_stop calls vhost operation
vhost_dev_start(false). In the case of vdpa it totally reset and wipes
the device, making the fetching of the vring base (virtqueue state) totally
useless.

The kernel backend does not use vhost_dev_start vhost op callback, but
vhost-user do. A patch to make vhost_user_dev_start more similar to vdpa
is desirable, but it can be added on top.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-8-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 0bb302a996 vdpa: add vhost_vdpa_suspend
The function vhost.c:vhost_dev_stop fetches the vring base so the vq
state can be migrated to other devices.  However, this is unreliable in
vdpa, since we didn't signal the device to suspend the queues, making
the value fetched useless.

Suspend the device if possible before fetching first and subsequent
vring bases.

Moreover, vdpa totally reset and wipes the device at the last device
before fetch its vrings base, making that operation useless in the last
device. This will be fixed in later patches of this series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-7-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez b6662cb7e5 vdpa: add vhost_vdpa->suspended parameter
This allows vhost_vdpa to track if it is safe to get the vring base from
the device or not.  If it is not, vhost can fall back to fetch idx from
the guest buffer again.

No functional change intended in this patch, later patches will use this
field.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-6-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 4241e8bd72 vdpa: rewind at get_base, not set_base
At this moment it is only possible to migrate to a vdpa device running
with x-svq=on. As a protective measure, the rewind of the inflight
descriptors was done at the destination. That way if the source sent a
virtqueue with inuse descriptors they are always discarded.

Since this series allows to migrate also to passthrough devices with no
SVQ, the right thing to do is to rewind at the source so the base of
vrings are correct.

Support for inflight descriptors may be added in the future.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-5-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez d83b494580 vdpa: Negotiate _F_SUSPEND feature
This is needed for qemu to know it can suspend the device to retrieve
its status and enable SVQ with it, so all the process is transparent to
the guest.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-4-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez b276524386 vdpa: Remember last call fd set
As SVQ can be enabled dynamically at any time, it needs to store call fd
always.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi 2cb0692768 cryptodev: Use CryptoDevBackendOpInfo for operation
Move queue_index, CryptoDevCompletionFunc and opaque into struct
CryptoDevBackendOpInfo, then cryptodev_backend_crypto_operation()
needs an argument CryptoDevBackendOpInfo *op_info only. And remove
VirtIOCryptoReq from cryptodev. It's also possible to hide
VirtIOCryptoReq into virtio-crypto.c in the next step. (In theory,
VirtIOCryptoReq is a private structure used by virtio-crypto only)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-9-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi bc304a6442 cryptodev: Introduce server type in QAPI
Introduce cryptodev service type in cryptodev.json, then apply this
to related codes. Now we can remove VIRTIO_CRYPTO_SERVICE_xxx
dependence from QEMU cryptodev.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-5-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi 999c789f00 cryptodev: Introduce cryptodev alg type in QAPI
Introduce cryptodev alg type in cryptodev.json, then apply this to
related codes, and drop 'enum CryptoDevBackendAlgType'.

There are two options:
1, { 'enum': 'QCryptodevBackendAlgType',
  'prefix': 'CRYPTODEV_BACKEND_ALG',
  'data': ['sym', 'asym']}
Then we can keep 'CRYPTODEV_BACKEND_ALG_SYM' and avoid lots of
changes.
2, changes in this patch(with prefix 'QCRYPTODEV_BACKEND_ALG').

To avoid breaking the rule of QAPI, use 2 here.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-4-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Joao Martins 95b29658b6 vfio/migration: Query device dirty page tracking support
Now that everything has been set up for device dirty page tracking,
query the device for device dirty page tracking support.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-15-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Joao Martins e46883204c vfio/migration: Block migration with vIOMMU
Migrating with vIOMMU will require either tracking maximum
IOMMU supported address space (e.g. 39/48 address width on Intel)
or range-track current mappings and dirty track the new ones
post starting dirty tracking. This will be done as a separate
series, so add a live migration blocker until that is fixed.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-14-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Joao Martins b153402a89 vfio/common: Add device dirty page bitmap sync
Add device dirty page bitmap sync functionality. This uses the device
DMA logging uAPI to sync dirty page bitmap from the device.

Device dirty page bitmap sync is used only if all devices within a
container support device dirty page tracking.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-13-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Avihai Horon 6607109f05 vfio/common: Extract code from vfio_get_dirty_bitmap() to new function
Extract the VFIO_IOMMU_DIRTY_PAGES ioctl code in vfio_get_dirty_bitmap()
to its own function.

This will help the code to be more readable after next patch will add
device dirty page bitmap sync functionality.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-12-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Joao Martins 5255bbf4ec vfio/common: Add device dirty page tracking start/stop
Add device dirty page tracking start/stop functionality. This uses the
device DMA logging uAPI to start and stop dirty page tracking by device.

Device dirty page tracking is used only if all devices within a
container support device dirty page tracking.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/20230307125450.62409-11-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:11 -07:00
Alex Bennée 548c96095d includes: move tb_flush into its own header
This aids subsystems (like gdbstub) that want to trigger a flush
without pulling target specific headers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Message-Id: <20230302190846.2593720-8-alex.bennee@linaro.org>
Message-Id: <20230303025805.625589-8-richard.henderson@linaro.org>
2023-03-07 17:06:33 +00:00
David Woodhouse a78c54c4f9 i386/xen: Initialize Xen backends from pc_basic_device_init() for emulation
Now that all the work is done to enable the PV backends to work without
actual Xen, instantiate the bus from pc_basic_device_init() for emulated
mode.

This allows us finally to launch an emulated Xen guest with PV disk.

   qemu-system-x86_64 -serial mon:stdio -M q35 -cpu host -display none \
     -m 1G -smp 2 -accel kvm,xen-version=0x4000a,kernel-irqchip=split \
     -kernel bzImage -append "console=ttyS0 root=/dev/xvda1" \
     -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \
     -device xen-disk,drive=disk,vdev=xvda

If we use -M pc instead of q35, we can even add an IDE disk and boot a
guest image normally through grub. But q35 gives us AHCI and that isn't
unplugged by the Xen magic, so the guests ends up seeing "both" disks.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse de26b26197 hw/xen: Implement soft reset for emulated gnttab
This is only part of it; we will also need to get the PV back end drivers
to tear down their own mappings (or do it for them, but they kind of need
to stop using the pointers too).

Some more work on the actual PV back ends and xen-bus code is going to be
needed to really make soft reset and migration fully functional, and this
part is the basis for that.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse d05864d23b hw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore
We don't actually access the guest's page through the grant, because
this isn't real Xen, and we can just use the page we gave it in the
first place. Map the grant anyway, mostly for cosmetic purposes so it
*looks* like it's in use in the guest-visible grant table.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 0324751272 hw/xen: Add emulated implementation of XenStore operations
Now that we have an internal implementation of XenStore, we can populate
the xenstore_backend_ops to allow PV backends to talk to it.

Watches can't be processed with immediate callbacks because that would
call back into XenBus code recursively. Defer them to a QEMUBH to be run
as appropriate from the main loop. We use a QEMUBH per XS handle, and it
walks all the watches (there shouldn't be many per handle) to fire any
which have pending events. We *could* have done it differently but this
allows us to use the same struct watch_event as we have for the guest
side, and keeps things relatively simple.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse b08d88e30f hw/xen: Add emulated implementation of grant table operations
This is limited to mapping a single grant at a time, because under Xen the
pages are mapped *contiguously* into qemu's address space, and that's very
hard to do when those pages actually come from anonymous mappings in qemu
in the first place.

Eventually perhaps we can look at using shared mappings of actual objects
for system RAM, and then we can make new mappings of the same backing
store (be it deleted files, shmem, whatever). But for now let's stick to
a page at a time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 4dfd5fb178 hw/xen: Hook up emulated implementation for event channel operations
We provided the backend-facing evtchn functions very early on as part of
the core Xen platform support, since things like timers and xenstore need
to use them.

By what may or may not be an astonishing coincidence, those functions
just *happen* all to have exactly the right function prototypes to slot
into the evtchn_backend_ops table and be called by the PV backends.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 072519037d hw/xen: Only advertise ring-page-order for xen-block if gnttab supports it
Whem emulating Xen, multi-page grants are distinctly non-trivial and we
have elected not to support them for the time being. Don't advertise
them to the guest.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant 240cc11369 hw/xen: Avoid crash when backend watch fires too early
The xen-block code ends up calling aio_poll() through blkconf_geometry(),
which means we see watch events during the indirect call to
xendev_class->realize() in xen_device_realize(). Unfortunately this call
is made before populating the initial frontend and backend device nodes
in xenstore and hence xen_block_frontend_changed() (which is called from
a watch event) fails to read the frontend's 'state' node, and hence
believes the device is being torn down. This in-turn sets the backend
state to XenbusStateClosed and causes the device to be deleted before it
is fully set up, leading to the crash.
By simply moving the call to xendev_class->realize() after the initial
xenstore nodes are populated, this sorry state of affairs is avoided.

Reported-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 4ca8cf092d hw/xen: Build PV backend drivers for CONFIG_XEN_BUS
Now that we have the redirectable Xen backend operations we can build the
PV backends even without the Xen libraries.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse e2abfe5ec6 hw/xen: Rename xen_common.h to xen_native.h
This header is now only for native Xen code, not PV backends that may be
used in Xen emulation. Since the toolstack libraries may depend on the
specific version of Xen headers that they pull in (and will set the
__XEN_TOOLS__ macro to enable internal definitions that they depend on),
the rule is that xen_native.h (and thus the toolstack library headers)
must be included *before* any of the headers in include/hw/xen/interface.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse a9ae1418b3 hw/xen: Use XEN_PAGE_SIZE in PV backend drivers
XC_PAGE_SIZE comes from the actual Xen libraries, while XEN_PAGE_SIZE is
provided by QEMU itself in xen_backend_ops.h. For backends which may be
built for emulation mode, use the latter.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 7a8a749da7 hw/xen: Move xenstore_store_pv_console_info to xen_console.c
There's no need for this to be in the Xen accel code, and as we want to
use the Xen console support with KVM-emulated Xen we'll want to have a
platform-agnostic version of it. Make it use GString to build up the
path while we're at it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant ba2a92db1f hw/xen: Add xenstore operations to allow redirection to internal emulation
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 15e283c5b6 hw/xen: Add foreignmem operations to allow redirection to internal emulation
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse f80fad16af hw/xen: Pass grant ref to gnttab unmap operation
The previous commit introduced redirectable gnttab operations fairly
much like-for-like, with the exception of the extra arguments to the
->open() call which were always NULL/0 anyway.

This *changes* the arguments to the ->unmap() operation to include the
original ref# that was mapped. Under real Xen it isn't necessary; all we
need to do from QEMU is munmap(), then the kernel will release the grant,
and Xen does the tracking/refcounting for the guest.

When we have emulated grant tables though, we need to do all that for
ourselves. So let's have the back ends keep track of what they mapped
and pass it in to the ->unmap() method for us.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse c412ba47b2 hw/xen: Add gnttab operations to allow redirection to internal emulation
Move the existing code using libxengnttab to xen-operations.c and allow
the operations to be redirected so that we can add emulation of grant
table mapping for backend drivers.

In emulation, mapping more than one grant ref to be virtually contiguous
would be fairly difficult. The best way to do it might be to make the
ram_block mappings actually backed by a file (shmem or a deleted file,
perhaps) so that we can have multiple *shared* mappings of it. But that
would be fairly intrusive.

Making the backend drivers cope with page *lists* instead of expecting
the mapping to be contiguous is also non-trivial, since some structures
would actually *cross* page boundaries (e.g. the 32-bit blkif responses
which are 12 bytes).

So for now, we'll support only single-page mappings in emulation. Add a
XEN_GNTTAB_OP_FEATURE_MAP_MULTIPLE flag to indicate that the native Xen
implementation *does* support multi-page maps, and a helper function to
query it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse b6cacfea0b hw/xen: Add evtchn operations to allow redirection to internal emulation
The existing implementation calling into the real libxenevtchn moves to
a new file hw/xen/xen-operations.c, and is called via a function table
which in a subsequent commit will also be able to invoke the emulated
event channel support.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant 831b0db8ab hw/xen: Create initial XenStore nodes
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 766804b101 hw/xen: Implement core serialize/deserialize methods for xenstore_impl
This implements the basic migration support in the back end, with unit
tests that give additional confidence in the node-counting already in
the tree.

However, the existing PV back ends like xen-disk don't support migration
yet. They will reset the ring and fail to continue where they left off.
We will fix that in future, but not in time for the 8.0 release.

Since there's also an open question of whether we want to serialize the
full XenStore or only the guest-owned nodes in /local/domain/${domid},
for now just mark the XenStore device as unmigratable.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant be1934dfef hw/xen: Implement XenStore permissions
Store perms as a GList of strings, check permissions.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 7cabbdb70d hw/xen: Watches on XenStore transactions
Firing watches on the nodes that still exist is relatively easy; just
walk the tree and look at the nodes with refcount of one.

Firing watches on *deleted* nodes is more fun. We add 'modified_in_tx'
and 'deleted_in_tx' flags to each node. Nodes with those flags cannot
be shared, as they will always be unique to the transaction in which
they were created.

When xs_node_walk would need to *create* a node as scaffolding and it
encounters a deleted_in_tx node, it can resurrect it simply by clearing
its deleted_in_tx flag. If that node originally had any *data*, they're
gone, and the modified_in_tx flag will have been set when it was first
deleted.

We then attempt to send appropriate watches when the transaction is
committed, properly delete the deleted_in_tx nodes, and remove the
modified_in_tx flag from the others.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 7248b87cb0 hw/xen: Implement XenStore transactions
Given that the whole thing supported copy on write from the beginning,
transactions end up being fairly simple. On starting a transaction, just
take a ref of the existing root; swap it back in on a successful commit.

The main tree has a transaction ID too, and we keep a record of the last
transaction ID given out. if the main tree is ever modified when it isn't
the latest, it gets a new transaction ID.

A commit can only succeed if the main tree hasn't moved on since it was
forked. Strictly speaking, the XenStore protocol allows a transaction to
succeed as long as nothing *it* read or wrote has changed in the interim,
but no implementations do that; *any* change is sufficient to abort a
transaction.

This does not yet fire watches on the changed nodes on a commit. That bit
is more fun and will come in a follow-on commit.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 6e1330090d hw/xen: Implement XenStore watches
Starts out fairly simple: a hash table of watches based on the path.

Except there can be multiple watches on the same path, so the watch ends
up being a simple linked list, and the head of that list is in the hash
table. Which makes removal a bit of a PITA but it's not so bad; we just
special-case "I had to remove the head of the list and now I have to
replace it in / remove it from the hash table". And if we don't remove
the head, it's a simple linked-list operation.

We do need to fire watches on *deleted* nodes, so instead of just a simple
xs_node_unref() on the topmost victim, we need to recurse down and fire
watches on them all.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 3ef7ff83ca hw/xen: Add basic XenStore tree walk and write/read/directory support
This is a fairly simple implementation of a copy-on-write tree.

The node walk function starts off at the root, with 'inplace == true'.
If it ever encounters a node with a refcount greater than one (including
the root node), then that node is shared with other trees, and cannot
be modified in place, so the inplace flag is cleared and we copy on
write from there on down.

Xenstore write has 'mkdir -p' semantics and will create the intermediate
nodes if they don't already exist, so in that case we flip the inplace
flag back to true as we populate the newly-created nodes.

We put a copy of the absolute path into the buffer in the struct walk_op,
with *two* NUL terminators at the end. As xs_node_walk() goes down the
tree, it replaces the next '/' separator with a NUL so that it can use
the 'child name' in place. The next recursion down then puts the '/'
back and repeats the exercise for the next path element... if it doesn't
hit that *second* NUL termination which indicates the true end of the
path.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse 0254c4d19d hw/xen: Add xenstore wire implementation and implementation stubs
This implements the basic wire protocol for the XenStore commands, punting
all the actual implementation to xs_impl_* functions which all just return
errors for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Peter Maydell 7b0f0aa55f * Fix missing memory barriers
* Fix comments about memory ordering
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmQHIqoUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPYBwgArUaS0KGrBM1XmRUUpXnJokmA37n8
 ft477na+XW+p9VYi27B0R01P8j+AkCrAO0Ir1MLG7axjn5KiRMnbf2uBgqasEREv
 repJEXsqISoxA6vvAvnehKHAI9zu8b7frRc/30b6EOrrZpn0JKePSNRTyBu2seGO
 NFDXPVA2Wom+xXaNSEGt0dmoJ6AzEVIZKhUIwyvUWOC7MXuuIkRWn9/nySUdvEt0
 RIFPPk7JCjnEc32vb4Xnq/Ncsy20tMIM1hlDxMOVNq3brjeSCzS0PPPSjE/X5OtW
 Yn5YS0nCyD7wjP2dkXI4I1lUPxUUx6LvMz1aGbJCfyjSX41mNES/agoGgA==
 =KEUo
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream-mb' of https://gitlab.com/bonzini/qemu into staging

* Fix missing memory barriers
* Fix comments about memory ordering

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmQHIqoUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPYBwgArUaS0KGrBM1XmRUUpXnJokmA37n8
# ft477na+XW+p9VYi27B0R01P8j+AkCrAO0Ir1MLG7axjn5KiRMnbf2uBgqasEREv
# repJEXsqISoxA6vvAvnehKHAI9zu8b7frRc/30b6EOrrZpn0JKePSNRTyBu2seGO
# NFDXPVA2Wom+xXaNSEGt0dmoJ6AzEVIZKhUIwyvUWOC7MXuuIkRWn9/nySUdvEt0
# RIFPPk7JCjnEc32vb4Xnq/Ncsy20tMIM1hlDxMOVNq3brjeSCzS0PPPSjE/X5OtW
# Yn5YS0nCyD7wjP2dkXI4I1lUPxUUx6LvMz1aGbJCfyjSX41mNES/agoGgA==
# =KEUo
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Mar 2023 11:40:26 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream-mb' of https://gitlab.com/bonzini/qemu:
  async: clarify usage of barriers in the polling case
  async: update documentation of the memory barriers
  physmem: add missing memory barrier
  qemu-coroutine-lock: add smp_mb__after_rmw()
  aio-wait: switch to smp_mb__after_rmw()
  edu: add smp_mb__after_rmw()
  qemu-thread-win32: cleanup, fix, document QemuEvent
  qemu-thread-posix: cleanup, fix, document QemuEvent
  qatomic: add smp_mb__before/after_rmw()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-07 17:02:06 +00:00
Karthikeyan Pasupathi 7840ba985a hw/arm/aspeed: Modified BMC FRU byte data in yosemitev2
Modified BMC FRU data in yosemite v2 platform.

Tested: Tested and Verified in yosemitev2 platform.

Fixes: 34f73a81e6 ("hw/arm/aspeed: Adding new machine Yosemitev2 in QEMU")
Signed-off-by: Karthikeyan Pasupathi <pkarthikeyan1509@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230307104833.3587947-1-pkarthikeyan1509@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-03-07 16:53:18 +01:00
Karthikeyan Pasupathi a09d357dd3 hw/arm/aspeed: Added TMP421 type sensor's support in tiogapass
Added TMP421 type sensor support in tiogapass platform.

Tested: Tested and verified in tiogapass platform.

Signed-off-by: Karthikeyan Pasupathi <pkarthikeyan1509@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230307103334.3586755-1-pkarthikeyan1509@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-03-07 16:53:18 +01:00
Karthikeyan Pasupathi 0a1f86bac9 hw/arm/aspeed: Added TMP421 type sensor's support in yosemitev2
Added TMP421 type support in yosemite v2 platform.

Tested: Tested and verified in yosemite V2 platform.

Signed-off-by: Karthikeyan Pasupathi <pkarthikeyan1509@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230307095239.3583613-1-pkarthikeyan1509@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-03-07 16:53:18 +01:00
Kevin Wolf 3c6f3f65ea pflash: Fix blk_pread_nonzeroes()
Commit a4b15a8b introduced a new function blk_pread_nonzeroes(). Instead
of reading directly from the root node of the BlockBackend, it reads
from its 'file' child node. This can happen to mostly work for raw
images (as long as the 'raw' format driver is in use, but not actually
doing anything), but it breaks everything else.

Fix it to read from the root node instead.

Fixes: a4b15a8b9e
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230307140230.59158-1-kwolf@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-03-07 16:53:18 +01:00
Cédric Le Goater 11aeb4b8c1 m25p80: Improve error when the backend file size does not match the device
Currently, when a block backend is attached to a m25p80 device and the
associated file size does not match the flash model, QEMU complains
with the error message "failed to read the initial flash content".
This is confusing for the user.

Instead, use helper blk_check_size_and_read_all() introduced by commit
06f1521795 ("pflash: Require backend size to match device, improve
errors").

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Delevoryas <peter@pjd.dev>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20221115151000.2080833-1-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-03-07 16:53:18 +01:00
Joao Martins 62c1b0024b vfio/common: Record DMA mapped IOVA ranges
According to the device DMA logging uAPI, IOVA ranges to be logged by
the device must be provided all at once upon DMA logging start.

As preparation for the following patches which will add device dirty
page tracking, keep a record of all DMA mapped IOVA ranges so later they
can be used for DMA logging start.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-10-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 07:20:32 -07:00
Joao Martins 4ead830848 vfio/common: Add helper to consolidate iova/end calculation
In preparation to be used in device dirty tracking, move the code that
calculate a iova/end range from the container/section.  This avoids
duplication on the common checks across listener callbacks.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-9-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 07:20:32 -07:00
Joao Martins b92f237635 vfio/common: Consolidate skip/invalid section into helper
The checks are replicated against region_add and region_del
and will be soon added in another memory listener dedicated
for dirty tracking.

Move these into a new helper for avoid duplication.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/r/20230307125450.62409-8-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 07:20:32 -07:00
Joao Martins 1cd7fa7adc vfio/common: Use a single tracepoint for skipped sections
In preparation to turn more of the memory listener checks into
common functions, one of the affected places is how we trace when
sections are skipped. Right now there is one for each. Change it
into one single tracepoint `vfio_listener_region_skip` which receives
a name which refers to the callback i.e. region_add and region_del.

Suggested-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-7-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 07:20:32 -07:00
Joao Martins fbc6c92134 vfio/common: Add helper to validate iova/end against hostwin
Move the code that finds the container host DMA window against a iova
range. This avoids duplication on the common checks across listener
callbacks.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/r/20230307125450.62409-6-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 07:20:32 -07:00
Avihai Horon 725ccd7e41 vfio/common: Add VFIOBitmap and alloc function
There are already two places where dirty page bitmap allocation and
calculations are done in open code.

To avoid code duplication, introduce VFIOBitmap struct and corresponding
alloc function and use them where applicable.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-5-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 07:20:32 -07:00
Avihai Horon 236e0a45f5 vfio/common: Abort migration if dirty log start/stop/sync fails
If VFIO dirty pages log start/stop/sync fails during migration,
migration should be aborted as pages dirtied by VFIO devices might not
be reported properly.

This is not the case today, where in such scenario only an error is
printed.

Fix it by aborting migration in the above scenario.

Fixes: 758b96b61d ("vfio/migrate: Move switch of dirty tracking into vfio_memory_listener")
Fixes: b6dd6504e3 ("vfio: Add vfio_listener_log_sync to mark dirty pages")
Fixes: 9e7b0442f2 ("vfio: Add ioctl to get dirty pages bitmap during dma unmap")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-4-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 07:20:32 -07:00
Avihai Horon db9b829b15 vfio/common: Fix wrong %m usages
There are several places where the %m conversion is used if one of
vfio_dma_map(), vfio_dma_unmap() or vfio_get_dirty_bitmap() fail.

The %m usage in these places is wrong since %m relies on errno value while
the above functions don't report errors via errno.

Fix it by using strerror() with the returned value instead.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-3-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 07:20:32 -07:00
Avihai Horon 3e2413a652 vfio/common: Fix error reporting in vfio_get_dirty_bitmap()
Return -errno instead of -1 if VFIO_IOMMU_DIRTY_PAGES ioctl fails in
vfio_get_dirty_bitmap().

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-2-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 07:20:32 -07:00
Peter Maydell 9832009d9d Sixth RISC-V PR for 8.0
* Support for the Zicbiom, ZCicboz, and Zicbop extensions.
 * OpenSBI has been updated to version 1.2, see
   <https://github.com/riscv-software-src/opensbi/releases/tag/v1.2> for
   the release notes.
 * Support for setting the virtual address width (ie, sv39/sv48/sv57) on
   the command line.
 * Support for ACPI on RISC-V.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmQGYGgTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYidmyEAC6FEMbbFM5D++qR6w6xM6hXgzcrev6
 s1kyRRNVa45uSA78ti/Zi0hsDLNf7ZsNPndF0OIkkO5iAE0OVm3LU7tV1TqKcT82
 Dd9VXxe93zEmfnuJazHrMa54SXPhhnNdWHtKlZ6vBfZpbxgx0FFs50xkCsrM5LQZ
 hYHxQUqPWQTvF2MdDHrxCuLcdKl+Wg3ysCcgRh2d049KUBrIu6vNaHC2+AGRjCbj
 BkrGCkB82fTmVJjzAcVWQxLoAV12pCbJS4og1GtP8hA7WevtB39tbPin9siBKRZp
 QBeiIsg0nebkpmZGrb+xWVwlIBNe9yYwJa0KmveQk8v7L5RIzjM1mtDL91VrVljC
 KC2tfT570m0Iq2NoFMb3wd/kESHFzVDM/g+XYqRd4KSoiCNP/RbqYNQBwbMc31Tr
 E27xfA1D8w2vem0Rk20x3KgPf1Z5OmGXjq6YObTpnAzG8cZlA37qKBP+ortt5aHX
 GZSg3CAwknHHVajd4aaegkPsHxm1tRvoTfh38MwkPSNxaA9GD0nz0k9xaYDmeZ2L
 olfanNsaQEwcVUId31+7sAENg1TZU0fnj879/nxkMUCazVTdL8/mz+IoTTx0QCST
 3+9ATWcyJUlmjbDKIs7kr1L+wJdvvHEJggPAbbPI8ekpXaLZvUYOT6ObzYKNAmwY
 wELQBn8QKXcLVA==
 =5gAt
 -----END PGP SIGNATURE-----

Merge tag 'pull-riscv-to-apply-20230306' of https://gitlab.com/palmer-dabbelt/qemu into staging

Sixth RISC-V PR for 8.0

* Support for the Zicbiom, ZCicboz, and Zicbop extensions.
* OpenSBI has been updated to version 1.2, see
  <https://github.com/riscv-software-src/opensbi/releases/tag/v1.2> for
  the release notes.
* Support for setting the virtual address width (ie, sv39/sv48/sv57) on
  the command line.
* Support for ACPI on RISC-V.

# -----BEGIN PGP SIGNATURE-----
#
# iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmQGYGgTHHBhbG1lckBk
# YWJiZWx0LmNvbQAKCRAuExnzX7sYidmyEAC6FEMbbFM5D++qR6w6xM6hXgzcrev6
# s1kyRRNVa45uSA78ti/Zi0hsDLNf7ZsNPndF0OIkkO5iAE0OVm3LU7tV1TqKcT82
# Dd9VXxe93zEmfnuJazHrMa54SXPhhnNdWHtKlZ6vBfZpbxgx0FFs50xkCsrM5LQZ
# hYHxQUqPWQTvF2MdDHrxCuLcdKl+Wg3ysCcgRh2d049KUBrIu6vNaHC2+AGRjCbj
# BkrGCkB82fTmVJjzAcVWQxLoAV12pCbJS4og1GtP8hA7WevtB39tbPin9siBKRZp
# QBeiIsg0nebkpmZGrb+xWVwlIBNe9yYwJa0KmveQk8v7L5RIzjM1mtDL91VrVljC
# KC2tfT570m0Iq2NoFMb3wd/kESHFzVDM/g+XYqRd4KSoiCNP/RbqYNQBwbMc31Tr
# E27xfA1D8w2vem0Rk20x3KgPf1Z5OmGXjq6YObTpnAzG8cZlA37qKBP+ortt5aHX
# GZSg3CAwknHHVajd4aaegkPsHxm1tRvoTfh38MwkPSNxaA9GD0nz0k9xaYDmeZ2L
# olfanNsaQEwcVUId31+7sAENg1TZU0fnj879/nxkMUCazVTdL8/mz+IoTTx0QCST
# 3+9ATWcyJUlmjbDKIs7kr1L+wJdvvHEJggPAbbPI8ekpXaLZvUYOT6ObzYKNAmwY
# wELQBn8QKXcLVA==
# =5gAt
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 Mar 2023 21:51:36 GMT
# gpg:                using RSA key 2B3C3747446843B24A943A7A2E1319F35FBB1889
# gpg:                issuer "palmer@dabbelt.com"
# gpg: Good signature from "Palmer Dabbelt <palmer@dabbelt.com>" [unknown]
# gpg:                 aka "Palmer Dabbelt <palmerdabbelt@google.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88  6DF8 EF4C A150 2CCB AB41
#      Subkey fingerprint: 2B3C 3747 4468 43B2 4A94  3A7A 2E13 19F3 5FBB 1889

* tag 'pull-riscv-to-apply-20230306' of https://gitlab.com/palmer-dabbelt/qemu: (22 commits)
  MAINTAINERS: Add entry for RISC-V ACPI
  hw/riscv/virt.c: Initialize the ACPI tables
  hw/riscv/virt: virt-acpi-build.c: Add RHCT Table
  hw/riscv/virt: virt-acpi-build.c: Add RINTC in MADT
  hw/riscv/virt: Enable basic ACPI infrastructure
  hw/riscv/virt: Add memmap pointer to RiscVVirtState
  hw/riscv/virt: Add a switch to disable ACPI
  hw/riscv/virt: Add OEM_ID and OEM_TABLE_ID fields
  riscv: Correctly set the device-tree entry 'mmu-type'
  riscv: Introduce satp mode hw capabilities
  riscv: Allow user to set the satp mode
  riscv: Change type of valid_vm_1_10_[32|64] to bool
  riscv: Pass Object to register_cpu_props instead of DeviceState
  roms/opensbi: Upgrade from v1.1 to v1.2
  gitlab/opensbi: Move to docker:stable
  hw: intc: Use cpu_by_arch_id to fetch CPU state
  target/riscv: cpu: Implement get_arch_id callback
  disas/riscv Fix ctzw disassemble
  hw/riscv/virt.c: add cbo[mz]-block-size fdt properties
  target/riscv: add Zicbop cbo.prefetch{i, r, m} placeholder
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-07 12:53:00 +00:00
Paolo Bonzini 2482aeea41 edu: add smp_mb__after_rmw()
Ensure ordering between clearing the COMPUTING flag and checking
IRQFACT, and between setting the IRQFACT flag and checking
COMPUTING.  This ensures that no wakeups are lost.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-07 12:38:40 +01:00
Peter Maydell c29a2f40cd target-arm queue:
* allwinner-h3: Fix I2C controller model for Sun6i SoCs
  * allwinner-h3: Add missing i2c controllers
  * Expose M-profile system registers to gdbstub
  * Expose pauth information to gdbstub
  * Support direct boot for Linux/arm64 EFI zboot images
  * Fix incorrect stage 2 MMU setup validation
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmQGB+wZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3gdQEACVfgbs77mxbOb6u8yWHKGZ
 tVnQr9KZMv2lmwt5H3ROJPXznchrIIAwdMeRgKnbI+lC5jTq9L+Q8RJch3t/EbAd
 f0VMyiPe3DzCbCrAR9cW6EWzbYnEVo3Ioj4k7qjxK6u1BIKhXz99DLYd1KRdTxnx
 BAYmcl857Uir1q2FrBVMZ/ItCLbk4ejn+YaDIawNue2/s1oGa+we473x9rosCFvp
 L9bzT3R46e0o+Mfkn1OYRmgCmURTalWPpWAxyOUFR9YbrzXleLgAKEB3o3PPcvls
 u26uxztyRMqje1q06VjUzwaLw7zN9XPhmir+NXX7KXp2/x9PZjApOpPtt0kl+6qe
 FbByKfl24O9w/OKewsJw+udCBYdYrRPm6tWv2D71iAwjBUzBJgNGe5VPRdPFtPDn
 uSRO65o34w1nPzRpAheUciZueiabYrVmIgVltFxj0JlrKGfgiYHPLVyU0Uu0K/A7
 F2kUEQIzIcWdo+c8SlvlWOEA2ojVd/KoLVLgndqr40Tk5pbc65TRS08kkVVl4cMT
 jUGscl7Dyxe+yo8+nHdycAJpnKYDllJOh2JbGv3r2FqCy5FMuIqW4hHeuUxwpE+O
 nxm7lzjnaVHSAFHdzhk9x4E4uH/GTcdWzX1EsmpgGqe5oejLJOrCINb+Dj44+Y8h
 8aGRvE7kxMs11upxc7BcAw==
 =KIMt
 -----END PGP SIGNATURE-----

Merge tag 'pull-target-arm-20230306' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * allwinner-h3: Fix I2C controller model for Sun6i SoCs
 * allwinner-h3: Add missing i2c controllers
 * Expose M-profile system registers to gdbstub
 * Expose pauth information to gdbstub
 * Support direct boot for Linux/arm64 EFI zboot images
 * Fix incorrect stage 2 MMU setup validation

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmQGB+wZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3gdQEACVfgbs77mxbOb6u8yWHKGZ
# tVnQr9KZMv2lmwt5H3ROJPXznchrIIAwdMeRgKnbI+lC5jTq9L+Q8RJch3t/EbAd
# f0VMyiPe3DzCbCrAR9cW6EWzbYnEVo3Ioj4k7qjxK6u1BIKhXz99DLYd1KRdTxnx
# BAYmcl857Uir1q2FrBVMZ/ItCLbk4ejn+YaDIawNue2/s1oGa+we473x9rosCFvp
# L9bzT3R46e0o+Mfkn1OYRmgCmURTalWPpWAxyOUFR9YbrzXleLgAKEB3o3PPcvls
# u26uxztyRMqje1q06VjUzwaLw7zN9XPhmir+NXX7KXp2/x9PZjApOpPtt0kl+6qe
# FbByKfl24O9w/OKewsJw+udCBYdYrRPm6tWv2D71iAwjBUzBJgNGe5VPRdPFtPDn
# uSRO65o34w1nPzRpAheUciZueiabYrVmIgVltFxj0JlrKGfgiYHPLVyU0Uu0K/A7
# F2kUEQIzIcWdo+c8SlvlWOEA2ojVd/KoLVLgndqr40Tk5pbc65TRS08kkVVl4cMT
# jUGscl7Dyxe+yo8+nHdycAJpnKYDllJOh2JbGv3r2FqCy5FMuIqW4hHeuUxwpE+O
# nxm7lzjnaVHSAFHdzhk9x4E4uH/GTcdWzX1EsmpgGqe5oejLJOrCINb+Dj44+Y8h
# 8aGRvE7kxMs11upxc7BcAw==
# =KIMt
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 Mar 2023 15:34:04 GMT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20230306' of https://git.linaro.org/people/pmaydell/qemu-arm: (21 commits)
  hw: arm: allwinner-h3: Fix and complete H3 i2c devices
  hw: allwinner-i2c: Fix TWI_CNTR_INT_FLAG on SUN6i SoCs
  hw: arm: Support direct boot for Linux/arm64 EFI zboot images
  target/arm: Rewrite check_s2_mmu_setup
  target/arm: Diagnose incorrect usage of arm_is_secure subroutines
  target/arm: Stub arm_hcr_el2_eff for m-profile
  target/arm: Handle m-profile in arm_is_secure
  target/arm: Implement gdbstub m-profile systemreg and secext
  target/arm: Export arm_v7m_get_sp_ptr
  target/arm: Export arm_v7m_mrs_control
  target/arm: Implement gdbstub pauth extension
  target/arm: Create pauth_ptr_mask
  target/arm: Simplify iteration over bit widths
  target/arm: Add name argument to output_vector_union_type
  target/arm: Fix svep width in arm_gen_dynamic_svereg_xml
  target/arm: Hoist pred_width in arm_gen_dynamic_svereg_xml
  target/arm: Simplify register counting in arm_gen_dynamic_svereg_xml
  target/arm: Split out output_vector_union_type
  target/arm: Move arm_gen_dynamic_svereg_xml to gdbstub64.c
  target/arm: Unexport arm_gen_dynamic_sysreg_xml
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-07 09:58:43 +00:00
Peter Maydell c1feaf7683 hw/nvme updates
* basic support for directives
 * simple support for endurance groups
 * emulation of flexible data placement (tp4146)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmQF+doACgkQTeGvMW1P
 DenIEgf+MLsRQ3kKUmsgVNnPuR69M0COfyaz0AnfX6YEIL9ukFJQPsmASfPmHof5
 tCYIFyKEpZt/givmzSI1jdpm0uX2MRwLGLYRdNhEPVjo+TfGda15x7DgpBEduqjq
 mChUS2wrmgP9TZne+kTAU28pUpU7hcfrt1RkDOO86W8oJmpBeIyGe6vikVhQppKW
 fAIKvhNfN3p5Kxq1fhE6I5YzKd2vvKtBvPpZp2uFe6LHXEcVV/FPcTx3Ph+um/o6
 ScmmxowT4Wqk4EgXh1ohephlxB89aWgwLNHLHcfte6UCU9x4eSmTC2T3pf7piBaE
 pGLpzPoYk6BAurwrMuxCxYgStl6SzQ==
 =CNSk
 -----END PGP SIGNATURE-----

Merge tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu into staging

hw/nvme updates

* basic support for directives
* simple support for endurance groups
* emulation of flexible data placement (tp4146)

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmQF+doACgkQTeGvMW1P
# DenIEgf+MLsRQ3kKUmsgVNnPuR69M0COfyaz0AnfX6YEIL9ukFJQPsmASfPmHof5
# tCYIFyKEpZt/givmzSI1jdpm0uX2MRwLGLYRdNhEPVjo+TfGda15x7DgpBEduqjq
# mChUS2wrmgP9TZne+kTAU28pUpU7hcfrt1RkDOO86W8oJmpBeIyGe6vikVhQppKW
# fAIKvhNfN3p5Kxq1fhE6I5YzKd2vvKtBvPpZp2uFe6LHXEcVV/FPcTx3Ph+um/o6
# ScmmxowT4Wqk4EgXh1ohephlxB89aWgwLNHLHcfte6UCU9x4eSmTC2T3pf7piBaE
# pGLpzPoYk6BAurwrMuxCxYgStl6SzQ==
# =CNSk
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 Mar 2023 14:34:02 GMT
# gpg:                using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [full]
# gpg:                 aka "Klaus Jensen <k.jensen@samsung.com>" [full]
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468  4272 63D5 6FC5 E55D A838
#      Subkey fingerprint: 5228 33AA 75E2 DCE6 A247  66C0 4DE1 AF31 6D4F 0DE9

* tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu:
  hw/nvme: flexible data placement emulation
  hw/nvme: basic directives support
  hw/nvme: add basic endurance group support
  hw/nvme: store a pointer to the NvmeSubsystem in the NvmeNamespace
  hw/nvme: move adjustment of data_units{read,written}

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-07 09:58:25 +00:00
Sunil V L f709360f0a
hw/riscv/virt.c: Initialize the ACPI tables
Initialize the ACPI tables if the acpi option is not
disabled.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230302091212.999767-8-sunilvl@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-06 11:35:07 -08:00
Sunil V L ebfd392893
hw/riscv/virt: virt-acpi-build.c: Add RHCT Table
RISC-V ACPI platforms need to provide RISC-V Hart Capabilities
Table (RHCT). Add this to the ACPI tables.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230302091212.999767-7-sunilvl@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-06 11:35:06 -08:00
Sunil V L 6cc40ea211
hw/riscv/virt: virt-acpi-build.c: Add RINTC in MADT
Add Multiple APIC Description Table (MADT) with the
RINTC structure for each cpu.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230302091212.999767-6-sunilvl@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-06 11:35:05 -08:00
Sunil V L 7da2fb240f
hw/riscv/virt: Enable basic ACPI infrastructure
Add basic ACPI infrastructure for RISC-V with below tables.
        1) DSDT with below basic objects
                - CPUs
                - fw_cfg
        2) FADT revision 6 with HW_REDUCED flag
        3) XSDT
        4) RSDP

Add this functionality in a new file virt-acpi-build.c and enable
building this infrastructure.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230302091212.999767-5-sunilvl@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-06 11:35:04 -08:00
Sunil V L 71302ff3bc
hw/riscv/virt: Add memmap pointer to RiscVVirtState
memmap needs to be exported outside of virt.c so that
modules like acpi can use it. Hence, add a pointer field
in RiscVVirtState structure and initialize it with the
memorymap.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230302091212.999767-4-sunilvl@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-06 11:35:03 -08:00
Sunil V L 168b8c29ce
hw/riscv/virt: Add a switch to disable ACPI
ACPI will be enabled by default. Add a switch to turn off
for testing and debug purposes.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230302091212.999767-3-sunilvl@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-06 11:35:02 -08:00
Sunil V L 90477a652b
hw/riscv/virt: Add OEM_ID and OEM_TABLE_ID fields
ACPI needs OEM_ID and OEM_TABLE_ID for the machine. Add these fields
in the RISCVVirtState structure and initialize with default values.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230302091212.999767-2-sunilvl@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-06 11:35:02 -08:00