virtio, vhost, pc, pci: documentation, fixes and cleanups

Lots of fixes all over the place. Unfortunately, this does not yet fix a regression with vhost introduced by the last pull, the issue is typically this error: kvm_mem_ioeventfd_add: error adding ioeventfd: File exists followed by QEMU aborting. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJYKyfxAAoJECgfDbjSjVRpI4oH/2ZBpUxT/neq4ezX0bou5+1R lQ1m0VI1JDbay5uYw0Z0rUY7aLP0kX2XLu2jNBZg7fGz3+BPhqAoEjkGdlUran79 fEnuYCvMMokQNzvMaHv+lFXO/MuEJthdDeUJyi4B0NU0sseutiz8opXuSWIC8Ncx pyqRb8AfgKnsUSRizEVfiOWI1fk+zsTFbSyUwajwKi5E7RNEuHwLiqk5VFhzrrTX nLwnUvlH7NrcDfliam9ziadhguo5cwCE4jBlMpeHnW5tGalNRulvF5EgwXybIdrU JaR6lzQabOcoaAuJJ/dgo336B1Ef3JA/hogqfTW4unJGL5VVkWT0HLZ9OV42NEg= =ibZy -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging virtio, vhost, pc, pci: documentation, fixes and cleanups Lots of fixes all over the place. Unfortunately, this does not yet fix a regression with vhost introduced by the last pull, the issue is typically this error: kvm_mem_ioeventfd_add: error adding ioeventfd: File exists followed by QEMU aborting. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> * remotes/mst/tags/for_upstream: (28 commits) docs: add PCIe devices placement guidelines virtio: drop virtio_queue_get_ring_{size,addr}() vhost: drop legacy vring layout bits vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout nvdimm acpi: introduce NVDIMM_DSM_MEMORY_SIZE nvdimm acpi: use aml_name_decl to define named object nvdimm acpi: rename nvdimm_dsm_reserved_root nvdimm acpi: fix two comments nvdimm acpi: define DSM return codes nvdimm acpi: rename nvdimm_acpi_hotplug nvdimm acpi: cleanup nvdimm_build_fit nvdimm acpi: rename nvdimm_plugged_device_list docs: improve the doc of Read FIT method nvdimm acpi: clean up nvdimm_build_acpi pc: memhp: stop handling nvdimm hotplug in pc_dimm_unplug pc: memhp: move nvdimm hotplug out of memory hotplug nvdimm acpi: drop the lock of fit buffer qdev: hotplug: drop HotplugHandler.post_plug callback vhost: migration blocker only if shared log is used virtio-net: mark VIRTIO_NET_F_GSO as legacy ... Message-id: 1479237527-11846-1-git-send-email-mst@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-15 19:50:06 +00:00 · 2016-11-15 19:50:06 +00:00 · 51f492e5da
parent 60c5a47a16 453ac8835b
commit 51f492e5da
28 changed files with 584 additions and 305 deletions
--- a/default-configs/mips-softmmu-common.mak
+++ b/default-configs/mips-softmmu-common.mak
@ -17,6 +17,7 @@ CONFIG_FDC=y
 CONFIG_ACPI=y
 CONFIG_ACPI_X86=y
 CONFIG_ACPI_MEMORY_HOTPLUG=y
+CONFIG_ACPI_NVDIMM=y
 CONFIG_ACPI_CPU_HOTPLUG=y
 CONFIG_APM=y
 CONFIG_I8257=y
--- a/docs/pcie.txt
+++ b/docs/pcie.txt
@ -0,0 +1,310 @@
+PCI EXPRESS GUIDELINES
+======================
+
+1. Introduction
+================
+The doc proposes best practices on how to use PCI Express/PCI device
+in PCI Express based machines and explains the reasoning behind them.
+
+The following presentations accompany this document:
+ (1) Q35 overview.
+     http://wiki.qemu.org/images/4/4e/Q35.pdf
+ (2) A comparison between PCI and PCI Express technologies.
+     http://wiki.qemu.org/images/f/f6/PCIvsPCIe.pdf
+
+Note: The usage examples are not intended to replace the full
+documentation, please use QEMU help to retrieve all options.
+
+2. Device placement strategy
+============================
+QEMU does not have a clear socket-device matching mechanism
+and allows any PCI/PCI Express device to be plugged into any
+PCI/PCI Express slot.
+Plugging a PCI device into a PCI Express slot might not always work and
+is weird anyway since it cannot be done for "bare metal".
+Plugging a PCI Express device into a PCI slot will hide the Extended
+Configuration Space thus is also not recommended.
+
+The recommendation is to separate the PCI Express and PCI hierarchies.
+PCI Express devices should be plugged only into PCI Express Root Ports and
+PCI Express Downstream ports.
+
+2.1 Root Bus (pcie.0)
+=====================
+Place only the following kinds of devices directly on the Root Complex:
+    (1) PCI Devices (e.g. network card, graphics card, IDE controller),
+        not controllers. Place only legacy PCI devices on
+        the Root Complex. These will be considered Integrated Endpoints.
+        Note: Integrated Endpoints are not hot-pluggable.
+
+        Although the PCI Express spec does not forbid PCI Express devices as
+        Integrated Endpoints, existing hardware mostly integrates legacy PCI
+        devices with the Root Complex. Guest OSes are suspected to behave
+        strangely when PCI Express devices are integrated
+        with the Root Complex.
+
+    (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI Express
+        hierarchies.
+
+    (3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI
+        hierarchies.
+
+    (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses
+        are needed.
+
+   pcie.0 bus
+   ----------------------------------------------------------------------------
+        |                |                    |                  |
+   -----------   ------------------   ------------------   --------------
+   | PCI Dev |   | PCIe Root Port |   | DMI-PCI Bridge |   |  pxb-pcie  |
+   -----------   ------------------   ------------------   --------------
+
+2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use:
+          -device <dev>[,bus=pcie.0]
+2.1.2 To expose a new PCI Express Root Bus use:
+          -device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z]
+      Only PCI Express Root Ports and DMI-PCI bridges can be connected
+      to the pcie.1 bus:
+          -device ioh3420,id=root_port1[,bus=pcie.1][,chassis=x][,slot=y][,addr=z]                                     \
+          -device i82801b11-bridge,id=dmi_pci_bridge1,bus=pcie.1
+
+
+2.2 PCI Express only hierarchy
+==============================
+Always use PCI Express Root Ports to start PCI Express hierarchies.
+
+A PCI Express Root bus supports up to 32 devices. Since each
+PCI Express Root Port is a function and a multi-function
+device may support up to 8 functions, the maximum possible
+number of PCI Express Root Ports per PCI Express Root Bus is 256.
+
+Prefer grouping PCI Express Root Ports into multi-function devices
+to keep a simple flat hierarchy that is enough for most scenarios.
+Only use PCI Express Switches (x3130-upstream, xio3130-downstream)
+if there is no more room for PCI Express Root Ports.
+Please see section 4. for further justifications.
+
+Plug only PCI Express devices into PCI Express Ports.
+
+
+   pcie.0 bus
+   ----------------------------------------------------------------------------------
+        |                 |                                    |
+   -------------    -------------                        -------------
+   | Root Port |    | Root Port |                        | Root Port |
+   ------------     -------------                        -------------
+         |                            -------------------------|------------------------
+    ------------                      |                 -----------------              |
+    | PCIe Dev |                      |    PCI Express  | Upstream Port |              |
+    ------------                      |      Switch     -----------------              |
+                                      |                  |            |                |
+                                      |    -------------------    -------------------  |
+                                      |    | Downstream Port |    | Downstream Port |  |
+                                      |    -------------------    -------------------  |
+                                      -------------|-----------------------|------------
+                                             ------------
+                                             | PCIe Dev |
+                                             ------------
+
+2.2.1 Plugging a PCI Express device into a PCI Express Root Port:
+          -device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z]  \
+          -device <dev>,bus=root_port1
+2.2.2 Using multi-function PCI Express Root Ports:
+      -device ioh3420,id=root_port1,multifunction=on,chassis=x,slot=y[,bus=pcie.0][,addr=z.0] \
+      -device ioh3420,id=root_port2,chassis=x1,slot=y1[,bus=pcie.0][,addr=z.1] \
+      -device ioh3420,id=root_port3,chassis=x2,slot=y2[,bus=pcie.0][,addr=z.2] \
+2.2.2 Plugging a PCI Express device into a Switch:
+      -device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z]  \
+      -device x3130-upstream,id=upstream_port1,bus=root_port1[,addr=x]          \
+      -device xio3130-downstream,id=downstream_port1,bus=upstream_port1,chassis=x1,slot=y1[,addr=z1]] \
+      -device <dev>,bus=downstream_port1
+
+Notes:
+  - (slot, chassis) pair is mandatory and must be
+     unique for each PCI Express Root Port.
+  - 'addr' parameter can be 0 for all the examples above.
+
+
+2.3 PCI only hierarchy
+======================
+Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints,
+but, as mentioned in section 5, doing so means the legacy PCI
+device in question will be incapable of hot-unplugging.
+Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination
+with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies.
+
+Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge
+(having 32 slots) and several PCI-PCI Bridges attached to it
+(each supporting also 32 slots) will support hundreds of legacy devices.
+The recommendation is to populate one PCI-PCI Bridge under the DMI-PCI Bridge
+until is full and then plug a new PCI-PCI Bridge...
+
+   pcie.0 bus
+   ----------------------------------------------
+        |                            |
+   -----------               ------------------
+   | PCI Dev |               | DMI-PCI BRIDGE |
+   ----------                ------------------
+                               |            |
+                  ------------------    ------------------
+                  | PCI-PCI Bridge |    | PCI-PCI Bridge |   ...
+                  ------------------    ------------------
+                                         |           |
+                                  -----------     -----------
+                                  | PCI Dev |     | PCI Dev |
+                                  -----------     -----------
+
+2.3.1 To plug a PCI device into pcie.0 as an Integrated Endpoint use:
+      -device <dev>[,bus=pcie.0]
+2.3.2 Plugging a PCI device into a PCI-PCI Bridge:
+      -device i82801b11-bridge,id=dmi_pci_bridge1[,bus=pcie.0]                        \
+      -device pci-bridge,id=pci_bridge1,bus=dmi_pci_bridge1[,chassis_nr=x][,addr=y]   \
+      -device <dev>,bus=pci_bridge1[,addr=x]
+      Note that 'addr' cannot be 0 unless shpc=off parameter is passed to
+      the PCI Bridge.
+
+3. IO space issues
+===================
+The PCI Express Root Ports and PCI Express Downstream ports are seen by
+Firmware/Guest OS as PCI-PCI Bridges. As required by the PCI spec, each
+such Port should be reserved a 4K IO range for, even though only one
+(multifunction) device can be plugged into each Port. This results in
+poor IO space utilization.
+
+The firmware used by QEMU (SeaBIOS/OVMF) may try further optimizations
+by not allocating IO space for each PCI Express Root / PCI Express
+Downstream port if:
+    (1) the port is empty, or
+    (2) the device behind the port has no IO BARs.
+
+The IO space is very limited, to 65536 byte-wide IO ports, and may even be
+fragmented by fixed IO ports owned by platform devices resulting in at most
+10 PCI Express Root Ports or PCI Express Downstream Ports per system
+if devices with IO BARs are used in the PCI Express hierarchy. Using the
+proposed device placing strategy solves this issue by using only
+PCI Express devices within PCI Express hierarchy.
+
+The PCI Express spec requires that PCI Express devices work properly
+without using IO ports. The PCI hierarchy has no such limitations.
+
+
+4. Bus numbers issues
+======================
+Each PCI domain can have up to only 256 buses and the QEMU PCI Express
+machines do not support multiple PCI domains even if extra Root
+Complexes (pxb-pcie) are used.
+
+Each element of the PCI Express hierarchy (Root Complexes,
+PCI Express Root Ports, PCI Express Downstream/Upstream ports)
+uses one bus number. Since only one (multifunction) device
+can be attached to a PCI Express Root Port or PCI Express Downstream
+Port it is advised to plan in advance for the expected number of
+devices to prevent bus number starvation.
+
+Avoiding PCI Express Switches (and thereby striving for a 'flatter' PCI
+Express hierarchy) enables the hierarchy to not spend bus numbers on
+Upstream Ports.
+
+The bus_nr properties of the pxb-pcie devices partition the 0..255 bus
+number space. All bus numbers assigned to the buses recursively behind a
+given pxb-pcie device's root bus must fit between the bus_nr property of
+that pxb-pcie device, and the lowest of the higher bus_nr properties
+that the command line sets for other pxb-pcie devices.
+
+
+5. Hot-plug
+============
+The PCI Express root buses (pcie.0 and the buses exposed by pxb-pcie devices)
+do not support hot-plug, so any devices plugged into Root Complexes
+cannot be hot-plugged/hot-unplugged:
+    (1) PCI Express Integrated Endpoints
+    (2) PCI Express Root Ports
+    (3) DMI-PCI Bridges
+    (4) pxb-pcie
+
+Be aware that PCI Express Downstream Ports can't be hot-plugged into
+an existing PCI Express Upstream Port.
+
+PCI devices can be hot-plugged into PCI-PCI Bridges. The PCI hot-plug is ACPI
+based and can work side by side with the PCI Express native hot-plug.
+
+PCI Express devices can be natively hot-plugged/hot-unplugged into/from
+PCI Express Root Ports (and PCI Express Downstream Ports).
+
+5.1 Planning for hot-plug:
+    (1) PCI hierarchy
+        Leave enough PCI-PCI Bridge slots empty or add one
+        or more empty PCI-PCI Bridges to the DMI-PCI Bridge.
+
+        For each such PCI-PCI Bridge the Guest Firmware is expected to reserve
+        4K IO space and 2M MMIO range to be used for all devices behind it.
+
+        Because of the hard IO limit of around 10 PCI Bridges (~ 40K space)
+        per system don't use more than 9 PCI-PCI Bridges, leaving 4K for the
+        Integrated Endpoints. (The PCI Express Hierarchy needs no IO space).
+
+    (2) PCI Express hierarchy:
+        Leave enough PCI Express Root Ports empty. Use multifunction
+        PCI Express Root Ports (up to 8 ports per pcie.0 slot)
+        on the Root Complex(es), for keeping the
+        hierarchy as flat as possible, thereby saving PCI bus numbers.
+        Don't use PCI Express Switches if you don't have
+        to, each one of those uses an extra PCI bus (for its Upstream Port)
+        that could be put to better use with another Root Port or Downstream
+        Port, which may come handy for hot-plugging another device.
+
+
+5.3 Hot-plug example:
+Using HMP: (add -monitor stdio to QEMU command line)
+  device_add <dev>,id=<id>,bus=<PCI Express Root Port Id/PCI Express Downstream Port Id/PCI-PCI Bridge Id/>
+
+
+6. Device assignment
+====================
+Host devices are mostly PCI Express and should be plugged only into
+PCI Express Root Ports or PCI Express Downstream Ports.
+PCI-PCI Bridge slots can be used for legacy PCI host devices.
+
+6.1 How to detect if a device is PCI Express:
+  > lspci -s 03:00.0 -v (as root)
+
+    03:00.0 Network controller: Intel Corporation Wireless 7260 (rev 83)
+    Subsystem: Intel Corporation Dual Band Wireless-AC 7260
+    Flags: bus master, fast devsel, latency 0, IRQ 50
+    Memory at f0400000 (64-bit, non-prefetchable) [size=8K]
+    Capabilities: [c8] Power Management version 3
+    Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
+    Capabilities: [40] Express Endpoint, MSI 00
+    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+    Capabilities: [100] Advanced Error Reporting
+    Capabilities: [140] Device Serial Number 7c-7a-91-ff-ff-90-db-20
+    Capabilities: [14c] Latency Tolerance Reporting
+    Capabilities: [154] Vendor Specific Information: ID=cafe Rev=1 Len=014 
+
+If you can see the "Express Endpoint" capability in the
+output, then the device is indeed PCI Express.
+
+
+7. Virtio devices
+=================
+Virtio devices plugged into the PCI hierarchy or as Integrated Endpoints
+will remain PCI and have transitional behaviour as default.
+Transitional virtio devices work in both IO and MMIO modes depending on
+the guest support. The Guest firmware will assign both IO and MMIO resources
+to transitional virtio devices.
+
+Virtio devices plugged into PCI Express ports are PCI Express devices and
+have "1.0" behavior by default without IO support.
+In both cases disable-legacy and disable-modern properties can be used
+to override the behaviour.
+
+Note that setting disable-legacy=off will enable legacy mode (enabling
+legacy behavior) for PCI Express virtio devices causing them to
+require IO space, which, given the limited available IO space, may quickly
+lead to resource exhaustion, and is therefore strongly discouraged.
+
+
+8. Conclusion
+==============
+The proposal offers a usage model that is easy to understand and follow
+and at the same time overcomes the PCI Express architecture limitations.
--- a/docs/specs/acpi_mem_hotplug.txt
+++ b/docs/specs/acpi_mem_hotplug.txt
@ -4,9 +4,6 @@ QEMU<->ACPI BIOS memory hotplug interface
 ACPI BIOS GPE.3 handler is dedicated for notifying OS about memory hot-add
 and hot-remove events.

-ACPI BIOS GPE.4 handler is dedicated for notifying OS about nvdimm device
-hot-add and hot-remove events.
-
 Memory hot-plug interface (IO port 0xa00-0xa17, 1-4 byte access):
 ---------------------------------------------------------------
 0xa00:
--- a/docs/specs/acpi_nvdimm.txt
+++ b/docs/specs/acpi_nvdimm.txt
@ -65,8 +65,8 @@ _FIT(Firmware Interface Table)
   The detailed definition of the structure can be found at ACPI 6.0: 5.2.25
   NVDIMM Firmware Interface Table (NFIT).

-QEMU NVDIMM Implemention
-========================
+QEMU NVDIMM Implementation
+==========================
 QEMU uses 4 bytes IO Port starting from 0x0a18 and a RAM-based memory page
 for NVDIMM ACPI.

@ -80,8 +80,17 @@ Memory:
   emulates _DSM access and writes the output data to it.

   ACPI writes _DSM Input Data (based on the offset in the page):
-   [0x0 - 0x3]: 4 bytes, NVDIMM Device Handle, 0 is reserved for NVDIMM
-                Root device.
+   [0x0 - 0x3]: 4 bytes, NVDIMM Device Handle.
+
+                The handle is completely QEMU internal thing, the values in
+                range [1, 0xFFFF] indicate nvdimm device. Other values are
+                reserved for other purposes.
+
+                Reserved handles:
+                0 is reserved for nvdimm root device named NVDR.
+                0x10000 is reserved for QEMU internal DSM function called on
+                the root device.
+
   [0x4 - 0x7]: 4 bytes, Revision ID, that is the Arg1 of _DSM method.
   [0x8 - 0xB]: 4 bytes. Function Index, that is the Arg2 of _DSM method.
   [0xC - 0xFFF]: 4084 bytes, the Arg3 of _DSM method.
@ -127,28 +136,17 @@ _DSM process diagram:
 | result from the page     |      |              |
 +--------------------------+      +--------------+

-Device Handle Reservation
-------------------------
-As we mentioned above, byte 0 ~ byte 3 in the DSM memory save NVDIMM device
-handle. The handle is completely QEMU internal thing, the values in range
-[0, 0xFFFF] indicate nvdimm device (O means nvdimm root device named NVDR),
-other values are reserved by other purpose.
-
-Current reserved handle:
-0x10000 is reserved for QEMU internal DSM function called on the root
-device.
+NVDIMM hotplug
+--------------
+ACPI BIOS GPE.4 handler is dedicated for notifying OS about nvdimm device
+hot-add event.

 QEMU internal use only _DSM function
 ------------------------------------
-UUID, 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, is reserved for QEMU internal
-DSM function.
-
-There is the function introduced by QEMU and only used by QEMU internal.
-
 1) Read FIT
-   As we only reserved one page for NVDIMM ACPI it is impossible to map the
-   whole FIT data to guest's address space. This function is used by _FIT
-   method to read a piece of FIT data from QEMU.
+   _FIT method uses _DSM method to fetch NFIT structures blob from QEMU
+   in 1 page sized increments which are then concatenated and returned
+   as _FIT method result.

   Input parameters:
   Arg0 – UUID {set to 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62}
@ -156,29 +154,34 @@ There is the function introduced by QEMU and only used by QEMU internal.
   Arg2 - Function Index, 0x1
   Arg3 - A package containing a buffer whose layout is as follows:

-   +----------+-------------+-------------+-----------------------------------+
-   |  Filed   | Byte Length | Byte Offset | Description                       |
-   +----------+-------------+-------------+-----------------------------------+
-   | offset   |     4       |    0        | the offset of FIT buffer          |
-   +----------+-------------+-------------+-----------------------------------+
+   +----------+--------+--------+-------------------------------------------+
+   |  Field   | Length | Offset |                 Description               |
+   +----------+--------+--------+-------------------------------------------+
+   | offset   |   4    |   0    | offset in QEMU's NFIT structures blob to  |
+   |          |        |        | read from                                 |
+   +----------+--------+--------+-------------------------------------------+

-   Output:
-   +----------+-------------+-------------+-----------------------------------+
-   |  Filed   | Byte Length | Byte Offset | Description                       |
-   +----------+-------------+-------------+-----------------------------------+
-   |          |             |             | return status codes               |
-   |          |             |             |   0x100 indicates fit has been    |
-   | status   |     4       |    0        |   updated                         |
-   |          |             |             | other follows Chapter 3 in DSM    |
-   |          |             |             | Spec Rev1                         |
-   +----------+-------------+-------------+-----------------------------------+
-   | fit data |  Varies     |    4        | FIT data                          |
-   |          |             |             |                                   |
-   +----------+-------------+-------------+-----------------------------------+
+   Output layout in the dsm memory page:
+   +----------+--------+--------+-------------------------------------------+
+   |  Field   | Length | Offset |                 Description               |
+   +----------+--------+--------+-------------------------------------------+
+   | length   |   4    |   0    | length of entire returned data            |
+   |          |        |        | (including this header)                   |
+   +----------+-----------------+-------------------------------------------+
+   |          |        |        | return status codes                       |
+   |          |        |        | 0x0 - success                             |
+   |          |        |        | 0x100 - error caused by NFIT update while |
+   | status   |   4    |   4    | read by _FIT wasn't completed, other      |
+   |          |        |        | codes follow Chapter 3 in DSM Spec Rev1   |
+   +----------+-----------------+-------------------------------------------+
+   | fit data | Varies |   8    | contains FIT data, this field is present  |
+   |          |        |        | if status field is 0;                     |
+   +----------+--------+--------+-------------------------------------------+

-   The FIT offset is maintained by the caller itself, current offset plugs
-   the length returned by the function is the next offset we should read.
-   When all the FIT data has been read out, zero length is returned.
+   The FIT offset is maintained by the OSPM itself, current offset plus
+   the size of the fit data returned by the function is the next offset
+   OSPM should read. When all FIT data has been read out, zero fit data
+   size is returned.

-   If it returns 0x100, OSPM should restart to read FIT (read from offset 0
-   again).
+   If it returns status code 0x100, OSPM should restart to read FIT (read
+   from offset 0 again).
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@ -490,8 +490,12 @@ void ich9_pm_device_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,

    if (lpc->pm.acpi_memory_hotplug.is_enabled &&
        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
-        acpi_memory_plug_cb(hotplug_dev, &lpc->pm.acpi_memory_hotplug,
-                            dev, errp);
+        if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
+            nvdimm_acpi_plug_cb(hotplug_dev, dev);
+        } else {
+            acpi_memory_plug_cb(hotplug_dev, &lpc->pm.acpi_memory_hotplug,
+                                dev, errp);
+        }
    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
        if (lpc->pm.cpu_hotplug_legacy) {
            legacy_acpi_cpu_plug_cb(hotplug_dev, &lpc->pm.gpe_cpu, dev, errp);
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@ -2,7 +2,6 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/pc-hotplug.h"
 #include "hw/mem/pc-dimm.h"
-#include "hw/mem/nvdimm.h"
 #include "hw/boards.h"
 #include "hw/qdev-core.h"
 #include "trace.h"
@ -233,8 +232,11 @@ void acpi_memory_plug_cb(HotplugHandler *hotplug_dev, MemHotplugState *mem_st,
                         DeviceState *dev, Error **errp)
 {
    MemStatus *mdev;
-    AcpiEventStatusBits event;
-    bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
+    DeviceClass *dc = DEVICE_GET_CLASS(dev);
+
+    if (!dc->hotpluggable) {
+        return;
+    }

    mdev = acpi_memory_slot_status(mem_st, dev, errp);
    if (!mdev) {
@ -242,23 +244,10 @@ void acpi_memory_plug_cb(HotplugHandler *hotplug_dev, MemHotplugState *mem_st,
    }

    mdev->dimm = dev;
-
-    /*
-     * do not set is_enabled and is_inserting if the slot is plugged with
-     * a nvdimm device to stop OSPM inquires memory region from the slot.
-     */
-    if (is_nvdimm) {
-        event = ACPI_NVDIMM_HOTPLUG_STATUS;
-    } else {
-        mdev->is_enabled = true;
-        event = ACPI_MEMORY_HOTPLUG_STATUS;
-    }
-
+    mdev->is_enabled = true;
    if (dev->hotplugged) {
-        if (!is_nvdimm) {
-            mdev->is_inserting = true;
-        }
-        acpi_send_event(DEVICE(hotplug_dev), event);
+        mdev->is_inserting = true;
+        acpi_send_event(DEVICE(hotplug_dev), ACPI_MEMORY_HOTPLUG_STATUS);
    }
 }

@ -273,8 +262,6 @@ void acpi_memory_unplug_request_cb(HotplugHandler *hotplug_dev,
        return;
    }

-    /* nvdimm device hot unplug is not supported yet. */
-    assert(!object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM));
    mdev->is_removing = true;
    acpi_send_event(DEVICE(hotplug_dev), ACPI_MEMORY_HOTPLUG_STATUS);
 }
@ -289,8 +276,6 @@ void acpi_memory_unplug_cb(MemHotplugState *mem_st,
        return;
    }

-    /* nvdimm device hot unplug is not supported yet. */
-    assert(!object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM));
    mdev->is_enabled = false;
    mdev->dimm = NULL;
 }
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@ -33,35 +33,30 @@
 #include "hw/nvram/fw_cfg.h"
 #include "hw/mem/nvdimm.h"

-static int nvdimm_plugged_device_list(Object *obj, void *opaque)
+static int nvdimm_device_list(Object *obj, void *opaque)
 {
    GSList **list = opaque;

    if (object_dynamic_cast(obj, TYPE_NVDIMM)) {
-        DeviceState *dev = DEVICE(obj);
-
-        if (dev->realized) { /* only realized NVDIMMs matter */
-            *list = g_slist_append(*list, DEVICE(obj));
-        }
+        *list = g_slist_append(*list, DEVICE(obj));
    }

-    object_child_foreach(obj, nvdimm_plugged_device_list, opaque);
+    object_child_foreach(obj, nvdimm_device_list, opaque);
    return 0;
 }

 /*
- * inquire plugged NVDIMM devices and link them into the list which is
+ * inquire NVDIMM devices and link them into the list which is
 * returned to the caller.
 *
 * Note: it is the caller's responsibility to free the list to avoid
 * memory leak.
 */
-static GSList *nvdimm_get_plugged_device_list(void)
+static GSList *nvdimm_get_device_list(void)
 {
    GSList *list = NULL;

-    object_child_foreach(qdev_get_machine(), nvdimm_plugged_device_list,
-                         &list);
+    object_child_foreach(qdev_get_machine(), nvdimm_device_list, &list);
    return list;
 }

@ -219,7 +214,7 @@ static uint32_t nvdimm_slot_to_dcr_index(int slot)
 static NVDIMMDevice *nvdimm_get_device_by_handle(uint32_t handle)
 {
    NVDIMMDevice *nvdimm = NULL;
-    GSList *list, *device_list = nvdimm_get_plugged_device_list();
+    GSList *list, *device_list = nvdimm_get_device_list();

    for (list = device_list; list; list = list->next) {
        NVDIMMDevice *nvd = list->data;
@ -350,7 +345,7 @@ static void nvdimm_build_structure_dcr(GArray *structures, DeviceState *dev)

 static GArray *nvdimm_build_device_structure(void)
 {
-    GSList *device_list = nvdimm_get_plugged_device_list();
+    GSList *device_list = nvdimm_get_device_list();
    GArray *structures = g_array_new(false, true /* clear */, 1);

    for (; device_list; device_list = device_list->next) {
@ -375,20 +370,17 @@ static GArray *nvdimm_build_device_structure(void)

 static void nvdimm_init_fit_buffer(NvdimmFitBuffer *fit_buf)
 {
-    qemu_mutex_init(&fit_buf->lock);
    fit_buf->fit = g_array_new(false, true /* clear */, 1);
 }

 static void nvdimm_build_fit_buffer(NvdimmFitBuffer *fit_buf)
 {
-    qemu_mutex_lock(&fit_buf->lock);
    g_array_free(fit_buf->fit, true);
    fit_buf->fit = nvdimm_build_device_structure();
    fit_buf->dirty = true;
-    qemu_mutex_unlock(&fit_buf->lock);
 }

-void nvdimm_acpi_hotplug(AcpiNVDIMMState *state)
+void nvdimm_plug(AcpiNVDIMMState *state)
 {
    nvdimm_build_fit_buffer(&state->fit_buf);
 }
@ -399,13 +391,6 @@ static void nvdimm_build_nfit(AcpiNVDIMMState *state, GArray *table_offsets,
    NvdimmFitBuffer *fit_buf = &state->fit_buf;
    unsigned int header;

-    qemu_mutex_lock(&fit_buf->lock);
-
-    /* NVDIMM device is not plugged? */
-    if (!fit_buf->fit->len) {
-        goto exit;
-    }
-
    acpi_add_table(table_offsets, table_data);

    /* NFIT header. */
@ -417,11 +402,10 @@ static void nvdimm_build_nfit(AcpiNVDIMMState *state, GArray *table_offsets,
    build_header(linker, table_data,
                 (void *)(table_data->data + header), "NFIT",
                 sizeof(NvdimmNfitHeader) + fit_buf->fit->len, 1, NULL, NULL);
-
-exit:
-    qemu_mutex_unlock(&fit_buf->lock);
 }

+#define NVDIMM_DSM_MEMORY_SIZE      4096
+
 struct NvdimmDsmIn {
    uint32_t handle;
    uint32_t revision;
@ -432,7 +416,7 @@ struct NvdimmDsmIn {
    };
 } QEMU_PACKED;
 typedef struct NvdimmDsmIn NvdimmDsmIn;
-QEMU_BUILD_BUG_ON(sizeof(NvdimmDsmIn) != 4096);
+QEMU_BUILD_BUG_ON(sizeof(NvdimmDsmIn) != NVDIMM_DSM_MEMORY_SIZE);

 struct NvdimmDsmOut {
    /* the size of buffer filled by QEMU. */
@ -440,7 +424,7 @@ struct NvdimmDsmOut {
    uint8_t data[4092];
 } QEMU_PACKED;
 typedef struct NvdimmDsmOut NvdimmDsmOut;
-QEMU_BUILD_BUG_ON(sizeof(NvdimmDsmOut) != 4096);
+QEMU_BUILD_BUG_ON(sizeof(NvdimmDsmOut) != NVDIMM_DSM_MEMORY_SIZE);

 struct NvdimmDsmFunc0Out {
    /* the size of buffer filled by QEMU. */
@ -468,7 +452,7 @@ struct NvdimmFuncGetLabelSizeOut {
    uint32_t max_xfer;
 } QEMU_PACKED;
 typedef struct NvdimmFuncGetLabelSizeOut NvdimmFuncGetLabelSizeOut;
-QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelSizeOut) > 4096);
+QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelSizeOut) > NVDIMM_DSM_MEMORY_SIZE);

 struct NvdimmFuncGetLabelDataIn {
    uint32_t offset; /* the offset in the namespace label data area. */
@ -476,7 +460,7 @@ struct NvdimmFuncGetLabelDataIn {
 } QEMU_PACKED;
 typedef struct NvdimmFuncGetLabelDataIn NvdimmFuncGetLabelDataIn;
 QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelDataIn) +
-                  offsetof(NvdimmDsmIn, arg3) > 4096);
+                  offsetof(NvdimmDsmIn, arg3) > NVDIMM_DSM_MEMORY_SIZE);

 struct NvdimmFuncGetLabelDataOut {
    /* the size of buffer filled by QEMU. */
@ -485,7 +469,7 @@ struct NvdimmFuncGetLabelDataOut {
    uint8_t out_buf[0]; /* the data got via Get Namesapce Label function. */
 } QEMU_PACKED;
 typedef struct NvdimmFuncGetLabelDataOut NvdimmFuncGetLabelDataOut;
-QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelDataOut) > 4096);
+QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelDataOut) > NVDIMM_DSM_MEMORY_SIZE);

 struct NvdimmFuncSetLabelDataIn {
    uint32_t offset; /* the offset in the namespace label data area. */
@ -494,14 +478,14 @@ struct NvdimmFuncSetLabelDataIn {
 } QEMU_PACKED;
 typedef struct NvdimmFuncSetLabelDataIn NvdimmFuncSetLabelDataIn;
 QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncSetLabelDataIn) +
-                  offsetof(NvdimmDsmIn, arg3) > 4096);
+                  offsetof(NvdimmDsmIn, arg3) > NVDIMM_DSM_MEMORY_SIZE);

 struct NvdimmFuncReadFITIn {
-    uint32_t offset; /* the offset of FIT buffer. */
+    uint32_t offset; /* the offset into FIT buffer. */
 } QEMU_PACKED;
 typedef struct NvdimmFuncReadFITIn NvdimmFuncReadFITIn;
 QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITIn) +
-                  offsetof(NvdimmDsmIn, arg3) > 4096);
+                  offsetof(NvdimmDsmIn, arg3) > NVDIMM_DSM_MEMORY_SIZE);

 struct NvdimmFuncReadFITOut {
    /* the size of buffer filled by QEMU. */
@ -510,7 +494,7 @@ struct NvdimmFuncReadFITOut {
    uint8_t fit[0]; /* the FIT data. */
 } QEMU_PACKED;
 typedef struct NvdimmFuncReadFITOut NvdimmFuncReadFITOut;
-QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITOut) > 4096);
+QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITOut) > NVDIMM_DSM_MEMORY_SIZE);

 static void
 nvdimm_dsm_function0(uint32_t supported_func, hwaddr dsm_mem_addr)
@ -532,7 +516,13 @@ nvdimm_dsm_no_payload(uint32_t func_ret_status, hwaddr dsm_mem_addr)
    cpu_physical_memory_write(dsm_mem_addr, &out, sizeof(out));
 }

-#define NVDIMM_QEMU_RSVD_HANDLE_ROOT 0x10000
+#define NVDIMM_DSM_RET_STATUS_SUCCESS        0 /* Success */
+#define NVDIMM_DSM_RET_STATUS_UNSUPPORT      1 /* Not Supported */
+#define NVDIMM_DSM_RET_STATUS_NOMEMDEV       2 /* Non-Existing Memory Device */
+#define NVDIMM_DSM_RET_STATUS_INVALID        3 /* Invalid Input Parameters */
+#define NVDIMM_DSM_RET_STATUS_FIT_CHANGED    0x100 /* FIT Changed */
+
+#define NVDIMM_QEMU_RSVD_HANDLE_ROOT         0x10000

 /* Read FIT data, defined in docs/specs/acpi_nvdimm.txt. */
 static void nvdimm_dsm_func_read_fit(AcpiNVDIMMState *state, NvdimmDsmIn *in,
@ -548,14 +538,13 @@ static void nvdimm_dsm_func_read_fit(AcpiNVDIMMState *state, NvdimmDsmIn *in,
    read_fit = (NvdimmFuncReadFITIn *)in->arg3;
    le32_to_cpus(&read_fit->offset);

-    qemu_mutex_lock(&fit_buf->lock);
    fit = fit_buf->fit;

    nvdimm_debug("Read FIT: offset %#x FIT size %#x Dirty %s.\n",
                 read_fit->offset, fit->len, fit_buf->dirty ? "Yes" : "No");

    if (read_fit->offset > fit->len) {
-        func_ret_status = 3 /* Invalid Input Parameters */;
+        func_ret_status = NVDIMM_DSM_RET_STATUS_INVALID;
        goto exit;
    }

@ -563,13 +552,13 @@ static void nvdimm_dsm_func_read_fit(AcpiNVDIMMState *state, NvdimmDsmIn *in,
    if (!read_fit->offset) {
        fit_buf->dirty = false;
    } else if (fit_buf->dirty) { /* FIT has been changed during RFIT. */
-        func_ret_status = 0x100 /* fit changed */;
+        func_ret_status = NVDIMM_DSM_RET_STATUS_FIT_CHANGED;
        goto exit;
    }

-    func_ret_status = 0 /* Success */;
+    func_ret_status = NVDIMM_DSM_RET_STATUS_SUCCESS;
    read_len = MIN(fit->len - read_fit->offset,
-                   4096 - sizeof(NvdimmFuncReadFITOut));
+                   NVDIMM_DSM_MEMORY_SIZE - sizeof(NvdimmFuncReadFITOut));

 exit:
    size = sizeof(NvdimmFuncReadFITOut) + read_len;
@ -582,22 +571,22 @@ exit:
    cpu_physical_memory_write(dsm_mem_addr, read_fit_out, size);

    g_free(read_fit_out);
-    qemu_mutex_unlock(&fit_buf->lock);
 }

-static void nvdimm_dsm_reserved_root(AcpiNVDIMMState *state, NvdimmDsmIn *in,
-                                     hwaddr dsm_mem_addr)
+static void
+nvdimm_dsm_handle_reserved_root_method(AcpiNVDIMMState *state,
+                                       NvdimmDsmIn *in, hwaddr dsm_mem_addr)
 {
    switch (in->function) {
    case 0x0:
        nvdimm_dsm_function0(0x1 | 1 << 1 /* Read FIT */, dsm_mem_addr);
        return;
-    case 0x1 /*Read FIT */:
+    case 0x1 /* Read FIT */:
        nvdimm_dsm_func_read_fit(state, in, dsm_mem_addr);
        return;
    }

-    nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
+    nvdimm_dsm_no_payload(NVDIMM_DSM_RET_STATUS_UNSUPPORT, dsm_mem_addr);
 }

 static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
@ -613,7 +602,7 @@ static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
    }

    /* No function except function 0 is supported yet. */
-    nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
+    nvdimm_dsm_no_payload(NVDIMM_DSM_RET_STATUS_UNSUPPORT, dsm_mem_addr);
 }

 /*
@ -623,7 +612,9 @@ static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
 */
 static uint32_t nvdimm_get_max_xfer_label_size(void)
 {
-    uint32_t max_get_size, max_set_size, dsm_memory_size = 4096;
+    uint32_t max_get_size, max_set_size, dsm_memory_size;
+
+    dsm_memory_size = NVDIMM_DSM_MEMORY_SIZE;

    /*
     * the max data ACPI can read one time which is transferred by
@ -659,7 +650,7 @@ static void nvdimm_dsm_label_size(NVDIMMDevice *nvdimm, hwaddr dsm_mem_addr)

    nvdimm_debug("label_size %#x, max_xfer %#x.\n", label_size, mxfer);

-    label_size_out.func_ret_status = cpu_to_le32(0 /* Success */);
+    label_size_out.func_ret_status = cpu_to_le32(NVDIMM_DSM_RET_STATUS_SUCCESS);
    label_size_out.label_size = cpu_to_le32(label_size);
    label_size_out.max_xfer = cpu_to_le32(mxfer);

@ -670,7 +661,7 @@ static void nvdimm_dsm_label_size(NVDIMMDevice *nvdimm, hwaddr dsm_mem_addr)
 static uint32_t nvdimm_rw_label_data_check(NVDIMMDevice *nvdimm,
                                           uint32_t offset, uint32_t length)
 {
-    uint32_t ret = 3 /* Invalid Input Parameters */;
+    uint32_t ret = NVDIMM_DSM_RET_STATUS_INVALID;

    if (offset + length < offset) {
        nvdimm_debug("offset %#x + length %#x is overflow.\n", offset,
@ -690,7 +681,7 @@ static uint32_t nvdimm_rw_label_data_check(NVDIMMDevice *nvdimm,
        return ret;
    }

-    return 0 /* Success */;
+    return NVDIMM_DSM_RET_STATUS_SUCCESS;
 }

 /*
@ -714,17 +705,18 @@ static void nvdimm_dsm_get_label_data(NVDIMMDevice *nvdimm, NvdimmDsmIn *in,

    status = nvdimm_rw_label_data_check(nvdimm, get_label_data->offset,
                                        get_label_data->length);
-    if (status != 0 /* Success */) {
+    if (status != NVDIMM_DSM_RET_STATUS_SUCCESS) {
        nvdimm_dsm_no_payload(status, dsm_mem_addr);
        return;
    }

    size = sizeof(*get_label_data_out) + get_label_data->length;
-    assert(size <= 4096);
+    assert(size <= NVDIMM_DSM_MEMORY_SIZE);
    get_label_data_out = g_malloc(size);

    get_label_data_out->len = cpu_to_le32(size);
-    get_label_data_out->func_ret_status = cpu_to_le32(0 /* Success */);
+    get_label_data_out->func_ret_status =
+                            cpu_to_le32(NVDIMM_DSM_RET_STATUS_SUCCESS);
    nvc->read_label_data(nvdimm, get_label_data_out->out_buf,
                         get_label_data->length, get_label_data->offset);

@ -752,17 +744,17 @@ static void nvdimm_dsm_set_label_data(NVDIMMDevice *nvdimm, NvdimmDsmIn *in,

    status = nvdimm_rw_label_data_check(nvdimm, set_label_data->offset,
                                        set_label_data->length);
-    if (status != 0 /* Success */) {
+    if (status != NVDIMM_DSM_RET_STATUS_SUCCESS) {
        nvdimm_dsm_no_payload(status, dsm_mem_addr);
        return;
    }

-    assert(offsetof(NvdimmDsmIn, arg3) +
-           sizeof(*set_label_data) + set_label_data->length <= 4096);
+    assert(offsetof(NvdimmDsmIn, arg3) + sizeof(*set_label_data) +
+                    set_label_data->length <= NVDIMM_DSM_MEMORY_SIZE);

    nvc->write_label_data(nvdimm, set_label_data->in_buf,
                          set_label_data->length, set_label_data->offset);
-    nvdimm_dsm_no_payload(0 /* Success */, dsm_mem_addr);
+    nvdimm_dsm_no_payload(NVDIMM_DSM_RET_STATUS_SUCCESS, dsm_mem_addr);
 }

 static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
@ -786,7 +778,7 @@ static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
    }

    if (!nvdimm) {
-        nvdimm_dsm_no_payload(2 /* Non-Existing Memory Device */,
+        nvdimm_dsm_no_payload(NVDIMM_DSM_RET_STATUS_NOMEMDEV,
                              dsm_mem_addr);
        return;
    }
@ -813,7 +805,7 @@ static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
        break;
    }

-    nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
+    nvdimm_dsm_no_payload(NVDIMM_DSM_RET_STATUS_UNSUPPORT, dsm_mem_addr);
 }

 static uint64_t
@ -850,12 +842,12 @@ nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
    if (in->revision != 0x1 /* Currently we only support DSM Spec Rev1. */) {
        nvdimm_debug("Revision %#x is not supported, expect %#x.\n",
                     in->revision, 0x1);
-        nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
+        nvdimm_dsm_no_payload(NVDIMM_DSM_RET_STATUS_UNSUPPORT, dsm_mem_addr);
        goto exit;
    }

    if (in->handle == NVDIMM_QEMU_RSVD_HANDLE_ROOT) {
-        nvdimm_dsm_reserved_root(state, in, dsm_mem_addr);
+        nvdimm_dsm_handle_reserved_root_method(state, in, dsm_mem_addr);
        goto exit;
    }

@ -881,6 +873,13 @@ static const MemoryRegionOps nvdimm_dsm_ops = {
    },
 };

+void nvdimm_acpi_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev)
+{
+    if (dev->hotplugged) {
+        acpi_send_event(DEVICE(hotplug_dev), ACPI_NVDIMM_HOTPLUG_STATUS);
+    }
+}
+
 void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
                            FWCfgState *fw_cfg, Object *owner)
 {
@ -1031,7 +1030,7 @@ static void nvdimm_build_common_dsm(Aml *dev)
    aml_append(unsupport, ifctx);

    /* No function is supported yet. */
-    byte_list[0] = 1 /* Not Supported */;
+    byte_list[0] = NVDIMM_DSM_RET_STATUS_UNSUPPORT;
    aml_append(unsupport, aml_return(aml_buffer(1, byte_list)));
    aml_append(method, unsupport);

@ -1103,13 +1102,11 @@ static void nvdimm_build_fit(Aml *dev)
    buf_size = aml_local(1);
    fit = aml_local(2);

-    aml_append(dev, aml_create_dword_field(aml_buffer(4, NULL),
-               aml_int(0), NVDIMM_DSM_RFIT_STATUS));
+    aml_append(dev, aml_name_decl(NVDIMM_DSM_RFIT_STATUS, aml_int(0)));

    /* build helper function, RFIT. */
    method = aml_method("RFIT", 1, AML_SERIALIZED);
-    aml_append(method, aml_create_dword_field(aml_buffer(4, NULL),
-                                              aml_int(0), "OFST"));
+    aml_append(method, aml_name_decl("OFST", aml_int(0)));

    /* prepare input package. */
    pkg = aml_package(1);
@ -1132,7 +1129,8 @@ static void nvdimm_build_fit(Aml *dev)
                                 aml_name(NVDIMM_DSM_RFIT_STATUS)));

     /* if something is wrong during _DSM. */
-    ifcond = aml_equal(aml_int(0 /* Success */), aml_name("STAU"));
+    ifcond = aml_equal(aml_int(NVDIMM_DSM_RET_STATUS_SUCCESS),
+                       aml_name("STAU"));
    ifctx = aml_if(aml_lnot(ifcond));
    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
    aml_append(method, ifctx);
@ -1147,11 +1145,9 @@ static void nvdimm_build_fit(Aml *dev)
    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
    aml_append(method, ifctx);

-    aml_append(method, aml_store(aml_shiftleft(buf_size, aml_int(3)),
-                                 buf_size));
    aml_append(method, aml_create_field(buf,
                            aml_int(4 * BITS_PER_BYTE), /* offset at byte 4.*/
-                            buf_size, "BUFF"));
+                            aml_shiftleft(buf_size, aml_int(3)), "BUFF"));
    aml_append(method, aml_return(aml_name("BUFF")));
    aml_append(dev, method);

@ -1171,7 +1167,7 @@ static void nvdimm_build_fit(Aml *dev)
     * again.
     */
    ifctx = aml_if(aml_equal(aml_name(NVDIMM_DSM_RFIT_STATUS),
-                             aml_int(0x100 /* fit changed */)));
+                             aml_int(NVDIMM_DSM_RET_STATUS_FIT_CHANGED)));
    aml_append(ifctx, aml_store(aml_buffer(0, NULL), fit));
    aml_append(ifctx, aml_store(aml_int(0), offset));
    aml_append(whilectx, ifctx);
@ -1281,14 +1277,22 @@ void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
                       BIOSLinker *linker, AcpiNVDIMMState *state,
                       uint32_t ram_slots)
 {
-    nvdimm_build_nfit(state, table_offsets, table_data, linker);
+    GSList *device_list;

-    /*
-     * NVDIMM device is allowed to be plugged only if there is available
-     * slot.
-     */
-    if (ram_slots) {
-        nvdimm_build_ssdt(table_offsets, table_data, linker, state->dsm_mem,
-                          ram_slots);
+    /* no nvdimm device can be plugged. */
+    if (!ram_slots) {
+        return;
    }
+
+    nvdimm_build_ssdt(table_offsets, table_data, linker, state->dsm_mem,
+                      ram_slots);
+
+    device_list = nvdimm_get_device_list();
+    /* no NVDIMM device is plugged. */
+    if (!device_list) {
+        return;
+    }
+
+    nvdimm_build_nfit(state, table_offsets, table_data, linker);
+    g_slist_free(device_list);
 }
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@ -378,7 +378,12 @@ static void piix4_device_plug_cb(HotplugHandler *hotplug_dev,

    if (s->acpi_memory_hotplug.is_enabled &&
        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
-        acpi_memory_plug_cb(hotplug_dev, &s->acpi_memory_hotplug, dev, errp);
+        if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
+            nvdimm_acpi_plug_cb(hotplug_dev, dev);
+        } else {
+            acpi_memory_plug_cb(hotplug_dev, &s->acpi_memory_hotplug,
+                                dev, errp);
+        }
    } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
        acpi_pcihp_device_plug_cb(hotplug_dev, &s->acpi_pci_hotplug, dev, errp);
    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
--- a/hw/core/hotplug.c
+++ b/hw/core/hotplug.c
@ -35,17 +35,6 @@ void hotplug_handler_plug(HotplugHandler *plug_handler,
    }
 }

-void hotplug_handler_post_plug(HotplugHandler *plug_handler,
-                               DeviceState *plugged_dev,
-                               Error **errp)
-{
-    HotplugHandlerClass *hdc = HOTPLUG_HANDLER_GET_CLASS(plug_handler);
-
-    if (hdc->post_plug) {
-        hdc->post_plug(plug_handler, plugged_dev, errp);
-    }
-}
-
 void hotplug_handler_unplug_request(HotplugHandler *plug_handler,
                                    DeviceState *plugged_dev,
                                    Error **errp)
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@ -945,21 +945,10 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
                goto child_realize_fail;
            }
        }
-
        if (dev->hotplugged) {
            device_reset(dev);
        }
        dev->pending_deleted_event = false;
-        dev->realized = value;
-
-        if (hotplug_ctrl) {
-            hotplug_handler_post_plug(hotplug_ctrl, dev, &local_err);
-        }
-
-        if (local_err != NULL) {
-            dev->realized = value;
-            goto post_realize_fail;
-        }
    } else if (!value && dev->realized) {
        Error **local_errp = NULL;
        QLIST_FOREACH(bus, &dev->child_bus, sibling) {
@ -976,14 +965,13 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
        }
        dev->pending_deleted_event = true;
        DEVICE_LISTENER_CALL(unrealize, Reverse, dev);
-
-        if (local_err != NULL) {
-            goto fail;
-        }
-
-        dev->realized = value;
    }

+    if (local_err != NULL) {
+        goto fail;
+    }
+
+    dev->realized = value;
    return;

 child_realize_fail:
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@ -2605,7 +2605,8 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker)
    scope->length = ioapic_scope_size;
    scope->enumeration_id = ACPI_BUILD_IOAPIC_ID;
    scope->bus = Q35_PSEUDO_BUS_PLATFORM;
-    scope->path[0] = cpu_to_le16(Q35_PSEUDO_DEVFN_IOAPIC);
+    scope->path[0].device = PCI_SLOT(Q35_PSEUDO_DEVFN_IOAPIC);
+    scope->path[0].function = PCI_FUNC(Q35_PSEUDO_DEVFN_IOAPIC);

    build_header(linker, table_data, (void *)(table_data->data + dmar_start),
                 "DMAR", table_data->len - dmar_start, 1, NULL, NULL);
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@ -218,7 +218,7 @@ static void vtd_reset_iotlb(IntelIOMMUState *s)
    g_hash_table_remove_all(s->iotlb);
 }

-static uint64_t vtd_get_iotlb_key(uint64_t gfn, uint8_t source_id,
+static uint64_t vtd_get_iotlb_key(uint64_t gfn, uint16_t source_id,
                                  uint32_t level)
 {
    return gfn | ((uint64_t)(source_id) << VTD_IOTLB_SID_SHIFT) |
@ -2180,7 +2180,7 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
    }

    addr.data = origin->address & VTD_MSI_ADDR_LO_MASK;
-    if (le16_to_cpu(addr.addr.__head) != 0xfee) {
+    if (addr.addr.__head != 0xfee) {
        VTD_DPRINTF(GENERAL, "error: MSI addr low 32 bits invalid: "
                    "0x%"PRIx32, addr.data);
        return -VTD_FR_IR_REQ_RSVD;
@ -2463,7 +2463,7 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
    IntelIOMMUState *s = opaque;
    VTDAddressSpace *vtd_as;

-    assert(0 <= devfn && devfn <= X86_IOMMU_PCI_DEVFN_MAX);
+    assert(0 <= devfn && devfn < X86_IOMMU_PCI_DEVFN_MAX);

    vtd_as = vtd_find_add_as(s, bus, devfn);
    return &vtd_as->as;
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@ -115,7 +115,7 @@

 /* The shift of source_id in the key of IOTLB hash table */
 #define VTD_IOTLB_SID_SHIFT         36
-#define VTD_IOTLB_LVL_SHIFT         44
+#define VTD_IOTLB_LVL_SHIFT         52
 #define VTD_IOTLB_MAX_SIZE          1024    /* Max size of the hash table */

 /* IOTLB_REG */
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@ -1715,22 +1715,16 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
        goto out;
    }

+    if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
+        nvdimm_plug(&pcms->acpi_nvdimm_state);
+    }
+
    hhc = HOTPLUG_HANDLER_GET_CLASS(pcms->acpi_dev);
    hhc->plug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &error_abort);
 out:
    error_propagate(errp, local_err);
 }

-static void pc_dimm_post_plug(HotplugHandler *hotplug_dev,
-                              DeviceState *dev, Error **errp)
-{
-    PCMachineState *pcms = PC_MACHINE(hotplug_dev);
-
-    if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
-        nvdimm_acpi_hotplug(&pcms->acpi_nvdimm_state);
-    }
-}
-
 static void pc_dimm_unplug_request(HotplugHandler *hotplug_dev,
                                   DeviceState *dev, Error **errp)
 {
@ -1767,12 +1761,6 @@ static void pc_dimm_unplug(HotplugHandler *hotplug_dev,
    HotplugHandlerClass *hhc;
    Error *local_err = NULL;

-    if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
-        error_setg(&local_err,
-                   "nvdimm device hot unplug is not supported yet.");
-        goto out;
-    }
-
    hhc = HOTPLUG_HANDLER_GET_CLASS(pcms->acpi_dev);
    hhc->unplug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &local_err);

@ -2008,14 +1996,6 @@ static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev,
    }
 }

-static void pc_machine_device_post_plug_cb(HotplugHandler *hotplug_dev,
-                                           DeviceState *dev, Error **errp)
-{
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
-        pc_dimm_post_plug(hotplug_dev, dev, errp);
-    }
-}
-
 static void pc_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
                                                DeviceState *dev, Error **errp)
 {
@ -2322,7 +2302,6 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
    mc->reset = pc_machine_reset;
    hc->pre_plug = pc_machine_device_pre_plug_cb;
    hc->plug = pc_machine_device_plug_cb;
-    hc->post_plug = pc_machine_device_post_plug_cb;
    hc->unplug_request = pc_machine_device_unplug_request_cb;
    hc->unplug = pc_machine_device_unplug_cb;
    nc->nmi_monitor_handler = x86_nmi;
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@ -1181,7 +1181,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
         * must have consumed the complete packet.
         * Otherwise, drop it. */
        if (!n->mergeable_rx_bufs && offset < size) {
-            virtqueue_discard(q->rx_vq, elem, total);
+            virtqueue_unpop(q->rx_vq, elem, total);
            g_free(elem);
            return size;
        }
@ -1946,6 +1946,7 @@ static void virtio_net_class_init(ObjectClass *klass, void *data)
    vdc->guest_notifier_pending = virtio_net_guest_notifier_pending;
    vdc->load = virtio_net_load_device;
    vdc->save = virtio_net_save_device;
+    vdc->legacy_features |= (0x1 << VIRTIO_NET_F_GSO);
 }

 static const TypeInfo virtio_net_info = {
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@ -303,6 +303,8 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
        if (!ccw.cda) {
            ret = -EFAULT;
        } else {
+            VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+
            features.index = address_space_ldub(&address_space_memory,
                                                ccw.cda
                                                + sizeof(features.features),
@ -312,7 +314,7 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
                if (dev->revision >= 1) {
                    /* Don't offer legacy features for modern devices. */
                    features.features = (uint32_t)
-                        (vdev->host_features & ~VIRTIO_LEGACY_FEATURES);
+                        (vdev->host_features & ~vdc->legacy_features);
                } else {
                    features.features = (uint32_t)vdev->host_features;
                }
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@ -421,32 +421,73 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
    dev->log_size = size;
 }

+
+static int vhost_verify_ring_part_mapping(void *part,
+                                          uint64_t part_addr,
+                                          uint64_t part_size,
+                                          uint64_t start_addr,
+                                          uint64_t size)
+{
+    hwaddr l;
+    void *p;
+    int r = 0;
+
+    if (!ranges_overlap(start_addr, size, part_addr, part_size)) {
+        return 0;
+    }
+    l = part_size;
+    p = cpu_physical_memory_map(part_addr, &l, 1);
+    if (!p || l != part_size) {
+        r = -ENOMEM;
+    }
+    if (p != part) {
+        r = -EBUSY;
+    }
+    cpu_physical_memory_unmap(p, l, 0, 0);
+    return r;
+}
+
 static int vhost_verify_ring_mappings(struct vhost_dev *dev,
                                      uint64_t start_addr,
                                      uint64_t size)
 {
-    int i;
+    int i, j;
    int r = 0;
+    const char *part_name[] = {
+        "descriptor table",
+        "available ring",
+        "used ring"
+    };

-    for (i = 0; !r && i < dev->nvqs; ++i) {
+    for (i = 0; i < dev->nvqs; ++i) {
        struct vhost_virtqueue *vq = dev->vqs + i;
-        hwaddr l;
-        void *p;

-        if (!ranges_overlap(start_addr, size, vq->ring_phys, vq->ring_size)) {
-            continue;
+        j = 0;
+        r = vhost_verify_ring_part_mapping(vq->desc, vq->desc_phys,
+                                           vq->desc_size, start_addr, size);
+        if (!r) {
+            break;
        }
-        l = vq->ring_size;
-        p = cpu_physical_memory_map(vq->ring_phys, &l, 1);
-        if (!p || l != vq->ring_size) {
-            error_report("Unable to map ring buffer for ring %d", i);
-            r = -ENOMEM;
+
+        j++;
+        r = vhost_verify_ring_part_mapping(vq->avail, vq->avail_phys,
+                                           vq->avail_size, start_addr, size);
+        if (!r) {
+            break;
        }
-        if (p != vq->ring) {
-            error_report("Ring buffer relocated for ring %d", i);
-            r = -EBUSY;
+
+        j++;
+        r = vhost_verify_ring_part_mapping(vq->used, vq->used_phys,
+                                           vq->used_size, start_addr, size);
+        if (!r) {
+            break;
        }
-        cpu_physical_memory_unmap(p, l, 0, 0);
+    }
+
+    if (r == -ENOMEM) {
+        error_report("Unable to map %s for ring %d", part_name[j], i);
+    } else if (r == -EBUSY) {
+        error_report("%s relocated for ring %d", part_name[j], i);
    }
    return r;
 }
@ -860,15 +901,15 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
        }
    }

-    s = l = virtio_queue_get_desc_size(vdev, idx);
-    a = virtio_queue_get_desc_addr(vdev, idx);
+    vq->desc_size = s = l = virtio_queue_get_desc_size(vdev, idx);
+    vq->desc_phys = a = virtio_queue_get_desc_addr(vdev, idx);
    vq->desc = cpu_physical_memory_map(a, &l, 0);
    if (!vq->desc || l != s) {
        r = -ENOMEM;
        goto fail_alloc_desc;
    }
-    s = l = virtio_queue_get_avail_size(vdev, idx);
-    a = virtio_queue_get_avail_addr(vdev, idx);
+    vq->avail_size = s = l = virtio_queue_get_avail_size(vdev, idx);
+    vq->avail_phys = a = virtio_queue_get_avail_addr(vdev, idx);
    vq->avail = cpu_physical_memory_map(a, &l, 0);
    if (!vq->avail || l != s) {
        r = -ENOMEM;
@ -882,14 +923,6 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
        goto fail_alloc_used;
    }

-    vq->ring_size = s = l = virtio_queue_get_ring_size(vdev, idx);
-    vq->ring_phys = a = virtio_queue_get_ring_addr(vdev, idx);
-    vq->ring = cpu_physical_memory_map(a, &l, 1);
-    if (!vq->ring || l != s) {
-        r = -ENOMEM;
-        goto fail_alloc_ring;
-    }
-
    r = vhost_virtqueue_set_addr(dev, vq, vhost_vq_index, dev->log_enabled);
    if (r < 0) {
        r = -errno;
@ -930,9 +963,6 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
 fail_vector:
 fail_kick:
 fail_alloc:
-    cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
-                              0, 0);
-fail_alloc_ring:
    cpu_physical_memory_unmap(vq->used, virtio_queue_get_used_size(vdev, idx),
                              0, 0);
 fail_alloc_used:
@ -973,8 +1003,6 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
                                                vhost_vq_index);
    }

-    cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
-                              0, virtio_queue_get_ring_size(vdev, idx));
    cpu_physical_memory_unmap(vq->used, virtio_queue_get_used_size(vdev, idx),
                              1, virtio_queue_get_used_size(vdev, idx));
    cpu_physical_memory_unmap(vq->avail, virtio_queue_get_avail_size(vdev, idx),
@ -1122,7 +1150,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
        if (!(hdev->features & (0x1ULL << VHOST_F_LOG_ALL))) {
            error_setg(&hdev->migration_blocker,
                       "Migration disabled: vhost lacks VHOST_F_LOG_ALL feature.");
-        } else if (!qemu_memfd_check()) {
+        } else if (vhost_dev_log_is_shared(hdev) && !qemu_memfd_check()) {
            error_setg(&hdev->migration_blocker,
                       "Migration disabled: failed to allocate shared memory");
        }
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@ -456,7 +456,7 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
    VirtIOBalloon *s = VIRTIO_BALLOON(vdev);

    if (s->stats_vq_elem != NULL) {
-        virtqueue_discard(s->svq, s->stats_vq_elem, 0);
+        virtqueue_unpop(s->svq, s->stats_vq_elem, 0);
        g_free(s->stats_vq_elem);
        s->stats_vq_elem = NULL;
    }
--- a/hw/virtio/virtio-crypto-pci.c
+++ b/hw/virtio/virtio-crypto-pci.c
@ -48,7 +48,7 @@ static void virtio_crypto_pci_class_init(ObjectClass *klass, void *data)
    k->realize = virtio_crypto_pci_realize;
    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
    dc->props = virtio_crypto_pci_properties;
-
+    dc->hotpluggable = false;
    pcidev_k->class_id = PCI_CLASS_OTHERS;
 }

--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@ -813,6 +813,7 @@ static void virtio_crypto_device_unrealize(DeviceState *dev, Error **errp)

 static const VMStateDescription vmstate_virtio_crypto = {
    .name = "virtio-crypto",
+    .unmigratable = 1,
    .minimum_version_id = VIRTIO_CRYPTO_VM_VERSION,
    .version_id = VIRTIO_CRYPTO_VM_VERSION,
    .fields = (VMStateField[]) {
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@ -1175,7 +1175,9 @@ static uint64_t virtio_pci_common_read(void *opaque, hwaddr addr,
        break;
    case VIRTIO_PCI_COMMON_DF:
        if (proxy->dfselect <= 1) {
-            val = (vdev->host_features & ~VIRTIO_LEGACY_FEATURES) >>
+            VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+
+            val = (vdev->host_features & ~vdc->legacy_features) >>
                (32 * proxy->dfselect);
        }
        break;
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@ -279,7 +279,7 @@ void virtqueue_detach_element(VirtQueue *vq, const VirtQueueElement *elem,
    virtqueue_unmap_sg(vq, elem, len);
 }

-/* virtqueue_discard:
+/* virtqueue_unpop:
 * @vq: The #VirtQueue
 * @elem: The #VirtQueueElement
 * @len: number of bytes written
@ -287,8 +287,8 @@ void virtqueue_detach_element(VirtQueue *vq, const VirtQueueElement *elem,
 * Pretend the most recent element wasn't popped from the virtqueue.  The next
 * call to virtqueue_pop() will refetch the element.
 */
-void virtqueue_discard(VirtQueue *vq, const VirtQueueElement *elem,
-                       unsigned int len)
+void virtqueue_unpop(VirtQueue *vq, const VirtQueueElement *elem,
+                     unsigned int len)
 {
    vq->last_avail_idx--;
    virtqueue_detach_element(vq, elem, len);
@ -301,7 +301,7 @@ void virtqueue_discard(VirtQueue *vq, const VirtQueueElement *elem,
 * Pretend that elements weren't popped from the virtqueue.  The next
 * virtqueue_pop() will refetch the oldest element.
 *
- * Use virtqueue_discard() instead if you have a VirtQueueElement.
+ * Use virtqueue_unpop() instead if you have a VirtQueueElement.
 *
 * Returns: true on success, false if @num is greater than the number of in use
 * elements.
@ -632,7 +632,7 @@ void virtqueue_map(VirtQueueElement *elem)
                        VIRTQUEUE_MAX_SIZE, 0);
 }

-void *virtqueue_alloc_element(size_t sz, unsigned out_num, unsigned in_num)
+static void *virtqueue_alloc_element(size_t sz, unsigned out_num, unsigned in_num)
 {
    VirtQueueElement *elem;
    size_t in_addr_ofs = QEMU_ALIGN_UP(sz, __alignof__(elem->in_addr[0]));
@ -1935,11 +1935,6 @@ hwaddr virtio_queue_get_used_addr(VirtIODevice *vdev, int n)
    return vdev->vq[n].vring.used;
 }

-hwaddr virtio_queue_get_ring_addr(VirtIODevice *vdev, int n)
-{
-    return vdev->vq[n].vring.desc;
-}
-
 hwaddr virtio_queue_get_desc_size(VirtIODevice *vdev, int n)
 {
    return sizeof(VRingDesc) * vdev->vq[n].vring.num;
@ -1957,12 +1952,6 @@ hwaddr virtio_queue_get_used_size(VirtIODevice *vdev, int n)
        sizeof(VRingUsedElem) * vdev->vq[n].vring.num;
 }

-hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n)
-{
-    return vdev->vq[n].vring.used - vdev->vq[n].vring.desc +
-	    virtio_queue_get_used_size(vdev, n);
-}
-
 uint16_t virtio_queue_get_last_avail_idx(VirtIODevice *vdev, int n)
 {
    return vdev->vq[n].last_avail_idx;
@ -2214,6 +2203,8 @@ static void virtio_device_class_init(ObjectClass *klass, void *data)
    dc->props = virtio_properties;
    vdc->start_ioeventfd = virtio_device_start_ioeventfd_impl;
    vdc->stop_ioeventfd = virtio_device_stop_ioeventfd_impl;
+
+    vdc->legacy_features |= VIRTIO_LEGACY_FEATURES;
 }

 bool virtio_device_ioeventfd_enabled(VirtIODevice *vdev)
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@ -619,7 +619,10 @@ struct AcpiDmarDeviceScope {
    uint16_t reserved;
    uint8_t enumeration_id;
    uint8_t bus;
-    uint16_t path[0];           /* list of dev:func pairs */
+    struct {
+        uint8_t device;
+        uint8_t function;
+    } path[0];
 } QEMU_PACKED;
 typedef struct AcpiDmarDeviceScope AcpiDmarDeviceScope;

--- a/include/hw/hotplug.h
+++ b/include/hw/hotplug.h
@ -47,7 +47,6 @@ typedef void (*hotplug_fn)(HotplugHandler *plug_handler,
 * @parent: Opaque parent interface.
 * @pre_plug: pre plug callback called at start of device.realize(true)
 * @plug: plug callback called at end of device.realize(true).
- * @post_pug: post plug callback called after device is successfully plugged.
 * @unplug_request: unplug request callback.
 *                  Used as a means to initiate device unplug for devices that
 *                  require asynchronous unplug handling.
@ -62,7 +61,6 @@ typedef struct HotplugHandlerClass {
    /* <public> */
    hotplug_fn pre_plug;
    hotplug_fn plug;
-    hotplug_fn post_plug;
    hotplug_fn unplug_request;
    hotplug_fn unplug;
 } HotplugHandlerClass;
@ -85,15 +83,6 @@ void hotplug_handler_pre_plug(HotplugHandler *plug_handler,
                              DeviceState *plugged_dev,
                              Error **errp);

-/**
- * hotplug_handler_post_plug:
- *
- * Call #HotplugHandlerClass.post_plug callback of @plug_handler.
- */
-void hotplug_handler_post_plug(HotplugHandler *plug_handler,
-                               DeviceState *plugged_dev,
-                               Error **errp);
-
 /**
 * hotplug_handler_unplug_request:
 *
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@ -123,7 +123,6 @@ enum {
 union VTD_IR_TableEntry {
    struct {
 #ifdef HOST_WORDS_BIGENDIAN
-        uint32_t dest_id:32;         /* Destination ID */
        uint32_t __reserved_1:8;     /* Reserved 1 */
        uint32_t vector:8;           /* Interrupt Vector */
        uint32_t irte_mode:1;        /* IRTE Mode */
@ -147,9 +146,9 @@ union VTD_IR_TableEntry {
        uint32_t irte_mode:1;        /* IRTE Mode */
        uint32_t vector:8;           /* Interrupt Vector */
        uint32_t __reserved_1:8;     /* Reserved 1 */
-        uint32_t dest_id:32;         /* Destination ID */
 #endif
-        uint16_t source_id:16;       /* Source-ID */
+        uint32_t dest_id;            /* Destination ID */
+        uint16_t source_id;          /* Source-ID */
 #ifdef HOST_WORDS_BIGENDIAN
        uint64_t __reserved_2:44;    /* Reserved 2 */
        uint64_t sid_vtype:2;        /* Source-ID Validation Type */
@ -220,7 +219,7 @@ struct VTD_MSIMessage {
            uint32_t dest:8;
            uint32_t __addr_head:12; /* 0xfee */
 #endif
-            uint32_t __addr_hi:32;
+            uint32_t __addr_hi;
        } QEMU_PACKED;
        uint64_t msi_addr;
    };
@ -239,7 +238,7 @@ struct VTD_MSIMessage {
            uint16_t level:1;
            uint16_t trigger_mode:1;
 #endif
-            uint16_t __resved1:16;
+            uint16_t __resved1;
        } QEMU_PACKED;
        uint32_t msi_data;
    };
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@ -99,20 +99,13 @@ typedef struct NVDIMMClass NVDIMMClass;
 #define NVDIMM_ACPI_IO_LEN      4

 /*
- * The buffer, @fit, saves the FIT info for all the presented NVDIMM
- * devices which is updated after the NVDIMM device is plugged or
- * unplugged.
- *
- * Rules to use the buffer:
- *    1) the user should hold the @lock to access the buffer.
- *    2) mark @dirty whenever the buffer is updated.
- *
- * These rules preserve NVDIMM ACPI _FIT method to read incomplete
- * or obsolete fit info if fit update happens during multiple RFIT
- * calls.
+ * NvdimmFitBuffer:
+ * @fit: FIT structures for present NVDIMMs. It is updated when
+ *   the NVDIMM device is plugged or unplugged.
+ * @dirty: It allows OSPM to detect change and restart read in
+ *   progress if there is any.
 */
 struct NvdimmFitBuffer {
-    QemuMutex lock;
    GArray *fit;
    bool dirty;
 };
@ -137,5 +130,6 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
 void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
                       BIOSLinker *linker, AcpiNVDIMMState *state,
                       uint32_t ram_slots);
-void nvdimm_acpi_hotplug(AcpiNVDIMMState *state);
+void nvdimm_plug(AcpiNVDIMMState *state);
+void nvdimm_acpi_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev);
 #endif
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@ -14,11 +14,12 @@ struct vhost_virtqueue {
    void *avail;
    void *used;
    int num;
+    unsigned long long desc_phys;
+    unsigned desc_size;
+    unsigned long long avail_phys;
+    unsigned avail_size;
    unsigned long long used_phys;
    unsigned used_size;
-    void *ring;
-    unsigned long long ring_phys;
-    unsigned ring_size;
    EventNotifier masked_notifier;
 };

--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@ -113,6 +113,11 @@ typedef struct VirtioDeviceClass {
    void (*set_config)(VirtIODevice *vdev, const uint8_t *config);
    void (*reset)(VirtIODevice *vdev);
    void (*set_status)(VirtIODevice *vdev, uint8_t val);
+    /* For transitional devices, this is a bitmap of features
+     * that are only exposed on the legacy interface but not
+     * the modern one.
+     */
+    uint64_t legacy_features;
    /* Test and clear event pending status.
     * Should be called after unmask to avoid losing events.
     * If backend does not support masking,
@ -154,14 +159,13 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,

 void virtio_del_queue(VirtIODevice *vdev, int n);

-void *virtqueue_alloc_element(size_t sz, unsigned out_num, unsigned in_num);
 void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
                    unsigned int len);
 void virtqueue_flush(VirtQueue *vq, unsigned int count);
 void virtqueue_detach_element(VirtQueue *vq, const VirtQueueElement *elem,
                              unsigned int len);
-void virtqueue_discard(VirtQueue *vq, const VirtQueueElement *elem,
-                       unsigned int len);
+void virtqueue_unpop(VirtQueue *vq, const VirtQueueElement *elem,
+                     unsigned int len);
 bool virtqueue_rewind(VirtQueue *vq, unsigned int num);
 void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
                    unsigned int len, unsigned int idx);
@ -255,11 +259,9 @@ typedef struct VirtIORNGConf VirtIORNGConf;
 hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_avail_addr(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_used_addr(VirtIODevice *vdev, int n);
-hwaddr virtio_queue_get_ring_addr(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_desc_size(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_avail_size(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_used_size(VirtIODevice *vdev, int n);
-hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n);
 uint16_t virtio_queue_get_last_avail_idx(VirtIODevice *vdev, int n);
 void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx);
 void virtio_queue_invalidate_signalled_used(VirtIODevice *vdev, int n);