Commit Graph

163 Commits

Author SHA1 Message Date
David Hildenbrand 309ebfa691 vhost: Remove vhost_backend_can_merge() callback
Checking whether the memory regions are equal is sufficient: if they are
equal, then most certainly the contained fd is equal.

The whole vhost-user memslot handling is suboptimal and overly
complicated. We shouldn't have to lookup a RAM memory regions we got
notified about in vhost_user_get_mr_data() using a host pointer. But that
requires a bigger rework -- especially an alternative vhost_set_mem_table()
backend call that simply consumes MemoryRegionSections.

For now, let's just drop vhost_backend_can_merge().

Message-ID: <20230926185738.277351-3-david@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:21 +02:00
David Hildenbrand 552b25229c vhost: Rework memslot filtering and fix "used_memslot" tracking
Having multiple vhost devices, some filtering out fd-less memslots and
some not, can mess up the "used_memslot" accounting. Consequently our
"free memslot" checks become unreliable and we might run out of free
memslots at runtime later.

An example sequence which can trigger a potential issue that involves
different vhost backends (vhost-kernel and vhost-user) and hotplugged
memory devices can be found at [1].

Let's make the filtering mechanism less generic and distinguish between
backends that support private memslots (without a fd) and ones that only
support shared memslots (with a fd). Track the used_memslots for both
cases separately and use the corresponding value when required.

Note: Most probably we should filter out MAP_PRIVATE fd-based RAM regions
(for example, via memory-backend-memfd,...,shared=off or as default with
 memory-backend-file) as well. When not using MAP_SHARED, it might not work
as expected. Add a TODO for now.

[1] https://lkml.kernel.org/r/fad9136f-08d3-3fd9-71a1-502069c000cf@redhat.com

Message-ID: <20230926185738.277351-2-david@redhat.com>
Fixes: 988a27754b ("vhost: allow backends to filter memory sections")
Cc: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:21 +02:00
Albert Esteve 1609476662 vhost-user: add shared_object msg
Add three new vhost-user protocol
`VHOST_USER_BACKEND_SHARED_OBJECT_* messages`.
These new messages are sent from vhost-user
back-ends to interact with the virtio-dmabuf
table in order to add or remove themselves as
virtio exporters, or lookup for virtio dma-buf
shared objects.

The action taken in the front-end depends
on the type stored in the virtio shared
object hash table.

When the table holds a pointer to a vhost
backend for a given UUID, the front-end sends
a VHOST_USER_GET_SHARED_OBJECT to the
backend holding the shared object.

The messages can only be sent after successfully
negotiating a new VHOST_USER_PROTOCOL_F_SHARED_OBJECT
vhost-user protocol feature bit.

Finally, refactor code to send response message so
that all common parts both for the common REPLY_ACK
case, and other data responses, can call it and
avoid code repetition.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20231002065706.94707-4-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Jonah Palmer 3d123a8b41 vhost-user: move VhostUserProtocolFeature definition to header file
Move the definition of VhostUserProtocolFeature to
include/hw/virtio/vhost-user.h.

Remove previous definitions in hw/scsi/vhost-user-scsi.c,
hw/virtio/vhost-user.c, and hw/virtio/virtio-qmp.c.

Previously there were 3 separate definitions of this over 3 different
files. Now only 1 definition of this will be present for these 3 files.

Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20230926224107.2951144-4-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 04:54:28 -04:00
Tom Lonergan 667e58aef1 vhost-user: Make RESET_DEVICE a per device message
A device reset is issued per device, not per VQ. The legacy device reset
message, VHOST_USER_RESET_OWNER, is already a per device message. Therefore,
this change adds the proper message, VHOST_USER_RESET_DEVICE, to per device
messages.

Signed-off-by: Tom Lonergan <tom.lonergan@nutanix.com>
Message-Id: <20230628163927.108171-3-tom.lonergan@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2023-07-10 16:17:08 -04:00
Tom Lonergan 0dcb4172f2 vhost-user: Change one_time to per_device request
Some devices, like virtio-scsi, consist of one vhost_dev, while others, like
virtio-net, contain multiple vhost_devs. The QEMU vhost-user code has a
concept of one-time messages which is misleading. One-time messages are sent
once per operation on the device, not once for the lifetime of the device.
Therefore, as discussed in [1], vhost_user_one_time_request should be
renamed to vhost_user_per_device_request and the relevant comments updated
to match the real functionality.

[1] https://lore.kernel.org/qemu-devel/20230127083027-mutt-send-email-mst@kernel.org/

Signed-off-by: Tom Lonergan <tom.lonergan@nutanix.com>
Message-Id: <20230628163927.108171-2-tom.lonergan@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2023-07-10 16:17:08 -04:00
Manos Pitsidianakis f8ed3648b5 vhost-user: fully use new backend/frontend naming
Slave/master nomenclature was replaced with backend/frontend in commit
1fc19b6527 ("vhost-user: Adopt new backend naming")

This patch replaces all remaining uses of master and slave in the
codebase.

Signed-off-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20230613080849.2115347-1-manos.pitsidianakis@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2023-06-26 09:50:00 -04:00
Gowrishankar Muthukrishnan 5c33f9783a cryptodev-vhost-user: add asymmetric crypto support
Add asymmetric crypto support in vhost_user backend.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Message-Id: <20230516083139.2349744-1-gmuthukrishn@marvell.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-23 02:54:44 -04:00
Alexander Graf 4b870dc4d0 hostmem-file: add offset option
Add an option for hostmem-file to start the memory object at an offset
into the target file. This is useful if multiple memory objects reside
inside the same target file, such as a device node.

In particular, it's useful to map guest memory directly into /dev/mem
for experimentation.

To make this work consistently, also fix up all places in QEMU that
expect fd offsets to be 0.

Signed-off-by: Alexander Graf <graf@amazon.com>
Message-Id: <20230403221421.60877-1-graf@amazon.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-05-23 16:47:03 +02:00
David Hildenbrand bab105300b vhost-user: Remove acpi-specific memslot limit
Let's just support 512 memslots on x86-64 and aarch64 as well. The maximum
number of ACPI slots (256) is no longer completely expressive ever since
we supported virtio-based memory devices. Further, we're completely
ignoring other memslots used outside of memory device context, such as
memslots used for boot memory.

Note that the vhost memslot limit in the kernel is usually configured to
be 509. With this change, we prepare vhost-user on the QEMU side to be
closer to that limit, to eventually support ~512 memslots in most vhost
implementations and have less "surprises" when cold/hotplugging vhost
devices while also consuming more memslots than we're currently used to
by memory devices (e.g., once virtio-mem starts using multiple memslots).

Note that most vhost-user implementations only support a small number of
memslots so far, which we can hopefully improve in the near future.

We'll leave the PPC special-case as is for now.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230503184144.808478-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-19 10:30:46 -04:00
Stefan Hajnoczi 6f8be29ec1 vhost-user: send SET_STATUS 0 after GET_VRING_BASE
Setting the VIRTIO Device Status Field to 0 resets the device. The
device's state is lost, including the vring configuration.

vhost-user.c currently sends SET_STATUS 0 before GET_VRING_BASE. This
risks confusion about the lifetime of the vhost-user state (e.g. vring
last_avail_idx) across VIRTIO device reset.

Eugenio Pérez <eperezma@redhat.com> adjusted the order for vhost-vdpa.c
in commit c3716f260b ("vdpa: move vhost reset after get vring base")
and in that commit description suggested doing the same for vhost-user
in the future.

Go ahead and adjust vhost-user.c now. I ran various online code searches
to identify vhost-user backends implementing SET_STATUS. It seems only
DPDK implements SET_STATUS and Yajun Wu <yajunw@nvidia.com> has
confirmed that it is safe to make this change.

Fixes: commit 923b8921d2 ("vhost-user: Support vhost_dev_start")
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cindy Lu <lulu@redhat.com>
Cc: Yajun Wu <yajunw@nvidia.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230501230409.274178-1-stefanha@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-19 10:30:46 -04:00
Albert Esteve 90e31232cf hw/virtio/vhost-user: avoid using unitialized errp
During protocol negotiation, when we the QEMU
stub does not support a backend with F_CONFIG,
it throws a warning and supresses the
VHOST_USER_PROTOCOL_F_CONFIG bit.

However, the warning uses warn_reportf_err macro
and passes an unitialized errp pointer. However,
the macro tries to edit the 'msg' member of the
unitialized Error and segfaults.

Instead, just use warn_report, which prints a
warning message directly to the output.

Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported")
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20230302121719.9390-1-aesteve@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 19:51:07 -05:00
Maxime Coquelin a84ec9935f vhost-user: Adopt new backend naming
The Vhost-user specification changed feature and request
naming from _SLAVE_ to _BACKEND_.

This patch adopts the new naming convention.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-Id: <20230208203259.381326-4-maxime.coquelin@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02 03:10:48 -05:00
Greg Kurz 4382138f64 Revert "vhost-user: Introduce nested event loop in vhost_user_read()"
This reverts commit a7f523c7d1.

The nested event loop is broken by design. It's only user was removed.
Drop the code as well so that nobody ever tries to use it again.

I had to fix a couple of trivial conflicts around return values because
of 025faa872b ("vhost-user: stick to -errno error return convention").

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20230119172424.478268-3-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2023-01-28 06:21:30 -05:00
Greg Kurz f340a59d5a Revert "vhost-user: Monitor slave channel in vhost_user_read()"
This reverts commit db8a3772e3.

Motivation : this is breaking vhost-user with DPDK as reported in [0].

Received unexpected msg type. Expected 22 received 40
Fail to update device iotlb
Received unexpected msg type. Expected 40 received 22
Received unexpected msg type. Expected 22 received 11
Fail to update device iotlb
Received unexpected msg type. Expected 11 received 22
vhost VQ 1 ring restore failed: -71: Protocol error (71)
Received unexpected msg type. Expected 22 received 11
Fail to update device iotlb
Received unexpected msg type. Expected 11 received 22
vhost VQ 0 ring restore failed: -71: Protocol error (71)
unable to start vhost net: 71: falling back on userspace virtio

The failing sequence that leads to the first error is :
- QEMU sends a VHOST_USER_GET_STATUS (40) request to DPDK on the master
  socket
- QEMU starts a nested event loop in order to wait for the
  VHOST_USER_GET_STATUS response and to be able to process messages from
  the slave channel
- DPDK sends a couple of legitimate IOTLB miss messages on the slave
  channel
- QEMU processes each IOTLB request and sends VHOST_USER_IOTLB_MSG (22)
  updates on the master socket
- QEMU assumes to receive a response for the latest VHOST_USER_IOTLB_MSG
  but it gets the response for the VHOST_USER_GET_STATUS instead

The subsequent errors have the same root cause : the nested event loop
breaks the order by design. It lures QEMU to expect responses to the
latest message sent on the master socket to arrive first.

Since this was only needed for DAX enablement which is still not merged
upstream, just drop the code for now. A working solution will have to
be merged later on. Likely protect the master socket with a mutex
and service the slave channel with a separate thread, as discussed with
Maxime in the mail thread below.

[0] https://lore.kernel.org/qemu-devel/43145ede-89dc-280e-b953-6a2b436de395@redhat.com/

Reported-by: Yanghang Liu <yanghliu@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2155173
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20230119172424.478268-2-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2023-01-28 06:21:30 -05:00
Minghao Yuan 920c184fa9 vhost-user: Skip unnecessary duplicated VHOST_USER_ADD/REM_MEM_REG requests
The VHOST_USER_ADD/REM_MEM_REG requests should be categorized into
non-vring specific messages, and should be sent only once.

Signed-off-by: Minghao Yuan <yuanmh12@chinatelecom.cn>
Message-Id: <20230123122119.194347-1-yuanmh12@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-28 06:21:30 -05:00
Akihiko Odaki 744734ccc9 vhost-user: Correct a reference of TARGET_AARCH64
Presumably TARGET_ARM_64 should be a mistake of TARGET_AARCH64.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230109063130.81296-1-akihiko.odaki@daynix.com>
Fixes: 27598393a2 ("Lift max memory slots limit imposed by vhost-user")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-27 11:47:02 -05:00
Yajun Wu c98ac64cfb vhost-user: send set log base message only once
Vhost message VHOST_USER_SET_LOG_BASE is device wide. So only
send it once with the first queue pair.

Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Parav Pandit <parav@nvidia.com>
Message-Id: <20221122051447.248462-1-yajunw@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-21 06:35:28 -05:00
Markus Armbruster 66997c42e0 cleanup: Tweak and re-run return_directly.cocci
Tweak the semantic patch to drop redundant parenthesis around the
return expression.

Coccinelle drops a comment in hw/rdma/vmw/pvrdma_cmd.c; restored
manually.

Coccinelle messes up vmdk_co_create(), not sure why.  Change dropped,
will be done manually in the next commit.

Line breaks in target/avr/cpu.h and hw/rdma/vmw/pvrdma_cmd.c tidied up
manually.

Whitespace in tools/virtiofsd/fuse_lowlevel.c tidied up manually.

checkpatch.pl complains "return of an errno should typically be -ve"
two times for hw/9pfs/9p-synth.c.  Preexisting, the patch merely makes
it visible to checkpatch.pl.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221122134917.1217307-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-12-14 16:19:35 +01:00
Alex Bennée 71e076a07d hw/virtio: generalise CHR_EVENT_CLOSED handling
..and use for both virtio-user-blk and virtio-user-gpio. This avoids
the circular close by deferring shutdown due to disconnection until a
later point. virtio-user-blk already had this mechanism in place so
generalise it as a vhost-user helper function and use for both blk and
gpio devices.

While we are at it we also fix up vhost-user-gpio to re-establish the
event handler after close down so we can reconnect later.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20221130112439.2527228-5-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-01 02:30:13 -05:00
Yajun Wu 923b8921d2 vhost-user: Support vhost_dev_start
The motivation of adding vhost-user vhost_dev_start support is to
improve backend configuration speed and reduce live migration VM
downtime.

Today VQ configuration is issued one by one. For virtio net with
multi-queue support, backend needs to update RSS (Receive side
scaling) on every rx queue enable. Updating RSS is time-consuming
(typical time like 7ms).

Implement already defined vhost status and message in the vhost
specification [1].
(a) VHOST_USER_PROTOCOL_F_STATUS
(b) VHOST_USER_SET_STATUS
(c) VHOST_USER_GET_STATUS

Send message VHOST_USER_SET_STATUS with VIRTIO_CONFIG_S_DRIVER_OK for
device start and reset(0) for device stop.

On reception of the DRIVER_OK message, backend can apply the needed setting
only once (instead of incremental) and also utilize parallelism on enabling
queues.

This improves QEMU's live migration downtime with vhost user backend
implementation by great margin, specially for the large number of VQs of 64
from 800 msec to 250 msec.

[1] https://qemu-project.gitlab.io/qemu/interop/vhost-user.html

Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Parav Pandit <parav@nvidia.com>
Message-Id: <20221017064452.1226514-3-yajunw@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-11-07 14:08:17 -05:00
Yajun Wu bd437c960f vhost-user: Fix out of order vring host notification handling
vhost backend sends host notification for every VQ. If backend creates
VQs in parallel, the VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG may
arrive to QEMU in different order than incremental queue index order.

For example VQ 1's message arrive earlier than VQ 0's:
After alloc VhostUserHostNotifier for VQ 1. GPtrArray becomes

    [ nil, VQ1 pointer ]

After alloc VhostUserHostNotifier for VQ 0. GPtrArray becomes

    [ VQ0 pointer, nil, VQ1 pointer ]

This is wrong. fetch_notifier will return NULL for VQ 1 in
vhost_user_get_vring_base, causes host notifier miss removal(leak).

The fix is to remove current element from GPtrArray, make the right
position for element to insert.

Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers")
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Parav Pandit <parav@nvidia.com>

Message-Id: <20221018023651.1359420-1-yajunw@nvidia.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-11-07 13:12:20 -05:00
Alex Bennée c97c76b3e7 hw/virtio: fix some coding style issues
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220802095010.3330793-8-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-10-07 09:41:51 -04:00
Alex Bennée 02b61f38d3 hw/virtio: incorporate backend features in features
There are some extra bits used over a vhost-user connection which are
hidden from the device itself. We need to set them here to ensure we
enable things like the protocol extensions.

Currently net/vhost-user.c has it's own inscrutable way of persisting
this data but it really should live in the core vhost_user code.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220726192150.2435175-7-alex.bennee@linaro.org>
Message-Id: <20220802095010.3330793-2-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-10-07 09:41:51 -04:00
Guoyi Tu 9cbda7b354 vhost-user: Call qemu_socketpair() instead of socketpair()
As the close-on-exec flags is not set on the file descriptors returned
by socketpair() at default, the fds will survive across exec' function.

In the case that exec' function get invoked, such as the live-update feature
which is been developing, it will cause fd leaks.

To address this problem, we should call qemu_socketpair() to create an pair of
connected sockets with the close-on-exec flag set.

Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <7002b12a5fb0a30cd878e14e07da61c36da72913.1661240709.git.tugy@chinatelecom.cn>
2022-09-29 14:38:05 +04:00
Alex Bennée 643a943554 hw/virtio: fix vhost_user_read tracepoint
As reads happen in the callback we were never seeing them. We only
really care about the header so move the tracepoint to when the header
is complete.

Fixes: 6ca6d8ee9d (hw/virtio: add vhost_user_[read|write] trace points)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220728135503.1060062-5-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-17 07:07:37 -04:00
Konstantin Khlebnikov 60dc3c5be9 vhost: add method vhost_set_vring_err
Kernel and user vhost may report virtqueue errors via eventfd.
This is only reliable way to get notification about protocol error.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Message-Id: <20220623161325.18813-2-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
2022-06-27 18:53:18 -04:00
Yajun Wu b595d6272e virtio/vhost-user: Fix wrong vhost notifier GPtrArray size
In fetch_or_create_notifier, idx begins with 0. So the GPtrArray size
should be idx + 1 and g_ptr_array_set_size should be called with idx + 1.

This wrong GPtrArray size causes fetch_or_create_notifier return an invalid
address. Passing this invalid pointer to vhost_user_host_notifier_remove
causes assert fail:

    qemu/include/qemu/int128.h:27: int128_get64: Assertion `r == a' failed.
	shutting down, reason=crashed

Backends like dpdk-vdpa which sends out vhost notifier requests almost always
hit qemu crash.

Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers")
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Parav Pandit <parav@nvidia.com>
Change-Id: I87e0f7591ca9a59d210879b260704a2d9e9d6bcd
Message-Id: <20220526034851.683258-1-yajunw@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2022-06-16 12:54:58 -04:00
Changpeng Liu fb38d0c97d hw/virtio/vhost-user: don't use uninitialized variable
Variable `vdev` in `struct vhost_dev` will not be ready
until start the device, so let's not use it for the error
output here.

Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported")

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Message-Id: <20220525125540.50979-1-changpeng.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported")
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported")
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
2022-06-09 19:32:49 -04:00
Alex Bennée 503e355465 virtio/vhost-user: dynamically assign VhostUserHostNotifiers
At a couple of hundred bytes per notifier allocating one for every
potential queue is very wasteful as most devices only have a few
queues. Instead of having this handled statically dynamically assign
them and track in a GPtrArray.

[AJB: it's hard to trigger the vhost notifiers code, I assume as it
requires a KVM guest with appropriate backend]

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220321153037.3622127-14-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16 04:38:40 -04:00
Alex Bennée 56534930b5 hw/virtio/vhost-user: don't suppress F_CONFIG when supported
Previously we would silently suppress VHOST_USER_PROTOCOL_F_CONFIG
during the protocol negotiation if the QEMU stub hadn't implemented
the vhost_dev_config_notifier. However this isn't the only way we can
handle config messages, the existing vdc->get/set_config can do this
as well.

Lightly re-factor the code to check for both potential methods and
instead of silently squashing the feature error out. It is unlikely
that a vhost-user backend expecting to handle CONFIG messages will
behave correctly if they never get sent.

Fixes: 1c3e5a2617 ("vhost-user: back SET/GET_CONFIG requests with a protocol feature")
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Message-Id: <20220321153037.3622127-13-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16 04:38:40 -04:00
Alex Bennée 6ca6d8ee9d hw/virtio: add vhost_user_[read|write] trace points
These are useful when trying to debug the initial vhost-user
negotiation, especially when it hard to get logging from the low level
library on the other side.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Message-Id: <20220321153037.3622127-4-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-05-16 04:38:40 -04:00
Murilo Opsfelder Araujo 97252353c1 vhost-user: Use correct macro name TARGET_PPC64
The correct name of the macro is TARGET_PPC64.

Fixes: 27598393a2 ("Lift max memory slots limit imposed by vhost-user")
Reported-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
Cc: Peter Turschmid <peter.turschm@nutanix.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20220503180108.34506-1-muriloo@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-05 15:36:16 -03:00
Kevin Wolf a81d8d4a72 vhost-user: Don't pass file descriptor for VHOST_USER_REM_MEM_REG
The spec clarifies now that QEMU should not send a file descriptor in a
request to remove a memory region. Change it accordingly.

For libvhost-user, this is a bug fix that makes it compatible with
rust-vmm's implementation that doesn't send a file descriptor. Keep
accepting, but ignoring a file descriptor for compatibility with older
QEMU versions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220407133657.155281-4-kwolf@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-05-04 15:55:23 +02:00
Marc-André Lureau ff5927baa7 util: rename qemu_*block() socket functions
The qemu_*block() functions are meant to be be used with sockets (the
win32 implementation expects SOCKET)

Over time, those functions where used with Win32 SOCKET or
file-descriptors interchangeably. But for portability, they must only be
used with socket-like file-descriptors. FDs can use
g_unix_set_fd_nonblocking() instead.

Rename the functions with "socket" in the name to prevent bad usages.

This is effectively reverting commit f9e8cacc55 ("oslib-posix:
rename socket_set_nonblock() to qemu_set_nonblock()").

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-05-03 15:53:20 +04:00
Marc-André Lureau 8e3b0cbb72 Replace qemu_real_host_page variables with inlined functions
Replace the global variables with inlined helper functions. getpagesize() is very
likely annotated with a "const" function attribute (at least with glibc), and thus
optimization should apply even better.

This avoids the need for a constructor initialization too.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 10:50:38 +02:00
Xueming Li 0b0af4d62f vhost-user: fix VirtQ notifier cleanup
When vhost-user device cleanup, remove notifier MR and munmaps notifier
address in the event-handling thread, VM CPU thread writing the notifier
in concurrent fails with an error of accessing invalid address. It
happens because MR is still being referenced and accessed in another
thread while the underlying notifier mmap address is being freed and
becomes invalid.

This patch calls RCU and munmap notifiers in the callback after the
memory flatview update finish.

Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Cc: qemu-stable@nongnu.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Message-Id: <20220207071929.527149-3-xuemingl@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04 08:30:52 -05:00
Xueming Li e867144b73 vhost-user: remove VirtQ notifier restore
Notifier set when vhost-user backend asks qemu to mmap an FD and
offset. When vhost-user backend restart or getting killed, VQ notifier
FD and mmap addresses become invalid. After backend restart, MR contains
the invalid address will be restored and fail on notifier access.

On the other hand, qemu should munmap the notifier, release underlying
hardware resources to enable backend restart and allocate hardware
notifier resources correctly.

Qemu shouldn't reference and use resources of disconnected backend.

This patch removes VQ notifier restore, uses the default vhost-user
notifier to avoid invalid address access.

After backend restart, the backend should ask qemu to install a hardware
notifier if needed.

Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Cc: qemu-stable@nongnu.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Message-Id: <20220207071929.527149-2-xuemingl@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04 08:30:52 -05:00
Roman Kagan 025faa872b vhost-user: stick to -errno error return convention
VhostOps methods in user_ops are not very consistent in their error
returns: some return negated errno while others just -1.

Make sure all of them consistently return negated errno.  This also
helps error propagation from the functions being called inside.
Besides, this synchronizes the error return convention with the other
two vhost backends, kernel and vdpa, and will therefore allow for
consistent error propagation in the generic vhost code (in a followup
patch).

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-9-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Xueming Li a1ed9ef1de vhost-user: fix duplicated notifier MR init
In case of device resume after suspend, VQ notifier MR still valid.
Duplicated registrations explode memory block list and slow down device
resume.

Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Cc: tiwei.bie@intel.com
Cc: qemu-stable@nongnu.org
Cc: Yuwei Zhang <zhangyuwei.9149@bytedance.com>

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Message-Id: <20211008080215.590292-1-xuemingl@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-20 04:37:55 -04:00
Yuwei Zhang c6effa9cf5 hw/virtio: Add flatview update in vhost_user_cleanup()
Qemu will crash on vhost backend unexpected exit and re-connect                                                                          │
in some case due to access released memory.

Signed-off-by: Yuwei Zhang <zhangyuwei.9149@bytedance.com>
Message-Id: <20210830123433.45727-1-zhangyuwei.9149@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 17:34:05 -04:00
Alyssa Ross edb40732bf vhost-user: add missing space in error message
This would previously give error messages like

> Received unexpected msg type.Expected 0 received 1

Signed-off-by: Alyssa Ross <hi@alyssa.is>
Message-Id: <20210806143926.315725-1-hi@alyssa.is>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 09:07:46 -04:00
Denis Plotnikov 699f2e535d vhost: make SET_VRING_ADDR, SET_FEATURES send replies
On vhost-user-blk migration, qemu normally sends a number of commands
to enable logging if VHOST_USER_PROTOCOL_F_LOG_SHMFD is negotiated.
Qemu sends VHOST_USER_SET_FEATURES to enable buffers logging and
VHOST_USER_SET_VRING_ADDR per each started ring to enable "used ring"
data logging.
The issue is that qemu doesn't wait for reply from the vhost daemon
for these commands which may result in races between qemu expectation
of logging starting and actual login starting in vhost daemon.

The race can appear as follows: on migration setup, qemu enables dirty page
logging by sending VHOST_USER_SET_FEATURES. The command doesn't arrive to a
vhost-user-blk daemon immediately and the daemon needs some time to turn the
logging on internally. If qemu doesn't wait for reply, after sending the
command, qemu may start migrateing memory pages to a destination. At this time,
the logging may not be actually turned on in the daemon but some guest pages,
which the daemon is about to write to, may have already been transferred
without logging to the destination. Since the logging wasn't turned on,
those pages won't be transferred again as dirty. So we may end up with
corrupted data on the destination.
The same scenario is applicable for "used ring" data logging, which is
turned on with VHOST_USER_SET_VRING_ADDR command.

To resolve this issue, this patch makes qemu wait for the command result
explicitly if VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated and logging enabled.

Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>

Message-Id: <20210809104824.78830-1-den-plotnikov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 09:07:45 -04:00
Yajun Wu 1f89d3b91e hw/virtio: Fix leak of host-notifier memory-region
If call virtio_queue_set_host_notifier_mr fails, should free
host-notifier memory-region.

Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Message-Id: <1629077555-19907-1-git-send-email-yajunw@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 09:07:45 -04:00
Markus Armbruster 998647dc8f vhost: Clean up how VhostOpts method vhost_backend_init() fails
vhost_user_backend_init() can fail without setting an error.  Unclean.
Its caller vhost_dev_init() compensates by substituting a generic
error then.  Goes back to commit 28770ff935 "vhost: Distinguish errors
in vhost_backend_init()".

Clean up by moving the generic error from vhost_dev_init() to all the
failure paths that neglect to set an error.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210720125408.387910-14-armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-08-26 17:15:28 +02:00
Markus Armbruster 66647ed459 vhost: Clean up how VhostOpts method vhost_get_config() fails
vhost_user_get_config() can fail without setting an error.  Unclean.
Its caller vhost_dev_get_config() compensates by substituting a
generic error then.  Goes back to commit 50de51387f "vhost:
Distinguish errors in vhost_dev_get_config()".

Clean up by moving the generic error from vhost_dev_get_config() to
all the failure paths that neglect to set an error.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210720125408.387910-13-armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Sign of error_setg_errno()'s second argument fixed in both calls]
2021-08-26 17:15:28 +02:00
Marc-André Lureau bf7b1eab25 chardev: mark explicitly first argument as poisoned
Since commit 9894dc0cdc "char: convert
from GIOChannel to QIOChannel", the first argument to the watch callback
can actually be a QIOChannel, which is not a GIOChannel (but a QEMU
Object).

Even though we never used that pointer, change the callback type to warn
the users. Possibly a better fix later, we may want to store the
callback and call it from intermediary functions.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-08-05 16:15:33 +04:00
Kevin Wolf 84affad1fd vhost-user: Fix backends without multiqueue support
dev->max_queues was never initialised for backends that don't support
VHOST_USER_PROTOCOL_F_MQ, so it would use 0 as the maximum number of
queues to check against and consequently fail for any such backend.

Set it to 1 if the backend doesn't have multiqueue support.

Fixes: c90bd505a3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210705171429.29286-1-kwolf@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-09 12:26:05 +02:00
Kevin Wolf 50de51387f vhost: Distinguish errors in vhost_dev_get_config()
Instead of just returning 0/-1 and letting the caller make up a
meaningless error message, add an Error parameter to allow reporting the
real error and switch to 0/-errno so that different kind of errors can
be distinguished in the caller.

config_len in vhost_user_get_config() is defined by the device, so if
it's larger than VHOST_USER_MAX_CONFIG_SIZE, this is a programming
error. Turn the corresponding check into an assertion.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-6-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-30 13:18:42 +02:00
Kevin Wolf f2a6e6c4fa vhost: Return 0/-errno in vhost_dev_init()
Instead of just returning 0/-1 and letting the caller make up a
meaningless error message, switch to 0/-errno so that different kinds of
errors can be distinguished in the caller.

This involves changing a few more callbacks in VhostOps to return
0/-errno: .vhost_set_owner(), .vhost_get_features() and
.vhost_virtqueue_set_busyloop_timeout(). The implementations of these
functions are trivial as they generally just send a message to the
backend.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-4-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-30 13:16:05 +02:00