ShuriZma/xemu - xemu

Commit Graph

Author	SHA1	Message	Date
Philippe Mathieu-Daudé	1dfd42c426	hw/rdma: Remove deprecated pvrdma device and rdmacm-mux helper The whole RDMA subsystem was deprecated in commit `e9a54265f5` ("hw/rdma: Deprecate the pvrdma device and the rdma subsystem") released in v8.2. Remove: - PVRDMA device - generated vmw_pvrdma/ directory from linux-headers - rdmacm-mux tool from contrib/ Cc: Yuval Shaia <yuval.shaia.ml@gmail.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20240328130255.52257-2-philmd@linaro.org>	2024-04-24 16:03:38 +02:00
Ilya Maximets	cb039ef3d9	net: add initial support for AF_XDP network backend AF_XDP is a network socket family that allows communication directly with the network device driver in the kernel, bypassing most or all of the kernel networking stack. In the essence, the technology is pretty similar to netmap. But, unlike netmap, AF_XDP is Linux-native and works with any network interfaces without driver modifications. Unlike vhost-based backends (kernel, user, vdpa), AF_XDP doesn't require access to character devices or unix sockets. Only access to the network interface itself is necessary. This patch implements a network backend that communicates with the kernel by creating an AF_XDP socket. A chunk of userspace memory is shared between QEMU and the host kernel. 4 ring buffers (Tx, Rx, Fill and Completion) are placed in that memory along with a pool of memory buffers for the packet data. Data transmission is done by allocating one of the buffers, copying packet data into it and placing the pointer into Tx ring. After transmission, device will return the buffer via Completion ring. On Rx, device will take a buffer form a pre-populated Fill ring, write the packet data into it and place the buffer into Rx ring. AF_XDP network backend takes on the communication with the host kernel and the network interface and forwards packets to/from the peer device in QEMU. Usage example: -device virtio-net-pci,netdev=guest1,mac=00:16:35:AF:AA:5C -netdev af-xdp,ifname=ens6f1np1,id=guest1,mode=native,queues=1 XDP program bridges the socket with a network interface. It can be attached to the interface in 2 different modes: 1. skb - this mode should work for any interface and doesn't require driver support. With a caveat of lower performance. 2. native - this does require support from the driver and allows to bypass skb allocation in the kernel and potentially use zero-copy while getting packets in/out userspace. By default, QEMU will try to use native mode and fall back to skb. Mode can be forced via 'mode' option. To force 'copy' even in native mode, use 'force-copy=on' option. This might be useful if there is some issue with the driver. Option 'queues=N' allows to specify how many device queues should be open. Note that all the queues that are not open are still functional and can receive traffic, but it will not be delivered to QEMU. So, the number of device queues should generally match the QEMU configuration, unless the device is shared with something else and the traffic re-direction to appropriate queues is correctly configured on a device level (e.g. with ethtool -N). 'start-queue=M' option can be used to specify from which queue id QEMU should start configuring 'N' queues. It might also be necessary to use this option with certain NICs, e.g. MLX5 NICs. See the docs for examples. In a general case QEMU will need CAP_NET_ADMIN and CAP_SYS_ADMIN or CAP_BPF capabilities in order to load default XSK/XDP programs to the network interface and configure BPF maps. It is possible, however, to run with no capabilities. For that to work, an external process with enough capabilities will need to pre-load default XSK program, create AF_XDP sockets and pass their file descriptors to QEMU process on startup via 'sock-fds' option. Network backend will need to be configured with 'inhibit=on' to avoid loading of the program. QEMU will need 32 MB of locked memory (RLIMIT_MEMLOCK) per queue or CAP_IPC_LOCK. There are few performance challenges with the current network backends. First is that they do not support IO threads. This means that data path is handled by the main thread in QEMU and may slow down other work or may be slowed down by some other work. This also means that taking advantage of multi-queue is generally not possible today. Another thing is that data path is going through the device emulation code, which is not really optimized for performance. The fastest "frontend" device is virtio-net. But it's not optimized for heavy traffic either, because it expects such use-cases to be handled via some implementation of vhost (user, kernel, vdpa). In practice, we have virtio notifications and rcu lock/unlock on a per-packet basis and not very efficient accesses to the guest memory. Communication channels between backend and frontend devices do not allow passing more than one packet at a time as well. Some of these challenges can be avoided in the future by adding better batching into device emulation or by implementing vhost-af-xdp variant. There are also a few kernel limitations. AF_XDP sockets do not support any kinds of checksum or segmentation offloading. Buffers are limited to a page size (4K), i.e. MTU is limited. Multi-buffer support implementation for AF_XDP is in progress, but not ready yet. Also, transmission in all non-zero-copy modes is synchronous, i.e. done in a syscall. That doesn't allow high packet rates on virtual interfaces. However, keeping in mind all of these challenges, current implementation of the AF_XDP backend shows a decent performance while running on top of a physical NIC with zero-copy support. Test setup: 2 VMs running on 2 physical hosts connected via ConnectX6-Dx card. Network backend is configured to open the NIC directly in native mode. The driver supports zero-copy. NIC is configured to use 1 queue. Inside a VM - iperf3 for basic TCP performance testing and dpdk-testpmd for PPS testing. iperf3 result: TCP stream : 19.1 Gbps dpdk-testpmd (single queue, single CPU core, 64 B packets) results: Tx only : 3.4 Mpps Rx only : 2.0 Mpps L2 FWD Loopback : 1.5 Mpps In skb mode the same setup shows much lower performance, similar to the setup where pair of physical NICs is replaced with veth pair: iperf3 result: TCP stream : 9 Gbps dpdk-testpmd (single queue, single CPU core, 64 B packets) results: Tx only : 1.2 Mpps Rx only : 1.0 Mpps L2 FWD Loopback : 0.7 Mpps Results in skb mode or over the veth are close to results of a tap backend with vhost=on and disabled segmentation offloading bridged with a NIC. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> (docker/lcitool) Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Philippe Mathieu-Daudé	b91b0fc163	accel: Remove HAX accelerator HAX is deprecated since commits `73741fda6c` ("MAINTAINERS: Abort HAXM maintenance") and `90c167a1da` ("docs/about/deprecated: Mark HAXM in QEMU as deprecated"), released in v8.0.0. Per the latest HAXM release (v7.8 []), the latest QEMU supported is v7.2: Note: Up to this release, HAXM supports QEMU from 2.9.0 to 7.2.0. The next commit (https://github.com/intel/haxm/commit/da1b8ec072) added: HAXM v7.8.0 is our last release and we will not accept pull requests or respond to issues after this. It became very hard to build and test HAXM. Its previous maintainers made it clear they won't help. It doesn't seem to be a very good use of QEMU maintainers to spend their time in a dead project. Save our time by removing this orphan zombie code. [] https://github.com/intel/haxm/releases/tag/v7.8.0 Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230831082016.60885-1-philmd@linaro.org>	2023-08-31 19:46:43 +02:00
Paolo Bonzini	6f3ae23b29	configure: remove --with-git-submodules= Reuse --enable/--disable-download to control git submodules as well. Adjust the error messages of git-submodule.sh to refer to the new option. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-06 16:30:01 +02:00
Paolo Bonzini	50cfed80ec	configure: remove --with-git= option The scenario for which --with-git= was introduced was to use a SOCKS proxy such as tsocks. However, this was back in 2017 when QEMU's submodules used the git:// protocol, and it is not as important when using the "smart HTTP" backend; for example, neither "meson subprojects download" nor scripts/checkpatch.pl obey the GIT environment variable. So remove the knob, but test for the presence of git in the configure and git-submodule.sh scripts, and suggest using --with-git-submodules=validate + a manual invocation of git-submodule.sh when git does not work. Hopefully in the future the GIT environment variable will be supported by Meson. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-06 16:30:01 +02:00
Dr. David Alan Gilbert	8ab5e8a503	virtiofsd: Remove build and docs glue Remove all the virtiofsd build and docs infrastructure. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-02-16 18:15:08 +00:00
Paolo Bonzini	11b4a4eeec	scripts/ci: bump CentOS Python to 3.8 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-02-06 13:21:28 +01:00
Paolo Bonzini	10229ec3b0	configure: remove backwards-compatibility and obsolete options Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-01-06 00:50:54 +01:00
Paolo Bonzini	d13b200253	build: move vhost-scsi configuration to Kconfig vhost-scsi and vhost-user-scsi are two devices of their own; it should be possible to enable/disable them with --without-default-devices, not --without-default-features. Compute their default value in Kconfig to obtain the more intuitive behavior. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-07 07:46:58 +02:00
Paolo Bonzini	9972ae314f	build: move vhost-vsock configuration to Kconfig vhost-vsock and vhost-user-vsock are two devices of their own; it should be possible to enable/disable them with --without-default-devices, not --without-default-features. Compute their default value in Kconfig to obtain the more intuitive behavior. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-07 07:46:58 +02:00
Kshitij Suri	95f8510ef4	Replacing CONFIG_VNC_PNG with CONFIG_PNG Libpng is only detected if VNC is enabled currently. This patch adds a generalised png option in the meson build which is aimed to replace use of CONFIG_VNC_PNG with CONFIG_PNG. Signed-off-by: Kshitij Suri <kshitij.suri@nutanix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220408071336.99839-2-kshitij.suri@nutanix.com> [ kraxel: add meson-buildoptions.sh updates ] [ kraxel: fix centos8 testcase ] [ kraxel: update --enable-vnc-png too ] Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --enable-vnc-png fixup Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2022-04-27 07:50:28 +02:00
Michael Tokarev	9e8be4c546	drop libxml2 checks since libxml is not actually used (for parallels) For a long time, we assumed that libxml2 is necessary for parallels block format support (block/parallels). However, this format actually does not use libxml []. Since this is the only user of libxml2 in whole QEMU tree, we can drop all libxml2 checks and dependencies too. It is even more: --enable-parallels configure option was the only option which was silently ignored when it's (fake) dependency (libxml2) isn't installed. Drop all mentions of libxml2. [*] Actually the basis for libxml use were introduced in commit `ed279a06c5` ("configure: add dependency") but the implementation was never merged: https://lore.kernel.org/qemu-devel/70227bbd-a517-70e9-714f-e6e0ec431be9@openvz.org/ Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220119090423.149315-1-mjt@msgid.tls.msk.ru> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> [PMD: Updated description and adapted to use lcitool] Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220121154134.315047-5-f4bug@amsat.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20220204204335.1689602-9-alex.bennee@linaro.org>	2022-02-09 12:08:42 +00:00
Thomas Huth	a5730b8bd3	block/file-posix: Simplify the XFS_IOC_DIOINFO handling The handling for the XFS_IOC_DIOINFO ioctl is currently quite excessive: This is not a "real" feature like the other features that we provide with the "--enable-xxx" and "--disable-xxx" switches for the configure script, since this does not influence lots of code (it's only about one call to xfsctl() in file-posix.c), so people don't gain much with the ability to disable this with "--disable-xfsctl". It's also unfortunate that the ioctl will be disabled on Linux in case the user did not install the right xfsprogs-devel package before running configure. Thus let's simplify this by providing the ioctl definition on our own, so we can completely get rid of the header dependency and thus the related code in the configure script. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20211215125824.250091-1-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-01-12 14:09:04 +01:00
Cleber Rosa	d7c2e2b3f4	Jobs based on custom runners: add CentOS Stream 8 This introduces three different parts of a job designed to run on a custom runner managed by Red Hat. The goals include: a) propose a model for other organizations that want to onboard their own runners, with their specific platforms, build configuration and tests. b) bring awareness to the differences between upstream QEMU and the version available under CentOS Stream, which is "A preview of upcoming Red Hat Enterprise Linux minor and major releases". c) because of b), it should be easier to identify and reduce the gap between Red Hat's downstream and upstream QEMU. The components of this custom job are: I) OS build environment setup code: - additions to the existing "build-environment.yml" playbook that can be used to set up CentOS/EL 8 systems. - a CentOS Stream 8 specific "build-environment.yml" playbook that adds to the generic one. II) QEMU build configuration: a script that will produce binaries with features as similar as possible to the ones built and packaged on CentOS stream 8. III) Scripts that define the minimum amount of testing that the binaries built with the given configuration (point II) under the given OS build environment (point I) should be subjected to. IV) Job definition: GitLab CI jobs that will dispatch the build/test jobs (see points #II and #III) to the machine specifically configured according to #I. Signed-off-by: Cleber Rosa <crosa@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Willian Rampazzo <willianr@redhat.com> Tested-by: Willian Rampazzo <willianr@redhat.com> Message-Id: <20211111160501.862396-2-crosa@redhat.com> Message-Id: <20211115142915.3797652-6-alex.bennee@linaro.org>	2021-11-16 16:19:53 +00:00

14 Commits