Commit Graph

108350 Commits

Author SHA1 Message Date
Fiona Ebner 76cb2f2491 mirror: return mirror-specific information upon query
To start out, only actively-synced is returned.

For example, this is useful for jobs that started out in background
mode and switched to active mode. Once actively-synced is true, it's
clear that the mode switch has been completed. Note that completion of
the switch might happen much earlier, e.g. if the switch happens
before the job is ready, once all background operations have finished.
It's assumed that whether the disks are actively-synced or not is more
interesting than whether the mode switch completed. That information
can still be added if required in the future.

In presence of an iothread, the actively_synced member is now shared
between the iothread and the main thread, so turn accesses to it
atomic.

Requires to adapt the output for iotest 109.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-10-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner 59fd82544d blockjob: query driver-specific info via a new 'query' driver method
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-9-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner 701efc9f2d qapi/block-core: turn BlockJobInfo into a union
In preparation to additionally return job-type-specific information.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20231031135431.393137-8-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner d67c54d05f qapi/block-core: use JobType for BlockJobInfo's type
In preparation to turn BlockJobInfo into a union with @type as the
discriminator. That requires it to be an enum. Even without that
requirement, it's nicer to have an enum instead of a str here.

No functional change is intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231031135431.393137-7-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner 2d400d15a0 mirror: implement mirror_change method
which allows switching the @copy-mode from 'background' to
'write-blocking'.

This is useful for management applications, so they can start out in
background mode to avoid limiting guest write speed and switch to
active mode when certain criteria are fulfilled.

In presence of an iothread, the copy_mode member is now shared between
the iothread and the main thread, so turn accesses to it atomic.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-6-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner 7b32ad2242 block/mirror: determine copy_to_target only once
In preparation to allow changing the copy_mode via QMP. When running
in an iothread, it could be that copy_mode is changed from the main
thread in between reading copy_mode in bdrv_mirror_top_pwritev() and
reading copy_mode in bdrv_mirror_top_do_write(), so they might end up
disagreeing about whether copy_to_target is true or false. Avoid that
scenario by determining copy_to_target only once and passing it to
bdrv_mirror_top_do_write() as an argument.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20231031135431.393137-5-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner 058cfca564 block/mirror: move dirty bitmap to filter
In preparation to allow switching to active mode without draining.
Initialization of the bitmap in mirror_dirty_init() still happens with
the original/backing BlockDriverState, which should be fine, because
the mirror top has the same length.

Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-4-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner c45d0e1af0 block/mirror: set actively_synced even after the job is ready
In preparation to allow switching from background to active mode. This
ensures that setting actively_synced will not be missed when the
switch happens after the job is ready.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20231031135431.393137-3-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner 61a3a5a76a blockjob: introduce block-job-change QMP command
which will allow changing job-type-specific options after job
creation.

In the JobVerbTable, the same allow bits as for set-speed are used,
because set-speed can be considered an existing change command.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20231031135431.393137-2-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:25 +01:00
Stefan Hajnoczi 073458da56 virtio-blk: remove batch notification BH
There is a batching mechanism for virtio-blk Used Buffer Notifications
that is no longer needed because the previous commit added batching to
virtio_notify_irqfd().

Note that this mechanism was rarely used in practice because it is only
enabled when EVENT_IDX is not negotiated by the driver. Modern drivers
enable EVENT_IDX.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-5-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 15:42:17 +01:00
Stefan Hajnoczi 84d61e5f36 virtio: use defer_call() in virtio_irqfd_notify()
virtio-blk and virtio-scsi invoke virtio_irqfd_notify() to send Used
Buffer Notifications from an IOThread. This involves an eventfd
write(2) syscall. Calling this repeatedly when completing multiple I/O
requests in a row is wasteful.

Use the defer_call() API to batch together virtio_irqfd_notify() calls
made during thread pool (aio=threads), Linux AIO (aio=native), and
io_uring (aio=io_uring) completion processing.

Behavior is unchanged for emulated devices that do not use
defer_call_begin()/defer_call_end() since defer_call() immediately
invokes the callback when called outside a
defer_call_begin()/defer_call_end() region.

fio rw=randread bs=4k iodepth=64 numjobs=8 IOPS increases by ~9% with a
single IOThread and 8 vCPUs. iodepth=1 decreases by ~1% but this could
be noise. Detailed performance data and configuration specifics are
available here:
https://gitlab.com/stefanha/virt-playbooks/-/tree/blk_io_plug-irqfd

This duplicates the BH that virtio-blk uses for batching. The next
commit will remove it.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-4-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 15:42:14 +01:00
Stefan Hajnoczi 433fcea40c util/defer-call: move defer_call() to util/
The networking subsystem may wish to use defer_call(), so move the code
to util/ where it can be reused.

As a reminder of what defer_call() does:

This API defers a function call within a defer_call_begin()/defer_call_end()
section, allowing multiple calls to batch up. This is a performance
optimization that is used in the block layer to submit several I/O requests
at once instead of individually:

  defer_call_begin(); <-- start of section
  ...
  defer_call(my_func, my_obj); <-- deferred my_func(my_obj) call
  defer_call(my_func, my_obj); <-- another
  defer_call(my_func, my_obj); <-- another
  ...
  defer_call_end(); <-- end of section, my_func(my_obj) is called once

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-3-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 15:41:42 +01:00
Stefan Hajnoczi ccee48aa73 block: rename blk_io_plug_call() API to defer_call()
Prepare to move the blk_io_plug_call() API out of the block layer so
that other subsystems call use this deferred call mechanism. Rename it
to defer_call() but leave the code in block/plug.c.

The next commit will move the code out of the block layer.

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-2-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 15:41:24 +01:00
Fiona Ebner 302823854b blockdev: mirror: avoid potential deadlock when using iothread
The bdrv_getlength() function is a generated co-wrapper and uses
AIO_WAIT_WHILE() to wait for the spawned coroutine. AIO_WAIT_WHILE()
expects the lock to be acquired exactly once.

Fix a case where it may be acquired twice. This can happen when the
source node is explicitly specified as the @replaces parameter or if the
source node is a filter node.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231019131936.414246-4-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:36 +01:00
Fiona Ebner e462c6d27d block: avoid potential deadlock during bdrv_graph_wrlock() in bdrv_close()
by passing the BlockDriverState along, so the held AioContext can be
dropped before polling. See commit 31b2ddfea3 ("graph-lock: Unlock the
AioContext while polling") which introduced this functionality for
more information.

The only way to reach bdrv_close() is via bdrv_unref() and for calling
that the BlockDriverState's AioContext lock is supposed to be held.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231019131936.414246-3-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:36 +01:00
Fiona Ebner 67446e605d blockjob: drop AioContext lock before calling bdrv_graph_wrlock()
Same rationale as in 31b2ddfea3 ("graph-lock: Unlock the AioContext
while polling"). Otherwise, a deadlock can happen.

The alternative would be to pass a BlockDriverState along to
bdrv_graph_wrlock(), but there is no BlockDriverState readily
available and it's also better conceptually, because the lock is held
for the job.

The function is always called with the job's AioContext lock held, via
one of the .abort, .clean, .free or .prepare job driver functions.
Thus, it's safe to drop it.

While mirror_exit_common() does hold a second AioContext lock while
calling block_job_remove_all_bdrv(), that is for the main thread's
AioContext and does not need to be dropped (bdrv_graph_wrlock(bs) also
skips dropping the lock if bdrv_get_aio_context(bs) ==
qemu_get_aio_context()).

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231019131936.414246-2-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:36 +01:00
Kevin Wolf 3be5762294 iotests: Test media change with iothreads
iotests case 118 already tests all relevant operations for media change
with multiple devices, however never with iothreads. This changes the
test so that the virtio-scsi tests run with an iothread.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231013153302.39234-3-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:33 +01:00
Kevin Wolf fed8245015 block: Fix locking in media change monitor commands
blk_insert_bs() requires that the caller holds the AioContext lock for
the node to be inserted. Since commit c066e808e1, neglecting to do so
causes a crash when the child has to be moved to a different AioContext
to attach it to the BlockBackend.

This fixes qmp_blockdev_insert_anon_medium(), which is called for the
QMP commands 'blockdev-insert-medium' and 'blockdev-change-medium', to
correctly take the lock.

Cc: qemu-stable@nongnu.org
Fixes: https://issues.redhat.com/browse/RHEL-3922
Fixes: c066e808e1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231013153302.39234-2-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:28 +01:00
Andrey Drobyshev 87fe52ceca iotests: add tests for "qemu-img rebase" with compression
The test cases considered so far:

314 (new test suite):

1. Check that compression mode isn't compatible with "-f raw" (raw
   format doesn't support compression).
2. Check that rebasing an image onto no backing file preserves the data
   and writes the copied clusters actually compressed.
3. Same as 2, but with a raw backing file (i.e. the clusters copied from the
   backing are originally uncompressed -- we check they end up compressed
   after being merged).
4. Remove a single delta from a backing chain, perform the same checks
   as in 2.
5. Check that even when backing and overlay are initially uncompressed,
   copied clusters end up compressed when rebase with compression is
   performed.

271:

1. Check that when target image has subclusters, rebase with compression
   will make an entire cluster containing the written subcluster
   compressed.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-9-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:28 +01:00
Andrey Drobyshev 26ea27898b qemu-img: add compression option to rebase subcommand
If we rebase an image whose backing file has compressed clusters, we
might end up wasting disk space since the copied clusters are now
uncompressed.  In order to have better control over this, let's add
"--compress" option to the "qemu-img rebase" command.

Note that this option affects only the clusters which are actually being
copied from the original backing file.  The clusters which were
uncompressed in the target image will remain so.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-8-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:28 +01:00
Andrey Drobyshev f93e65ee51 iotests/{024, 271}: add testcases for qemu-img rebase
As the previous commit changes the logic of "qemu-img rebase" (it's using
write alignment now), let's add a couple more test cases which would
ensure it works correctly.  In particular, the following scenarios:

024: add test case for rebase within one backing chain when the overlay
     cluster size > backings cluster size;
271: add test case for rebase images that contain subclusters.  Check
     that no extra allocations are being made.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-7-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:28 +01:00
Andrey Drobyshev 12df580b3b qemu-img: rebase: avoid unnecessary COW operations
When rebasing an image from one backing file to another, we need to
compare data from old and new backings.  If the diff between that data
happens to be unaligned to the target cluster size, we might end up
doing partial writes, which would lead to copy-on-write and additional IO.

Consider the following simple case (virtual_size == cluster_size == 64K):

base <-- inc1 <-- inc2

qemu-io -c "write -P 0xaa 0 32K" base.qcow2
qemu-io -c "write -P 0xcc 32K 32K" base.qcow2
qemu-io -c "write -P 0xbb 0 32K" inc1.qcow2
qemu-io -c "write -P 0xcc 32K 32K" inc1.qcow2
qemu-img rebase -f qcow2 -b base.qcow2 -F qcow2 inc2.qcow2

While doing rebase, we'll write a half of the cluster to inc2, and block
layer will have to read the 2nd half of the same cluster from the base image
inc1 while doing this write operation, although the whole cluster is already
read earlier to perform data comparison.

In order to avoid these unnecessary IO cycles, let's make sure every
write request is aligned to the overlay subcluster boundaries.  Using
subcluster size is universal as for the images which don't have them
this size equals to the cluster size. so in any case we end up aligning
to the smallest unit of allocation.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20230919165804.439110-6-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:28 +01:00
Andrey Drobyshev c20c8ae700 qemu-img: add chunk size parameter to compare_buffers()
Add @chsize param to the function which, if non-zero, would represent
the chunk size to be used for comparison.  If it's zero, then
BDRV_SECTOR_SIZE is used as default chunk size, which is the previous
behaviour.

In particular, we're going to use this param in img_rebase() to make the
write requests aligned to a predefined alignment value.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-5-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:28 +01:00
Andrey Drobyshev ce8b8f9fe7 qemu-img: rebase: use backing files' BlockBackend for buffer alignment
Since commit bb1c05973c ("qemu-img: Use qemu_blockalign"), buffers for
the data read from the old and new backing files are aligned using
BlockDriverState (or BlockBackend later on) referring to the target image.
However, this isn't quite right, because buf_new is only being used for
reading from the new backing, while buf_old is being used for both reading
from the old backing and writing to the target.  Let's take that into account
and use more appropriate values as alignments.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20230919165804.439110-4-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:28 +01:00
Andrey Drobyshev 827171c318 qemu-iotests: 024: add rebasing test case for overlay_size > backing_size
Before previous commit, rebase was getting infitely stuck in case of
rebasing within the same backing chain and when overlay_size > backing_size.
Let's add this case to the rebasing test 024 to make sure it doesn't
break again.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-3-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:28 +01:00
Andrey Drobyshev 8b097fd6b0 qemu-img: rebase: stop when reaching EOF of old backing file
In case when we're rebasing within one backing chain, and when target image
is larger than old backing file, bdrv_is_allocated_above() ends up setting
*pnum = 0.  As a result, target offset isn't getting incremented, and we
get stuck in an infinite for loop.  Let's detect this case and proceed
further down the loop body, as the offsets beyond the old backing size need
to be explicitly zeroed.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-2-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 13:51:27 +01:00
Stefan Hajnoczi 516fffc993 linux-user: Fix guest signal remapping after adjusting SIGABRT
linux-user: Implement VDSOs
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmVAHMsdHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/GSgf/SiaCzl7FV2NsxA2h
 zHrgSYEf/4dyqjbgNhE9XSrIJ/cPEY47JrpMqJ0cK4BGc/d2IppUU0Zz3qZltXck
 CkTIPPXEWDvex+PSe5NXarxQtOazi21C+EySGFtFcCQ32C/LsdJBtNzrB+G/Tl/t
 QvPJBztXvS6FAdVci2TGBNk62nFq3NS/Uz477SD6Q/uSlczQQ5b1fu3YgZcCqM9D
 ncncHbuExUu+NMK02h8vyWwpxaTvUBSdRxx/6jnyctwVpWyMaIOfsrMooz0gBfoD
 Z7MqXhvtBYOnm4OjcQs4Nj1JBOdYoQS/y6dJ7ZP0kg10VSEwr48pduXZSvIypxbw
 hsaa8w==
 =wcWF
 -----END PGP SIGNATURE-----

Merge tag 'pull-lu-20231030' of https://gitlab.com/rth7680/qemu into staging

linux-user: Fix guest signal remapping after adjusting SIGABRT
linux-user: Implement VDSOs

* tag 'pull-lu-20231030' of https://gitlab.com/rth7680/qemu: (21 commits)
  build: Add update-linux-vdso makefile rule
  linux-user: Show vdso address in /proc/pid/maps
  linux-user/s390x: Add vdso
  linux-user/s390x: Rename __SIGNAL_FRAMESIZE to STACK_FRAME_OVERHEAD
  linux-user/ppc: Add vdso
  linux-user/loongarch64: Add vdso
  linux-user/riscv: Add vdso
  linux-user/hppa: Add vdso
  linux-user/arm: Add vdso
  linux-user/aarch64: Add vdso
  linux-user/x86_64: Add vdso
  linux-user/i386: Add vdso
  linux-user: Add gen-vdso tool
  linux-user: Load vdso image if available
  linux-user: Replace bprm->fd with bprm->src.fd
  linux-user: Use ImageSource in load_symbols
  linux-user: Use ImageSource in load_elf_image
  linux-user: Do not clobber bprm_buf swapping ehdr
  linux-user: Tidy loader_exec
  linux-user: Introduce imgsrc_read, imgsrc_read_alloc
  ...

Conflicts:
  linux-user/arm/signal.c
  Fix an #include context conflict.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-31 07:12:40 +09:00
Stefan Hajnoczi 235fe6d06e ufs-next-pull-request
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUBfYMVl8eKPZB+73EuIgTA5dtgIFAmU/DfoACgkQEuIgTA5d
 tgKZ3g/+J38LTaktLPgUb0Kg390anPkIAkqqA1QZC8lC/FRSEWpgsNBqcvAASNTl
 jj1c80k/+Dvf9Ti1lmDNkuYczCFvKNJZQ1iRHmv2wc79A01GV0Ue6xayQjjNjoKK
 SBMIsFpArmFQjR2wGlkRc8PXha1JyWrsD4iPY6ZqedEcyuueLx69XbLL37FfVbQt
 5IMnDqGkLCmrGowAjwurq2UM5IiYjeB4I5OwUgJC526zlyngXTFJimCWS6b2uUBk
 Yg1PnFffBsh11Pwmq4IZ1DAv3Bv/gFovenuatFqZrgqtfK7tEiARInIEsctu0U0a
 hPK/KojJAPF/cfMssRm1D1GCfsXM4tP2yFY/6q0wTRr9Dod8OSjlvfJR7+ez71/j
 aoY4N/nYYrZ6+pQNsPJcuBqQdtjdNUp4gUHx5qYxwwqZcHK4ubxpIvstmxceoLEX
 3PG4O1iAapc/aL12ww9bYJ2lrbKGx7ZJU/Ij8bud8tYzLheG3xaYUEhonk7DE6+e
 AXFSad5CJTIF9Duh1uAMe1sV9GxELV8MHZSalqfGOhWYp7LzUBgouEJ1gQdOQbTK
 VsLs48WQ23OjWNKyAMaXQXdFO4FVbsjIg9nQXEHNRPkUownVHNVL8zu6EsXvHfch
 u691ygt5pD100SYdcDv73xTSeqP/rxqyYdxJl4LRkv/hGWU4y78=
 =Oisg
 -----END PGP SIGNATURE-----

Merge tag 'pull-ufs-20231030' of https://gitlab.com/jeuk20.kim/qemu into staging

ufs-next-pull-request

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEUBfYMVl8eKPZB+73EuIgTA5dtgIFAmU/DfoACgkQEuIgTA5d
# tgKZ3g/+J38LTaktLPgUb0Kg390anPkIAkqqA1QZC8lC/FRSEWpgsNBqcvAASNTl
# jj1c80k/+Dvf9Ti1lmDNkuYczCFvKNJZQ1iRHmv2wc79A01GV0Ue6xayQjjNjoKK
# SBMIsFpArmFQjR2wGlkRc8PXha1JyWrsD4iPY6ZqedEcyuueLx69XbLL37FfVbQt
# 5IMnDqGkLCmrGowAjwurq2UM5IiYjeB4I5OwUgJC526zlyngXTFJimCWS6b2uUBk
# Yg1PnFffBsh11Pwmq4IZ1DAv3Bv/gFovenuatFqZrgqtfK7tEiARInIEsctu0U0a
# hPK/KojJAPF/cfMssRm1D1GCfsXM4tP2yFY/6q0wTRr9Dod8OSjlvfJR7+ez71/j
# aoY4N/nYYrZ6+pQNsPJcuBqQdtjdNUp4gUHx5qYxwwqZcHK4ubxpIvstmxceoLEX
# 3PG4O1iAapc/aL12ww9bYJ2lrbKGx7ZJU/Ij8bud8tYzLheG3xaYUEhonk7DE6+e
# AXFSad5CJTIF9Duh1uAMe1sV9GxELV8MHZSalqfGOhWYp7LzUBgouEJ1gQdOQbTK
# VsLs48WQ23OjWNKyAMaXQXdFO4FVbsjIg9nQXEHNRPkUownVHNVL8zu6EsXvHfch
# u691ygt5pD100SYdcDv73xTSeqP/rxqyYdxJl4LRkv/hGWU4y78=
# =Oisg
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 30 Oct 2023 10:59:22 JST
# gpg:                using RSA key 5017D831597C78A3D907EEF712E2204C0E5DB602
# gpg: Good signature from "Jeuk Kim <jeuk20.kim@samsung.com>" [unknown]
# gpg:                 aka "Jeuk Kim <jeuk20.kim@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5017 D831 597C 78A3 D907  EEF7 12E2 204C 0E5D B602

* tag 'pull-ufs-20231030' of https://gitlab.com/jeuk20.kim/qemu:
  hw/ufs: Modify lu.c to share codes with SCSI subsystem

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-31 07:11:23 +09:00
Stefan Hajnoczi 850e874f1c target-arm queue:
* Correct minor errors in Cortex-A710 definition
  * Implement Neoverse N2 CPU model
  * Refactor feature test functions out into separate header
  * Fix syndrome for FGT traps on ERET
  * Remove 'hw/arm/boot.h' includes from various header files
  * pxa2xx: Refactoring/cleanup
  * Avoid using 'first_cpu' when first ARM CPU is reachable
  * misc/led: LED state is set opposite of what is expected
  * hw/net/cadence_gen: clean up to use FIELD macros
  * hw/net/cadence_gem: perform PHY access on write only
  * hw/net/cadence_gem: enforce 32 bits variable size for CRC
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmU7yz0ZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3n4xEACK4ti+PFSJHVCQ69NzLLBT
 ybFGFMsMhXJTSNS30Pzs+KWCKWPP59knYBD4qO43W1iV6pPUhy+skr+BFCCRvBow
 se74+Fm1l4LmnuHxgukJzTdvRffI3v37alLn6Y/ioWe8bDpf/IJj8WLj8B1IPoNg
 fswJSGDLpPMovaz8NBQRzglUWpfyzxH+uuW779qBS1nuFdPOfIHKrocvvdrfogBP
 aO8AeiBzz5STW9Naeq+BIKho8S9LinSB6FHa+rRPUDkWx03lvRIvkgGPzHpXYy8I
 zAZ8gUQZyXprHAHMpnoBv8Wcw3Bwc2f+8xx8hnRRki3iBroXKfJA9NkeN0StQmL1
 ZHhfYkiKSS5diIFW5pX6ZixKbXHE2a4aH4zPVUNQriNWOevhe7n82mAPNFIYjk97
 ciTtd4I2oew48sDLSodMiirGL987Mit7KC23itVGezcNfQ9FnVTDmuGy8Rq52BZm
 u4TZjVBrtjQOdMBUcD2hKvXhikQNAdOhArPwNfOr0esSQL44MMEe+6Q5/Cbp0BOE
 stAY/xwSP2cY5mIPnAbIBELseEZsV8ySA3M0y1iRCJptjwbyWM+s1TYz0iXcqeOn
 l6LfiI6r1BqUeoWLGP4042R4FLyLNh6gU/TiFNLu7JJQjXl/EkRgqVXWYfzy2n51
 KKY6iGFi5r41sAU6GIXOkQ==
 =szC7
 -----END PGP SIGNATURE-----

Merge tag 'pull-target-arm-20231027' of https://git-us.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * Correct minor errors in Cortex-A710 definition
 * Implement Neoverse N2 CPU model
 * Refactor feature test functions out into separate header
 * Fix syndrome for FGT traps on ERET
 * Remove 'hw/arm/boot.h' includes from various header files
 * pxa2xx: Refactoring/cleanup
 * Avoid using 'first_cpu' when first ARM CPU is reachable
 * misc/led: LED state is set opposite of what is expected
 * hw/net/cadence_gen: clean up to use FIELD macros
 * hw/net/cadence_gem: perform PHY access on write only
 * hw/net/cadence_gem: enforce 32 bits variable size for CRC

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmU7yz0ZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3n4xEACK4ti+PFSJHVCQ69NzLLBT
# ybFGFMsMhXJTSNS30Pzs+KWCKWPP59knYBD4qO43W1iV6pPUhy+skr+BFCCRvBow
# se74+Fm1l4LmnuHxgukJzTdvRffI3v37alLn6Y/ioWe8bDpf/IJj8WLj8B1IPoNg
# fswJSGDLpPMovaz8NBQRzglUWpfyzxH+uuW779qBS1nuFdPOfIHKrocvvdrfogBP
# aO8AeiBzz5STW9Naeq+BIKho8S9LinSB6FHa+rRPUDkWx03lvRIvkgGPzHpXYy8I
# zAZ8gUQZyXprHAHMpnoBv8Wcw3Bwc2f+8xx8hnRRki3iBroXKfJA9NkeN0StQmL1
# ZHhfYkiKSS5diIFW5pX6ZixKbXHE2a4aH4zPVUNQriNWOevhe7n82mAPNFIYjk97
# ciTtd4I2oew48sDLSodMiirGL987Mit7KC23itVGezcNfQ9FnVTDmuGy8Rq52BZm
# u4TZjVBrtjQOdMBUcD2hKvXhikQNAdOhArPwNfOr0esSQL44MMEe+6Q5/Cbp0BOE
# stAY/xwSP2cY5mIPnAbIBELseEZsV8ySA3M0y1iRCJptjwbyWM+s1TYz0iXcqeOn
# l6LfiI6r1BqUeoWLGP4042R4FLyLNh6gU/TiFNLu7JJQjXl/EkRgqVXWYfzy2n51
# KKY6iGFi5r41sAU6GIXOkQ==
# =szC7
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 27 Oct 2023 23:37:49 JST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20231027' of https://git-us.linaro.org/people/pmaydell/qemu-arm: (41 commits)
  hw/net/cadence_gem: enforce 32 bits variable size for CRC
  hw/net/cadence_gem: perform PHY access on write only
  hw/net/cadence_gem: use FIELD to describe PHYMNTNC register fields
  hw/net/cadence_gem: use FIELD to describe DESCONF6 register fields
  hw/net/cadence_gem: use FIELD to describe IRQ register fields
  hw/net/cadence_gem: use FIELD to describe [TX|RX]STATUS register fields
  hw/net/cadence_gem: use FIELD to describe DMACFG register fields
  hw/net/cadence_gem: use FIELD to describe NWCFG register fields
  hw/net/cadence_gem: use FIELD to describe NWCTRL register fields
  hw/net/cadence_gem: use FIELD for screening registers
  hw/net/cadence_gem: use REG32 macro for register definitions
  misc/led: LED state is set opposite of what is expected
  hw/arm: Avoid using 'first_cpu' when first ARM CPU is reachable
  hw/arm/pxa2xx: Realize PXA2XX_I2C device before accessing it
  hw/intc/pxa2xx: Factor pxa2xx_pic_realize() out of pxa2xx_pic_init()
  hw/intc/pxa2xx: Pass CPU reference using QOM link property
  hw/intc/pxa2xx: Convert to Resettable interface
  hw/pcmcia/pxa2xx: Inline pxa2xx_pcmcia_init()
  hw/pcmcia/pxa2xx: Do not open-code sysbus_create_simple()
  hw/pcmcia/pxa2xx: Realize sysbus device before accessing it
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-31 07:07:42 +09:00
Richard Henderson 335b8f700c build: Add update-linux-vdso makefile rule
This is not ideal, since it requires all cross-compilers
to be present rather than a simple subset.  But since it
is only run manually, should be good enough for now.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:56 -07:00
Richard Henderson 5d94c2ffa8 linux-user: Show vdso address in /proc/pid/maps
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:56 -07:00
Richard Henderson b63c6b97f8 linux-user/s390x: Add vdso
Acked-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:56 -07:00
Richard Henderson ba02f1ea63 linux-user/s390x: Rename __SIGNAL_FRAMESIZE to STACK_FRAME_OVERHEAD
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:56 -07:00
Richard Henderson e34136d930 linux-user/ppc: Add vdso
Add support in gen-vdso-elfn.c.inc for the DT_PPC64_OPT
dynamic tag: this is an integer, so does not need relocation.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:56 -07:00
Richard Henderson 00cc2934b2 linux-user/loongarch64: Add vdso
Requires a relatively recent binutils version in order to avoid
spurious R_LARCH_NONE relocations.  The presence of these relocs
are diagnosed by our gen-vdso tool.

Tested-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson 468c1bb5ca linux-user/riscv: Add vdso
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson c7bc2a8fb1 linux-user/hppa: Add vdso
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson a9f495b93f linux-user/arm: Add vdso
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson ee95fae075 linux-user/aarch64: Add vdso
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson 6b1a9d38b5 linux-user/x86_64: Add vdso
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson a1367443ba linux-user/i386: Add vdso
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1267
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson 2fa536d107 linux-user: Add gen-vdso tool
This tool will be used for post-processing the linked vdso image,
turning it into something that is easy to include into elfload.c.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson c40f621a19 linux-user: Load vdso image if available
The vdso image will be pre-processed into a C data array, with
a simple list of relocations to perform, and identifying the
location of signal trampolines.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson d0b6b79323 linux-user: Replace bprm->fd with bprm->src.fd
There are only a couple of uses of bprm->fd remaining.
Migrate to the other field.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson 86cf82dc9f linux-user: Use ImageSource in load_symbols
Aside from the section headers, we're unlikely to hit the
ImageSource cache on guest executables.  But the interface
for imgsrc_read_* is better.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson 3bd0238638 linux-user: Use ImageSource in load_elf_image
Change parse_elf_properties as well, as the bprm_buf argument
ties the two functions closely.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson 40d487eecf linux-user: Do not clobber bprm_buf swapping ehdr
Rearrange the allocation of storage for ehdr between load_elf_image
and load_elf_binary.  The same set of copies are done, but we don't
modify bprm_buf, which will be important later.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson f485be725d linux-user: Tidy loader_exec
Reorg the if cases to reduce indentation.
Test for 4 bytes in the file before checking the signatures.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson 7d2c5526ed linux-user: Introduce imgsrc_read, imgsrc_read_alloc
Introduced and initialized, but not yet really used.
These will tidy the current tests vs BPRM_BUF_SIZE.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:41:55 -07:00
Richard Henderson 02d9f5b6ac linux-user: Fix guest signal remapping after adjusting SIGABRT
The arithmetic within the loop was not adjusted properly after SIGRTMIN
was stolen for the guest SIGABRT.  The effect was that the guest libc
could not send itself __SIGRTMIN to wake sleeping threads.

Fixes: 38ee0a7dfb ("linux-user: Remap guest SIGABRT")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1967
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-30 13:40:35 -07:00