Right now, each CPU has its own TOD. Especially, the TOD will differ
based on creation time of a CPU - e.g. when hotplugging a CPU the times
will differ quite a lot, resulting in stall warnings in the guest.
Let's use a single TOD by implementing our new TOD device. Prepare it
for TOD-clock epoch extension.
Most importantly, whenever we set the TOD, we have to update the CKC
timer.
Introduce "tcg_s390x.h" just like "kvm_s390x.h" for tcg specific
function declarations that should not go into cpu.h.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's treat this like a separate device. TCG will have to store the
actual state/time later on.
Include cpu-qom.h in kvm_s390x.h (due to S390CPU) to compile tod-kvm.c.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The rom_ptr() function allows direct access to the ROM blobs that we
load during startup. However, there are currently no checks for the
size of the accesses, so it's currently possible to crash QEMU for
example with:
$ echo "Insane in the mainframe" > /tmp/test.txt
$ s390x-softmmu/qemu-system-s390x -kernel /tmp/test.txt -append xyz
Segmentation fault (core dumped)
$ s390x-softmmu/qemu-system-s390x -kernel /tmp/test.txt -initrd /tmp/test.txt
Segmentation fault (core dumped)
$ echo -n HdrS > /tmp/hdr.txt
$ sparc64-softmmu/qemu-system-sparc64 -kernel /tmp/hdr.txt -initrd /tmp/hdr.txt
Segmentation fault (core dumped)
We need a possibility to check the size of the ROM area that we want
to access, thus let's add a size parameter to the rom_ptr() function
to avoid these problems.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1530005740-25254-1-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
kexec/kdump as well as the bootloader use a subcode of diagnose 308
that is supposed to reset the I/O subsystem but not comprise a full
"reboot". With the latest refactoring this is now broken when
-no-reboot is used or when libvirt acts on a reboot QMP event, for
example a virt-install from iso images.
We need to mark these "subsystem resets" as special.
Fixes: a30fb811cb (s390x: refactor reset/reipl handling)
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180622102928.173420-1-borntraeger@de.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
It was unclear before on what does the CLOSED event mean. Meanwhile we
add a TODO to fix up the CLOSED event in the future when the in/out
ports are different for a chardev.
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "Marc-André Lureau" <marcandre.lureau@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180620073223.31964-2-peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
* last of the SVE patches; SVE is now enabled for aarch64 linux-user
* sd: Don't trace SDRequest crc field (coverity bugfix)
* target/arm: Mark PMINTENSET accesses as possibly doing IO
* clean up v7VE feature bit handling
* i.mx7d: minor cleanups
* target/arm: support reading of CNT[VCT|FRQ]_EL0 from user-space
* target/arm: Implement ARMv8.2-DotProd
* virt: add addresses to dt node names (which stops dtc from
complaining that they're not correctly named)
* cleanups: replace error_setg(&error_fatal) by error_report() + exit()
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJbNkelAAoJEDwlJe0UNgzeB/wP/1WLIC2HD6jbYCPm/qW6fpXx
cUIRjvmodD8gVdJ8m5MVPQl1AdWVdVOGTStMdKNo4nWRuZyRKl3WvnZKoDQqj+fq
PahPUMEXGC14djhmMQ7XdDHFYFCtKOi/leZSSLw3nI1QVq+SSYOiWhqrUExLeMWb
xA25kzHhsNrTbwE7WxBz0dhtkX+QBy2CY11e1o3ONC5vJQhuXxLxPyEZZrQpj2yZ
9twZTcWwJ1FRlTNsGo26wNl65gC7RNAbts+fNa2gxFcwUfN2ioKKdI1iyQVvW7Mz
CwHoa7ghppyQ6hBT2U2O5nBJx72pdQaLae6mQ+FdIFTaoE05aoAQ+pUKZiv+JwTH
blZQZld03c80Rt6SBSVS7AU8l3A0T+cePqVMA2lv7NyGSyq4G8RjqtLIe++QGTIj
Lq9PDVh0evnFQY11RBvJYOit3l/gB6XtfnYo2+ePpCapr/cGmXzerr9kwcCr/D7I
jCQamPO0oxAIbvmqfc0feppWwFdI9qFbyKIRcQtb7uglN29z95ImNCq1tGYweQvy
/ApGLMrrcdtgF2Ok13ajM+Ge60QPZ0uB8EJoStZwOwFeIuEThdvS919q0BxmR/d3
eSwMj7ysf+UqPU9vkjtiXUyXl7r2iCbFoXcXIBsYE1Xpx1YYZKSThzMl7hqDWRGo
W+wtYk3PvdlAPTH/lg5D
=KZLT
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180629' into staging
target-arm queue:
* last of the SVE patches; SVE is now enabled for aarch64 linux-user
* sd: Don't trace SDRequest crc field (coverity bugfix)
* target/arm: Mark PMINTENSET accesses as possibly doing IO
* clean up v7VE feature bit handling
* i.mx7d: minor cleanups
* target/arm: support reading of CNT[VCT|FRQ]_EL0 from user-space
* target/arm: Implement ARMv8.2-DotProd
* virt: add addresses to dt node names (which stops dtc from
complaining that they're not correctly named)
* cleanups: replace error_setg(&error_fatal) by error_report() + exit()
# gpg: Signature made Fri 29 Jun 2018 15:52:21 BST
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180629: (55 commits)
target/arm: Add ID_ISAR6
target/arm: Prune a15 features from max
target/arm: Prune a57 features from max
target/arm: Fix SVE system register access checks
target/arm: Fix SVE signed division vs x86 overflow exception
sdcard: Use the ldst API
sd: Don't trace SDRequest crc field
target/arm: Mark PMINTENSET accesses as possibly doing IO
target/arm: Remove redundant DIV detection for KVM
target/arm: Add ARM_FEATURE_V7VE for v7 Virtualization Extensions
i.mx7d: Change IRQ number type from hwaddr to int
i.mx7d: Change SRC unimplemented device name from sdma to src
i.mx7d: Remove unused header files
target/arm: support reading of CNT[VCT|FRQ]_EL0 from user-space
target/arm: Implement ARMv8.2-DotProd
target/arm: Enable SVE for aarch64-linux-user
target/arm: Implement SVE dot product (indexed)
target/arm: Implement SVE dot product (vectors)
target/arm: Implement SVE fp complex multiply add (indexed)
target/arm: Pass index to AdvSIMD FCMLA (indexed)
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
soft-freeze, but I'd like these preparatory patches to be merged anyway.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEtIKLr5QxQM7yo0kQcdTV5YIvc9YFAls2DEgACgkQcdTV5YIv
c9ZShw/6AuDMeWpzFHAdf4lBuUdDFyAC7QFln8xdb+MDus5whAvtt7vDeY0OKbgo
w5VDTkstO7h4jQJDHkjzHK91ZdUbgu0Tj7C09x4oQpUueNWsTZGcPBvNGadjjOBt
70LdwyV2nSER3+QNjTNznrh0faxay4xuSTIY/mW6iudeWGobwXmseEeOE8gGM+w0
s1GwxMVKIfllKUmW2vx0mGfn02pKtTnan+Si+sp/AnY9xSquFfHWpZhXZlkZrfYd
mgtJOTY9IpSekr9jBBKgUlZ/QVYiliDzuh3ePDYKtsuHZZ7z2ype3DkXqYOnblOs
C+2gWUE/TC5BStjRX3RmPv21dpfkEdlxOZpgXbpP1VgKqbtnbnvcgTL89IPv9afl
Aj+q5uYR494kOL5rSDynVRdWhnUmMnkqHCZpKG+IRMHv6GlrXpxOQWenwCS/vYWK
swKqRwGj0CFugdt7qVZ+4XjXbbWEI21dHHG7nAXinfakKVOfJYIeGIQC7WfpIrxy
ApV0mHSceK0AMBJvlf1Zf0Qm0lJ7Ay7MRT/5XWDFV9Bogf+wxtGvf9Ukc2qQhwd8
mR9iN7rlWz3VSu5vS3bEdsiBXKibxIRfv7HhF5fa+mwkZA9gMbj33vVds1zA4ta1
Qw4doRq4xWui3uNO9jvtcXtW5Bq7N4p6wVFK76dLVHk5axLCIec=
=vVJr
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
The Darwin host support still needs some more work. It won't make it for
soft-freeze, but I'd like these preparatory patches to be merged anyway.
# gpg: Signature made Fri 29 Jun 2018 11:39:04 BST
# gpg: using RSA key 71D4D5E5822F73D6
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Gregory Kurz <gregory.kurz@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6
* remotes/gkurz/tags/for-upstream:
9p: darwin: Explicitly cast comparisons of mode_t with -1
cutils: Provide strchrnul
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This helper allows to retrieve the paths of nodes whose name
match node-name or node-name@unit-address patterns.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1530044492-24921-2-git-send-email-eric.auger@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This updates the minimum required glib version to 2.40
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJbNhcKAAoJEL6G67QVEE/f1gMP+wV4J+V9WJOxcXbNjP9bb1K4
gsvZrv2pKCVbcJW2o3Mc3POXExX1E2GsrAl639zpDHbNVlVFvQC51GaRO1Fc6lwh
7NRFHen0Ee6I7TvWlVXMy1YPXPZ/8EJ/1KuZerYUUyKO/R8ojEt/TbELwfl3P1LF
VZGWgs+GpwUYwiG4ZmYPQ0VsT/munZPlaK1mRQSDbGqX0KG1Vy8Q+mgbGAm+S6xh
39HtM7ecdfXbVEAnoMsp9+kRi3zXpD5zSAyVBZN2RDktMt/EHxF0pPJCqX7oq0Rn
DehXDkzaqF2ghzrSAIT3rbKBhzYIY/ny/vxnyQZ/Px0GrBdKwuUJ+pYcF2akS2hV
s98VWRw/tqt8wQKYLF/wmeL7eE/tKAHSUqFc3Ta0Rvgfq0z9Mp3b73vuswBpfjWL
+GiASRvg96n/1yqmywtCd7KtF5riOo1iyvfMFCdQ+nBHQok/6K/oWfZPyoZsCAIa
2GJBfmHzkkc98QeNO0Dp/+ckSyKIBuyTV1bDyq/8Yz7IwEd1PfRWooEgVFSHO7MU
6ddCECvKVvP0S03r5dlw4Imio39gPpeZBMmSE4ZOh/hRa89T4UqIUvSLwzat7OhA
DH1NhJsj7G1jaIxENfzBhpggg++rHCVcfc2cv28Tl7i7twyRsb7CveYA6TwD+UIg
7JT+KswUo7W9MrkKBHC+
=8tKK
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/berrange/tags/min-glib-pull-request' into staging
glib: update the min required version
This updates the minimum required glib version to 2.40
# gpg: Signature made Fri 29 Jun 2018 12:24:58 BST
# gpg: using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/min-glib-pull-request:
glib: enforce the minimum required version and warn about old APIs
glib: bump min required glib library version to 2.40
util: remove redundant include of glib.h and add osdep.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We are gradually moving away from sector-based interfaces, towards
byte-based. Now that all callers of vectored I/O have been converted
to use our preferred byte-based bdrv_co_p{read,write}v(), we can
delete the unused bdrv_co_{read,write}v().
Furthermore, this gets rid of the signature difference between the
public bdrv_co_writev() and the callback .bdrv_co_writev (the
latter still exists, because some drivers still need more work
before they are fully byte-based).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This moves the code to resize an image file to the thread pool to avoid
blocking.
Creating large images with preallocation with blockdev-create is now
actually a background job instead of blocking the monitor (and most
other things) until the preallocation has completed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
When growing an image, block drivers (especially protocol drivers) may
initialise the newly added area. I/O requests to the same area need to
wait for this initialisation to be completed so that data writes don't
get overwritten and reads don't read uninitialised data.
To avoid overhead in the fast I/O path by adding new locking in the
protocol drivers and to restrict the impact to requests that actually
touch the new area, reuse the existing tracked request infrastructure in
block/io.c and mark all discard requests as serialising.
With this change, it is safe for protocol drivers to make
.bdrv_co_truncate actually asynchronous.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
This moves the bdrv_truncate() implementation from block.c to block/io.c
so it can have access to the tracked requests infrastructure.
This involves making refresh_total_sectors() public (in block_int.h).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.
This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:
* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
protocol drivers are trivially safe because they don't actually yield
yet, so there is no change in behaviour.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* qcow2: The implementation modifies metadata, so it needs to hold
s->lock to be safe with concurrent I/O requests. In order to avoid
double locking, this requires pulling the locking out into
preallocate_co() and using qcow2_write_caches() instead of
bdrv_flush().
* qed: Does a single header update, this is fine without locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
The error handling policy was traditionally set with -drive, but with
-blockdev it is no longer possible to set frontend options. scsi-disk
(and other block devices) have long supported qdev properties to
configure the error handling policy, so let's add these options to
usb-storage as well and just forward them to the internal scsi-disk
instance.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
There are two useful macros that can be defined before including
glib.h that are related to the min required glib version
- GLIB_VERSION_MIN_REQUIRED
When this is defined, if code uses an API that was deprecated
in this version, or older, a compiler warning will be emitted.
This alerts maintainers to update their code to whatever new
replacement API is now recommended best practice.
- GLIB_VERSION_MAX_ALLOWED
When this is defined, if code uses an API that was introduced
in a version that is newer than the declared version, a compiler
warning will be emitted. This alerts maintainers if new code
accidentally uses functionality that won't be available on some
supported platforms.
The GLIB_VERSION_MAX_ALLOWED constant makes it a bit harder to opt
in to using specific new APIs with a GLIB_CHECK_VERSION conditional.
To workaround this Pragmas can be used to temporarily turn off the
-Wdeprecated-declarations compiler warning, while a static inline
compat function is implemented. This workaround is illustrated with the
implementation of the g_strv_contains method to satisfy the test suite.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Per supported platforms doc[1], the various min glib on relevant distros is:
RHEL-7: 2.50.3
Debian (Stretch): 2.50.3
Debian (Jessie): 2.42.1
OpenBSD (Ports): 2.54.3
FreeBSD (Ports): 2.50.3
OpenSUSE Leap 15: 2.54.3
SLE12-SP2: 2.48.2
Ubuntu (Xenial): 2.48.0
macOS (Homebrew): 2.56.0
This suggests that a minimum glib of 2.42 is a reasonable target.
The GLibC compile farm, however, uses Ubuntu 14.04 (Trusty) which only
has glib 2.40.0, and this is needed for testing during merge. Thus an
exception is made to the documented platform support policy to allow for
all three current LTS releases to be supported.
Docker jobs that not longer satisfy this new min version are removed.
[1] https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Code must only ever include glib.h indirectly via the glib-compat.h
header file, because we will need some macros set before glib.h is
pulled in. Adding extra includes of glib.h will (soon) cause compile
failures such as:
In file included from /home/berrange/src/virt/qemu/include/qemu/osdep.h:107,
from /home/berrange/src/virt/qemu/include/qemu/iova-tree.h:26,
from util/iova-tree.c:13:
/home/berrange/src/virt/qemu/include/glib-compat.h:22: error: "GLIB_VERSION_MIN_REQUIRED" redefined [-Werror]
#define GLIB_VERSION_MIN_REQUIRED GLIB_VERSION_2_40
In file included from /usr/include/glib-2.0/glib/gtypes.h:34,
from /usr/include/glib-2.0/glib/galloca.h:32,
from /usr/include/glib-2.0/glib.h:30,
from util/iova-tree.c:12:
/usr/include/glib-2.0/glib/gversionmacros.h:237: note: this is the location of the previous definition
# define GLIB_VERSION_MIN_REQUIRED (GLIB_VERSION_CUR_STABLE)
Furthermore, the osdep.h include should always be done directly from the
.c file rather than indirectly via any .h file.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The VPD Block Limits Inquiry page is optional, allowing SCSI devices
to not implement it. This is the case for devices like the MegaRAID
SAS 9361-8i and Microsemi PM8069.
In case of SCSI passthrough, the response of this request is used by
the QEMU SCSI layer to set the max_io_sectors that the guest
device will support, based on the value of the max_sectors_kb that
the device has set in the host at that time. Without this response,
the guest kernel is free to assume any value of max_io_sectors
for the SCSI device. If this value is greater than the value from
the host, SCSI Sense errors will occur because the guest will send
read/write requests that are larger than the underlying host device
is configured to support. An example of this behavior can be seen
in [1].
A workaround is to set the max_sectors_kb host value back in the guest
kernel (a process that can be automated using rc.local startup scripts
and the like), but this has several drawbacks:
- it can be troublesome if the guest has many passthrough devices that
needs this tuning;
- if a change in max_sectors_kb is made in the host side, manual change
in the guests will also be required;
- during an OS install it is difficult, and sometimes not possible, to
go to a terminal and change the max_sectors_kb prior to the installation.
This means that the disk can't be used during the install process. The
easiest alternative here is to roll back to scsi-hd, install the guest
and then go back to SCSI passthrough when the installation is done and
max_sectors_kb can be set.
An easier way would be to QEMU handle the absence of the Block Limits
VPD device response, setting max_io_sectors accordingly and allowing
the guest to use the device without the hassle.
This patch adds emulation of the Block Limits VPD response for
SCSI passthrough devices of type TYPE_DISK that doesn't support
it. The following changes were made:
- scsi_handle_inquiry_reply will now check the available VPD
pages from the Inquiry EVPD reply. In case the device does not
- a new function called scsi_generic_set_vpd_bl_emulation,
that is called during device realize, was created to set a
new flag 'needs_vpd_bl_emulation' of the device. This function
retrieves the Inquiry EVPD response of the device to check for
VPD BL support.
- scsi_handle_inquiry_reply will now check the available VPD
pages from the Inquiry EVPD reply in case the device needs
VPD BL emulation, adding the Block Limits page (0xb0) to
the list. This will make the guest kernel aware of the
support that we're now providing by emulation.
- a new function scsi_emulate_block_limits creates the
emulated Block Limits response. This function is called
inside scsi_read_complete in case the device requires
Block Limits VPD emulation and we detected a SCSI Sense
error in the VPD Block Limits reply that was issued
from the guest kernel to the device. This error is
expected: we're reporting support from our side, but
the device isn't aware of it.
With this patch, the guest now queries the Block Limits
page during the device configuration because it is being
advertised in the Supported Pages response. It will either
receive the Block Limits page from the hardware, if it supports
it, or will receive an emulated response from QEMU. At any rate,
the guest now has the information to set the max_sectors_kb
parameter accordingly, sparing the user of SCSI sense errors
that would happen without the emulated response and in the
absence of Block Limits support from the hardware.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1566195
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1566195
Reported-by: Dac Nguyen <dacng@us.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20180627172432.11120-4-danielhb413@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For the VPD Block Limits emulation with SCSI passthrough,
we'll issue an Inquiry request with EVPD set to retrieve
the available VPD pages of the device. This would be done in
a way similar of what scsi_generic_read_device_identification
does: create a SCSI command and a reply buffer, fill in the
sg_io_hdr_t structure, call blk_ioctl, check if an error
occurred, process the response.
This same process is done in other 2 functions, get_device_type
and get_stream_blocksize. They differ in the command/reply
buffer and post-processing, everything else is almost a
copy/paste.
Instead of adding a forth copy/pasted-ish code when adding
the passthrough VPD BL emulation, this patch extirpates
this repetition of those 3 functions and put it into
a new one called scsi_SG_IO_FROM_DEV. Any future code that
wants to execute an SG_DXFER_FROM_DEV to the device can
use it, avoiding filling sg_io_hdr_t again and et cetera.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20180627172432.11120-3-danielhb413@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To add support for the emulation of Block Limits VPD page
for passthrough devices, a few adjustments in the current code
base is required to avoid repetition and improve clarity.
In scsi-generic.c, detach the Inquiry handling from
scsi_read_complete and put it into a new function called
scsi_handle_inquiry_reply. This change aims to avoid
cluttering of scsi_read_complete when we more logic in the
Inquiry response handling is added in the next patches,
centralizing the changes in the new function.
In scsi-disk.c, take the build of all emulated VPD pages
from scsi_disk_emulate_inquiry and make it available to
other files into a non-static function called
scsi_disk_emulate_vpd_page. Making it public will allow
the future VPD BL emulation code for passthrough devices
to use it from scsi-generic.c, avoiding copy/pasting this
code solely for that purpose. It also has the advantage of
providing emulation of all VPD pages in case we need to
emulate other pages in other scenarios. As a bonus,
scsi_disk_emulate_inquiry got tidier.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20180627172432.11120-2-danielhb413@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
strchrnul is a GNU extension and thus unavailable on a number of targets.
In the review for a commit removing strchrnul from 9p, I was asked to
create a qemu_strchrnul helper to factor out this functionality.
Do so, and use it in a number of other places in the code base that inlined
the replacement pattern in a place where strchrnul could be used.
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
With this flag, kvm allows guest to control host CPU power state. This
increases latency for other processes using same host CPU in an
unpredictable way, but if decreases idle entry/exit times for the
running VCPU, so to use it QEMU needs a hint about whether host CPU is
overcommitted, hence the flag name.
Follow-up patches will expose this capability to guest
(using mwait leaf).
Based on a patch by Wanpeng Li <kernellwp@gmail.com> .
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20180622192148.178309-2-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's start to use "info pic" just like other platforms. For now we
keep the command for a while so that old users can know what is the new
command to use.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171229073104.3810-6-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This include both userspace and in-kernel ioapic. Note that the numbers
can be inaccurate for kvm-ioapic. One reason is the same with
kvm-i8259, that when irqfd is used, irqs can be delivered all inside
kernel without our notice. Meanwhile, kvm-ioapic is specially treated
when irq numbers <ISA_NUM_IRQS, those irqs will be delivered in kernel
too via kvm-i8259 (please refer to kvm_pc_gsi_handler).
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171229073104.3810-5-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds owners/parents (which are the same, just occasionally
owner==NULL) printing for memory regions; a new '-o' flag
enabled new output.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20180604032511.6980-1-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove the legacy esp_init() function now that there are no more remaining
users.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20180613094727.11326-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Coverity does not like the new _Float* types that are used by
recent glibc, and croaks on every single file that includes
stdlib.h. Add dummy typedefs to please it.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's try to reduce error handling a bit. In the plug/unplug case, the
device was realized and therefore we can assume that getting access to
the memory region will not fail.
For get_vmstate_memory_region() this is already handled that way.
Document both cases.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-13-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This way we can easily check if the region has already been inititalized
without having to rely on the size of an uninitialized region being 0.
Free the region in nvdimm_finalize() and not in unrealize() as we will
allow to create the region before realization in following patches.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-11-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Importantly, get_vmstate_memory_region() should also fail with a proper
error if called before the device is realized. For a PCDIMM, both functions
are to return the same thing, so share the implementation.
All current users are called after the device has been realized, so we
can expect the calls to succeed.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Unused, so let's remove it.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-8-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Not used outside of pc-dimm.c and there shouldn't be other users. If
other devices (e.g. memory devices) ever have to also use slots, then we
will have to factor this out.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's rename it to make it look more consistent.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-4-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have had some tracing tools for mutex but it's not easy to use them
for e.g. dead locks. Let's provide "--enable-debug-mutex" parameter
when configure to allow QemuMutex to store the last owner that took
specific lock. It will be easy to use this tool to debug deadlocks
since we can directly know who took the lock then as long as we can have
a debugger attached to the process.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180425025459.5258-4-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to KVM commit 75d61fbc, it needs to delete the slot before
changing the KVM_MEM_READONLY flag. But QEMU commit 235e8982 only check
whether KVM_MEM_READONLY flag is set instead of changing. It doesn't
need to delete the slot if the KVM_MEM_READONLY flag is not changed.
This fixes a issue that migrating a VM at the OVMF startup stage and
VM is executing the codes in rom. Between the deleting and adding the
slot in kvm_set_user_memory_region, there is a chance that guest access
rom and trap to KVM, then KVM can't find the corresponding memslot.
While KVM (on ARM) injects an abort to guest due to the broken hva, then
guest will get stuck.
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Message-Id: <1526462314-19720-1-git-send-email-zhaoshenglong@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180602085259.17853-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Place them in exec.c, exec-all.h and ram_addr.h. This removes
knowledge of translate-all.h (which is an internal header) from
several files outside accel/tcg and removes knowledge of
AddressSpace from translate-all.c (as it only operates on ram_addr_t).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
laio_init() can fail for a couple of reasons, which will lead to a NULL
pointer dereference in laio_attach_aio_context().
To solve this, add a aio_setup_linux_aio() function which is called
early in raw_open_common. If this fails, propagate the error up. The
signature of aio_get_linux_aio() was not modified, because it seems
preferable to return the actual errno from the possible failing
initialization calls.
Additionally, when the AioContext changes, we need to associate a
LinuxAioState with the new AioContext. Use the bdrv_attach_aio_context
callback and call the new aio_setup_linux_aio(), which will allocate a
new AioContext if needed, and return errors on failures. If it fails for
any reason, fallback to threaded AIO with an error message, as the
device is already in-use by the guest.
Add an assert that aio_get_linux_aio() cannot return NULL.
Signed-off-by: Nishanth Aravamudan <naravamudan@digitalocean.com>
Message-id: 20180622193700.6523-1-naravamudan@digitalocean.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Determining the size of a field is useful when you don't have a struct
variable handy. Open-coding this is ugly.
This patch adds the sizeof_field() macro, which is similar to
typeof_field(). Existing instances are updated to use the macro.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180614164431.29305-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Not needed. Don't expose last_ram_page().
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180620202736.21399-1-david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
trace_mem_build_info expects a size_shift for its first argument. Fix it.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 1527028012-21888-2-git-send-email-cota@braap.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* aspeed: set APB clocks correctly (fixes slowdown on palmetto)
* smmuv3: cache config data and TLB entries
* v7m/v8m: support read/write from MPU regions smaller than 1K
* various: clean up logging/debug messages
* xilinx_spips: Make dma transactions as per dma_burst_size
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJbMnASAAoJEDwlJe0UNgzesJQQAIUYSTN+jrvl7CE+6Eo2LDp4
XX7Q1oO/wVSWqID2ESm9yhpM8+xNa/IPqHy23qBzNClLxCdqYwr+hIsUp9NhIkBX
JpHeoJWr8CI4MO5AyQYx72bcASSodR35sp+rw4aX6LL6tqL6UuZ5e8p66vh9W0wI
umSUB+QfhR8BRpZOk79uA8o3sHwslx3Z9N+Lr4TrprV53hvEs3xe31D7VLTTE1f9
+yYxtlIW2E0zoy8+Ptstn6wemwTKyEUmJDptBJVDMj89kHsDuKSvOSBcqsiNbfM2
jniKQXPFvZa8FRFfO258bS+hHNQ7CI4kFR7Mli2hNNZZVUZLyq0TI9xh+kYBORNM
nC8rTJgNzQz7iqNWtxro0/oQWjPdEOBnZAdN0+ozc3+i1l1mTE/fbPeqAZbpnYj0
U/bca/MT7KkhVFZumidPoWn6oeByLhSB4duX2OgrkE802r1BGdNFZxkyv1Tx8PMd
FT5al3md6usytXIeVC1oA3u61VnnfcaBRgjYA57c8ygIA7U7IsCtdMNWZXHtVyBF
CF+n/lQ+VgqnDF1Ns2bW0cRC9wTIbkKQZA8UNEiK6WIWG8BiiIag7G/3BEIFxb5z
QjyxU2VmHE4+sk1T25e54O/329E36nmeWCI4cLGUobVe+Kzbo/dHY3lFq/QtkEJo
xCI2niNzT8PLpWOPNPeH
=ZiOR
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180626' into staging
target-arm queue:
* aspeed: set APB clocks correctly (fixes slowdown on palmetto)
* smmuv3: cache config data and TLB entries
* v7m/v8m: support read/write from MPU regions smaller than 1K
* various: clean up logging/debug messages
* xilinx_spips: Make dma transactions as per dma_burst_size
# gpg: Signature made Tue 26 Jun 2018 17:55:46 BST
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180626: (32 commits)
aspeed/timer: use the APB frequency from the SCU
aspeed: initialize the SCU controller first
aspeed/scu: introduce clock frequencies
hw/arm/smmuv3: Add notifications on invalidation
hw/arm/smmuv3: IOTLB emulation
hw/arm/smmuv3: Cache/invalidate config data
hw/arm/smmuv3: Fix translate error handling
target/arm: Handle small regions in get_phys_addr_pmsav8()
target/arm: Set page (region) size in get_phys_addr_pmsav7()
tcg: Support MMU protection regions smaller than TARGET_PAGE_SIZE
hw/arm/stellaris: Use HWADDR_PRIx to display register address
hw/arm/stellaris: Fix gptm_write() error message
hw/net/smc91c111: Use qemu_log_mask(UNIMP) instead of fprintf
hw/net/smc91c111: Use qemu_log_mask(GUEST_ERROR) instead of hw_error
hw/net/stellaris_enet: Use qemu_log_mask(GUEST_ERROR) instead of hw_error
hw/net/stellaris_enet: Fix a typo
hw/arm/stellaris: Use qemu_log_mask(UNIMP) instead of fprintf
hw/arm/omap: Use qemu_log_mask(GUEST_ERROR) instead of fprintf
hw/arm/omap1: Use qemu_log_mask(GUEST_ERROR) instead of fprintf
hw/i2c/omap_i2c: Use qemu_log_mask(UNIMP) instead of fprintf
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The timer controller can be driven by either an external 1MHz clock or
by the APB clock. Today, the model makes the assumption that the APB
frequency is always set to 24MHz but this is incorrect.
The AST2400 SoC on the palmetto machines uses a 48MHz input clock
source and the APB can be set to 48MHz. The consequence is a general
system slowdown. The QEMU machines using the AST2500 SoC do not seem
impacted today because the APB frequency is still set to 24MHz.
We fix the timer frequency for all SoCs by linking the Timer model to
the SCU model. The APB frequency driving the timers is now the one
configured for the SoC.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 20180622075700.5923-4-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
All Aspeed SoC clocks are driven by an input source clock which can
have different frequencies : 24MHz or 25MHz, and also, on the Aspeed
AST2400 SoC, 48MHz. The H-PLL (CPU) clock is defined from a
calculation using parameters in the H-PLL Parameter register or from a
predefined set of frequencies if the setting is strapped by hardware
(Aspeed AST2400 SoC). The other clocks of the SoC are then defined
from the H-PLL using dividers.
We introduce first the APB clock because it should be used to drive
the Aspeed timer model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 20180622075700.5923-2-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On TLB invalidation commands, let's call registered
IOMMU notifiers. Those can only be UNMAP notifiers.
SMMUv3 does not support notification on MAP (VFIO).
This patch allows vhost use case where IOTLB API is notified
on each guest IOTLB invalidation.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1529653501-15358-5-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We emulate a TLB cache of size SMMU_IOTLB_MAX_SIZE=256.
It is implemented as a hash table whose key is a combination
of the 16b asid and 48b IOVA (Jenkins hash).
Entries are invalidated on TLB invalidation commands, either
globally, or per asid, or per asid/iova.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1529653501-15358-4-git-send-email-eric.auger@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let's cache config data to avoid fetching and parsing STE/CD
structures on each translation. We invalidate them on data structure
invalidation commands.
We put in place a per-smmu mutex to protect the config cache. This
will be useful too to protect the IOTLB cache. The caches can be
accessed without BQL, ie. in IO dataplane. The same kind of mutex was
put in place in the intel viommu.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1529653501-15358-3-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add support for MMU protection regions that are smaller than
TARGET_PAGE_SIZE. We do this by marking the TLB entry for those
pages with a flag TLB_RECHECK. This flag causes us to always
take the slow-path for accesses. In the slow path we can then
special case them to always call tlb_fill() again, so we have
the correct information for the exact address being accessed.
This change allows us to handle reading and writing from small
regions; we cannot deal with execution from the small region.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180620130619.11362-2-peter.maydell@linaro.org
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20180624040609.17572-10-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
TCMI_VERBOSE is no more used, drop the OMAP_8/16/32B_REG macros.
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20180624040609.17572-9-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Qspi dma has a burst length of 64 bytes, So limit the transactions w.r.t
dma-burst-size property.
Signed-off-by: Sai Pavan Boddu <saipava@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1529660880-30376-1-git-send-email-sai.pavan.boddu@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In commit a8bff79e9f we added a definition to hw/virtio/virtio-gpu.h
for VIRTIO_GPU_CAPSET_VIRGL2, as a workaround for it not yet being
in the Linux kernel headers. In commit 77d361b13c we updated our
kernel headers to a version which does define the macro, so we can
now remove our workaround.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180622173249.29963-1-peter.maydell@linaro.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add support for OpenGL ES to egl-helpers. Wire up the new option for
egl-headless and gtk UIs. egl-headless actually works fine. gtk hits a
not-yet implemented code path in libEGL when trying to use gles mode:
libEGL warning: FIXME: egl/x11 doesn't support front buffer rendering.
(This is mesa 17.2.3).
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Message-id: 20180618112141.23398-1-kraxel@redhat.com
The oldest machine type which is still used in a still maintained distro
is a pc-0.12 based machine type in RHEL6, so everything that is older
than pc-0.12 should not be used anymore. Thus let's deprecate pc-0.10
and pc-0.11 so that we can finally remove them in a future release.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1529917512-10528-1-git-send-email-thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Also add a compat property to disable it for old machine types,
needed for live migration compatibility.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180622111200.30561-6-kraxel@redhat.com
Enable TOPOEXT feature on EPYC CPU. This is required to support
hyperthreading on VM guests. Also extend xlevel to 0x8000001E.
Disable topoext on PC_COMPAT_2_12 and keep xlevel 0x8000000a.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <1529443919-67509-3-git-send-email-babu.moger@amd.com>
[ehabkost: Added EPYC-IBPB.xlevel to PC_COMPAT_2_12]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* hw/intc/arm_gicv3: fix wrong values when reading IPRIORITYR
* target/arm: fix read of freed memory in kvm_arm_machine_init_done()
* virt: support up to 512 CPUs
* virt: support 256MB ECAM PCI region (for more PCI devices)
* xlnx-zynqmp: Use Cortex-R5F, not Cortex-R5
* mps2-tz: Implement and use the TrustZone Memory Protection Controller
* target/arm: enforce alignment checking for v6M cores
* xen: Don't use memory_region_init_ram_nomigrate() in pci_assign_dev_load_option_rom()
* vl.c: Don't zero-initialize statics for serial_hds
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJbLPHgAAoJEDwlJe0UNgzeCugP/0yFWNkIQqgPX1D2fDLaea8z
fj2QCkMHVDC/iM9s2Q7PpQDwfxMfE5MbvcavyfcVWe6lthyWQvvQ+5j4JHipgEup
vdGX+ASA9sbC3kJ1u/OUekz6JliEWaxOyImnj2gyQfBw8zuMfiF+eTmtoKXcJC3u
KNSNvArPIaFLNKsaQgTQE19Bu8CxIzfGEzsIgeAoD4gE7wv48EWqpOaYdIkbDo+0
gJylBVCa46pmf56ESCuLTDZC+2FxBcw+uCFtD5zzt6YVV0Fli1ja9FNwLP17YHqg
GOfNQ6melPeNUF+ByIEaPLWrq+Sy2P6wlnVlvKcKis8nXq497VjJdvv1txbbnyrn
s4dKgVHjQP6EocvaVKCxXsLfjvPUCF2+f/uIdA8IR4WRilgTEfV3IdOYNuKjOFXb
FcYap5UrX4ikEBkDBIBkh1BQXqOAU+lf8JjW1O6kn8PbfkEu2oRqGybWtEOjr+mz
+iNsJ+4PydJ34WlvPKkzDG7GjEJcleStmgD3/DdL3r+jtjBYBT97xOAqWVnyVRYb
NEUQEGmG894THftieuMw6r8vi4u24ZkI/3vGvBeSZ1od8IFEjzENRWkhYoDhLO5r
LH16da19pVgWnLSXJnhxLTRaYNNAphNSIW30GDAwMbuXFK+LpCBufT8X7b6RerNx
tOYET9DwlQRYEiuBVeSk
=1T5q
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180622' into staging
target-arm queue:
* hw/intc/arm_gicv3: fix wrong values when reading IPRIORITYR
* target/arm: fix read of freed memory in kvm_arm_machine_init_done()
* virt: support up to 512 CPUs
* virt: support 256MB ECAM PCI region (for more PCI devices)
* xlnx-zynqmp: Use Cortex-R5F, not Cortex-R5
* mps2-tz: Implement and use the TrustZone Memory Protection Controller
* target/arm: enforce alignment checking for v6M cores
* xen: Don't use memory_region_init_ram_nomigrate() in pci_assign_dev_load_option_rom()
* vl.c: Don't zero-initialize statics for serial_hds
# gpg: Signature made Fri 22 Jun 2018 13:56:00 BST
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180622: (28 commits)
xen: Don't use memory_region_init_ram_nomigrate() in pci_assign_dev_load_option_rom()
vl.c: Don't zero-initialize statics for serial_hds
target/arm: Strict alignment for ARMv6-M and ARMv8-M Baseline
target/arm: Introduce ARM_FEATURE_M_MAIN
hw/arm/mps2-tz.c: Instantiate MPCs
hw/arm/iotkit: Wire up MPC interrupt lines
hw/arm/iotkit: Instantiate MPC
hw/misc/iotkit-secctl.c: Implement SECMPCINTSTATUS
hw/misc/tz_mpc.c: Honour the BLK_LUT settings in translate
hw/misc/tz-mpc.c: Implement correct blocked-access behaviour
hw/misc/tz-mpc.c: Implement registers
hw/misc/tz-mpc.c: Implement the Arm TrustZone Memory Protection Controller
xlnx-zynqmp: Swap Cortex-R5 for Cortex-R5F
target-arm: Add the Cortex-R5F
hw/arm/virt: Increase max_cpus to 512
hw/arm/virt: Use 256MB ECAM region by default
hw/arm/virt: Add virt-3.0 machine type
hw/arm/virt: Add a new 256MB ECAM region
hw/arm/virt: Register two redistributor regions when necessary
hw/arm/virt-acpi-build: Advertise one or two GICR structures
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Another assorted patch of patches for ppc and spapr.
* Rework of guest pagesize handling for ppc, which avoids guest
visibly different behaviour between accelerators
* A number of Pnv cleanups, working towards more complete POWER9
support
* Migration of VPA data, a significant bugfix
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlssebQACgkQbDjKyiDZ
s5IMUA/+MUejq1eIAE3z4POijm0nNS5r7gce4b6M4G0vV1cC3eY9GA/kir4r3XU5
LoBGJx96pSXph69tzI7WIdn8vHpE2AkvWr4DT36ndxW0H7V630BCHk7vWMuBrfHf
5l+bKgKKnfM5MaptnKDsosvfSk7NaH+7cqSa97h2PUGFaTKmD2XyTXQqBkReBzoC
lBYVUVLjc6l9Y3P+DT+3uF+5lR7QOuqJ+2q31DvuNcTFoYhHc4DPXmDb5PfjY/DG
3oBIcFC6dqjkRzWAAT65Zk1P/hrKcYmmE382hwsFg76GTGR893MhE0vx0EjqrFbr
Tdkwj8v6Uto9/iatRz/nBSEg96oDmvSmz97Omh4JwHPKOpHRZ+QjCxccK63Jp1Yv
xgwV52YIw/5LIQYLw6eArDKfU9iLZ/Nh+rtAZxaCb/Bvsmv2vljbDpHbuhk1uJ8i
NaJ4Pe9aDt7tew39Sl/ohdUmgOYBal3WwVBNE5JNn5cO6GxzEG8U5JuFDY+RzpUV
Xh2ksy1n6F7Q3XOqf4CgF/0Gpy5mrUXSRyCVIx47E+DNwFvkexNwX8PqANJqKJFo
cxBBZbvaHRjpUGUN/yjixPqx7xWWfC698Urgdzg7LMcn3SjaRztT834DhlH46gh6
1KBf2RrkPXEJD+O4YMb5CowDXrPL4yYXSLN0GxGUknW0nQF3v2g=
=vjK1
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-3.0-20180622' into staging
ppc patch queue 2018-06-22
Another assorted patch of patches for ppc and spapr.
* Rework of guest pagesize handling for ppc, which avoids guest
visibly different behaviour between accelerators
* A number of Pnv cleanups, working towards more complete POWER9
support
* Migration of VPA data, a significant bugfix
# gpg: Signature made Fri 22 Jun 2018 05:23:16 BST
# gpg: using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-3.0-20180622: (23 commits)
spapr: Don't rewrite mmu capabilities in KVM mode
spapr: Limit available pagesizes to provide a consistent guest environment
target/ppc: Add ppc_hash64_filter_pagesizes()
spapr: Use maximum page size capability to simplify memory backend checking
spapr: Maximum (HPT) pagesize property
pseries: Update SLOF firmware image to qemu-slof-20180621
target/ppc: Add missing opcode for icbt on PPC440
ppc4xx_i2c: Implement directcntl register
ppc4xx_i2c: Remove unimplemented sdata and intr registers
sm501: Fix hardware cursor color conversion
fpu_helper.c: fix helper_fpscr_clrbit() function
spapr: remove unused spapr_irq routines
spapr: split the IRQ allocation sequence
target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper
spapr: Add cpu_apply hook to capabilities
spapr: Compute effective capability values earlier
target/ppc: Allow cpu compatiblity checks based on type, not instance
ppc/pnv: consolidate the creation of the ISA bus device tree
ppc/pnv: introduce Pnv8Chip and Pnv9Chip models
spapr_cpu_core: migrate VPA related state
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The interrupt outputs from the MPC in the IoTKit and the expansion
MPCs in the board must be wired up to the security controller, and
also all ORed together to produce a single line to the NVIC.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180620132032.28865-8-peter.maydell@linaro.org
Wire up the one MPC that is part of the IoTKit itself. For the
moment we don't wire up its interrupt line.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180620132032.28865-7-peter.maydell@linaro.org
Implement the SECMPCINTSTATUS register. This is the only register
in the security controller that deals with Memory Protection
Controllers, and it simply provides a read-only view of the
interrupt lines from the various MPCs in the system.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180620132032.28865-6-peter.maydell@linaro.org
Implement the missing registers for the TZ MPC.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20180620132032.28865-3-peter.maydell@linaro.org
Implement the Arm TrustZone Memory Protection Controller, which sits
in front of RAM and allows secure software to configure it to either
pass through or reject transactions.
We implement the MPC as a QEMU IOMMU, which will direct transactions
either through to the devices and memory behind it or to a special
"never works" AddressSpace if they are blocked.
This initial commit implements the skeleton of the device:
* it always permits accesses
* it doesn't implement most of the registers
* it doesn't implement the interrupt or other behaviour
for blocked transactions
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20180620132032.28865-2-peter.maydell@linaro.org
With this patch, virt-3.0 machine uses a new 256MB ECAM region
by default instead of the legacy 16MB one, if highmem is set
(LPAE supported by the guest) and (!firmware_loaded || aarch64).
Indeed aarch32 mode FW may not support this high ECAM region.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1529072910-16156-11-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch defines a new ECAM region located after the 256GB limit.
The virt machine state is augmented with a new highmem_ecam field
which guards the usage of this new ECAM region instead of the legacy
16MB one. With the highmem ECAM region, up to 256 PCIe buses can be
used.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1529072910-16156-9-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch allows the creation of a GICv3 node with 1 or 2
redistributor regions depending on the number of smu_cpus.
The second redistributor region is located just after the
existing RAM region, at 256GB and contains up to up to 512 vcpus.
Please refer to kernel documentation for further node details:
Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1529072910-16156-6-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
To prepare for multiple redistributor regions, we introduce
an array of uint32_t properties that stores the redistributor
count of each redistributor region.
Non accelerated VGICv3 only supports a single redistributor region.
The capacity of all redist regions is checked against the number of
vcpus.
Machvirt is updated to set those properties, ie. a single
redistributor region with count set to the number of vcpus
capped by 123.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1529072910-16156-4-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add experimental x-nbd-server-add-bitmap to expose a disabled
bitmap over NBD, in preparation for a pull model incremental
backup scheme. Also fix a corner case protocol issue with
NBD_CMD_BLOCK_STATUS, and add new NBD_CMD_CACHE.
- Eric Blake: tests: Simplify .gitignore
- Eric Blake: nbd/server: Reject 0-length block status request
- Vladimir Sementsov-Ogievskiy: 0/6 NBD export bitmaps
- Vladimir Sementsov-Ogievskiy: nbd/server: introduce NBD_CMD_CACHE
-----BEGIN PGP SIGNATURE-----
Comment: Public key at http://people.redhat.com/eblake/eblake.gpg
iQEcBAABCAAGBQJbK7wDAAoJEKeha0olJ0NqJuMH/0o7wzP3HWIO1pJZvhRA81AP
ZZqZy/8xO0B++keCxzgTbdr9yf6RDtYGjDrNuOcVzqfA1k9E1Cl2pPjS93TzPlzZ
PEgtG6bLXL2jCUi/8/EbZ/TLuoQx5YJc7eXTNREgXDRMLiTEg5Kcmg1n0efhr8Sl
5/bRMDmW1+atO4dJS8bEW5WLuy4IoB823cSHjWoPFRHeNCwmFREwTx2xVULVje84
JZzHYxyaBRquYLKIAew1WidhKVC4OiYtm4F5PxGBB/ZCVkeoDBX5uqZSjOUVt8y7
uOhGYQGW/qSj/s3x74NEL6IPo57Iay/Wq065DbolDx7s0y5t38TocrBnakzOBFU=
=VK6J
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-06-20-v2' into staging
nbd patches for 2018-06-20
Add experimental x-nbd-server-add-bitmap to expose a disabled
bitmap over NBD, in preparation for a pull model incremental
backup scheme. Also fix a corner case protocol issue with
NBD_CMD_BLOCK_STATUS, and add new NBD_CMD_CACHE.
- Eric Blake: tests: Simplify .gitignore
- Eric Blake: nbd/server: Reject 0-length block status request
- Vladimir Sementsov-Ogievskiy: 0/6 NBD export bitmaps
- Vladimir Sementsov-Ogievskiy: nbd/server: introduce NBD_CMD_CACHE
# gpg: Signature made Thu 21 Jun 2018 15:53:55 BST
# gpg: using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2018-06-20-v2:
nbd/server: introduce NBD_CMD_CACHE
docs/interop: add nbd.txt
qapi: new qmp command nbd-server-add-bitmap
nbd/server: implement dirty bitmap export
nbd/server: add nbd_meta_empty_or_pattern helper
nbd/server: refactor NBDExportMetaContexts
nbd/server: fix trace
nbd/server: Reject 0-length block status request
tests: Simplify .gitignore
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The way we used to handle KVM allowable guest pagesizes for PAPR guests
required some convoluted checking of memory attached to the guest.
The allowable pagesizes advertised to the guest cpus depended on the memory
which was attached at boot, but then we needed to ensure that any memory
later hotplugged didn't change which pagesizes were allowed.
Now that we have an explicit machine option to control the allowable
maximum pagesize we can simplify this. We just check all memory backends
against that declared pagesize. We check base and cold-plugged memory at
reset time, and hotplugged memory at pre_plug() time.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
that every page that the guest puts in the pagetables must be truly
physically contiguous, not just GPA-contiguous. In effect this means that
an HPT guest can't use any pagesizes greater than the host page size used
to back its memory.
At present we handle this by changing what we advertise to the guest based
on the backing pagesizes. This is pretty bad, because it means the guest
sees a different environment depending on what should be host configuration
details.
As a start on fixing this, we add a new capability parameter to the
pseries machine type which gives the maximum allowed pagesizes for an
HPT guest. For now we just create and validate the parameter without
making it do anything.
For backwards compatibility, on older machine types we set it to the max
available page size for the host. For the 3.0 machine type, we fix it to
16, the intention being to only allow HPT pagesizes up to 64kiB by default
in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Handle nbd CACHE command. Just do read, without sending read data back.
Cache mechanism should be done by exported node driver chain.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180413143156.11409-1-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: fix two missing case labels in switch statements]
Signed-off-by: Eric Blake <eblake@redhat.com>
Handle a new NBD meta namespace: "qemu", and corresponding queries:
"qemu:dirty-bitmap:<export bitmap name>".
With the new metadata context negotiated, BLOCK_STATUS query will reply
with dirty-bitmap data, converted to extents. The new public function
nbd_export_bitmap selects which bitmap to export. For now, only one bitmap
may be exported.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: wording tweaks, minor cleanups, additional tracing]
Signed-off-by: Eric Blake <eblake@redhat.com>
As well as being able to generate its own i2c transactions, the ppc4xx
i2c controller has a DIRECTCNTL register which allows explicit control
of the i2c lines.
Using this register an OS can directly bitbang i2c operations. In
order to let emulated i2c devices respond to this, we need to wire up
the DIRECTCNTL register to qemu's bitbanged i2c handling code.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We don't emulate slave mode so related registers are not needed.
[lh]sadr are only retained to avoid too many warnings and simplify
debugging but sdata is not even correct because device has a 4 byte
FIFO instead so just remove this unimplemented register for now.
The intr register is also not implemented correctly, it is for
diagnostics and normally not even visible on device without explicitly
enabling it. As no guests are known to need this remove it as well.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_irq_alloc_block and spapr_irq_alloc() are now deprecated.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Today, when a device requests for IRQ number in a sPAPR machine, the
spapr_irq_alloc() routine first scans the ICSState status array to
find an empty slot and then performs the assignement of the selected
numbers. Split this sequence in two distinct routines : spapr_irq_find()
for lookups and spapr_irq_claim() for claiming the IRQ numbers.
This will ease the introduction of a static layout of IRQ numbers.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr capabilities have an apply hook to actually activate (or deactivate)
the feature in the system at reset time. However, a number of capabilities
affect the setup of cpus, and need to be applied to each of them -
including hotplugged cpus for extra complication. To make this simpler,
add an optional cpu_apply hook that is called from spapr_cpu_reset().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Previously, the effective values of the various spapr capability flags
were only determined at machine reset time. That was a lazy way of making
sure it was after cpu initialization so it could use the cpu object to
inform the defaults.
But we've now improved the compat checking code so that we don't need to
instantiate the cpus to use it. That lets us move the resolution of the
capability defaults much earlier.
This is going to be necessary for some future capabilities.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
It introduces a base PnvChip class from which the specific processor
chip classes, Pnv8Chip and Pnv9Chip, inherit. Each of them needs to
define an init and a realize routine which will create the controllers
of the target processor. For the moment, the base PnvChip class
handles the XSCOM bus and the cores.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A per-CPU machine data pointer was recently added to PowerPCCPU. The
motivation is to to hide platform specific details from the core CPU
code. This per-CPU data can hold state which is relevant to the guest
though, eg, Virtual Processor Areas, and we should migrate this state.
This patch adds the plumbing so that we can migrate the per-CPU data
for PAPR guests. We only do this for newer machine types for the sake
of backward compatibility. No state is migrated for the moment: the
vmstate_spapr_cpu_state structure will be populated by subsequent
patches.
Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fix some trivial spelling and spacing errors]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This moves the details of the ISA bus creation under the LPC model but
more important, the new PnvChip operation will let us choose the chip
class to use when we introduce the different chip classes for Power9
and Power8. It hides away the processor chip controllers from the
machine.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On Power9, the thread interrupt presenter has a different type and is
linked to the chip owning the cores.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch allows the user to specify whether to use active or only
background mode for mirror block jobs. Currently, this setting will
remain constant for the duration of the entire block job.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180613181823.13618-14-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180613181823.13618-12-mreitz@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
This new function allows to look for a consecutively dirty area in a
dirty bitmap.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180613181823.13618-10-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This new parameter allows the caller to just query the next dirty
position without moving the iterator.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180613181823.13618-8-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
bdrv_drain_all_*() used bdrv_next() to iterate over all root nodes and
did a subtree drain for each of them. This works fine as long as the
graph is static, but sadly, reality looks different.
If the graph changes so that root nodes are added or removed, we would
have to compensate for this. bdrv_next() returns each root node only
once even if it's the root node for multiple BlockBackends or for a
monitor-owned block driver tree, which would only complicate things.
The much easier and more obviously correct way is to fundamentally
change the way the functions work: Iterate over all BlockDriverStates,
no matter who owns them, and drain them individually. Compensation is
only necessary when a new BDS is created inside a drain_all section.
Removal of a BDS doesn't require any action because it's gone afterwards
anyway.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the future, bdrv_drained_all_begin/end() will drain all invidiual
nodes separately rather than whole subtrees. This means that we don't
want to propagate the drain to all parents any more: If the parent is a
BDS, it will already be drained separately. Recursing to all parents is
unnecessary work and would make it an O(n²) operation.
Prepare the drain function for the changed drain_all by adding an
ignore_bds_parents parameter to the internal implementation that
prevents the propagation of the drain to BDS parents. We still (have to)
propagate it to non-BDS parents like BlockBackends or Jobs because those
are not drained separately.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_drain_all() wants to have a single polling loop for draining the
in-flight requests of all nodes. This means that the AIO_WAIT_WHILE()
condition relies on activity in multiple AioContexts, which is polled
from the mainloop context. We must therefore call AIO_WAIT_WHILE() from
the mainloop thread and use the AioWait notification mechanism.
Just randomly picking the AioContext of any non-mainloop thread would
work, but instead of bothering to find such a context in the caller, we
can just as well accept NULL for ctx.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_do_drained_begin() is only safe if we have a single
BDRV_POLL_WHILE() after quiescing all affected nodes. We cannot allow
that parent callbacks introduce a nested polling loop that could cause
graph changes while we're traversing the graph.
Split off bdrv_do_drained_begin_quiesce(), which only quiesces a single
node without waiting for its requests to complete. These requests will
be waited for in the BDRV_POLL_WHILE() call down the call chain.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Anything can happen inside BDRV_POLL_WHILE(), including graph
changes that may interfere with its callers (e.g. child list iteration
in recursive callers of bdrv_do_drained_begin).
Switch to a single BDRV_POLL_WHILE() call for the whole subtree at the
end of bdrv_do_drained_begin() to avoid such effects. The recursion
happens now inside the loop condition. As the graph can only change
between bdrv_drain_poll() calls, but not inside of it, doing the
recursion here is safe.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We already requested that block jobs be paused in .bdrv_drained_begin,
but no guarantee was made that the job was actually inactive at the
point where bdrv_drained_begin() returned.
This introduces a new callback BdrvChildRole.bdrv_drained_poll() and
uses it to make bdrv_drain_poll() consider block jobs using the node to
be drained.
For the test case to work as expected, we have to switch from
block_job_sleep_ns() to qemu_co_sleep_ns() so that the test job is even
considered active and must be waited for when draining the node.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 91af091f92 added an additional aio_poll() to BDRV_POLL_WHILE()
in order to make sure that all pending BHs are executed on drain. This
was the wrong place to make the fix, as it is useless overhead for all
other users of the macro and unnecessarily complicates the mechanism.
This patch effectively reverts said commit (the context has changed a
bit and the code has moved to AIO_WAIT_WHILE()) and instead polls in the
loop condition for drain.
The effect is probably hard to measure in any real-world use case
because actual I/O will dominate, but if I run only the initialisation
part of 'qemu-img convert' where it calls bdrv_block_status() for the
whole image to find out how much data there is copy, this phase actually
needs only roughly half the time after this patch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
The boot framebuffer is expected to be configured by the firmware, so it
uses fw_cfg as interface. Initialization goes as follows:
(1) Check whenever etc/ramfb is present.
(2) Allocate framebuffer from RAM.
(3) Fill struct RAMFBCfg, write it to etc/ramfb.
Done. You can write stuff to the framebuffer now, and it should appear
automagically on the screen.
Note that this isn't very efficient because it does a full display
update on each refresh. No dirty tracking. Dirty tracking would have
to be active for the whole ram slot, so that wouldn't be very efficient
either. For a boot display which is active for a short time only this
isn't a big deal. As permanent guest display something better should be
used (if possible).
This is the ramfb core code. Some windup is needed for display devices
which want have a ramfb boot display.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 20180613122948.18149-2-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
CPUPPCState currently contains a number of fields containing the state of
the VPA. The VPA is a PAPR specific concept covering several guest/host
shared memory areas used to communicate some information with the
hypervisor.
As a PAPR concept this is really machine specific information, although it
is per-cpu, so it doesn't really belong in the core CPU state structure.
There's also other information that's per-cpu, but platform/machine
specific. So create a (void *)machine_data in PowerPCCPU which can be
used by the machine to locate per-cpu data. Intialization, lifetime and
cleanup of machine_data is entirely up to the machine type.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Currently, we allocate space for all the cpu objects within a single core
in one big block. This was copied from an older version of the spapr code
and requires some ugly pointer manipulation to extract the individual
objects.
This design was due to a misunderstanding of qemu lifetime conventions and
has already been changed in spapr (in 94ad93bd "spapr_cpu_core: instantiate
CPUs separately".
Make an equivalent change in pnv_core to get rid of the nasty pointer
arithmetic.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
In the case where we have an interrupt generated externally from inputs to
bits 1 and 2 of port A and/or port B, it is necessary to expose
mos6522_update_irq() so it can be called by the interrupt source.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PMU device supercedes the CUDA device found on older New World Macs and
is supported by a larger number of guest OSs from OS 9 to OS X 10.5.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
MacOS 9 has a bug in its PMU driver whereby after configuring the ADB bus
devices it sends another write to reg 3 on both devices resetting them
both back to the same address.
Add a new disable_direct_reg3_writes property to ADBDevice to disable these
direct writes which can enabled just for the upcoming pmu-adb support.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
PMU-enabled New World Macs expose their GPIOs via a separate memory region
within the macio device.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This option allows the VIA configuration to be controlled between 3
different possible setups: cuda, pmu-adb and pmu with USB rather than ADB
keyboard/mouse.
For the moment we don't do anything with the configuration except to pass
it to the macio device (the via-cuda parent) and also to the firmware via
the fw_cfg interface so that it can present the correct device tree.
The default is cuda which is the current default and so will have no
change in behaviour.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Use mmap_lock in user-mode to protect TCG state and the page descriptors.
In !user-mode, each vCPU has its own TCG state, so no locks needed.
Per-page locks are used to protect the page descriptors.
Per-TB locks are used in both modes to protect TB jumps.
Some notes:
- tb_lock is removed from notdirty_mem_write by passing a
locked page_collection to tb_invalidate_phys_page_fast.
- tcg_tb_lookup/remove/insert/etc have their own internal lock(s),
so there is no need to further serialize access to them.
- do_tb_flush is run in a safe async context, meaning no other
vCPU threads are running. Therefore acquiring mmap_lock there
is just to please tools such as thread sanitizer.
- Not visible in the diff, but tb_invalidate_phys_page already
has an assert_memory_lock.
- cpu_io_recompile is !user-only, so no mmap_lock there.
- Added mmap_unlock()'s before all siglongjmp's that could
be called in user-mode while mmap_lock is held.
+ Added an assert for !have_mmap_lock() after returning from
the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic.
Performance numbers before/after:
Host: AMD Opteron(tm) Processor 6376
ubuntu 17.04 ppc64 bootup+shutdown time
700 +-+--+----+------+------------+-----------+------------*--+-+
| + + + + + *B |
| before ***B*** ** * |
|tb lock removal ###D### *** |
600 +-+ *** +-+
| ** # |
| *B* #D |
| *** * ## |
500 +-+ *** ### +-+
| * *** ### |
| *B* # ## |
| ** * #D# |
400 +-+ ** ## +-+
| ** ### |
| ** ## |
| ** # ## |
300 +-+ * B* #D# +-+
| B *** ### |
| * ** #### |
| * *** ### |
200 +-+ B *B #D# +-+
| #B* * ## # |
| #* ## |
| + D##D# + + + + |
100 +-+--+----+------+------------+-----------+------------+--+-+
1 8 16 Guest CPUs 48 64
png: https://imgur.com/HwmBHXe
debian jessie aarch64 bootup+shutdown time
90 +-+--+-----+-----+------------+------------+------------+--+-+
| + + + + + + |
| before ***B*** B |
80 +tb lock removal ###D### **D +-+
| **### |
| **## |
70 +-+ ** # +-+
| ** ## |
| ** # |
60 +-+ *B ## +-+
| ** ## |
| *** #D |
50 +-+ *** ## +-+
| * ** ### |
| **B* ### |
40 +-+ **** # ## +-+
| **** #D# |
| ***B** ### |
30 +-+ B***B** #### +-+
| B * * # ### |
| B ###D# |
20 +-+ D ##D## +-+
| D# |
| + + + + + + |
10 +-+--+-----+-----+------------+------------+------------+--+-+
1 8 16 Guest CPUs 48 64
png: https://imgur.com/iGpGFtv
The gains are high for 4-8 CPUs. Beyond that point, however, unrelated
lock contention significantly hurts scalability.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This applies to both user-mode and !user-mode emulation.
Instead of relying on a global lock, protect the list of incoming
jumps with tb->jmp_lock. This lock also protects tb->cflags,
so update all tb->cflags readers outside tb->jmp_lock to use
atomic reads via tb_cflags().
In order to find the destination TB (and therefore its jmp_lock)
from the origin TB, we introduce tb->jmp_dest[].
I considered not using a linked list of jumps, which simplifies
code and makes the struct smaller. However, it unnecessarily increases
memory usage, which results in a performance decrease. See for
instance these numbers booting+shutting down debian-arm:
Time (s) Rel. err (%) Abs. err (s) Rel. slowdown (%)
------------------------------------------------------------------------------
before 20.88 0.74 0.154512 0.
after 20.81 0.38 0.079078 -0.33524904
GTree 21.02 0.28 0.058856 0.67049808
GHashTable + xxhash 21.63 1.08 0.233604 3.5919540
Using a hash table or a binary tree to keep track of the jumps
doesn't really pay off, not only due to the increased memory usage,
but also because most TBs have only 0 or 1 jumps to them. The maximum
number of jumps when booting debian-arm that I measured is 35, but
as we can see in the histogram below a TB with that many incoming jumps
is extremely rare; the average TB has 0.80 incoming jumps.
n_jumps: 379208; avg jumps/tb: 0.801099
dist: [0.0,1.0)|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁ ▁▁▁ ▁|[34.0,35.0]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The appended adds assertions to make sure we do not longjmp with page
locks held. Note that user-mode has nothing to check, since page_locks
are !user-mode only.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Groundwork for supporting parallel TCG generation.
Instead of using a global lock (tb_lock) to protect changes
to pages, use fine-grained, per-page locks in !user-mode.
User-mode stays with mmap_lock.
Sometimes changes need to happen atomically on more than one
page (e.g. when a TB that spans across two pages is
added/invalidated, or when a range of pages is invalidated).
We therefore introduce struct page_collection, which helps
us keep track of a set of pages that have been locked in
the appropriate locking order (i.e. by ascending page index).
This commit first introduces the structs and the function helpers,
to then convert the calling code to use per-page locking. Note
that tb_lock is not removed yet.
While at it, rename tb_alloc_page to tb_page_add, which pairs with
tb_page_remove.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This commit does several things, but to avoid churn I merged them all
into the same commit. To wit:
- Use uintptr_t instead of TranslationBlock * for the list of TBs in a page.
Just like we did in (c37e6d7e "tcg: Use uintptr_t type for
jmp_list_{next|first} fields of TB"), the rationale is the same: these
are tagged pointers, not pointers. So use a more appropriate type.
- Only check the least significant bit of the tagged pointers. Masking
with 3/~3 is unnecessary and confusing.
- Introduce the TB_FOR_EACH_TAGGED macro, and use it to define
PAGE_FOR_EACH_TB, which improves readability. Note that
TB_FOR_EACH_TAGGED will gain another user in a subsequent patch.
- Update tb_page_remove to use PAGE_FOR_EACH_TB. In case there
is a bug and we attempt to remove a TB that is not in the list, instead
of segfaulting (since the list is NULL-terminated) we will reach
g_assert_not_reached().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby making it per-TCGContext. Once we remove tb_lock, this will
avoid an atomic increment every time a TB is invalidated.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This paves the way for enabling scalable parallel generation of TCG code.
Instead of tracking TBs with a single binary search tree (BST), use a
BST for each TCG region, protecting it with a lock. This is as scalable
as it gets, since each TCG thread operates on a separate region.
The core of this change is the introduction of struct tcg_region_tree,
which contains a pointer to a GTree and an associated lock to serialize
accesses to it. We then allocate an array of tcg_region_tree's, adding
the appropriate padding to avoid false sharing based on
qemu_dcache_linesize.
Given a tc_ptr, we first find the corresponding region_tree. This
is done by special-casing the first and last regions first, since they
might be of size != region.size; otherwise we just divide the offset
by region.stride. I was worried about this division (several dozen
cycles of latency), but profiling shows that this is not a fast path.
Note that region.stride is not required to be a power of two; it
is only required to be a multiple of the host's page size.
Note that with this design we can also provide consistent snapshots
about all region trees at once; for instance, tcg_tb_foreach
acquires/releases all region_tree locks before/after iterating over them.
For this reason we now drop tb_lock in dump_exec_info().
As an alternative I considered implementing a concurrent BST, but this
can be tricky to get right, offers no consistent snapshots of the BST,
and performance and scalability-wise I don't think it could ever beat
having separate GTrees, given that our workload is insert-mostly (all
concurrent BST designs I've seen focus, understandably, on making
lookups fast, which comes at the expense of convoluted, non-wait-free
insertions/removals).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The meaning of "existing" is now changed to "matches in hash and
ht->cmp result". This is saner than just checking the pointer value.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qht_lookup now uses the default cmp function. qht_lookup_custom is defined
to retain the old behaviour, that is a cmp function is explicitly provided.
qht_insert will gain use of the default cmp in the next patch.
Note that we move qht_lookup_custom's @func to be the last argument,
which makes the new qht_lookup as simple as possible.
Instead of this (i.e. keeping @func 2nd):
0000000000010750 <qht_lookup>:
10750: 89 d1 mov %edx,%ecx
10752: 48 89 f2 mov %rsi,%rdx
10755: 48 8b 77 08 mov 0x8(%rdi),%rsi
10759: e9 22 ff ff ff jmpq 10680 <qht_lookup_custom>
1075e: 66 90 xchg %ax,%ax
We get:
0000000000010740 <qht_lookup>:
10740: 48 8b 4f 08 mov 0x8(%rdi),%rcx
10744: e9 37 ff ff ff jmpq 10680 <qht_lookup_custom>
10749: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
- Fix options that work only with -drive or -blockdev, but not with
both, because of QDict type confusion
- rbd: Add options 'auth-client-required' and 'key-secret'
- Remove deprecated -drive options serial/addr/cyls/heads/secs/trans
- rbd, iscsi: Remove deprecated 'filename' option
- Fix 'qemu-img map' crash with unaligned image size
- Improve QMP documentation for jobs
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbI8sTAAoJEH8JsnLIjy/WeIsP/Ak8ealnjx8GRUfWwi69DX5z
LPzGXjlFJ1DqVb1SCuc7mptlgQwvHSoeYk6i+2xzJy9WmaAJsT2Mxz+EkviGzlcU
Xna6t/a9dgMxvHQMKbN1j4Rm3tDJzm2x6ZnBLYlwFhecmkEz4i3TEsceFL7BrN4F
C21jm+yC2wKPyggKXRRLRP4a1UfuM7OH8OEE3aWCZwEz1+6ucZGs3xsmWJfXxlCK
adigVec3gNWgQK+gWFlYKXRJKalK50BnZ7+2LlQIcC1kLAx22Wr2t2zmlROa351w
JEBIQlLkEN8hpB44Xk3Wla3YmwumW8EANIdxkeAXDgYgn8KEpN1nwWyxGUsfoOqz
WhZ+pT0yerg/kSovKgTyxsF9YoeptJEUmmR0nZX4+xpnctbU6EHuspymMb1Xl95h
/7lirJbBVb8o9rzcCNOzTc0f9lot9pDglLjxcWq3RqvpJ2iQmxxY2UUidYl9BmRg
SpkkA9E613bZ9zuK+17FDbnGq/N9h77HSBSVbLgEoajFLXYUtkuc0egi/ZVLWo0W
wo/XvXTZqy7n9/M/j4yVS+Xv7WotNuyHajk6N73KdL0NodXE776CPxvMlUEHEHB9
Pizo/fuxmgatjDFf1WU3Qt5pZlU1Pao4Aakcq3z/Ag7ucMK03QQIMBcrJC24+DD2
pjT5HPIeS3vklkg9Jqdi
=LxgF
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- Fix options that work only with -drive or -blockdev, but not with
both, because of QDict type confusion
- rbd: Add options 'auth-client-required' and 'key-secret'
- Remove deprecated -drive options serial/addr/cyls/heads/secs/trans
- rbd, iscsi: Remove deprecated 'filename' option
- Fix 'qemu-img map' crash with unaligned image size
- Improve QMP documentation for jobs
# gpg: Signature made Fri 15 Jun 2018 15:20:03 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (26 commits)
block: Remove dead deprecation warning code
block: Remove deprecated -drive option serial
block: Remove deprecated -drive option addr
block: Remove deprecated -drive geometry options
rbd: New parameter key-secret
rbd: New parameter auth-client-required
block: Fix -blockdev / blockdev-add for empty objects and arrays
check-block-qdict: Cover flattening of empty lists and dictionaries
check-block-qdict: Rename qdict_flatten()'s variables for clarity
block-qdict: Simplify qdict_is_list() some
block-qdict: Clean up qdict_crumple() a bit
block-qdict: Tweak qdict_flatten_qdict(), qdict_flatten_qlist()
block-qdict: Simplify qdict_flatten_qdict()
block: Make remaining uses of qobject input visitor more robust
block: Factor out qobject_input_visitor_new_flat_confused()
block: Clean up a misuse of qobject_to() in .bdrv_co_create_opts()
block: Fix -drive for certain non-string scalars
block: Fix -blockdev for certain non-string scalars
qobject: Move block-specific qdict code to block-qdict.c
block: Add block-specific QDict header
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently we don't support board configurations that put an IOMMU
in the path of the CPU's memory transactions, and instead just
assert() if the memory region fonud in address_space_translate_for_iotlb()
is an IOMMUMemoryRegion.
Remove this limitation by having the function handle IOMMUs.
This is mostly straightforward, but we must make sure we have
a notifier registered for every IOMMU that a transaction has
passed through, so that we can flush the TLB appropriately
when any of the IOMMUs change their mappings.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-5-peter.maydell@linaro.org
Add an IOMMU index argument to the translate method of
IOMMUs. Since all of our current IOMMU implementations
support only a single IOMMU index, this has no effect
on the behaviour.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-4-peter.maydell@linaro.org
Add support for multiple IOMMU indexes to the IOMMU notifier APIs.
When initializing a notifier with iommu_notifier_init(), the caller
must pass the IOMMU index that it is interested in. When a change
happens, the IOMMU implementation must pass
memory_region_notify_iommu() the IOMMU index that has changed and
that notifiers must be called for.
IOMMUs which support only a single index don't need to change.
Callers which only really support working with IOMMUs with a single
index can use the result of passing MEMTXATTRS_UNSPECIFIED to
memory_region_iommu_attrs_to_index().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-3-peter.maydell@linaro.org
If an IOMMU supports mappings that care about the memory
transaction attributes, then it no longer has a unique
address -> output mapping, but more than one. We can
represent these using an IOMMU index, analogous to TCG's
mmu indexes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-2-peter.maydell@linaro.org
There's a common pattern in QEMU where a function needs to perform
a data load or store of an N byte integer in a particular endianness.
At the moment this is handled by doing a switch() on the size and
calling the appropriate ld*_p or st*_p function for each size.
Provide a new family of functions ldn_*_p() and stn_*_p() which
take the size as an argument and do the switch() themselves.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611171007.4165-2-peter.maydell@linaro.org
The API for cpu_transaction_failed() says that it takes the physical
address for the failed transaction. However we were actually passing
it the offset within the target MemoryRegion. We don't currently
have any target CPU implementations of this hook that require the
physical address; fix this bug so we don't get confused if we ever
do add one.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611125633.32755-3-peter.maydell@linaro.org
The 'addr' field in the CPUIOTLBEntry struct has a rather non-obvious
use; add a comment documenting it (reverse-engineered from what
the code that sets it is doing).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611125633.32755-2-peter.maydell@linaro.org
For the IoTKit MPC support, we need to wire together the
interrupt outputs of 17 MPCs; this exceeds the current
value of MAX_OR_LINES. Increase MAX_OR_LINES to 32 (which
should be enough for anyone).
The tricky part is retaining the migration compatibility for
existing OR gates; we add a subsection which is only used
for larger OR gates, and define it such that we can freely
increase MAX_OR_LINES in future (or even move to a dynamically
allocated levels[] array without an upper size limit) without
breaking compatibility.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-10-peter.maydell@linaro.org
Remove the now-unused armv7m_init() function. This was a legacy from
before we properly QOMified ARMv7M, and it has some flaws:
* it combines work that needs to be done by an SoC object (creating
and initializing the TYPE_ARMV7M object) with work that needs to
be done by the board model (setting the system up to load the ELF
file specified with -kernel)
* TYPE_ARMV7M creation failure is fatal, but an SoC object wants to
arrange to propagate the failure outward
* it uses allocate-and-create via qdev_create() whereas the current
preferred style for SoC objects is to do creation in-place
Board and SoC models can instead do the two jobs this function
was doing themselves, in the right places and with whatever their
preferred style/error handling is.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180601144328.23817-3-peter.maydell@linaro.org
The migration code should be using the
RAMBLOCK_FOREACH_MIGRATABLE and qemu_ram_foreach_block_migratable
not the all-block versions; poison them so that we can't accidentally
use them.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180605162545.80778-3-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Since commit 83ee768d62, we now have two places that define the
QJSON type:
$ git grep 'typedef struct QJSON QJSON'
include/migration/vmstate.h:typedef struct QJSON QJSON;
migration/qjson.h:typedef struct QJSON QJSON;
This breaks docker-test-build@centos6:
In file included from /tmp/qemu-test/src/migration/savevm.c:59:
/tmp/qemu-test/src/migration/qjson.h:16: error: redefinition of typedef
'QJSON'
/tmp/qemu-test/src/include/migration/vmstate.h:30: note: previous
declaration of 'QJSON' was here
make: *** [migration/savevm.o] Error 1
This happens because CentOS 6 has an old GCC 4.4.7. Even if redefining
a typedef with the same type is permitted since GCC 4.6, unless -pedantic
is passed, we don't really need to do that on purpose. Let's have a
single definition in <qemu/typedefs.h> instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <152844714981.11789.3657734445739553287.stgit@bahia.lan>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The -drive option serial was deprecated in QEMU 2.10. It's time to
remove it.
Tests need to be updated to set the serial number with -global instead
of using the -drive option.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
The -drive option addr was deprecated in QEMU 2.10. It's time to remove
it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
The -drive options cyls, heads, secs and trans were deprecated in
QEMU 2.10. It's time to remove them.
hd-geo-test tested both the old version with geometry options in -drive
and the new one with -device. Therefore the code using -drive doesn't
have to be replaced there, we just need to remove the -drive test cases.
This in turn allows some simplification of the code.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Configuration flows through the block subsystem in a rather peculiar
way. Configuration made with -drive enters it as QemuOpts.
Configuration made with -blockdev / blockdev-add enters it as QAPI
type BlockdevOptions. The block subsystem uses QDict, QemuOpts and
QAPI types internally. The precise flow is next to impossible to
explain (I tried for this commit message, but gave up after wasting
several hours). What I can explain is a flaw in the BlockDriver
interface that leads to this bug:
$ qemu-system-x86_64 -blockdev node-name=n1,driver=nfs,server.type=inet,server.host=localhost,path=/foo/bar,user=1234
qemu-system-x86_64: -blockdev node-name=n1,driver=nfs,server.type=inet,server.host=localhost,path=/foo/bar,user=1234: Internal error: parameter user invalid
QMP blockdev-add is broken the same way.
Here's what happens. The block layer passes configuration represented
as flat QDict (with dotted keys) to BlockDriver methods
.bdrv_file_open(). The QDict's members are typed according to the
QAPI schema.
nfs_file_open() converts it to QAPI type BlockdevOptionsNfs, with
qdict_crumple() and a qobject input visitor.
This visitor comes in two flavors. The plain flavor requires scalars
to be typed according to the QAPI schema. That's the case here. The
keyval flavor requires string scalars. That's not the case here.
nfs_file_open() uses the latter, and promptly falls apart for members
@user, @group, @tcp-syn-count, @readahead-size, @page-cache-size,
@debug.
Switching to the plain flavor would fix -blockdev, but break -drive,
because there the scalars arrive in nfs_file_open() as strings.
The proper fix would be to replace the QDict by QAPI type
BlockdevOptions in the BlockDriver interface. Sadly, that's beyond my
reach right now.
Next best would be to fix the block layer to always pass correctly
typed QDicts to the BlockDriver methods. Also beyond my reach.
What I can do is throw another hack onto the pile: have
nfs_file_open() convert all members to string, so use of the keyval
flavor actually works, by replacing qdict_crumple() by new function
qdict_crumple_for_keyval_qiv().
The pattern "pass result of qdict_crumple() to
qobject_input_visitor_new_keyval()" occurs several times more:
* qemu_rbd_open()
Same issue as nfs_file_open(), but since BlockdevOptionsRbd has only
string members, its only a latent bug. Fix it anyway.
* parallels_co_create_opts(), qcow_co_create_opts(),
qcow2_co_create_opts(), bdrv_qed_co_create_opts(),
sd_co_create_opts(), vhdx_co_create_opts(), vpc_co_create_opts()
These work, because they create the QDict with
qemu_opts_to_qdict_filtered(), which creates only string scalars.
The function sports a TODO comment asking for better typing; that's
going to be fun. Use qdict_crumple_for_keyval_qiv() to be safe.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are numerous QDict functions that have been introduced for and are
used only by the block layer. Move their declarations into an own
header file to reflect that.
While qdict_extract_subqdict() is in fact used outside of the block
layer (in util/qemu-config.c), it is still a function related very
closely to how the block layer works with nested QDicts, namely by
sometimes flattening them. Therefore, its declaration is put into this
header as well and util/qemu-config.c includes it with a comment stating
exactly which function it needs.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20180509165530.29561-7-mreitz@redhat.com>
[Copyright note tweaked, superfluous includes dropped]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In appears the input keymap for osx was forgotten in the commit that
converted the gtk frontend to keycodemapdb. Add it.
Fixes: 2ec78706 ("ui: convert GTK and SDL1 frontends to keycodemapdb")
CC: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Message-id: 1528933916-40670-1-git-send-email-keno@juliacomputing.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Here's another batch of ppc patches towards the 3.0 release. There's
a fair bit here, because I've been working through my mail backlog
after a holiday. There's not much of a central theme, amongst other
things we have:
* ppc440 / sam460ex improvements
* logging and error cleanups
* 40p (PReP) bugfixes
* Macintosh fixes and cleanups
* Add emulation of the new POWER9 store-forwarding barrier
instruction variant
* Hotplug cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlsfa4kACgkQbDjKyiDZ
s5IdhxAA1ZDgEMyQkMPlnjYBNSpw9ICo/v+Lj1ZvlVjIbtjP6ZWdmaNHbBLi1N/U
jbY7Yb+a43xj96WigN2kxu16aL7AvgrGayscht+P+ZXGWaEAt0R1EhiqhheKjkOH
n0gAxe5c9wN18tJ7ZMtq7HmWw/MyWVjPQ4Pr38npyDU037LG4xY55DigcFu9D/6N
rSYtB3gWIEvbGhx2Sm+gzbkleKufUNhSJMJI9diRS2KvyocH8ENQUqcMqK//Y6w7
/Lk/j+qcAOAfCS+QWH9/XQvYdOGR9Gaw4oWLViOCD+EpTJHAYFMzVsp5ZLZZByw/
S51W3e9mfxzgtIjoDxaetrW95yx2bAcFekP+H8KQesWpydSprsnAlh3SE3dfO2Jp
Z1yVPHZv2fYZQjNZ0t62Z248iJ4YiBDttDpQYzsiGTiGVfx1IKTNz+gfNwV78CCJ
SkyIktky2TYAemAB+U90JL3gYSn0GHwgBFLTJ4Rrrf3kC0wNslgosGTEB9v3D+Mb
zsWxaBftFwZsRnNggBtE2/35zccJ2YURASlDu9ccmczyCUrRZbKE57/GftRXdIcY
Oqjf48/NSwBm98bIEQGTOw7H1NgWuTazAc2+VS4mAHtUEEYIunuRmC+eTKNM01Yu
qSMCg6tPADEXwRg6z062sb6T0vvpicv+X1U++tbHNqV3gprOlts=
=g3bP
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-3.0-20180612' into staging
ppc patch queue 2018-06-12
Here's another batch of ppc patches towards the 3.0 release. There's
a fair bit here, because I've been working through my mail backlog
after a holiday. There's not much of a central theme, amongst other
things we have:
* ppc440 / sam460ex improvements
* logging and error cleanups
* 40p (PReP) bugfixes
* Macintosh fixes and cleanups
* Add emulation of the new POWER9 store-forwarding barrier
instruction variant
* Hotplug cleanups
# gpg: Signature made Tue 12 Jun 2018 07:43:21 BST
# gpg: using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-3.0-20180612: (33 commits)
spapr_pci: Remove unhelpful pagesize warning
xics_kvm: use KVM helpers
ppc/pnv: fix LPC HC firmware address space
spapr: handle cpu core unplug via hotplug handler chain
spapr: handle pc-dimm unplug via hotplug handler chain
spapr: introduce machine unplug handler
spapr: move memory hotplug support check into spapr_memory_pre_plug()
spapr: move lookup of the node into spapr_memory_plug()
spapr: no need to verify the node
target/ppc: Allow PIR read in privileged mode
ppc4xx_i2c: Clean up and improve error logging
target/ppc: extend eieio for POWER9
mos6522: convert VMSTATE_TIMER_PTR_TEST to VMSTATE_TIMER_PTR
mos6522: move timer frequency initialisation to mos6522_reset
cuda: embed mos6522_cuda device directly rather than using QOM object link
mos6522: fix vmstate_mos6522_timer version in vmstate_mos6522
ppc: add missing FW_CFG_PPC_NVRAM_FLAT definition
ppc: remove obsolete macio_init() definition from mac.h
ppc: remove obsolete pci_pmac_init() definitions from mac.h
hw/misc/mos6522: Add trailing '\n' to qemu_log() calls
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A link property can be set during creation, with
object_property_add_link() and later with object_property_set_link().
add_link() doesn't add a reference to the target object, while
set_link() does.
Furthemore, OBJ_PROP_LINK_UNREF_ON_RELEASE flags, set during add_link,
says whether a reference must be released when the property is destroyed.
This can lead to leaks if the property was later set_link(), as the
added reference is never released.
Instead, rename OBJ_PROP_LINK_UNREF_ON_RELEASE to OBJ_PROP_LINK_STRONG
and use that has an indication on how the link handle reference
management in set_link().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180531195119.22021-3-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
A specific MemoryRegion is required for the LPC HC Firmware address
space.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The 6522 VIA timer frequency cannot be set by altering registers within the
device itself and hence it is a fixed property of the machine.
Move the initialisation of the timer frequency to the mos6522 reset function
and ensure that any subclasses always call the parent reset function so that
it isn't required to store the timer frequency within vmstate_mos6522_timer
itself.
By moving the frequency initialisation to the device reset function then we
find that the realize function for both mos6522 and mos6522_cuda becomes
obsolete and can simply be removed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Examining the migration stream it can be seen that the mos6522 device state is
being stored separately rather than as part of the CUDA device which is
incorrect (and likely to cause issues if another mos6522 device is added to
the machine).
Resolve this by embedding the mos6522_cuda device directly within the CUDA
device rather than using a QOM object link to reference the device separately.
Note that we also bump the version in vmstate_cuda to reflect this change: this
isn't particularly important for the moment as the Mac machine migration isn't
100% reliable due to issues migrating the timebase under TCG.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is used in OpenBIOS to define the memory layout of the NVRAM device. Whilst
currently left at its default value, add the missing definition to ensure it is
reserved.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This allows KVM with the Book3S radix MMU mode to take advantage of
THP and install larger pages in the partition scope page tables (the
host translation).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>From observation of various OS sources it can be seen that the token register
introduced in 4e46dcdbd3 "PPC: Newworld: Add uninorth token register" is not
required, since the only register currently implemented is the uninorth hardware
version which is read-only.
Remove the token register implementation and instead return the uninorth
version corresponding to the hardware.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Replace the "nvdimm-cap" option which took numeric arguments such as "2"
with a more user friendly "nvdimm-persistence" option which takes symbolic
arguments "cpu" or "mem-ctrl".
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20180606182449.1607-5-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
All this function is doing will be repeated by
bdrv_do_release_matching_dirty_bitmap_locked, except
resetting bm->persistent. But even that does not matter
because the bitmap will be freed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180323164254.26487-1-pbonzini@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
This is a useful function for the whole block layer, so make it public.
At the same time, users outside of block.c probably do not need to make
use of the reopen functionality, so rename the current function to
bdrv_is_writable_after_reopen() create a new bdrv_is_writable() function
that just passes NULL to it for the reopen queue.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180606193702.7113-2-mreitz@redhat.com
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
This is basically what everything else in the qemu code base does, so we
can do it here, too.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180509194302.21585-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
For qemu-io, a function returns an integer with two possible values: 0
for "qemu-io may continue execution", or 1 for "qemu-io should exit".
However, there is only a single command that returns 1, and that is
"quit".
So let's turn this case into a global variable instead so we can make
better use of the return value in a later patch.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180509194302.21585-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Looking at the qcow2 code that is riddled with error_report() calls,
this is really how it should have been from the start.
Along the way, turn the target_version/current_version comparisons at
the beginning of qcow2_downgrade() into assertions (the caller has to
make sure these conditions are met), and rephrase the error message on
using compat=1.1 to get refcount widths other than 16 bits.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180509210023.20283-3-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
For the case where the end_transfer_func is also the caller of
ide_transfer_start, the mutual recursion can lead to unlimited
stack usage. Introduce a new version that can be used to change
tail recursion into a loop, and use it in trace_ide_atapi_cmd_reply_end.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180606190955.20845-8-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Now that end_transfer_func is a tail call in ahci_start_transfer,
formalize the fact that the callback (of which ahci_start_transfer is
the sole implementation) takes care of the transfer too: rename it to
pio_transfer and, if it is present, call the end_transfer_func as soon
as it returns.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180606190955.20845-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20180607180641.874-6-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As of this commit, the Spec v1 is not working, and all controllers
expect the cards to be conformant to Spec v2.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20180607180641.874-4-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The maximum frame size includes the CRC and depends if a VLAN tag is
inserted or not. Adjust the frame size limit in the transmit handler
using on the FTGMAC100State buffer size and in the receive handler use
the packet protocol.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180530061711.23673-2-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Specs are available here :
https://www.nxp.com/docs/en/application-note/AN264.pdf
This is a simple model supporting the basic registers for led and GPIO
mode. The device also supports two blinking rates but not the model
yet.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180530064049.27976-7-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is an helper routine to add a single EEPROM on an I2C bus. It can
be directly used by smbus_eeprom_init() which adds a certain number of
EEPROMs on mips and x86 machines.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180530064049.27976-5-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
While we skip the GIC_INTERNAL irqs, we don't change the register offset
accordingly. This will overlap the GICR registers value and leave the
last GIC_INTERNAL irq's registers out of update.
Fix this by skipping the registers banked by GICR.
Also for migration compatibility if the migration source (old version
qemu) doesn't send gicd_no_migration_shift_bug = 1 to destination, then
we shift the data of PPI to get the right data for SPI.
Fixes: 367b9f527b
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Message-id: 1527816987-16108-1-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This macro isn't used by any VFIO code. And its name is
too generic. The vfio-common.h (in include/hw/vfio) can
be included by other modules in QEMU. It can introduce
conflicts.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* Copy offloading for qemu-img convert (iSCSI, raw, and qcow2)
If the underlying storage supports copy offloading, qemu-img convert will
use it instead of performing reads and writes. This avoids data transfers
and thus frees up storage bandwidth for other purposes. SCSI EXTENDED COPY
and Linux copy_file_range(2) are used to implement this optimization.
* Drop spurious "WARNING: I\/O thread spun for 1000 iterations" warning
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbFSBoAAoJEJykq7OBq3PISpEIAIcMao4/rzinAWXzS+ncK9LO
6FtRVpgutpHaWX2ayySaz5n2CdR3cNMrpCI7sjY2Kw0lrdkqxPgl5n0SWD+VCl4W
7+JLz/uF0iUV8X+99e7WGAjZbm9LSlxgn5AQKfrrwyPf0ZfzoYQ5nBMcQ6xjEeQP
48j2WqJqN9/u8RBD07o11yn0+CE5g56/f12xVjR5ASVodzsAmcZ2OQRMQbM01isU
1mBekJQkDxJkt5l13Rql8+t+vWz8/9BEW2c/eIDKvoayMqYJpdfKv4DqLloIuHnc
3RkquA0zUuKtl7xEnEkH/We7fi4QPGW/vyBN7ychS/zKzZFQrXmwqrAuFSw3dKU=
=vZp+
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
* Copy offloading for qemu-img convert (iSCSI, raw, and qcow2)
If the underlying storage supports copy offloading, qemu-img convert will
use it instead of performing reads and writes. This avoids data transfers
and thus frees up storage bandwidth for other purposes. SCSI EXTENDED COPY
and Linux copy_file_range(2) are used to implement this optimization.
* Drop spurious "WARNING: I\/O thread spun for 1000 iterations" warning
# gpg: Signature made Mon 04 Jun 2018 12:20:08 BST
# gpg: using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
main-loop: drop spin_counter
qemu-img: Convert with copy offloading
block-backend: Add blk_co_copy_range
iscsi: Implement copy offloading
iscsi: Create and use iscsi_co_wait_for_task
iscsi: Query and save device designator when opening
file-posix: Implement bdrv_co_copy_range
qcow2: Implement copy offloading
raw: Implement copy offloading
raw: Check byte range uniformly
block: Introduce API for copy offloading
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbEXNvAAoJECgfDbjSjVRpc8gH/R8xrcFrV+k9wwbgYcOcGb6Y
LWjseE31pqJcxRV80vLOdzYEuLStZQKQQY7xBDMlA5vdyvZxIA6FLO2IsiJSbFAk
EK8pclwhpwQAahr8BfzenabohBv2UO7zu5+dqSvuJCiMWF3jGtPAIMxInfjXaOZY
odc1zY2D2EgsC7wZZ1hfraRbISBOiRaez9BoGDKPOyBY9G1ASEgxJgleFgoBLfsK
a1XU+fDM6hAVdxftfkTm0nibyf7PWPDyzqghLqjR9WXLvZP3Cqud4p8N29mY51pR
KSTjA4FYk6Z9EVMltyBHfdJs6RQzglKjxcNGdlrvacDfyFi79fGdiosVllrjfJM=
=3+V0
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
acpi, vhost, misc: fixes, features
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 01 Jun 2018 17:25:19 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (31 commits)
vhost-blk: turn on pre-defined RO feature bit
ACPI testing: test NFIT platform capabilities
nvdimm, acpi: support NFIT platform capabilities
tests/.gitignore: add entry for generated file
arch_init: sort architectures
ui: use local path for local headers
qga: use local path for local headers
colo: use local path for local headers
migration: use local path for local headers
usb: use local path for local headers
sd: fix up include
vhost-scsi: drop an unused include
ppc: use local path for local headers
rocker: drop an unused include
e1000e: use local path for local headers
ioapic: fix up includes
ide: use local path for local headers
display: use local path for local headers
trace: use local path for local headers
migration: drop an unused include
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On the POWER9 processor, the XIVE interrupt controller can control
interrupt sources using MMIO to trigger events, to EOI or to turn off
the sources. Priority management and interrupt acknowledgment is also
controlled by MMIO in the presenter sub-engine.
These MMIO regions are exposed to guests in QEMU with a set of 'ram
device' memory mappings, similarly to VFIO, and the VMAs are populated
dynamically with the appropriate pages using a fault handler.
But, these regions are an issue for migration. We need to discard the
associated RAMBlocks from the RAM state on the source VM and let the
destination VM rebuild the memory mappings on the new host in the
post_load() operation just before resuming the system.
To achieve this goal, the following introduces a new RAMBlock flag
RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
vmstate_unregister_ram() routines. This flag is then used by the
migration to identify RAMBlocks to discard on the source. Some checks
are also performed on the destination to make sure nothing invalid was
sent.
This change impacts the boston, malta and jazz mips boards for which
migration compatibility is broken.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
QEMU 3.0 enables strict check for compression & decompression to
make the migration more robust, that depends on the source to fix
the internal design which triggers the unexpected error conditions
To make it work for migrating old version QEMU to 2.13 QEMU, we
introduce this parameter to disable the error check on the
destination which is the default behavior of the machine type
which is older than 2.13, alternately, the strict check can be
enabled explicitly as followings:
-M pc-q35-2.11 -global migration.decompress-error-check=true
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Do the cast to uintptr_t within the helper, so that the compiler
can type check the pointer argument. We can also do some more
sanity checking of the index argument.
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Read only feature shouldn't be negotiable, because if the
backend device reported Read only feature supported, QEMU
host driver shouldn't change backend's RO attribute. While
here, also enable the vhost-user-blk test utility to test
RO feature.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a machine command line option to allow the user to control the Platform
Capabilities Structure in the virtualized NFIT. This Platform Capabilities
Structure was added in ACPI 6.2 Errata A.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In the vmstate.h file, we just need a struct name. Use a forward
declaration instead of an include, then adjust the one affected .c file
to include the file that is no longer implicit from the header.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
It's a BlockBackend wrapper of the BDS interface.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180601092648.24614-10-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Issue EXTENDED COPY (LID1) command to implement the copy_range API.
The parameter data construction code is modified from libiscsi's
iscsi-dd.c.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180601092648.24614-9-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
With copy_file_range(2), we can implement the bdrv_co_copy_range
semantics.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20180601092648.24614-6-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce the bdrv_co_copy_range() API for copy offloading. Block
drivers implementing this API support efficient copy operations that
avoid reading each block from the source device and writing it to the
destination devices. Examples of copy offload primitives are SCSI
EXTENDED COPY and Linux copy_file_range(2).
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180601092648.24614-2-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since no devices use it, we can safely remove it.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180419212727.26095-5-f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Removal of DeviceClass::init() moved from previous patch, missing
documentation updates supplied]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180528144509.15812-5-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I2CSlaveClass::init is no more used, remove it.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180419212727.26095-3-f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180528144509.15812-3-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SMBusDeviceClass::init is no more used, remove it.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180419212727.26095-2-f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180528144509.15812-2-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Update our copy of the Linux headers to upstream 4.17-rc6
(kernel commit 771c577c23bac90597c68).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180525132755.21839-6-peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 5643cc94ac ("virtio-gpu-3d: add support for second capability
set (v4)") updated virtio_gpu.h with a define that does not yet(?)
exist upstream resulting in build breakage every time Linux headers
are updated via the standard update script. Conditionally define this
within QEMU code instead to avoid future breakage.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Fixes: 5643cc94ac ("virtio-gpu-3d: add support for second capability set (v4)")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180525132755.21839-2-peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The VMS_STRUCT has no way to specify which version of a structure
to use. Add a type and a new field to allow the specific version
of a structure to be used.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Message-Id: <1524670052-28373-2-git-send-email-minyard@acm.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The function has been deprecated for 2.5 years, and there are just a handful
of users. Convert them to memory_region_init_io with NULL callbacks,
and while at it pass the right device as the owner.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180515134835.3409-1-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-22-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-17-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If CONFIG_SECCOMP is undefined, the option 'elevatedprivileges' remains
compiled. This would make libvirt set the corresponding capability and
then trigger failure during guest startup. This patch moves the code
regarding seccomp command line options to qemu-seccomp.c file and
wraps qemu_opts_foreach finding sandbox option with CONFIG_SECCOMP.
Because parse_sandbox() is moved into qemu-seccomp.c file, change
seccomp_start() to static function.
Signed-off-by: Yi Min Zhao <zyimin@linux.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
Xen 4.11 has a new API to directly map guest resources. Among the resources
that can be mapped using this API are ioreq pages.
This patch modifies QEMU to attempt to use the new API should it exist,
falling back to the previous mechanism if it is unavailable.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
When global_log_dirty is enabled VRAM modification tracking never
worked correctly. The address that is passed to xen_hvm_modified_memory()
is not the effective PFN but RAM block address which is not the same
for VRAM.
We need to make a translation for this address into PFN using
physmap. Since there is no way to access physmap properly inside
xen_hvm_modified_memory() let's make it a global structure.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
No declaration of "hw/vfio/vfio-common.h" directly requires to include
the "exec/address-spaces.h" header. To simplify dependencies and
ease the upcoming cleanup of "exec/address-spaces.h", directly include
it in the source file where the declaration are used.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-2-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If CONFIG_SECCOMP is undefined, the option 'elevateprivileges' remains
compiled. This would make libvirt set the corresponding capability and
then trigger failure during guest startup. This patch moves the code
regarding seccomp command line options to qemu-seccomp.c file and
wraps qemu_opts_foreach finding sandbox option with CONFIG_SECCOMP.
Because parse_sandbox() is moved into qemu-seccomp.c file, change
seccomp_start() to static function.
Signed-off-by: Yi Min Zhao <zyimin@linux.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
Message-Id: <20180531032937.1925-1-zyimin@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Provide a VMSTATE_BOOL_SUB_ARRAY to go with VMSTATE_UINT8_SUB_ARRAY
and friends.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180521140402.23318-23-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_get_iotlb_entry().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-12-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to flatview_translate(); all its
callers now have attrs available.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-11-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to the MemoryRegion valid.accepts
callback. We'll need this for subpage_accepts().
We could take the approach we used with the read and write
callbacks and add new a new _with_attrs version, but since there
are so few implementations of the accepts hook we just change
them all.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-9-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to memory_region_access_valid().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.
The callsite in flatview_access_valid() is part of a recursive
loop flatview_access_valid() -> memory_region_access_valid() ->
subpage_accepts() -> flatview_access_valid(); we make it pass
MEMTXATTRS_UNSPECIFIED for now, until the next several commits
have plumbed an attrs parameter through the rest of the loop
and we can add an attrs parameter to flatview_access_valid().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-8-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_access_valid().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-6-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_map().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-5-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_translate()
and address_space_translate_cached(). Callers either have an
attrs value to hand, or don't care and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-4-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to tb_invalidate_phys_addr().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180521140402.23318-3-peter.maydell@linaro.org
Add more detail to the documentation for memory_region_init_iommu()
and other IOMMU-related functions and data structures.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20180521140402.23318-2-peter.maydell@linaro.org
Depending on the host abi, float16, aka uint16_t, values are
passed and returned either zero-extended in the host register
or with garbage at the top of the host register.
The tcg code generator has so far been assuming garbage, as that
matches the x86 abi, but this is incorrect for other host abis.
Further, target/arm has so far been assuming zero-extended results,
so that it may store the 16-bit value into a 32-bit slot with the
high 16-bits already clear.
Rectify both problems by mapping "f16" in the helper definition
to uint32_t instead of (a typedef for) uint16_t. This forces
the host compiler to assume garbage in the upper 16 bits on input
and to zero-extend the result on output.
Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180522175629.24932-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* New command-line option: --preconfig
This option allows pausing QEMU and allow the configuration
using QMP commands before running board initialization code.
* New QMP set-numa-node, now made possible because of --preconfig
* Small update on -numa error messages
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJbDy2jAAoJECgHk2+YTcWm8OgP/As0judZ7JwT5F5fR4nxtQPr
EODeEI+e/IahGH58Bafx24Fvxy0HZIgwqBDnPgclA8d3ebwmDQIejgk5/00zq2xv
cktRpzXGSBGJCVNctVz8xqN91m0lPtVgeRvXUJPn3hthXRSLO4p0vbyOW8g/C+O2
+dEcGqifAQatyxs9gYmmWoUia8zle/v1bd2v/x/DwiFW9cd47yDr3+ChHh6+nx0W
uh2zymD+ykVeNJ9WSxc3k8zTzQnuxbDK1fNbZsztk9KQDMWG3+u7KRNoJ86/7hKL
RMUfRCUMBgemuZsrFF8wSsha27e9VgCw4oR8dQ5AnTgMjK6nch/839XuFWh7j5qu
ntS4vt1v6IczdflKXX7IUILAgvVffLh2SCGWlKq9Q1eKo+CZTCea+iSbb2QobAE+
9TyuqXVKCsGKLoqkbK39d1UjZFhDQJhMsyuOHreKEINmJRmLI5qrNbEaaysb+mMB
DnFE1NiWjE+6X4w9U1l8lnKNUakOPNDG8cbgtuiEKDC9e3OZeiS8yyPLaQajELSR
Ig7N2GxrLMSfS3LNMSwkACILzVeu03aFoHyCPrpUqD8mpxidN0mmFVup8dawtkJ1
A5dHrKDmzlOT6DivJOgT8/Os/wMo3ErPcvyu69rTUO9PKSbwcFPwi6bMMDzgb4xO
kAles9X7HrHxSeJiggRW
=RHIj
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/numa-next-pull-request' into staging
NUMA queue, 2018-05-30
* New command-line option: --preconfig
This option allows pausing QEMU and allow the configuration
using QMP commands before running board initialization code.
* New QMP set-numa-node, now made possible because of --preconfig
* Small update on -numa error messages
# gpg: Signature made Thu 31 May 2018 00:02:59 BST
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/numa-next-pull-request:
tests: functional tests for QMP command set-numa-node
qmp: add set-numa-node command
qmp: permit query-hotpluggable-cpus in preconfig state
tests: extend qmp test with preconfig checks
cli: add --preconfig option
tests: qapi-schema tests for allow-preconfig
qapi: introduce new cmd option "allow-preconfig"
hmp: disable monitor in preconfig state
qapi: introduce preconfig runstate
numa: split out NumaOptions parsing into set_numa_options()
numa: postpone options post-processing till machine_run_board_init()
numa: clarify error message when node index is out of range in -numa dist, ...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This option allows pausing QEMU in the new RUN_STATE_PRECONFIG state,
allowing the configuration of QEMU from QMP before the machine jumps
into board initialization code of machine_run_board_init()
The intent is to allow management to query machine state and additionally
configure it using previous query results within one QEMU instance
(i.e. eliminate the need to start QEMU twice, 1st to query board specific
parameters and 2nd for actual VM start using query results for
additional parameters).
The new option complements -S option and could be used with or without
it. The difference is that -S pauses QEMU when the machine is completely
initialized with all devices wired up and ready to execute guest code
(QEMU needs only to unpause VCPUs to let guest execute its code),
while the "preconfig" option pauses QEMU early before board specific init
callback (machine_run_board_init) is executed and allows the configuration
of machine parameters which will be used by board init code.
When early introspection/configuration is done, command 'exit-preconfig'
should be used to exit RUN_STATE_PRECONFIG and transition to the next
requested state (i.e. if -S is used then QEMU will pause the second
time when board/device initialization is completed or start guest
execution if -S isn't provided on CLI)
PS:
Initially 'preconfig' is planned to be used for configuring numa
topology depending on board specified possible cpus layout.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1526059483-42847-1-git-send-email-imammedo@redhat.com>
[ehabkost: Changed "since 2.13" to "since 3.0"]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
New option will be used to allow commands, which are prepared/need
to run, during preconfig state. Other commands that should be able
to run in preconfig state, should be amended to not expect machine
in initialized state or deal with it.
For compatibility reasons, commands that don't use new flag
'allow-preconfig' explicitly are not permitted to run in
preconfig state but allowed in all other states like they used
to be.
Within this patch allow following commands in preconfig state:
qmp_capabilities
query-qmp-schema
query-commands
query-command-line-options
query-status
exit-preconfig
to allow qmp connection, basic introspection and moving to the next
state.
PS:
set-numa-node and query-hotpluggable-cpus will be enabled later in
a separate patches.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1526057503-39287-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[ehabkost: Changed "since 2.13" to "since 3.0"]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
it will allow to reuse set_numa_options() for parsing
configuration commands received via QMP interface
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1525423069-61903-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
in preparation for numa options to being handled via QMP before
machine_run_board_init(), move final numa configuration checks
and processing to machine_run_board_init() so it could take into
account both CLI (via parse_numa_opts()) and QMP input
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1525423069-61903-2-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
So far we relied on job->ret and strerror() to produce an error message
for failed jobs. Not surprisingly, this tends to result in completely
useless messages.
This adds a Job.error field that can contain an error string for a
failing job, and a parameter to job_completed() that sets the field. As
a default, if NULL is passed, we continue to use strerror(job->ret).
All existing callers are changed to pass NULL. They can be improved in
separate patches.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
gdb_handlesig()'s behaviour is not entirely obvious at first
glance. Add a doc comment for it, and also add a comment
explaining why it's ok for gdb_do_syscallv() to ignore
gdb_handlesig()'s return value. (Coverity complains about
this: CID 1390850.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180515181958.25837-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
In thunk_type_align() and thunk_type_size() we currently return
-1 if the value at the type_ptr isn't one of the TYPE_* values
we understand. However, this should never happen, and if it does
then the calling code will go confusingly wrong because none
of the callsites try to handle an error return. Switch to an
assertion instead, so that if this does somehow happen we'll have
a nice clear backtrace of what happened rather than a weird crash
or misbehaviour.
This also silences various Coverity complaints about not handling
the negative return value (CID 1005735, 1005736, 1005738, 1390582).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180514174616.19601-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Man page for WCOREDUMP says:
WCOREDUMP(wstatus) returns true if the child produced a core dump.
This macro should be employed only if WIFSIGNALED returned true.
This macro is not specified in POSIX.1-2001 and is not
available on some UNIX implementations (e.g., AIX, SunOS). Therefore,
enclose its use inside #ifdef WCOREDUMP ... #endif.
Let's do exactly this.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces VHOST_USER_PROTOCOL_F_HOST_NOTIFIER.
With this feature negotiated, vhost-user backend can register
memory region based host notifiers. And it will allow the guest
driver in the VM to notify the hardware accelerator at the
vhost-user backend directly.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When multi queue is enabled e.g. for a virtio-net device,
each queue pair will have a vhost_dev, and the only thing
shared between vhost devs currently is the chardev. This
patch introduces a vhost-user state structure which will
be shared by all vhost devs of the same virtio device.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces a vhost op for vhost backends to allow
them to filter the memory sections that they can handle.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Beginning of merging vDPA, new PCI ID, a new virtio balloon stat, intel
iommu rework fixing a couple of security problems (no CVEs yet), fixes
all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbBX2cAAoJECgfDbjSjVRpOEYIAIR6KGkwbAJ9SnO9B71DQHl1
yYYgM7i2HwyZ1YPnXOYWnI1lzQ1bARTf2krQJFGmfjlDaueFf9KnXdNByoVCmG8m
UhF/rQp3DcJ4wTABktPtME8gWdQxKPmDxlN5W3f29Zrm3g9S+Hshi+sfPZUkBxL4
gQMFRctb2SxvQXG+lusHVwo1oF6pzGZMmX35906he3m4xS/cfoeCP7Qj6nSvHZq7
lsLoOeYxHtXWA9gTYxpd7zW+hhUxkspoOqcXySHfO7e5enJANaulTxKuC0T+6HL4
O2iUM+1wjUYE0tQcNJ6x7emA82k5OdG2OMD6gbR1oSdquttJo7+4R+goqpb44rc=
=NUoY
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio, vhost: fixes, features
Beginning of merging vDPA, new PCI ID, a new virtio balloon stat, intel
iommu rework fixing a couple of security problems (no CVEs yet), fixes
all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 23 May 2018 15:41:32 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (28 commits)
intel-iommu: rework the page walk logic
util: implement simple iova tree
intel-iommu: trace domain id during page walk
intel-iommu: pass in address space when page walk
intel-iommu: introduce vtd_page_walk_info
intel-iommu: only do page walk for MAP notifiers
intel-iommu: add iommu lock
intel-iommu: remove IntelIOMMUNotifierNode
intel-iommu: send PSI always even if across PDEs
nvdimm: fix typo in label-size definition
contrib/vhost-user-blk: enable protocol feature for vhost-user-blk
hw/virtio: Fix brace Werror with clang 6.0.0
libvhost-user: Send messages with no data
vhost-user+postcopy: Use qemu_set_nonblock
virtio: support setting memory region based host notifier
vhost-user: support receiving file descriptors in slave_read
vhost-user: add Net prefix to internal state structure
linux-headers: add kvm header for mips
linux-headers: add unistd.h on all arches
update-linux-headers.sh: unistd.h, kvm consistency
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJbBGT2AAoJEIlPj0hw4a6Q0BIQAIciEH80ZLPSQPJciRoBCB/s
1BUsbFelfeHcTXAOxiahmaQbzzffosifK0uJAH5Quy5RotnKknguRgfgAl8nWX34
vXBbYKOmdudNwk3wIhcnnLfiRvNcdF4yDVVArsnvEbCWUBV2mpTrhuqmepZMiY7s
GQU42ly09WQePBPFW19RS/kxsaR1lSuGIuSxtc+0QvDQpqMRK3HslHrjD0jgUISh
pitqL3ztPHpJN1d9KI9fNT066magjnAkWV1gSYH9MAFvWV6JVuXqyJPY/x+YcZEU
uPgZdEmONIoFqLD7yEHf/kXtCoRZavnvX3t0OPvUOPrN8eeui/UrtOyElY0NxmS/
RGKggOAjN4VYdAfXHqLIWHkkE8CicFkukr6hk2nyfDKT63ZTR6TfEESrweL3clej
J2dQ1bPvmAhlPU7tFeo8drAFgeIjLl9QAXEtp2+1FXGYBOT4avwtnt4rrGpnSbpL
wV6SObd4d8yJR1JrA42jXuL2qlubainlceg4dSFGf84iUx68Sq+DXM+ssbVxs4Tc
VEJ2nIGK2OF0jafFUV4sqZ7iDQoor0A/SGiOHk8j3kr/kWbidid6opl+MRcr/II3
CYHyR8IH3LLY/CY18ppVbuCNxovPTilwRsleNy6ZxM1uFeWIOGWUy0WutPTi8wq8
Ksa1MVPftDaMOLL2WE5C
=e+YI
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/sstabellini-http/tags/xen-20180522-tag' into staging
Xen 2018/05/22
# gpg: Signature made Tue 22 May 2018 19:44:06 BST
# gpg: using RSA key 894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# gpg: aka "Stefano Stabellini <sstabellini@kernel.org>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* remotes/sstabellini-http/tags/xen-20180522-tag:
xen_disk: be consistent with use of xendev and blkdev->xendev
xen_disk: use a single entry iovec
xen_backend: make the xen_feature_grant_copy flag private
xen_disk: remove use of grant map/unmap
xen_backend: add an emulation of grant copy
xen: remove other open-coded use of libxengnttab
xen_disk: remove open-coded use of libxengnttab
xen_backend: add grant table helpers
xen: add a meaningful declaration of grant_copy_segment into xen_common.h
checkpatch: generalize xen handle matching in the list of types
xen-hvm: create separate function for ioreq server initialization
xen_pt: Present the size of 64 bit BARs correctly
configure: Add explanation for --enable-xen-pci-passthrough
xen/pt: use address_space_memory object for memory region hooks
xen-pvdevice: Introduce a simplistic xen-pvdevice save state
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Create a new header file, move the bochs vbe dispi interface
defines to it, so they can be used outside vga code.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180522165058.15404-2-kraxel@redhat.com
This patch fixes a potential small window that the DMA page table might
be incomplete or invalid when the guest sends domain/context
invalidations to a device. This can cause random DMA errors for
assigned devices.
This is a major change to the VT-d shadow page walking logic. It
includes but is not limited to:
- For each VTDAddressSpace, now we maintain what IOVA ranges we have
mapped and what we have not. With that information, now we only send
MAP or UNMAP when necessary. Say, we don't send MAP notifies if we
know we have already mapped the range, meanwhile we don't send UNMAP
notifies if we know we never mapped the range at all.
- Introduce vtd_sync_shadow_page_table[_range] APIs so that we can call
in any places to resync the shadow page table for a device.
- When we receive domain/context invalidation, we should not really run
the replay logic, instead we use the new sync shadow page table API to
resync the whole shadow page table without unmapping the whole
region. After this change, we'll only do the page walk once for each
domain invalidations (before this, it can be multiple, depending on
number of notifiers per address space).
While at it, the page walking logic is also refactored to be simpler.
CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Jintack Lim <jintack@cs.columbia.edu>
Tested-by: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a simplest iova tree implementation based on GTree.
CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For UNMAP-only IOMMU notifiers, we don't need to walk the page tables.
Fasten that procedure by skipping the page table walk. That should
boost performance for UNMAP-only notifiers like vhost.
CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
SECURITY IMPLICATION: this patch fixes a potential race when multiple
threads access the IOMMU IOTLB cache.
Add a per-iommu big lock to protect IOMMU status. Currently the only
thing to be protected is the IOTLB/context cache, since that can be
accessed even without BQL, e.g., in IO dataplane.
Note that we don't need to protect device page tables since that's fully
controlled by the guest kernel. However there is still possibility that
malicious drivers will program the device to not obey the rule. In that
case QEMU can't really do anything useful, instead the guest itself will
be responsible for all uncertainties.
CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
That is not really necessary. Removing that node struct and put the
list entry directly into VTDAddressSpace. It simplfies the code a lot.
Since at it, rename the old notifiers_list into vtd_as_with_notifiers.
CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit da6789c27c ("nvdimm: add a macro for property "label-size"")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Cc: Haozhong Zhang <haozhong.zhang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces the support for setting memory region
based host notifiers for virtio device. This is helpful when
using a hardware accelerator for a virtio device, because
hardware heavily depends on the notification, this will allow
the guest driver in the VM to notify the hardware directly.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds a minimal query-jobs implementation that shouldn't pose many
design questions. It can later be extended to expose more information,
and especially job-specific information.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockJob has fields .offset and .len, which are actually misnomers today
because they are no longer tied to block device sizes, but just progress
counters. As such they make a lot of sense in generic Jobs.
This patch moves the fields to Job and renames them to .progress_current
and .progress_total to describe their function better.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The transition to the READY state was still performed in the BlockJob
layer, in the same function that sent the BLOCK_JOB_READY QMP event.
This patch brings the state transition to the Job layer and implements
the QMP event using a notifier called from the Job layer, like we
already do for other events related to state transitions.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Instead of having a 'bool ready' in BlockJob, add a function that
derives its value from the job status.
At the same time, this fixes the behaviour to match what the QAPI
documentation promises for query-block-job: 'true if the job may be
completed'. When the ready flag was introduced in commit ef6dbf1e46,
the flag never had to be reset to match the description because after
being ready, the jobs would immediately complete and disappear.
Job transactions and manual job finalisation were introduced only later.
With these changes, jobs may stay around even after having completed
(and they are not ready to be completed a second time), however their
patches forgot to reset the ready flag.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This moves the logic that implements job transactions from BlockJob to
Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This doesn't actually move any transaction code to Job yet, but it
renames the type for transactions from BlockJobTxn to JobTxn and makes
them contain Jobs rather than BlockJobs
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
block_job_finish_sync() doesn't contain anything block job specific any
more, so it can be moved to Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This moves the .complete callback that tells a READY job to complete
from BlockJobDriver to JobDriver. The wrapper function job_complete()
doesn't require anything block job specific any more and can be moved
to Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
block_job_drain() contains a blk_drain() call which cannot be moved to
Job, so add a new JobDriver callback JobDriver.drain which has a common
implementation for all BlockJobs. In addition to this we keep the
existing BlockJobDriver.drain callback that is called by the common
drain implementation for all block jobs.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
block_job_cancel_async() did two things that were still block job
specific:
* Setting job->force. This field makes sense on the Job level, so we can
just move it. While at it, rename it to job->force_cancel to make its
purpose more obvious.
* Resetting the I/O status. This can't be moved because generic Jobs
don't have an I/O status. What the function really implements is a
user resume, except without entering the coroutine. Consequently, it
makes sense to call the .user_resume driver callback here which
already resets the I/O status.
The old block_job_cancel_async() has two separate if statements that
check job->iostatus != BLOCK_DEVICE_IO_STATUS_OK and job->user_paused.
However, the former condition always implies the latter (as is
asserted in block_job_iostatus_reset()), so changing the explicit call
of block_job_iostatus_reset() on the former condition with the
.user_resume callback on the latter condition is equivalent and
doesn't need to access any BlockJob specific state.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This moves the finalisation of a single job from BlockJob to Job.
Some part of this code depends on job transactions, and job transactions
call this code, we introduce some temporary calls from Job functions to
BlockJob ones. This will be fixed once transactions move to Job, too.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Go through the Job layer in order to send QMP events. For the moment,
these functions only call a notifier in the BlockJob layer that sends
the existing commands.
This uses notifiers rather than JobDriver callbacks because internal
users of jobs won't receive QMP events, but might still be interested
in getting notified for the events.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This renames the BlockJobCreateFlags constants, moves a few JOB_INTERNAL
checks to job_create() and the auto_{finalize,dismiss} fields from
BlockJob to Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Since we introduced an explicit status to block job, BlockJob.completed
is redundant because it can be derived from the status. Remove the field
from BlockJob and add a function to derive it from the status at the Job
level.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
While we already moved the state related to job pausing to Job, the
functions to do were still BlockJob only. This commit moves them over to
Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
There is nothing block layer specific about block_job_sleep_ns(), so
move the function to Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This commit moves some core functions for dealing with the job coroutine
from BlockJob to Job. This includes primarily entering the coroutine
(both for the first and reentering) and yielding explicitly and at pause
points.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Move the defer_to_main_loop functionality from BlockJob to Job.
The code can be simplified because we can use job->aio_context in
job_defer_to_main_loop_bh() now, instead of having to access the
BlockDriverState.
Probably taking the data->aio_context lock in addition was already
unnecessary in the old code because we didn't actually make use of
anything protected by the old AioContext except getting the new
AioContext, in case it changed between scheduling the BH and running it.
But it's certainly unnecessary now that the BDS isn't accessed at all
any more.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
When block jobs need an AioContext, they just take it from their main
block node. Generic jobs don't have a main block node, so we need to
assign them an AioContext explicitly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
We cannot yet move the whole logic around job cancelling to Job because
it depends on quite a few other things that are still only in BlockJob,
but we can move the cancelled field at least.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
This moves reference counting from BlockJob to Job.
In order to keep calling the BlockJob cleanup code when the job is
deleted via job_unref(), introduce a new JobDriver.free callback. Every
block job must use block_job_free() for this callback, this is asserted
in block_job_create().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
This moves BlockJob.status and the closely related functions
(block_)job_state_transition() and (block_)job_apply_verb to Job. The
two QAPI enums are renamed to JobStatus and JobVerb.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This moves the job list from BlockJob to Job. Now we can check for
duplicate IDs in job_create().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
This moves freeing the Job object and its fields from block_job_unref()
to job_delete().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
This moves the job_type field from BlockJobDriver to JobDriver.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
QAPI types aren't externally visible, so we can rename them without
causing problems. Before we add a job type to Job, rename the enum
so it can be used for more than just block jobs.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
This is the first step towards creating an infrastructure for generic
background jobs that aren't tied to a block device. For now, Job only
stores its ID and JobDriver, the rest stays in BlockJob.
The following patches will move over more parts of BlockJob to Job if
they are meaningful outside the context of a block job.
BlockJob.driver is now redundant, but this patch leaves it around to
avoid unnecessary churn. The next patches will get rid of almost all of
its uses anyway so that it can be removed later with much less churn.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Switch to the header we imported from Linux,
this allows us to drop a hack in kvm_i386.h.
More code will be dropped in the next patch.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qemu should read and report hugetlb page allocation
counts exported in the following kernel patch:
commit 4c3ca37c4a4394978fd0f005625f6064ed2b9a64
Author: Jonathan Helman <jonathan.helman@oracle.com>
Date: Mon Mar 19 11:00:35 2018 -0700
virtio_balloon: export hugetlb page allocation counts
Export the number of successful and failed hugetlb page
allocations via the virtio balloon driver. These 2 counts
come directly from the vm_events HTLB_BUDDY_PGALLOC and
HTLB_BUDDY_PGALLOC_FAIL.
Signed-off-by: Jonathan Helman <jonathan.helman@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
There is no longer any use of this flag outside of the xen_backend code.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
This patch adds grant table helper functions to the xen_backend code to
localize error reporting and use of xen_domid.
The patch also defers the call to xengnttab_open() until just before the
initialise method in XenDevOps is invoked. This method is responsible for
mapping the shared ring. No prior method requires access to the grant table.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Currently the xen_disk source has to carry #ifdef exclusions to compile
against Xen older then 4.8. This is a bit messy so this patch lifts the
definition of struct xengnttab_grant_copy_segment and adds it into the
pre-4.8 compat area in xen_common.h, which allows xen_disk to be cleaned
up.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
It is long gone since e4e8ba04c2 ...
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There is no need to include pci.h in these files.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The ZynqMP contains two instances of a generic DMA, the GDMA, located in the
FPD (full power domain), and the ADMA, located in LPD (low power domain). This
patch adds these two DMAs to the ZynqMP board.
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20180503214201.29082-3-frasse.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a model of the generic DMA found on Xilinx ZynqMP.
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20180503214201.29082-2-frasse.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Generate an XML description for the cp-regs.
Register these regs with the gdb_register_coprocessor().
Add arm_gdb_get_sysreg() to use it as a callback to read those regs.
Add a dummy arm_gdb_set_sysreg().
Signed-off-by: Abdallah Bouassida <abdallah.bouassida@lauterbach.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1524153386-3550-4-git-send-email-abdallah.bouassida@lauterbach.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When we call addIOThread, the epollfd created in aio_context_setup,
but not close it in the process of delIOThread, so the epollfd will leak.
Reorder the code in aio_epoll_disable and reuse it.
Signed-off-by: Jie Wang <wangjie88@huawei.com>
Message-Id: <1526517763-11108-1-git-send-email-wangjie88@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
[Mention change to aio_epoll_disable in commit message. - Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>
Only MIPS requires snan_bit_is_one to be variable. While we are
specializing softfloat behaviour, allow other targets to eliminate
this runtime check.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@mips.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alexander Graf <agraf@suse.de>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These functions are now unused.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This allows us to delete a lot of additional boilerplate
code which is no longer needed.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The new function assumes that the input is an SNaN and
does not double-check.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* KnightsMill CPU model
* CLDEMOTE(Demote Cache Line) cpu feature
* pc-i440fx-2.13 and pc-q35-2.13 machine-types
* Add model-specific cache information to EPYC CPU model
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJa+1bIAAoJECgHk2+YTcWmebgP/2337b3L2z5K6YRBOjaH/Kmw
xuBmns8B2ta4/KFX1IjX+7TRvl8qJT39TadLs6OGaPKDHi/OIK7vhXZZdgqm3o2H
WKpsPYp6MPYx/Ff8eOesHGUyxDTqP49rMYTTbP1GdgoulqjUq0XqgmdF4+uKesei
1v4NG7M6H2VxlAdkF1PeZUyuuEEZT6F1T2O43FDTQHOQRWmb1XAHiSgiqC4431J9
9SVe4E0v8i96zrMVeaezvQrMVCCGIo3zC3JZBfbX205Yehl/eCK5WktO03zIn9yS
IK7Gqwh4RfWmRg2wef+5qEQ1fl11XlH895/F0wMcZ8sRrSm04jNcBs9O4kc59Zz2
kyvaGkCG14aTC2Y35H0BsNd1AszKaWfknIJKNMnPkeXYLYAbPm1bwgBmXVXAa0Ux
mI6dk4ArwHALNl2srNOnJRvtmgohm+1HCZtlstSzHVWzNFS9nfGWK3kpgDqXft8w
TKe2r928uQtympCK7s43wIstT1KRwaVn7ivNpcotJgojCuUg0tK4PQvP82Z/ZXUh
pUxhW03al6Dz2Khh8vpz88rryCh72i5BQ9S7D8kAcFpyQDd6DfbLv6ebu/mWgjLR
EwMdJkLQjXGs1SNcpA085KVksGW5U1S2avjeakkp3ceSA/fBt4UdftTmwG6oORaS
pCxNIzozXx4QdDK81aaE
=uP+F
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
x86 queue, 2018-05-15
* KnightsMill CPU model
* CLDEMOTE(Demote Cache Line) cpu feature
* pc-i440fx-2.13 and pc-q35-2.13 machine-types
* Add model-specific cache information to EPYC CPU model
# gpg: Signature made Tue 15 May 2018 22:53:12 BST
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-next-pull-request:
i386: Add new property to control cache info
pc: add 2.13 machine types
i386: Initialize cache information for EPYC family processors
i386: Add cache information in X86CPUDefinition
i386: Helpers to encode cache information consistently
x86/cpu: Enable CLDEMOTE(Demote Cache Line) cpu feature
i386: add KnightsMill cpu model
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is hook function to be called when a postcopy migration wants to
resume from a failure. For each module, it should provide its own
recovery logic before we switch to the postcopy-active state.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-16-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
- Switch AIO/callback based block drivers to a byte-based interface
- Block jobs: Expose error string via query-block-jobs
- Block job cleanups and fixes
- hmp: Allow using a qdev id in block_set_io_throttle
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJa+v22AAoJEH8JsnLIjy/Wy5kP/21fPSvmyxMsSQ5lF5hlkodr
qNFIV6EuQ46YXSr4KzK2Dw88YR18nI5SlMd6mzWF9qx7WhjTMeHhARG9G497cQty
yDb3Y6dwiuhVndWLMzj/590miqk5TnJvFx5ii88oEnsrbcjKmTY78KMkl/q1bHSp
qL7sBhI3zPol+y28mvILPXgKsqnabvS/cmsQJCISUfSdFsnsxXVABUPI/WKe1ecs
UE3tl3cDA/0F8TYDerPUX1RZLJcr7yoXc91ieVWrzug4SStY3HZqWT8SShe2igN0
w5eWshUBhccHKeiqKx8vdXN7MzNP4v5H2lstTdqV/zDQEXO9vqF1X91zaRBJNb+o
A4M3lZU/U3xideBo7Hvp4euJ5f6ZKNswpeIC3Ppky788Q+HU/d+cQF+eZUxm+t8y
vVtixTToSt52dIaAPMsssWCtVwkS4IFO9RLXJeRs94XR3ocrsSdJNOQPTodWafjF
BXIF4wyKlPvVMzisvpj6jsjVR4Oq7J7P+EXq+hAMbio9WlsXssWu0ZedH8oV+mKl
UkEB5TgRAaz8MpYNCCEhUwdmHLwlc/PaPN+BAlvQCnDWMtfiFTys6Sh9N9vv/c16
HISDkR8sf7gPm3VLnt2fPoVOQMU3Gjtany+RySXfn1AzCRG3tPzdS/Ay+s1WpoYv
7F2Pm9QLepP7xRB48yD3
=JFBS
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- Switch AIO/callback based block drivers to a byte-based interface
- Block jobs: Expose error string via query-block-jobs
- Block job cleanups and fixes
- hmp: Allow using a qdev id in block_set_io_throttle
# gpg: Signature made Tue 15 May 2018 16:33:10 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (37 commits)
iotests: Add test for -U/force-share conflicts
qemu-img: Use only string options in img_open_opts
qemu-io: Use purely string blockdev options
block: Document BDRV_REQ_WRITE_UNCHANGED support
qemu-img: Check post-truncation size
iotests: Add test for COR across nodes
iotests: Copy 197 for COR filter driver
iotests: Clean up wrap image in 197
block: Support BDRV_REQ_WRITE_UNCHANGED in filters
block/quorum: Support BDRV_REQ_WRITE_UNCHANGED
block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes
block: Add BDRV_REQ_WRITE_UNCHANGED flag
block: BLK_PERM_WRITE includes ..._UNCHANGED
block: Add COR filter driver
iotests: Skip 181 and 201 without userfaultfd
iotests: Add failure matching to common.qemu
docs: Document the new default sizes of the qcow2 caches
qcow2: Give the refcount cache the minimum possible size by default
specs/qcow2: Clarify that compressed clusters have the COPIED bit reset
Fix error message about compressed clusters with OFLAG_COPIED
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The property legacy-cache will be used to control the cache information.
If user passes "-cpu legacy-cache" then older information will
be displayed even if the hardware supports new information. Otherwise
use the statically loaded cache definitions if available.
Renamed the previous cache structures to legacy_*. If there is any change in
the cache information, then it needs to be initialized in builtin_x86_defs.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Message-Id: <20180514164156.27034-3-babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add BDRV_REQ_WRITE_UNCHANGED to the list of flags honored during pwrite
and pwrite_zeroes, and also add a note on when you absolutely need to
support it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180502140359.18222-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
This flag signifies that a write request will not change the visible
disk content. With this flag set, it is sufficient to have the
BLK_PERM_WRITE_UNCHANGED permission instead of BLK_PERM_WRITE.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180421132929.21610-4-mreitz@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Currently we never actually check whether the WRITE_UNCHANGED
permission has been taken for unchanging writes. But the one check that
is commented out checks both WRITE and WRITE_UNCHANGED; and considering
that WRITE_UNCHANGED is already documented as being weaker than WRITE,
we should probably explicitly document WRITE to include WRITE_UNCHANGED.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180421132929.21610-3-mreitz@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The backup block job directly accesses the driver field in BlockJob. Add
a wrapper for getting it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
This gets us rid of more direct accesses to BlockJob fields from the
job drivers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
All block job drivers support .set_speed and all of them duplicate the
same code to implement it. Move that code to blockjob.c and remove the
now useless callback.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Every block job has a RateLimit, and they all do the exact same thing
with it, so it should be common infrastructure. Move the struct field
for a start.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Block job drivers are not expected to mess with the internals of the
BlockJob object, so provide wrapper functions for one of the cases where
they still do it: Updating the progress counter.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
We have too many driver callback interfaces; simplify the mess
somewhat by merging the flags parameter of .bdrv_co_writev_flags()
into .bdrv_co_writev(). Note that as long as a driver doesn't set
.supported_write_flags, the flags argument will be 0 and behavior is
identical. Also note that the public function bdrv_co_writev() still
lacks a flags argument; so the driver signature is thus intentionally
slightly different. But that's not the end of the world, nor the first
time that the driver interface differs slightly from the public
interface.
Ideally, we should be rewriting all of these drivers to use modern
byte-based interfaces. But that's a more invasive patch to write
and audit, compared to the simplification done here.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Now that all drivers with aio callbacks are using the
byte-based interfaces, we can remove the sector-based versions.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Make the change for the last few sector-based callbacks
in the file-win32 driver.
Note that the driver was already using byte-based calls for
performing actual I/O, so this just gets rid of a round trip
of scaling; however, as I don't know if Windows is tolerant of
non-sector AIO operations, I went with the conservative approach
of modifying .bdrv_refresh_limits to override the block layer
defaults back to the pre-patch value of 512.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Add new sector-based aio callbacks for read and write,
to match the fact that bdrv_aio_pdiscard is already byte-based.
Ideally, drivers should be converted to use coroutine callbacks
rather than aio; but that is not quite as trivial (and if we were
to do that conversion, the null-aio driver would disappear), so for
the short term, converting the signature but keeping things with
aio is easier. However, we CAN declare that a driver that uses
the byte-based aio interfaces now defaults to byte-based
operations, and must explicitly provide a refresh_limits override
to stick with larger alignments (making the alignment issues more
obvious directly in the drivers touched in the next few patches).
Once all drivers are converted, the sector-based aio callbacks will
be removed; in the meantime, a FIXME comment is added due to a
slight inefficiency that will be touched up as part of that later
cleanup.
Simplify some instances of 'bs->drv' into 'drv' while touching this,
since the local variable already exists to reduce typing.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Usually the logging of the CPU state produced by -d cpu is sufficient
to diagnose problems, but sometimes you want to see the state of
the floating point registers as well. We don't want to enable that
by default as it adds a lot of extra data to the log; instead,
allow it to be optionally enabled via -d fpu.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180510130024.31678-1-peter.maydell@linaro.org
This fixes an issue by adding bounds checking to multi-byte packets
where the PS/2 mouse data stream may become corrupted due to data being
discarded when the PS/2 ringbuffer is full.
Interrupts for Multi-byte responses are postponed until the final byte
has been queued.
These changes fix a bug where windows guests drop the mouse device
entirely requring the guest to be restarted.
Signed-off-by: Geoffrey McRae <geoff@hostfission.com>
Message-Id: <20180507150310.2FEA0381924@moya.office.hostfission.com>
[ kraxel: codestyle fixes ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Calling pause_all_vcpus()/resume_all_vcpus() from a VCPU thread might
not be the best idea. As pause_all_vcpus() temporarily drops the qemu
mutex, two parallel calls to pause_all_vcpus() can be active at a time,
resulting in a deadlock. (either by two VCPUs or by the main thread and a
VCPU)
Let's handle it via the main loop instead, as suggested by Paolo. If we
would have two parallel reset requests by two different VCPUs at the
same time, the last one would win.
We use the existing ipl device to handle it. The nice side effect is
that we can get rid of reipl_requested.
This change implies that all reset handling now goes via the common
path, so "no-reboot" handling is now active for all kinds of reboots.
Let's execute any CPU initialization code on the target CPU using
run_on_cpu.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180424101859.10239-1-david@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Version: GnuPG v1
iQEcBAABAgAGBQJa+UAZAAoJEO8Ells5jWIRTJIIAIcpNROupxHEfcAQKU7lIqys
qx/FxKp+lknzzQMwUfmZwT3PuBD+tWuE7ugXgzjVulvE11F+Z3QBPTDBtOObaMa1
qpgIF3zzrNxtuWMc/72Q8/wEE1wtBUo+WTAGw9Xp1dVomYOOsg1wa7dsKdZhRfz7
nIwDW2ftw3/mx+uTW2/a163v+IDDL9L+HOLibQHWUxOMM39ASchiqAXLF4mfhpwH
xr0OPd7wtcmrDsD/CLbdkGCJ/+vsXnY8pzNmy1RjJuDpWpqlgYpJJPtLBfXBo9VA
91sz5+KryzjpXXzKcQuhiU020O1dIKIe3PWqK6z0x0UjtZ1Yox5adZ3eFomHHwU=
=bIws
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 14 May 2018 08:51:53 BST
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
net: Get rid of 'vlan' terminology and use 'hub' instead in the doc files
net: Get rid of 'vlan' terminology and use 'hub' instead in the source files
net: Remove the deprecated "vlan" parameter
net: Fix memory leak in net_param_nic()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* dtc configure fixes
* MemoryRegionCache second try
* Deprecated option removal
* add support for Hyper-V reenlightenment MSRs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJa9Y2qAAoJEL/70l94x66Df8EIAI4pi+zf1mTlH0Koi+oqOg+d
geBC6N9IA+n1p90XERnPbuiT19NjON2R1Z907SbzDkijxdNRoYUoQf7Z+ZBTENjn
dYsVvgLYzajGLWWtJetPPaNFAqeF2z8B3lbVQnGVLzH5pQQ2NS1NJsvXQA2LslLs
2ll1CJ2EEBhayoBSbHK+0cY85f+DUgK/T1imIV2T/rwcef9Rw218nvPfGhPBSoL6
tI2xIOxz8bBOvZNg2wdxpaoPuDipBFu6koVVbaGSgXORg8k5CEcKNxInztufdELW
KZK5ORa3T0uqu5T/GGPAfm/NbYVQ4aTB5mddshsXtKbBhnbSfRYvpVsR4kQB/Hc=
=oC1r
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Don't silently truncate extremely long words in the command line
* dtc configure fixes
* MemoryRegionCache second try
* Deprecated option removal
* add support for Hyper-V reenlightenment MSRs
# gpg: Signature made Fri 11 May 2018 13:33:46 BST
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (29 commits)
rename included C files to foo.inc.c, remove osdep.h
pc-dimm: fix error messages if no slots were defined
build: Silence dtc directory creation
shippable: Remove Debian 8 libfdt kludge
configure: Display if libfdt is from system or git
configure: Really use local libfdt if the system one is too old
i386/kvm: add support for Hyper-V reenlightenment MSRs
qemu-doc: provide details of supported build platforms
qemu-options: Remove deprecated -no-kvm-irqchip
qemu-options: Remove deprecated -no-kvm-pit-reinjection
qemu-options: Bail out on unsupported options instead of silently ignoring them
qemu-options: Remove remainders of the -tdf option
qemu-options: Mark -virtioconsole as deprecated
target/i386: sev: fix memory leaks
opts: don't silently truncate long option values
opts: don't silently truncate long parameter keys
accel: use g_strsplit for parsing accelerator names
update-linux-headers: drop hyperv.h
qemu-thread: always keep the posix wrapper layer
exec: reintroduce MemoryRegion caching
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It's been marked as deprecated since QEMU v2.9.0, so that should have
been enough time for everybody to either just drop unnecessary "vlan=0"
parameters, to switch to the modern -device + -netdev syntax for connecting
guest NICs with host network backends, or to switch to the "hubport" netdev
in case hubs are really wanted instead.
Buglink: https://bugs.launchpad.net/qemu/+bug/658904
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
* hw/arm/iotkit.c: fix minor memory leak
* softfloat: fix wrong-exception-flags bug for multiply-add corner case
* arm: isolate and clean up DTB generation
* implement Arm v8.1-Atomics extension
* Fix some bugs and missing instructions in the v8.2-FP16 extension
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJa9IUCAAoJEDwlJe0UNgzeEGMQAKKjVRzZ7MBgvxQj0FJSWhSP
BZkATf3ktid255PRpIssBZiY9oM+uY6n+/IRozAGvfDBp9eQOkrZczZjfW5hpe0B
YsQadtk5cUOXqQzRTegSMPOoMmz8f5GaGOk4R6AEXJEX+Rug/zbOn9Q8Yx7JTd7o
yBvU1+fys3galSiB88cffA95B9fwGfLsM7rP6OC4yNdUBYwjHf3wtY53WsxtWqX9
oX4keEiROQkrOfbSy9wYPZzu/0iRo8v35+7wIZhvNSlf02k6yJ7a+w0C4EQIRhWm
5zciE+aMYr7nOGpj7AEJLrRekhwnD6Ppje6aUd15yrxfNRZkpk/FeECWnaOPDis7
QNijx5Zqg6+GyItQKi5U4vFVReMj09OB7xDyAq77xDeBj4l3lg2DNkRfRhqQZAcv
2r4EW+pfLNj76Ah1qtQ410fprw462Sopb6bHmeuFbf1QFbQvJ4CL1+7Jl3ExrDX4
2+iQb4sQghWDxhDLfRSLxQ7K+bX+mNfGdFW8h+jPShD/+JY42dTKkFZEl4ghNgMD
mpj8FrQuIkSMqnDmPfoTG5MVTMERacqPU7GGM7/fxudIkByO3zTiLxJ/E+Iy8HvX
29xKoOBjKT5FJrwJABsN6VpA3EuyAARgQIZ/dd6N5GZdgn2KAIHuaI+RHFOesKFd
dJGM6sdksnsAAz28aUEJ
=uXY+
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180510' into staging
target-arm queue:
* hw/arm/iotkit.c: fix minor memory leak
* softfloat: fix wrong-exception-flags bug for multiply-add corner case
* arm: isolate and clean up DTB generation
* implement Arm v8.1-Atomics extension
* Fix some bugs and missing instructions in the v8.2-FP16 extension
# gpg: Signature made Thu 10 May 2018 18:44:34 BST
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180510: (21 commits)
target/arm: Clear SVE high bits for FMOV
target/arm: Fix float16 to/from int16
target/arm: Implement vector shifted FCVT for fp16
target/arm: Implement vector shifted SCVF/UCVF for fp16
target/arm: Enable ARM_FEATURE_V8_ATOMICS for user-only
target/arm: Implement CAS and CASP
target/arm: Fill in disas_ldst_atomic
target/arm: Introduce ARM_FEATURE_V8_ATOMICS and initial decode
target/riscv: Use new atomic min/max expanders
tcg: Use GEN_ATOMIC_HELPER_FN for opposite endian atomic add
tcg: Introduce atomic helpers for integer min/max
target/xtensa: Use new min/max expanders
target/arm: Use new min/max expanders
tcg: Introduce helpers for integer min/max
atomic.h: Work around gcc spurious "unused value" warning
make sure that we aren't overwriting mc->get_hotplug_handler by accident
arm/boot: split load_dtb() from arm_load_kernel()
platform-bus-device: use device plug callback instead of machine_done notifier
pc: simplify MachineClass::get_hotplug_handler handling
softfloat: Handle default NaN mode after pickNaNMulAdd, not before
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# target/riscv/translate.c
Some versions of gcc produce a spurious warning if the result of
__atomic_compare_echange_n() is not used and the type involved
is a signed 8 bit value:
error: value computed is not used [-Werror=unused-value]
This has been seen on at least
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
Work around this by using an explicit cast to void to indicate
that we don't care about the return value.
We don't currently use our atomic_cmpxchg() macro on any signed
8 bit types, but the upcoming support for the Arm v8.1-Atomics
will require it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
load_dtb() depends on arm_load_kernel() to figure out place
in RAM where it should be loaded, but it's not required for
arm_load_kernel() to work. Sometimes it's neccesary for
devices added with -device/device_add to be enumerated in
DTB as well, which's lead to [1] and surrounding commits to
add 2 more machine_done notifiers with non obvious ordering
to make dynamic sysbus devices initialization happen in
the right order.
However instead of moving whole arm_load_kernel() in to
machine_done, it's sufficient to move only load_dtb() into
virt_machine_done() notifier and remove ArmLoadKernelNotifier/
/PlatformBusFDTNotifierParams notifiers, which saves us ~90LOC
and simplifies code flow quite a bit.
Later would allow to consolidate DTB generation within one
function for 'mach-virt' board and make it reentrant so it
could generate updated DTB in device hotplug secenarios.
While at it rename load_dtb() to arm_load_dtb() since it's
public now.
Add additional field skip_dtb_autoload to struct arm_boot_info
to allow manual DTB load later in mach-virt and to avoid touching
all other boards to explicitly call arm_load_dtb().
1) (ac9d32e hw/arm/boot: arm_load_kernel implemented as a machine init done notifier)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1525691524-32265-4-git-send-email-imammedo@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
platform-bus were using machine_done notifier to get and map
(assign irq/mmio resources) dynamically added sysbus devices
after all '-device' options had been processed.
That however creates non obvious dependencies on ordering of
machine_done notifiers and requires carefull line juggling
to keep it working. For example see comment above
create_platform_bus() and 'straitforward' arm_load_kernel()
had to converted to machine_done notifier and that lead to
yet another machine_done notifier to keep it working
arm_register_platform_bus_fdt_creator().
Instead of hiding resource assignment in platform-bus-device
to magically initialize sysbus devices, use device plug
callback and assign resources explicitly at board level
at the moment each -device option is being processed.
That adds a bunch of machine declaration boiler plate to
e500plat board, similar to ARM/x86 but gets rid of hidden
machine_done notifier and would allow to remove the dependent
notifiers in ARM code simplifying it and making code flow
easier to follow.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 1525691524-32265-3-git-send-email-imammedo@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
By default MachineClass::get_hotplug_handler is NULL and concrete board
should set it to it's own handler.
Considering there isn't any default handler, drop saving empty
MachineClass::get_hotplug_handler in child class and make PC code
consistent with spapr/s390x boards.
We can bring this back when actual usecase surfaces and do it
consistently across boards that use get_hotplug_handler().
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 1525691524-32265-2-git-send-email-imammedo@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 8119334918 ("block: Don't
block_job_pause_all() in bdrv_drain_all()") removed the only callers of
block_job_pause/resume_all().
Pausing and resuming now happens in child_job_drained_begin/end() so
it's no longer necessary to globally pause/resume jobs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180424085240.5798-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
While at it, use int for both num_insns and max_insns to make
sure we have same-type comparisons.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The dangling remainder of the -tdf option revealed a deficiency in our
option parsing: Options that have been declared, but are not supported
in the switch-case statement in vl.c and not handled in the OS-specifc
os_parse_cmd_args() functions are currently silently ignored. We should
rather tell the users that they specified something that we can not
handle, so let's print an error message and exit instead.
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1525453270-23074-3-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MemoryRegionCache was reverted to "normal" address_space_* operations
for 2.9, due to lack of support for IOMMUs. Reinstate the
optimizations, caching only the IOMMU translation at address_cache_init
but not the IOMMU lookup and target AddressSpace translation are not
cached; now that MemoryRegionCache supports IOMMUs, it becomes more widely
applicable too.
The inlined fast path is defined in memory_ldst_cached.inc.h, while the
slow path uses memory_ldst.inc.c as before. The smaller fast path causes
a little code size reduction in MemoryRegionCache users:
hw/virtio/virtio.o text size before: 32373
hw/virtio/virtio.o text size after: 31941
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For now, this reduces the text size very slightly due to the newly-added
inlining:
text size before: 9301965
text size after: 9300645
Later, however, the declarations in include/exec/memory_ldst.inc.h will be
reused for the MemoryRegionCache slow path functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The "id" property is unnecessary and can be replaced simply with
object_get_canonical_path_component. This patch mostly undoes commit
e1ff3c67e8 ("monitor: fix qmp/hmp query-memdev not reporting IDs of
memory backends", 2017-01-12).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just return NULL; any callers that cause a change in behavior
would have caused an assertion failure before, so this is safe.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* pc-dimm: factor out MemoryDevice
(virtio-pmem and virtio-mem will make use of the new abstraction later)
* scripts/device-crash-test: Removed fixed CAN entries
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJa8IZ2AAoJECgHk2+YTcWmmD0P/2Lddw+ilGhGS/CWarq4uLSF
ILtEMwNgbJeJAEza6IQx/IIuUER3H5UcxgZhO49nELpurobhl5yW9JKP1qjH9z9i
7hVPORGioiyGkjgjbm8jWtljePAloTIwEiIcrqYkVHpWDCUJaZ7SES2VQL7ltY/W
AU3uSFQQMDfVqr/MXDxZq084wFK3Jm2aIE+p8a0MF7B+29RSHdFU9iKysCC1Wu/1
AllXCkQ4yWHCGoSRBfzFz9EWBb4VlzM+VNj9nhHu75zdF3hm7J05yIiGuZLiOjmB
MDOkvKhSeXNj+21mXVLmSxkfI65z6jrq3aI7iTp4+orrd2SCXoHsOZoj4Q2cRSnw
kJlY62+p85H9NYIKTgMCM/oURpL2ZnqPKmCto1NRFywSBGLXll2weyKpX9ByvXe2
gL8hqra/K8eUPW4zSsPYbbN1b16EnK4MY2nkYvG0Y/aAXGZF6V9zQwKNT4/F5GyY
SRMC4c2OtQOgZNDSuPdgZ5Lu5PXfetvvcqWCj0tXNdaScOp6Omsc/i/YCUtu6r/3
IbBIclJ+K5aD+U4QP4DKZ+DJbEkIGMU4pSHgR2i8bK7MmoJpJcAIB1mL5nA/TknP
/RVgtnP7gVbfGIVVwjUw9bMurvOti4PBp0/DxC/VqUqGs9e8avE1yb9grVJdj/jA
oEGJ6EIsmO1URbk1+f93
=Hhge
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging
Machine queue, 2018-05-07
* pc-dimm: factor out MemoryDevice
(virtio-pmem and virtio-mem will make use of the new abstraction later)
* scripts/device-crash-test: Removed fixed CAN entries
# gpg: Signature made Mon 07 May 2018 18:01:42 BST
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/machine-next-pull-request:
scripts/device-crash-test: Removed fixed CAN entries
vl: allow 'maxmem' without 'slot'
spapr: rename "hotplug memory" terminology to "device memory"
pc: rename "hotplug memory" terminology to "device memory"
machine: rename MemoryHotplugState to DeviceMemoryState
pc-dimm: move actual plug/unplug of a memory region to MemoryDevice
pc-dimm: factor out capacity and slot checks into MemoryDevice
pc-dimm: factor out address search into MemoryDevice code
pc-dimm: pass in the machine and to the MemoryHotplugState
pc-dimm: no need to pass the memory region
machine: make MemoryHotplugState accessible via the machine
pc-dimm: factor out MemoryDevice interface
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Several code cleanups, minor specification conformance changes,
fixes to make ROM read-only and add device-tree size checks.
* Honour privileged ISA v1.10 counter enable CSRs.
* Implements WARL behavior for CSRs that don't support writes
* Past behavior of raising traps was non-conformant
with the RISC-V Privileged ISA Specification v1.10.
* Allow S-mode access to sstatus.MXR when priv ISA >= v1.10
* Sets mtval/stval to zero on exceptions without addresses
* Past behavior of leaving the last value was non-conformant
with the RISC-V Privileged ISA Specition v1.10. mtval/stval
must be set on all exceptions; to zero if not supported.
* Make ROMs read-only and implement device-tree size checks
* Uses memory_region_init_rom and rom_add_blob_fixed_as
* Adds hexidecimal instruction bytes to disassembly output.
* Fixes missing break statement for rv128 disassembly.
* Several code cleanups
* Replacing hard-coded constants with enums
* Dead-code elimination
This is an incremental pull that contains 20 reviewed changes out
of 38 changes currently queued in the qemu-2.13-for-upstream branch.
-----BEGIN PGP SIGNATURE-----
iF0EABECAB0WIQR8mZMOsXzYugc9Xvpr8dezV+8+TwUCWu496QAKCRBr8dezV+8+
T1ZEAJ4wQRHZtn4suN5yMEHQMA2FkX1iNACgiYWLtcNcgoa88eaTcJJJu4QZryY=
=I2wf
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/riscv/tags/riscv-qemu-2.13-pull-20180506' into staging
RISC-V: QEMU 2.13 Privileged ISA emulation updates
Several code cleanups, minor specification conformance changes,
fixes to make ROM read-only and add device-tree size checks.
* Honour privileged ISA v1.10 counter enable CSRs.
* Implements WARL behavior for CSRs that don't support writes
* Past behavior of raising traps was non-conformant
with the RISC-V Privileged ISA Specification v1.10.
* Allow S-mode access to sstatus.MXR when priv ISA >= v1.10
* Sets mtval/stval to zero on exceptions without addresses
* Past behavior of leaving the last value was non-conformant
with the RISC-V Privileged ISA Specition v1.10. mtval/stval
must be set on all exceptions; to zero if not supported.
* Make ROMs read-only and implement device-tree size checks
* Uses memory_region_init_rom and rom_add_blob_fixed_as
* Adds hexidecimal instruction bytes to disassembly output.
* Fixes missing break statement for rv128 disassembly.
* Several code cleanups
* Replacing hard-coded constants with enums
* Dead-code elimination
This is an incremental pull that contains 20 reviewed changes out
of 38 changes currently queued in the qemu-2.13-for-upstream branch.
# gpg: Signature made Sun 06 May 2018 00:27:37 BST
# gpg: using DSA key 6BF1D7B357EF3E4F
# gpg: Good signature from "Michael Clark <michaeljclark@mac.com>"
# gpg: aka "Michael Clark <mjc@sifive.com>"
# gpg: aka "Michael Clark <michael@metaparadigm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7C99 930E B17C D8BA 073D 5EFA 6BF1 D7B3 57EF 3E4F
* remotes/riscv/tags/riscv-qemu-2.13-pull-20180506:
RISC-V: Mark ROM read-only after copying in code
RISC-V: No traps on writes to misa,minstret,mcycle
RISC-V: Make mtvec/stvec ignore vectored traps
RISC-V: Add mcycle/minstret support for -icount auto
RISC-V: Use [ms]counteren CSRs when priv ISA >= v1.10
RISC-V: Allow S-mode mxr access when priv ISA >= v1.10
RISC-V: Clear mtval/stval on exceptions without info
RISC-V: Hardwire satp to 0 for no-mmu case
RISC-V: Update E and I extension order
RISC-V: Remove erroneous comment from translate.c
RISC-V: Remove EM_RISCV ELF_MACHINE indirection
RISC-V: Make virt header comment title consistent
RISC-V: Make some header guards more specific
RISC-V: Fix missing break statement in disassembler
RISC-V: Include instruction hex in disassembly
RISC-V: Remove unused class definitions
RISC-V: Remove identity_translate from load_elf
RISC-V: Use ROM base address and size from memmap
RISC-V: Make virt board description match spike
RISC-V: Replace hardcoded constants with enum values
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let's make it clear at relevant places that we are dealing with device
memory. That it can be used for memory hotplug is just a special case.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-11-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
[ehabkost: rebased series, solved conflicts at spapr.c]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's make it clear that we are dealing with device memory. That it can
be used for memory hotplug is just a special case.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-10-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Rename it to better match the new terminology.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-9-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Registering the memory region for migration has do be done by the owner.
There could be cases, where we don't want to migrate the memory.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-8-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Move the checks into memory_device_get_free_addr(). This will check
before doing any calculations if we have KVM/vhost slots left and if
the total region size would be exceeded.
Of course, while at it, make it independent of pc-dimm code.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This mainly moves code, but does a handfull of optimizations:
- We pass the machine instead of the address space properties
- We check the hinted address directly and handle fragmented memory
better
- We make the search independent of pc-dimm
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-6-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We use the machine internally either way, so let's just pass it in then.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We can just query it ourselves. When unplugging, we should always be
able to the region (as it was previously plugged). E.g. PPC already
assumed that and used &error_abort.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's allow to query the MemoryHotplugState directly from the machine.
If the pointer is NULL, the machine does not support memory devices. If
the pointer is !NULL, the machine supports memory devices and the
data structure contains information about the applicable physical
guest address space region.
This allows us to generically detect if a certain machine has support
for memory devices, and to generically manage it (find free address
range, plug/unplug a memory region).
We will rename "MemoryHotplugState" to something more meaningful
("DeviceMemory") after we completed factoring out the pc-dimm code into
MemoryDevice code.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
[ehabkost: rebased series, solved conflicts at spapr.c]
[ehabkost: squashed fix to use g_malloc0()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
On the qmp level, we already have the concept of memory devices:
"query-memory-devices"
Right now, we only support NVDIMM and PCDIMM.
We want to map other devices later into the address space of the guest.
Such device could e.g. be virtio devices. These devices will have a
guest memory range assigned but won't be exposed via e.g. ACPI. We want
to make them look like memory device, but not glued to pc-dimm.
Especially, it will not always be possible to have TYPE_PC_DIMM as a parent
class (e.g. virtio devices). Let's use an interface instead. As a first
part, convert handling of
- qmp_pc_dimm_device_list
- get_plugged_memory_size
to our new model. plug/unplug stuff etc. will follow later.
A memory device will have to provide the following functions:
- get_addr(): Necessary, as the property "addr" can e.g. not be used for
virtio devices (already defined).
- get_plugged_size(): The amount this device offers to the guest as of
now.
- get_region_size(): Because this can later on be bigger than the
plugged size.
- fill_device_info(): Fill MemoryDeviceInfo, e.g. for qmp.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Removes a whole lot of unnecessary boilerplate code. Machines
don't need to be objects. The expansion of the SOC object model
for the RISC-V machines will happen in the future as SiFive
plans to add their FE310 and FU540 SOCs to QEMU. However, it
seems that this present boilerplate is complete unnecessary.
Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Another case of replacing hard coded constants, this time
referring to the definition in the virt machine's memmap.
Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
The RISC-V device-tree code has a number of hard-coded
constants and this change moves them into header enums.
Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
This patch builds the smmuv3 node in the ACPI IORT table.
The RID space of the root complex, which spans 0x0-0x10000
maps to streamid space 0x0-0x10000 in smmuv3, which in turn
maps to deviceid space 0x0-0x10000 in the ITS group.
The guest must feature the IOMMU probe deferral series
(https://lkml.org/lkml/2017/4/10/214) which fixes streamid
multiple lookup. This bug is not related to the SMMU emulation.
Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Shannon Zhao <zhaoshenglong@huawei.com>
Message-id: 1524665762-31355-14-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add code to instantiate an smmuv3 in virt machine. A new iommu
integer member is introduced in VirtMachineState to store the type
of the iommu in use.
Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1524665762-31355-13-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>