Since pc_madt_cpu_entry() is only used by:
- hw/i386/acpi-build.c // single call
- hw/i386/acpi-common.c // definition
there is no need to expose it outside of hw/i386/.
Declare it in "acpi-common.h".
acpi-build.c doesn't need "hw/i386/pc.h" anymore.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
HPET_INTCAP is specific to TYPE_HPET, so define it there.
hpet.c doesn't need to include "hw/i386/pc.h" anymore.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Luc Michel <luc.michel@amd.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
pc_machine_kvm_type() was introduced by commit e21be724ea ("i386/xen:
add pc_machine_kvm_type to initialize XEN_EMULATE mode") to do Xen
specific initialization by utilizing kvm_type method.
commit eeedfe6c63 ("hw/xen: Simplify emulated Xen platform init")
moves the Xen specific initialization to pc_basic_device_init().
There is no need to keep the PC specific kvm_type() implementation
anymore. So we'll fallback to kvm_arch_get_default_type(), which
simply returns 0.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20231007065819.27498-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU populates the apic_state attribute of x86 CPUs if supported by real
hardware or if SMP is active. When handling interrupts, it just checks whether
apic_state is populated to route the interrupt to the PIC or to the APIC.
However, chapter 10.4.3 of [1] requires that:
When IA32_APIC_BASE[11] is 0, the processor is functionally equivalent to an
IA-32 processor without an on-chip APIC.
This means that when apic_state is populated, QEMU needs to check for the
MSR_IA32_APICBASE_ENABLE flag in addition. Implement this which fixes some
real-world BIOSes.
[1] Intel 64 and IA-32 Architectures Software Developer's Manual, Vol. 3A:
System Programming Guide, Part 1
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240106132546.21248-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit adds support for x2APIC transitions when writing to
MSR_IA32_APICBASE register and finally adds CPUID_EXT_X2APIC to
TCG_EXT_FEATURES.
The set_base in APICCommonClass now returns an integer to indicate error in
execution. apic_set_base return -1 on invalid APIC state transition,
accelerator can use this to raise appropriate exception.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Message-Id: <20240111154404.5333-4-minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit extends the APIC ID to 32-bit long and remove the 255 max APIC
ID limit in userspace APIC. The array that manages local APICs is now
dynamically allocated based on the max APIC ID of created x86 machine.
Also, new x2APIC IPI destination determination scheme, self IPI and x2APIC
mode register access are supported.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Message-Id: <20240111154404.5333-3-minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit creates apic_register_read/write which are used by both
apic_mem_read/write for MMIO access and apic_msr_read/write for MSR access.
The apic_msr_read/write returns -1 on error, accelerator can use this to
raise the appropriate exception.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Message-Id: <20240111154404.5333-2-minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When instantiating XenBus itself, for each NIC which is configured with
either the model unspecified, or set to to "xen" or "xen-net-device",
create a corresponding xen-net-device for it.
Now we can revert the previous more hackish version which relied on the
platform code explicitly registering the NICs on its own XenBus, having
returned the BusState* from xen_bus_init() itself.
This also fixes the setup for Xen PV guests, which was previously broken
in various ways and never actually managed to peer with the netdev.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
For function comments in this file, keep the comment style consistent
with other files in the directory.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@Intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20231024090323.1859210-2-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The default NIC creation seems a bit hackish to me. I don't understand
why each platform has to call pci_nic_init_nofail() from a point in the
code where it actually has a pointer to the PCI bus, and then we have
the special cases for things like ne2k_isa.
If qmp_device_add() can *find* the appropriate bus and instantiate
the device on it, why can't we just do that from generic code for
creating the default NICs too?
But that isn't a yak I want to shave today. Add a xenbus field to the
PCMachineState so that it can make its way from pc_basic_device_init()
to pc_nic_init() and be handled as a special case like ne2k_isa is.
Now we can launch emulated Xen guests with '-nic user'.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
QEMU's PIIX3 implementation actually models the real PIIX4, but with different
PCI IDs. Usually, guests deal just fine with it. Still, in order to provide a
more consistent illusion to guests, allow QEMU's PIIX4 implementation to be used
in the PC machine.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20231007123843.127151-30-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
32-bit x86 systems do not have a reserved memory for hole64. On those 32-bit
systems without PSE36 or PAE CPU features, hotplugging memory devices are not
supported by QEMU as QEMU always places hotplugged memory above 4 GiB boundary
which is beyond the physical address space of the processor. Linux guests also
does not support memory hotplug on those systems. Please see Linux
kernel commit b59d02ed08690 ("mm/memory_hotplug: disable the functionality
for 32b") for more details.
Therefore, the maximum limit of the guest physical address in the absence of
additional memory devices effectively coincides with the end of
"above 4G memory space" region for 32-bit x86 without PAE/PSE36. When users
configure additional memory devices, after properly accounting for the
additional device memory region to find the maximum value of the guest
physical address, the address will be outside the range of the processor's
physical address space.
This change adds improvements to take above into consideration.
For example, previously this was allowed:
$ ./qemu-system-x86_64 -cpu pentium -m size=10G
With this change now it is no longer allowed:
$ ./qemu-system-x86_64 -cpu pentium -m size=10G
qemu-system-x86_64: Address space limit 0xffffffff < 0x2bfffffff phys-bits too low (32)
However, the following are allowed since on both cases physical address
space of the processor is 36 bits:
$ ./qemu-system-x86_64 -cpu pentium2 -m size=10G
$ ./qemu-system-x86_64 -cpu pentium,pse36=on -m size=10G
For 32-bit, without PAE/PSE36, hotplugging additional memory is no longer allowed.
$ ./qemu-system-i386 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0xffffffff < 0x1ffffffff phys-bits too low (32)
$ ./qemu-system-i386 -machine q35 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0xffffffff < 0x1ffffffff phys-bits too low (32)
A new compatibility flag is introduced to make sure pc_max_used_gpa() keeps
returning the old value for machines 8.1 and older.
Therefore, the above is still allowed for older machine types in order to support
compatibility. Hence, the following still works:
$ ./qemu-system-i386 -machine pc-i440fx-8.1 -m size=1G,maxmem=3G,slots=2
$ ./qemu-system-i386 -machine pc-q35-8.1 -m size=1G,maxmem=3G,slots=2
Further, following is also allowed as with PSE36, the processor has 36-bit
address space:
$ ./qemu-system-i386 -cpu 486,pse36=on -m size=1G,maxmem=3G,slots=2
After calling CPUID with EAX=0x80000001, all AMD64 compliant processors
have the longmode-capable-bit turned on in the extended feature flags (bit 29)
in EDX. The absence of CPUID longmode can be used to differentiate between
32-bit and 64-bit processors and is the recommended approach. QEMU takes this
approach elsewhere (for example, please see x86_cpu_realizefn()), With
this change, pc_max_used_gpa() also uses the same method to detect 32-bit
processors.
Unit tests are modified to not run 32-bit x86 tests that use memory hotplug.
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230922160413.165702-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The first bitfield here is supposed to be used as a 64-bit equivalent
to the "uint64_t msi_addr" in the union. To make this work correctly
on big endian hosts, too, the __addr_hi field has to be part of the
bitfield, and the the bitfield members must be declared with "uint64_t"
instead of "uint32_t" - otherwise the values are placed in the wrong
bytes on big endian hosts.
Same applies to the 32-bit "msi_data" field: __resved1 must be part
of the bitfield, and the members must be declared with "uint32_t"
instead of "uint16_t".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-7-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
The code already tries to do some endianness handling here, but
currently fails badly:
- While it already swaps the data when logging errors / tracing, it fails
to byteswap the value before e.g. accessing entry->irte.present
- entry->irte.source_id is swapped with le32_to_cpu(), though this is
a 16-bit value
- The whole union is apparently supposed to be swapped via the 64-bit
data[2] array, but the struct is a mixture between 32 bit values
(the first 8 bytes) and 64 bit values (the second 8 bytes), so this
cannot work as expected.
Fix it by converting the struct to two proper 64-bit bitfields, and
by swapping the values only once for everybody right after reading
the data from memory.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-3-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
There are no remaining users in the tree. Libvirt never used that
property and a quick internet search revealed no other users.
Further, we renamed that property already in commit f2ffbe2b7d
("pc: rename "hotplug memory" terminology to "device memory"") without
anybody complaining.
So let's just get rid of it.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <eduardo@habkost.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230623124553.400585-9-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-7-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, pc-q35 and pc-i44fx machine models are default to use SMBIOS 2.8
(32-bit entry point). Since SMBIOS 3.0 (64-bit entry point) is now fully
supported since QEMU 7.0, default to use SMBIOS 3.0 for newer machine
models. This is necessary to avoid the following message when launching
a VM with large number of vcpus.
"SMBIOS 2.1 table length 66822 exceeds 65535"
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230607205717.737749-2-suravee.suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
This patch does following:
1. creates arch_handle_ioreq() and arch_xen_set_memory(). This is done in
preparation for moving most of xen-hvm code to an arch-neutral location,
move the x86-specific portion of xen_set_memory to arch_xen_set_memory.
Also, move handle_vmport_ioreq to arch_handle_ioreq.
2. Pure code movement: move common functions to hw/xen/xen-hvm-common.c
Extract common functionalities from hw/i386/xen/xen-hvm.c and move them to
hw/xen/xen-hvm-common.c. These common functions are useful for creating
an IOREQ server.
xen_hvm_init_pc() contains the architecture independent code for creating
and mapping a IOREQ server, connecting memory and IO listeners, initializing
a xen bus and registering backends. Moved this common xen code to a new
function xen_register_ioreq() which can be used by both x86 and ARM machines.
Following functions are moved to hw/xen/xen-hvm-common.c:
xen_vcpu_eport(), xen_vcpu_ioreq(), xen_ram_alloc(), xen_set_memory(),
xen_region_add(), xen_region_del(), xen_io_add(), xen_io_del(),
xen_device_realize(), xen_device_unrealize(),
cpu_get_ioreq_from_shared_memory(), cpu_get_ioreq(), do_inp(),
do_outp(), rw_phys_req_item(), read_phys_req_item(),
write_phys_req_item(), cpu_ioreq_pio(), cpu_ioreq_move(),
cpu_ioreq_config(), handle_ioreq(), handle_buffered_iopage(),
handle_buffered_io(), cpu_handle_ioreq(), xen_main_loop_prepare(),
xen_hvm_change_state_handler(), xen_exit_notifier(),
xen_map_ioreq_server(), destroy_hvm_domain() and
xen_shutdown_fatal_error()
3. Removed static type from below functions:
1. xen_region_add()
2. xen_region_del()
3. xen_io_add()
4. xen_io_del()
5. xen_device_realize()
6. xen_device_unrealize()
7. xen_hvm_change_state_handler()
8. cpu_ioreq_pio()
9. xen_exit_notifier()
4. Replace TARGET_PAGE_SIZE with XC_PAGE_SIZE to match the page side with Xen.
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
We are going to re-use this setting for other targets, so let's
move this to the main MachineClass.
Message-Id: <20230512124033.502654-4-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Now that the RTC is created as part of the southbridges it doesn't need
to be an out-parameter any longer.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230519084734.220480-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Going through pc_memory_init() seems quite complicated for a simple
assignment.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213162004.2797-7-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
i440fx machine versions 2.3 and newer supports dynamic ram
resizing. See commit a1666142db ("acpi-build: make ROMs RAM blocks resizeable") .
Currently supported all q35 machine types (versions 2.4 and newer) supports
resizable RAM/ROM blocks.Therefore the warning generated when the ACPI table
size exceeds a pre-defined value does not apply to those machine versions.
Add a check limiting the warning message to only those machines that does not
support expandable ram blocks (that is, i440fx machines with version 2.2
and older).
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230329045726.14028-1-anisinha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add 8.1 machine types for arm/i440fx/m68k/q35/s390x/spapr.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20230314173009.152667-1-cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
nmi.h and notify.h are not needed here, drop them to speed up
the compiling a little bit.
Message-Id: <20230210111438.1114600-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
vhost-user support without ioeventfd
word replacements in vhost user spec
shpc improvements
cleanups, fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQBO8QPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpMUMH/3/FVp4qaF4CDwCHn7xWFRJpOREIhX/iWfUu
lGkwxnB7Lfyqdg7i4CAfgMf2emWKZchEE2DamfCo5bIX0IgRU3DWcOdR9ePvJ29J
cKwIYpxZcB4RYSoWL5OUakQLCT3JOu4XWaXeVjyHABjQhf3lGpwN4KmIOBGOy/N6
0YHOQScW2eW62wIOwhAEuYQceMt6KU32Uw3tLnMbJliiBf3a/hPctVNM9TFY9pcd
UYHGfBx/zD45owf1lTVEQFDg0eqPZKWW29g5haiOd5oAyXHHolzu+bt3bU7lH46b
f7iP12LqDudyrgoF5YWv3NJ4HaGm5V3kPqNqLLF/mjF7alxG+N8=
=hN3h
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, cleanups, fixes
vhost-user support without ioeventfd
word replacements in vhost user spec
shpc improvements
cleanups, fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQBO8QPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpMUMH/3/FVp4qaF4CDwCHn7xWFRJpOREIhX/iWfUu
# lGkwxnB7Lfyqdg7i4CAfgMf2emWKZchEE2DamfCo5bIX0IgRU3DWcOdR9ePvJ29J
# cKwIYpxZcB4RYSoWL5OUakQLCT3JOu4XWaXeVjyHABjQhf3lGpwN4KmIOBGOy/N6
# 0YHOQScW2eW62wIOwhAEuYQceMt6KU32Uw3tLnMbJliiBf3a/hPctVNM9TFY9pcd
# UYHGfBx/zD45owf1lTVEQFDg0eqPZKWW29g5haiOd5oAyXHHolzu+bt3bU7lH46b
# f7iP12LqDudyrgoF5YWv3NJ4HaGm5V3kPqNqLLF/mjF7alxG+N8=
# =hN3h
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 03 Mar 2023 00:13:56 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (53 commits)
tests/data/acpi/virt: drop (most) duplicate files.
hw/cxl/mailbox: Use new UUID network order define for cel_uuid
qemu/uuid: Add UUID static initializer
qemu/bswap: Add const_le64()
tests: acpi: Update q35/DSDT.cxl for removed duplicate UID
hw/i386/acpi: Drop duplicate _UID entry for CXL root bridge
tests/acpi: Allow update of q35/DSDT.cxl
hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition
hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL
hw/pci-bridge/cxl_downstream: Fix type naming mismatch
hw/mem/cxl_type3: Improve error handling in realize()
MAINTAINERS: Add Fan Ni as Compute eXpress Link QEMU reviewer
intel-iommu: send UNMAP notifications for domain or global inv desc
smmu: switch to use memory_region_unmap_iommu_notifier_range()
memory: introduce memory_region_unmap_iommu_notifier_range()
intel-iommu: fail DEVIOTLB_UNMAP without dt mode
intel-iommu: fail MAP notifier without caching mode
memory: Optimize replay of guest mapping
chardev/char-socket: set s->listener = NULL in char_socket_finalize
hw/pci: Trace IRQ routing on PCI topology
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts commit 67f7e426e5.
Additionally to the automatic revert, I went over the code
and dropped all mentions of legacy_no_rng_seed manually,
effectively reverting a combination of 2 additional commits:
commit ffe2d2382e
Author: Jason A. Donenfeld <Jason@zx2c4.com>
Date: Wed Sep 21 11:31:34 2022 +0200
x86: re-enable rng seeding via SetupData
commit 3824e25db1
Author: Gerd Hoffmann <kraxel@redhat.com>
Date: Wed Aug 17 10:39:40 2022 +0200
x86: disable rng seeding via setup_data
Fixes: 67f7e426e5 ("hw/i386: pass RNG seed via setup_data entry")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Dov Murik <dovmurik@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This reverts commit eac7a7791b.
Fixes: eac7a7791b ("x86: don't let decompressed kernel image clobber setup_data")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Dov Murik <dovmurik@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The xen_overlay device (and later similar devices for event channels and
grant tables) need to be instantiated. Do this from a kvm_type method on
the PC machine derivatives, since KVM is only way to support Xen emulation
for now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
ICH9 is a south bridge which doesn't necessarily depend on x86, so move
it into the southbridge folder, analoguous to PIIX.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-13-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-12-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The ioapic sources reside in hw/intc already. Move the headers there
as well.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-11-shentey@gmail.com>
[PMD: Keep ioapic_internal.h in hw/intc/, not under include/]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Most code uses IOAPIC_NUM_PINS. The only place where GSI_NUM_PINS defines
the size of an array is ICH9LPCState::gsi which needs to match
IOAPIC_NUM_PINS. Remove GSI_NUM_PINS for consistency.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-10-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Make TYPE_ICH9_LPC_DEVICE more self-contained by moving the call to
ich9_lpc_pm_init() from board code to its realize function. In order
to propagate x86_machine_is_smm_enabled(), introduce an "smm-enabled"
property like we have in piix4.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-8-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
ich9_smb_init() is a legacy init function, so modernize the code.
Note that the smb_io_base parameter was unused.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
By using qdev_get_child_bus() we can eliminate ICH9LPCState::isa_bus and
spare the ich9_lpc variable in pc_q35, too.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-4-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
No need to rely on the board to wire up the ICH9 PCI IRQs. All functions
access private state of the LPC device which suggests that it should
wire up the IRQs.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-3-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The Q35_MASK macro is already defined by TYPE_Q35_HOST_DEVICE, so let
TYPE_ICH9_LPC_DEVICE have its own one to prevent potential name clash.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-2-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This function is not used anywhere outside this file, so
we can delete the prototype from include/hw/i386/x86.h and
make the function "static void".
This fixes when building with -Wall and using Clang
("Apple clang version 14.0.0 (clang-1400.0.29.202)"):
../hw/i386/x86.c:70:24: error: static function 'MACHINE' is used in an inline function with external linkage [-Werror,-Wstatic-in-inline]
MachineState *ms = MACHINE(x86ms);
^
include/hw/i386/x86.h:101:1: note: use 'static' to give inline function 'init_topo_info' internal linkage
void init_topo_info(X86CPUTopoInfo *topo_info, const X86MachineState *x86ms);
^
static
include/hw/boards.h:24:49: note: 'MACHINE' declared here
OBJECT_DECLARE_TYPE(MachineState, MachineClass, MACHINE)
^
Reported-by: Stefan Weil <sw@weilnetz.de>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221216220158.6317-6-philmd@linaro.org>
The setup_data links are appended to the compressed kernel image. Since
the kernel image is typically loaded at 0x100000, setup_data lives at
`0x100000 + compressed_size`, which does not get relocated during the
kernel's boot process.
The kernel typically decompresses the image starting at address
0x1000000 (note: there's one more zero there than the compressed image
above). This usually is fine for most kernels.
However, if the compressed image is actually quite large, then
setup_data will live at a `0x100000 + compressed_size` that extends into
the decompressed zone at 0x1000000. In other words, if compressed_size
is larger than `0x1000000 - 0x100000`, then the decompression step will
clobber setup_data, resulting in crashes.
Visually, what happens now is that QEMU appends setup_data to the kernel
image:
kernel image setup_data
|--------------------------||----------------|
0x100000 0x100000+l1 0x100000+l1+l2
The problem is that this decompresses to 0x1000000 (one more zero). So
if l1 is > (0x1000000-0x100000), then this winds up looking like:
kernel image setup_data
|--------------------------||----------------|
0x100000 0x100000+l1 0x100000+l1+l2
d e c o m p r e s s e d k e r n e l
|-------------------------------------------------------------|
0x1000000 0x1000000+l3
The decompressed kernel seemingly overwriting the compressed kernel
image isn't a problem, because that gets relocated to a higher address
early on in the boot process, at the end of startup_64. setup_data,
however, stays in the same place, since those links are self referential
and nothing fixes them up. So the decompressed kernel clobbers it.
Fix this by appending setup_data to the cmdline blob rather than the
kernel image blob, which remains at a lower address that won't get
clobbered.
This could have been done by overwriting the initrd blob instead, but
that poses big difficulties, such as no longer being able to use memory
mapped files for initrd, hurting performance, and, more importantly, the
initrd address calculation is hard coded in qboot, and it always grows
down rather than up, which means lots of brittle semantics would have to
be changed around, incurring more complexity. In contrast, using cmdline
is simple and doesn't interfere with anything.
The microvm machine has a gross hack where it fiddles with fw_cfg data
after the fact. So this hack is updated to account for this appending,
by reserving some bytes.
Fixup-by: Michael S. Tsirkin <mst@redhat.com>
Cc: x86@kernel.org
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Message-Id: <20221230220725.618763-1-Jason@zx2c4.com>
Message-ID: <20230128061015-mutt-send-email-mst@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Eric Biggers <ebiggers@google.com>
Tested-by: Mathias Krause <minipli@grsecurity.net>
It seems not super clear on when iova_tree is used, and why. Add a rich
comment above iova_tree to track why we needed the iova_tree, and when we
need it.
Also comment for the map/unmap messages, on how they're used and
implications (e.g. unmap can be larger than the mapped ranges).
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230109193727.1360190-1-peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The only function ever assigned to AcpiDeviceIfClass::madt_cpu is
pc_madt_cpu_entry() which doesn't use the AcpiDeviceIf parameter.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230121151941.24120-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This argument was added 9 years ago in commit 83d08f2673
("pc: map PCI address space as catchall region for not mapped
addresses") and has never been used since, so remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230105173826.56748-1-philmd@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
These IRQ counting functions will soon be required in binaries that
do not include the APIC code, too, so let's extract them into a
separate file that can be linked independently of the APIC code.
While we're at it, change the apic_* prefix into kvm_* since the
functions are used from the i8259 PIC (i.e. not the APIC), too.
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20230110095351.611724-2-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
mostly vhost-vdpa:
guest announce feature emulation when using shadow virtqueue
support for configure interrupt
startup speed ups
an acpi change to only generate cluster node in PPTT when specified for arm
misc fixes, cleanups
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmO6eGMPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpoUIIALqC3UtJcK3AuAMbeqVokxl5CPwoeXMyi+rT
0QuN8m8dpBtJFpy3Vyq0afixOFmlwvORW5ye4QI97OyIhtLJq00buzQsgHjNoPo3
zN2L0BDyofDmfFHgCxcEbv2aAO8TaqRSHmKffEFmf8JDMDL9Ev1QvPTWHhfm2eJf
VKPHOtCA/3WXBD9JNfYJ0YuzCrrJaMhIO6/5tqv9yjMxWTfEFa1J2Sr2tWkRLuDk
FPfApy7afjI705Guv6PllZ3JdOMwf7iZaoBK6mSdCDSyi1xciYM0VeWi8SLD4qbM
N+9NkUQOIYS5ZC4BXrULy6HDUsECJ71I0pvHveX7nwbK6xPD4RQ=
=0tPe
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, cleanups, fixes
mostly vhost-vdpa:
guest announce feature emulation when using shadow virtqueue
support for configure interrupt
startup speed ups
an acpi change to only generate cluster node in PPTT when specified for arm
misc fixes, cleanups
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 08 Jan 2023 08:01:39 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (50 commits)
vhost-scsi: fix memleak of vsc->inflight
acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
tests: acpi: aarch64: Add *.topology tables
tests: acpi: aarch64: Add topology test for aarch64
tests: acpi: Add and whitelist *.topology blobs
tests: virt: Update expected ACPI tables for virt test
hw/acpi/aml-build: Only generate cluster node in PPTT when specified
tests: virt: Allow changes to PPTT test table
virtio-pci: fix proxy->vector_irqfd leak in virtio_pci_set_guest_notifiers
vdpa: commit all host notifier MRs in a single MR transaction
vhost: configure all host notifiers in a single MR transaction
vhost: simplify vhost_dev_enable_notifiers
vdpa: harden the error path if get_iova_range failed
vdpa-dev: get iova range explicitly
docs/devel: Rules on #include in headers
include: Include headers where needed
include/hw/virtio: Break inclusion loop
include/hw/cxl: Break inclusion loop cxl_pci.h and cxl_cdat_h
include/hw/pci: Include hw/pci/pci.h where needed
include/hw/pci: Split pci_device.h off pci.h
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A number of headers neglect to include everything they need. They
compile only if the headers they need are already included from
elsewhere. Fix that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20221222120813.727830-3-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
hw/pci/pci_bridge.h and hw/cxl/cxl.h include each other.
Fortunately, breaking the loop is merely a matter of deleting
unnecessary includes from headers, and adding them back in places
where they are now missing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221222100330.380143-2-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-machine kernel-irqchip=off is broken for many guest OSes; kernel-irqchip=split
is the replacement that works, so remove the deprecated support for the former.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add 8.0 machine types for arm/i440fx/m68k/q35/s390x/spapr.
Reviewed-by: Cédric Le Goater <clg@kaod.org> [ppc]
Reviewed-by: Thomas Huth <thuth@redhat.com> [s390x]
Reviewed-by: Greg Kurz <groug@kaod.org> [ppc]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20221212152145.124317-2-cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
This patch introduce ECAP_PASID via "x-pasid-mode". Based on the
existing support for scalable mode, we need to implement the following
missing parts:
1) tag VTDAddressSpace with PASID and support IOMMU/DMA translation
with PASID
2) tag IOTLB with PASID
3) PASID cache and its flush
4) PASID based IOTLB invalidation
For simplicity PASID cache is not implemented so we can simply
implement the PASID cache flush as a no and leave it to be implemented
in the future. For PASID based IOTLB invalidation, since we haven't
had L1 stage support, the PASID based IOTLB invalidation is not
implemented yet. For PASID based device IOTLB invalidation, it
requires the support for vhost so we forbid enabling device IOTLB when
PASID is enabled now. Those work could be done in the future.
Note that though PASID based IOMMU translation is ready but no device
can issue PASID DMA right now. In this case, PCI_NO_PASID is used as
PASID to identify the address without PASID. vtd_find_add_as() has
been extended to provision address space with PASID which could be
utilized by the future extension of PCI core to allow device model to
use PASID based DMA translation.
This feature would be useful for:
1) prototyping PASID support for devices like virtio
2) future vPASID work
3) future PRS and vSVA work
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221028061436.30093-5-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We introduce VTDBus structure as an intermediate step for searching
the address space. This works well with SID based matching/lookup. But
when we want to support SID plus PASID based address space lookup,
this intermediate steps turns out to be a burden. So the patch simply
drops the VTDBus structure and use the PCIBus and devfn as the key for
the g_hash_table(). This simplifies the codes and the future PASID
extension.
To prevent being slower for past vtd_find_as_from_bus_num() callers, a
vtd_as cache indexed by the bus number is introduced to store the last
recent search result of a vtd_as belongs to a specific bus.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221028061436.30093-3-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
The added enforcing is only relevant in the case of AMD where the
range right before the 1TB is restricted and cannot be DMA mapped
by the kernel consequently leading to IOMMU INVALID_DEVICE_REQUEST
or possibly other kinds of IOMMU events in the AMD IOMMU.
Although, there's a case where it may make sense to disable the
IOVA relocation/validation when migrating from a
non-amd-1tb-aware qemu to one that supports it.
Relocating RAM regions to after the 1Tb hole has consequences for
guest ABI because we are changing the memory mapping, so make
sure that only new machine enforce but not older ones.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20220719170014.27028-12-joao.m.martins@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the pre-initialized pci-host qdev and fetch the
pci-hole64-size into pc_memory_init() newly added argument.
Use PCI_HOST_PROP_PCI_HOLE64_SIZE pci-host property for
fetching pci-hole64-size.
This is in preparation to determine that host-phys-bits are
enough and for pci-hole64-size to be considered to relocate
ram-above-4g to be at 1T (on AMD platforms).
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20220719170014.27028-4-joao.m.martins@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rather than hardcoding the 4G boundary everywhere, introduce a
X86MachineState field @above_4g_mem_start and use it
accordingly.
This is in preparation for relocating ram-above-4g to be
dynamically start at 1T on AMD platforms.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20220719170014.27028-2-joao.m.martins@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tiny machines optimized for fast boot time generally don't use EFI,
which means a random seed has to be supplied some other way. For this
purpose, Linux (≥5.20) supports passing a seed in the setup_data table
with SETUP_RNG_SEED, specially intended for hypervisors, kexec, and
specialized bootloaders. The linked commit shows the upstream kernel
implementation.
At Paolo's request, we don't pass these to versioned machine types ≤7.0.
Link: https://git.kernel.org/tip/tip/c/68b8e9713c8
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <eduardo@habkost.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Message-Id: <20220721125636.446842-1-Jason@zx2c4.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220520180109.8224-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220520180109.8224-5-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
The macro seems to be used only internally, so remove it.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220520180109.8224-4-shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This removes the last of the CXL code from the MachineState where it
is visible to all Machines to only those that support CXL (currently i386/pc)
As i386/pc always support CXL now, stop allocating the state independently.
Note the pxb register hookup code runs even if cxl=off in order to detect
pxb_cxl host bridges and fail to start if any are present as they won't
have the control registers available.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Message-Id: <20220608145440.26106-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
by default we do not version ACPI AML as it's considered
a part of firmware. Drop do_not_add_smb_acpi that blocked
SMBUS AML description on 3.1 and older machine types without
providing justification.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20220608135340.3304695-18-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Legacy PIC (8259) cannot be supported for TDX guests since TDX module
doesn't allow directly interrupt injection. Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.
Make PIC the property of common x86 machine type. Hence all x86
machines, including microvm, can disable it.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20220310122811.807794-3-xiaoyao.li@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Both pc and microvm have pit property individually. Let's just make it
the property of common x86 base machine type.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20220310122811.807794-2-xiaoyao.li@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
By setting none of the SAGAW bits we can indicate to a guest that DMA
translation isn't supported. Tested by booting Windows 10, as well as
Linux guests with the fix at https://git.kernel.org/torvalds/c/c40aaaac10
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220314142544.150555-2-dwmw2@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
igd-passthrough-isa-bridge is only requested in xen_pt but was
implemented in pc_piix.c. This caused xen_pt to dependend on i386/pc
which is hereby resolved.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20220326165825.30794-2-shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
move sev firmware setup to separate function so it can be used from
other code paths. No functional change.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20220425135051.551037-3-kraxel@redhat.com>
Replace a config-time define with a compile time condition
define (compatible with clang and gcc) that must be declared prior to
its usage. This avoids having a global configure time define, but also
prevents from bad usage, if the config header wasn't included before.
This can help to make some code independent from qemu too.
gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[ For the s390x parts I'm involved in ]
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220323155743.1585078-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
commit
f862ddbb1a (hw/i386: Remove the deprecated pc-1.x machine types)
removed the last user of broken APIC ID compat knob,
but compat_apic_id_mode itself was forgotten.
Clean it up and simplify x86_cpu_apic_id_from_index()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20220228131634.3389805-1-imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On some older software like Windows 7 installer, having both a PS/2
mouse and USB mouse results in only one device working property (which
might be a different device each boot). While the workaround to not use
a USB mouse with such software is valid, it creates an inconsistent
experience if the user wishes to always use a USB mouse.
This introduces a new machine property to inhibit the creation of the
i8042 PS/2 controller.
Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20220227210655.45592-1-j@getutm.app>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
SC is required for some kernel features like vhost-vDPA. So this patch
implements basic SC feature. The idea is pretty simple, for software
emulated DMA it would be always coherent. In this case we can simple
advertise ECAP_SC bit. For VFIO and vhost, thing will be more much
complicated, so this patch simply fail the IOMMU notifier
registration.
In the future, we may want to have a dedicated notifiers flag or
similar mechanism to demonstrate the coherency so VFIO could advertise
that if it has VFIO_DMA_CC_IOMMU, for vhost kernel backend we don't
need that since it's a software backend.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220214060346.72455-1-jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The functions are only used within their respective source files, so no
need for exporting.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20220116122327.73048-1-shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The i440fx and Q35 machine types are both hardcoded to use the
legacy SMBIOS 2.1 (32-bit) entry point. This is a sensible
conservative choice because SeaBIOS only supports SMBIOS 2.1
EDK2, however, can also support SMBIOS 3.0 (64-bit) entry points,
and QEMU already uses this on the ARM virt machine type.
This adds a property to allow the choice of SMBIOS entry point
versions For example to opt in to 64-bit SMBIOS entry point:
$QEMU -machine q35,smbios-entry-point-type=64
Based on a patch submitted by Daniel Berrangé.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-4-ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Add 7.0 machine types for arm/i440fx/q35/s390x/spapr.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211217143948.289995-1-cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
* add support for KVM_GUESTDBG_BLOCKIRQ (Maxim)
* update linux-headers to Linux 5.16 (myself)
* configure cleanups (myself)
* lsi53c895a assertion failure fix (Philippe)
* fix incorrect description for die-id (Yanan)
* support for NUMA in SGX enclave memory (Yang Zhong)
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmG5yEgUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOFkwf9Glt3tnHfO/gWZectiMAjjM8vg3cR
yEsWU0fa0iu8MO8NiOW0b6DT0yosZ8rl2SYSZBQGqwa/Lq2MZLO2v0JG4LblBom7
WVmzNOG4t0OFoE2gwMlZ2J+ppaHV5zN1Yc6lMXOVQcefY0pXoyze+0Fq/cwKf2+p
6WDZTuOpjtxMwt9n71b93kNAHagk422MpNi1xi+o37hNsm/Zh3BTtaSJ7WCkCmWq
YmwwQZgyB/V0Ah42wchuf5zxEnQPMbjaQSu8jQU4iyCB7+2biBuh3Cqw5eR8tqC1
cGNINGlo2Orl9ASyEunHyuDj8p0wN8J37wNuBo4kghhrTVsVKQYF0wmsqw==
=U1Xy
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* improve compatibility for macOS scripts/entitlement.sh (Evan)
* add support for KVM_GUESTDBG_BLOCKIRQ (Maxim)
* update linux-headers to Linux 5.16 (myself)
* configure cleanups (myself)
* lsi53c895a assertion failure fix (Philippe)
* fix incorrect description for die-id (Yanan)
* support for NUMA in SGX enclave memory (Yang Zhong)
# gpg: Signature made Wed 15 Dec 2021 02:49:44 AM PST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [unknown]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
configure: remove dead variables
doc: Add the SGX numa description
numa: Support SGX numa in the monitor and Libvirt interfaces
numa: Enable numa for SGX EPC sections
kvm: add support for KVM_GUESTDBG_BLOCKIRQ
gdbstub, kvm: let KVM report supported singlestep flags
gdbstub: reject unsupported flags in handle_set_qemu_sstep
linux-headers: update to 5.16-rc1
virtio-gpu: do not byteswap padding
scripts/entitlement.sh: Use backward-compatible cp flags
qapi/machine.json: Fix incorrect description for die-id
tests/qtest: Add fuzz-lsi53c895a-test
hw/scsi/lsi53c895a: Do not abort when DMA requested and no data queued
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The qemu-common.h header is not supposed to be included from any
other header files, only from .c files (as documented in a comment at
the start of it).
include/hw/i386/x86.h and include/hw/i386/microvm.h break this rule.
In fact, the include is not required at all, so we can just drop it
from both files.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20211129200510.1233037-2-peter.maydell@linaro.org
The basic SGX did not enable numa for SGX EPC sections, which
result in all EPC sections located in numa node 0. This patch
enable SGX numa function in the guest and the EPC section can
work with RAM as one numa node.
The Guest kernel related log:
[ 0.009981] ACPI: SRAT: Node 0 PXM 0 [mem 0x180000000-0x183ffffff]
[ 0.009982] ACPI: SRAT: Node 1 PXM 1 [mem 0x184000000-0x185bfffff]
The SRAT table can normally show SGX EPC sections menory info in different
numa nodes.
The SGX EPC numa related command:
......
-m 4G,maxmem=20G \
-smp sockets=2,cores=2 \
-cpu host,+sgx-provisionkey \
-object memory-backend-ram,size=2G,host-nodes=0,policy=bind,id=node0 \
-object memory-backend-epc,id=mem0,size=64M,prealloc=on,host-nodes=0,policy=bind \
-numa node,nodeid=0,cpus=0-1,memdev=node0 \
-object memory-backend-ram,size=2G,host-nodes=1,policy=bind,id=node1 \
-object memory-backend-epc,id=mem1,size=28M,prealloc=on,host-nodes=1,policy=bind \
-numa node,nodeid=1,cpus=2-3,memdev=node1 \
-M sgx-epc.0.memdev=mem0,sgx-epc.0.node=0,sgx-epc.1.memdev=mem1,sgx-epc.1.node=1 \
......
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20211101162009.62161-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* DMA support in the multiboot option ROM
* Rename default-bus-bypass-iommu
* Deprecate -watchdog and cleanup -watchdog-action
* HVF fix for <PAGE_SIZE regions
* Support TSC scaling for AMD nested virtualization
* Fix for ESP fuzzing bug
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGBUeEUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOh+Qf+OMRhRiv6dYjbK/5zXrx81AgxYAY3
dBUSr8v16LyrMl1U3DZWzhD+MzQsC83m/Xsh4lGxlHDWtkK9QQA5xDG95JZdY26i
MGCbbjnFHISbyBQV9Y724gPfPjOOODuoFbzafSx6VLITOcyv1ye0cm7TOjOPB+tt
E4c3JqTZ7g8a5yMe8ItkVhz5pPY+oVw8dxMNRp6Sup5Dbfx0DjacIwLasLsHfPL7
qBADfqB20ovHUzLjXu7oWgEd4KxJ6kiSCaJJu/KD36hg0wB8+WVP1o43j4PkczHT
QjU7eZaeaTrN5Cf34ttPge6QReMi5SFNCaA9O9/HLqrQgdEtt/diZWuqjQ==
=a2mC
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Build system fixes and cleanups
* DMA support in the multiboot option ROM
* Rename default-bus-bypass-iommu
* Deprecate -watchdog and cleanup -watchdog-action
* HVF fix for <PAGE_SIZE regions
* Support TSC scaling for AMD nested virtualization
* Fix for ESP fuzzing bug
# gpg: Signature made Tue 02 Nov 2021 10:57:37 AM EDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* remotes/bonzini/tags/for-upstream: (27 commits)
configure: fix --audio-drv-list help message
configure: Remove the check for the __thread keyword
Move the l2tpv3 test from configure to meson.build
meson: remove unnecessary coreaudio test program
meson: remove pointless warnings
meson.build: Allow to disable OSS again
meson: bump submodule to 0.59.3
qtest/am53c974-test: add test for cancelling in-flight requests
esp: ensure in-flight SCSI requests are always cancelled
KVM: SVM: add migration support for nested TSC scaling
hw/i386: fix vmmouse registration
watchdog: remove select_watchdog_action
vl: deprecate -watchdog
watchdog: add information from -watchdog help to -device help
hw/i386: Rename default_bus_bypass_iommu
hvf: Avoid mapping regions < PAGE_SIZE as ram
configure: do not duplicate CPU_CFLAGS into QEMU_LDFLAGS
configure: remove useless NPTL probe
target/i386: use DMA-enabled multiboot ROM for new-enough QEMU machine types
optionrom: add a DMA-enabled multiboot ROM
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Allows edk2 detect virtio-mmio devices and pcie ecam.
See comment in hw/i386/microvm-dt.c for more details.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20211014193617.2475578-1-kraxel@redhat.com>
This removes a parameter from x86_load_linux, and will avoid code
duplication between the linux and multiboot cases once multiboot
starts to support DMA.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We're about to support a third vIOMMU for x86, virtio-iommu which
doesn't inherit X86IOMMUState. Move the IOMMU singleton into
PCMachineState, so it can be shared between all three vIOMMUs.
The x86_iommu_get_default() helper is still needed by KVM and IOAPIC to
fetch the default IRQ-remapping IOMMU. Since virtio-iommu doesn't
support IRQ remapping, this interface doesn't need to change for the
moment. We could later replace X86IOMMUState with an "IRQ remapping
IOMMU" interface if necessary.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20211026182024.2642038-4-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To generate the IOMMU ACPI table, acpi-build.c can use base QEMU types
instead of a special IommuType value.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20211026182024.2642038-3-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move qmp_query_sgx() and hmp_info_sgx() from target/i386/monitor.c
to hw/i386/sgx.c, removing the sgx_get_info() indirection and the
"hw/i386/sgx.h" header.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211007175612.496366-5-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move qmp_query_sgx_capabilities() from target/i386/monitor.c to
hw/i386/sgx.c, removing the sgx_get_capabilities() indirection.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211007175612.496366-4-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211007175612.496366-3-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of composing disabled _MAT entry and then later on
patching it to enabled for hotpluggbale CPUs in DSDT,
set it to enabled at the time _MAT entry is built.
It will allow to drop usage of packed structures in
following patches when build_madt() is switched to use
build_append_int_noprefix() API.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-24-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Libvirt can use query-sgx-capabilities to get the host
sgx capabilities to decide how to allocate SGX EPC size to VM.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210910102258.46648-3-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The QMP and HMP interfaces can be used by monitor or QMP tools to retrieve
the SGX information from VM side when SGX is enabled on Intel platform.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210910102258.46648-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add helpers to detect if SGX EPC exists above 4g, and if so, where SGX
EPC above 4g ends. Use the helpers to adjust the device memory range
if SGX EPC exists above 4g.
For multiple virtual EPC sections, we just put them together physically
contiguous for the simplicity because we don't support EPC NUMA affinity
now. Once the SGX EPC NUMA support in the kernel SGX driver, we will
support this in the future.
Note that SGX EPC is currently hardcoded to reside above 4g.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-18-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose SGX to the guest if and only if KVM is enabled and supports
virtualization of SGX. While the majority of ENCLS can be emulated to
some degree, because SGX uses a hardware-based root of trust, the
attestation aspects of SGX cannot be emulated in software, i.e.
ultimately emulation will fail as software cannot generate a valid
quote/report. The complexity of partially emulating SGX in Qemu far
outweighs the value added, e.g. an SGX specific simulator for userspace
applications can emulate SGX for development and testing purposes.
Note, access to the PROVISIONKEY is not yet advertised to the guest as
KVM blocks access to the PROVISIONKEY by default and requires userspace
to provide additional credentials (via ioctl()) to expose PROVISIONKEY.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-13-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because SGX EPC is enumerated through CPUID, EPC "devices" need to be
realized prior to realizing the vCPUs themselves, i.e. long before
generic devices are parsed and realized. From a virtualization
perspective, the CPUID aspect also means that EPC sections cannot be
hotplugged without paravirtualizing the guest kernel (hardware does
not support hotplugging as EPC sections must be locked down during
pre-boot to provide EPC's security properties).
So even though EPC sections could be realized through the generic
-devices command, they need to be created much earlier for them to
actually be usable by the guest. Place all EPC sections in a
contiguous block, somewhat arbitrarily starting after RAM above 4g.
Ensuring EPC is in a contiguous region simplifies calculations, e.g.
device memory base, PCI hole, etc..., allows dynamic calculation of the
total EPC size, e.g. exposing EPC to guests does not require -maxmem,
and last but not least allows all of EPC to be enumerated in a single
ACPI entry, which is expected by some kernels, e.g. Windows 7 and 8.
The new compound properties command for sgx like below:
......
-object memory-backend-epc,id=mem1,size=28M,prealloc=on \
-object memory-backend-epc,id=mem2,size=10M \
-M sgx-epc.0.memdev=mem1,sgx-epc.1.memdev=mem2
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-6-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SGX EPC is enumerated through CPUID, i.e. EPC "devices" need to be
realized prior to realizing the vCPUs themselves, which occurs long
before generic devices are parsed and realized. Because of this,
do not allow 'sgx-epc' devices to be instantiated after vCPUS have
been created.
The 'sgx-epc' device is essentially a placholder at this time, it will
be fully implemented in a future patch along with a dedicated command
to create 'sgx-epc' devices.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-5-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
EPC (Enclave Page Cahe) is a specialized type of memory used by Intel
SGX (Software Guard Extensions). The SDM desribes EPC as:
The Enclave Page Cache (EPC) is the secure storage used to store
enclave pages when they are a part of an executing enclave. For an
EPC page, hardware performs additional access control checks to
restrict access to the page. After the current page access checks
and translations are performed, the hardware checks that the EPC
page is accessible to the program currently executing. Generally an
EPC page is only accessed by the owner of the executing enclave or
an instruction which is setting up an EPC page.
Because of its unique requirements, Linux manages EPC separately from
normal memory. Similar to memfd, the device /dev/sgx_vepc can be
opened to obtain a file descriptor which can in turn be used to mmap()
EPC memory.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-3-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since commits aa57020774 ("numa: move numa global variable
nb_numa_nodes into MachineState") and 7e721e7b10 ("numa: move
numa global variable numa_info into MachineState"), we can get
NUMA information completely from MachineState::numa_state.
Remove PCMachineState::numa_nodes and PCMachineState::node_mem,
since they are just copied from MachineState::numa_state.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Message-Id: <20210823011254.28506-1-jingqi.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add 6.2 machine types for arm/i440fx/q35/s390x/spapr.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a default_bus_bypass_iommu pc machine option to enable/disable
bypass_iommu for default root bus. The option is disabled by default
and can be enabled with:
$QEMU -machine q35,default_bus_bypass_iommu=true
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Message-Id: <1625748919-52456-5-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>