ShuriZma/xemu - xemu

Commit Graph

Author	SHA1	Message	Date
Daniel P. Berrangé	b57151ac03	crypto: check that LUKS PBKDF2 iterations count is non-zero Both the master key and key slot passphrases are run through the PBKDF2 algorithm. The iterations count is expected to be generally very large (many 10's or 100's of 1000s). It is hard to define a low level cutoff, but we can certainly say that iterations count should be non-zero. A zero count likely indicates an initialization mistake so reject it. Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-27 12:55:27 +01:00
Daniel P. Berrangé	c5f6962801	crypto: strengthen the check for key slots overlapping with LUKS header The LUKS header data on disk is a fixed size, however, there's expected to be a gap between the end of the header and the first key slot to get alignment with the 2nd sector on 4k drives. This wasn't originally part of the LUKS spec, but was always part of the reference implementation, so it is worth validating this. Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-27 12:55:27 +01:00
Daniel P. Berrangé	d233fbc327	crypto: validate that LUKS payload doesn't overlap with header We already validate that LUKS keyslots don't overlap with the header, or with each other. This closes the remaining hole in validation of LUKS file regions. Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-27 12:55:27 +01:00
Daniel P. Berrangé	93569c3730	crypto: enforce that key material doesn't overlap with LUKS header We already check that key material doesn't overlap between key slots, and that it doesn't overlap with the payload. We didn't check for overlap with the LUKS header. Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-27 12:55:27 +01:00
Daniel P. Berrangé	f1195961f3	crypto: enforce that LUKS stripes is always a fixed value Although the LUKS stripes are encoded in the keyslot header and so potentially configurable, in pratice the cryptsetup impl mandates this has the fixed value 4000. To avoid incompatibility apply the same enforcement in QEMU too. This also caps the memory usage for key material when QEMU tries to open a LUKS volume. Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-27 12:55:27 +01:00
Daniel P. Berrangé	c1d8634c20	crypto: sanity check that LUKS header strings are NUL-terminated The LUKS spec requires that header strings are NUL-terminated, and our code relies on that. Protect against maliciously crafted headers by adding validation. Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-27 12:55:27 +01:00
Daniel P. Berrangé	f1018ea0a3	tests: avoid DOS line endings in PSK file Using FILE * APIs for writing the PSK file results in translation from UNIX to DOS line endings on Windows. When the crypto PSK code later loads the credentials the stray \r will result in failure to load the PSK credentials into GNUTLS. Rather than switching the FILE* APIs to open in binary format, just switch to the more concise g_file_set_contents API. Reviewed-by: Bin Meng <bmeng.cn@gmail.com> Tested-by: Bin Meng <bmeng.cn@gmail.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-27 11:55:41 +01:00
Daniel P. Berrangé	3983bf1b41	crypto: check for and report errors setting PSK credentials If setting credentials fails, the handshake will later fail to complete with an obscure error message which is hard to diagnose. Reviewed-by: Bin Meng <bmeng.cn@gmail.com> Tested-by: Bin Meng <bmeng.cn@gmail.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-27 11:55:41 +01:00
Daniel P. Berrangé	dd84a906e0	scripts: check if .git exists before checking submodule status Currently we check status of each submodule, before actually checking if we're in a git repo. These status commands will all fail, but we are hiding their output so we don't see it currently. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-27 11:54:37 +01:00
Jason A. Donenfeld	6233a13859	mips/malta: pass RNG seed via env var and re-randomize on reboot As of the kernel commit linked below, Linux ingests an RNG seed passed as part of the environment block by the bootloader or firmware. This mechanism works across all different environment block types, generically, which pass some block via the second firmware argument. On malta, this has been tested to work when passed as an argument from U-Boot's linux_env_set. As is the case on most other architectures (such as boston), when booting with `-kernel`, QEMU, acting as the bootloader, should pass the RNG seed, so that the machine has good entropy for Linux to consume. So this commit implements that quite simply by using the guest random API, which is what is used on nearly all other archs too. It also reinitializes the seed on reboot, so that it is always fresh. Link: https://git.kernel.org/torvalds/c/056a68cea01 Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com> Cc: Paul Burton <paulburton@kernel.org> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:47:45 +01:00
Jason A. Donenfeld	a76b911c0d	rx: re-randomize rng-seed on reboot When the system reboots, the rng-seed that the FDT has should be re-randomized, so that the new boot gets a new seed. Since the FDT is in the ROM region at this point, we add a hook right after the ROM has been added, so that we have a pointer to that copy of the FDT. Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-12-Jason@zx2c4.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	2db07d0506	openrisc: re-randomize rng-seed on reboot When the system reboots, the rng-seed that the FDT has should be re-randomized, so that the new boot gets a new seed. Since the FDT is in the ROM region at this point, we add a hook right after the ROM has been added, so that we have a pointer to that copy of the FDT. Cc: Stafford Horne <shorne@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-11-Jason@zx2c4.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	4fbae24450	mips/boston: re-randomize rng-seed on reboot When the system reboots, the rng-seed that the FDT has should be re-randomized, so that the new boot gets a new seed. Since the FDT is in the ROM region at this point, we add a hook right after the ROM has been added, so that we have a pointer to that copy of the FDT. Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com> Cc: Paul Burton <paulburton@kernel.org> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-9-Jason@zx2c4.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	fbbbe7eb23	m68k/q800: do not re-randomize RNG seed on snapshot load Snapshot loading is supposed to be deterministic, so we shouldn't re-randomize the various seeds used. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-8-Jason@zx2c4.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	1ffd007c9c	m68k/virt: do not re-randomize RNG seed on snapshot load Snapshot loading is supposed to be deterministic, so we shouldn't re-randomize the various seeds used. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-7-Jason@zx2c4.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	64c75db3c5	riscv: re-randomize rng-seed on reboot When the system reboots, the rng-seed that the FDT has should be re-randomized, so that the new boot gets a new seed. Since the FDT is in the ROM region at this point, we add a hook right after the ROM has been added, so that we have a pointer to that copy of the FDT. Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Alistair Francis <alistair.francis@wdc.com> Cc: Bin Meng <bin.meng@windriver.com> Cc: qemu-riscv@nongnu.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20221025004327.568476-6-Jason@zx2c4.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	98aa4c839d	arm: re-randomize rng-seed on reboot When the system reboots, the rng-seed that the FDT has should be re-randomized, so that the new boot gets a new seed. Since the FDT is in the ROM region at this point, we add a hook right after the ROM has been added, so that we have a pointer to that copy of the FDT. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-arm@nongnu.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-5-Jason@zx2c4.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	14b29fea74	x86: do not re-randomize RNG seed on snapshot load Snapshot loading is supposed to be deterministic, so we shouldn't re-randomize the various seeds used. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-4-Jason@zx2c4.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	e1e618b9a0	device-tree: add re-randomization helper function When the system reboots, the rng-seed that the FDT has should be re-randomized, so that the new boot gets a new seed. Several architectures require this functionality, so export a function for injecting a new seed into the given FDT. Cc: Alistair Francis <alistair.francis@wdc.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20221025004327.568476-3-Jason@zx2c4.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	7966d70f6f	reset: allow registering handlers that aren't called by snapshot loading Snapshot loading only expects to call deterministic handlers, not non-deterministic ones. So introduce a way of registering handlers that won't be called when reseting for snapshots. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-2-Jason@zx2c4.com [PMM: updated json doc comment with Markus' text; fixed checkpatch style nit] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Richard Henderson	c8d6c286ab	target/arm: Use the max page size in a 2-stage ptw We had only been reporting the stage2 page size. This causes problems if stage1 is using a larger page size (16k, 2M, etc), but stage2 is using a smaller page size, because cputlb does not set large_page_{addr,mask} properly. Fix by using the max of the two page sizes. Reported-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20221024051851.3074715-15-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Richard Henderson	65c123fdf5	target/arm: Implement FEAT_HAFDBS, dirty bit portion Perform the atomic update for hardware management of the dirty bit. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20221024051851.3074715-14-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Richard Henderson	71943a1e90	target/arm: Implement FEAT_HAFDBS, access flag portion Perform the atomic update for hardware management of the access flag. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20221024051851.3074715-13-richard.henderson@linaro.org [PMM: Fix accidental PROT_WRITE to PAGE_WRITE; add missing main-loop.h include] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:24 +01:00
Richard Henderson	34a57faeab	target/arm: Tidy merging of attributes from descriptor and table Replace some gotos with some nested if statements. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20221024051851.3074715-12-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:24 +01:00
Richard Henderson	0e8df0fe24	target/arm: Consider GP an attribute in get_phys_addr_lpae Both GP and DBM are in the upper attribute block. Extend the computation of attrs to include them, then simplify the setting of guarded. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20221024051851.3074715-11-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:24 +01:00
Richard Henderson	4566609176	target/arm: Don't shift attrs in get_phys_addr_lpae Leave the upper and lower attributes in the place they originate from in the descriptor. Shifting them around is confusing, since one cannot read the bit numbers out of the manual. Also, new attributes have been added which would alter the shifts. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20221024051851.3074715-10-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:24 +01:00
Richard Henderson	27c1b81d61	target/arm: Fix fault reporting in get_phys_addr_lpae Always overriding fi->type was incorrect, as we would not properly propagate the fault type from S1_ptw_translate, or arm_ldq_ptw. Simplify things by providing a new label for a translation fault. For other faults, store into fi directly. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20221024051851.3074715-9-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:24 +01:00
Richard Henderson	fe4ddc151b	target/arm: Remove loop from get_phys_addr_lpae The unconditional loop was used both to iterate over levels and to control parsing of attributes. Use an explicit goto in both cases. While this appears less clean for iterating over levels, we will need to jump back into the middle of this loop for atomic updates, which is even uglier. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20221024051851.3074715-8-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Richard Henderson	f0a398a249	target/arm: Add ARMFault_UnsuppAtomicUpdate This fault type is to be used with FEAT_HAFDBS when the guest enables hw updates, but places the tables in memory where atomic updates are unsupported. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20221024051851.3074715-7-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Richard Henderson	93e5b3a6f9	target/arm: Move S1_ptw_translate outside arm_ld[lq]_ptw Separate S1 translation from the actual lookup. Will enable lpae hardware updates. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20221024051851.3074715-6-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Richard Henderson	8973922783	target/arm: Extract HA and HD in aa64_va_parameters Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20221024051851.3074715-5-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Richard Henderson	980a68925c	target/arm: Add isar predicates for FEAT_HAFDBS The MMFR1 field may indicate support for hardware update of access flag alone, or access flag and dirty bit. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20221024051851.3074715-4-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Richard Henderson	48da29e485	target/arm: Add ptw_idx to S1Translate Hoist the computation of the mmu_idx for the ptw up to get_phys_addr_with_struct and get_phys_addr_twostage. This removes the duplicate check for stage2 disabled from the middle of the walk, performing it only once. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20221024051851.3074715-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Richard Henderson	edc05dd43a	target/arm: Introduce regime_is_stage2 Reduce the amount of typing required for this check. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20221024051851.3074715-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Axel Heider	7719419deb	target/imx: reload cmp timer outside of the reload ptimer transaction When running seL4 tests (https://docs.sel4.systems/projects/sel4test) on the sabrelight platform, the timer tests fail. The arm/imx6 EPIT timer interrupt does not fire properly, instead of a e.g. second in can take up to a minute to finally see the interrupt. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1263 Signed-off-by: Axel Heider <axel.heider@hensoldt.net> Message-id: 166663118138.13362.1229967229046092876-0@git.sr.ht Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Peter Maydell	7764963b92	hw/hyperv/hyperv.c: Use device_cold_reset() instead of device_legacy_reset() The semantic difference between the deprecated device_legacy_reset() function and the newer device_cold_reset() function is that the new function resets both the device itself and any qbuses it owns, whereas the legacy function resets just the device itself and nothing else. In hyperv_synic_reset() we reset a SynICState, which has no qbuses, so for this purpose the two functions behave identically and we can stop using the deprecated one. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-id: 20221013171817.1447562-1-peter.maydell@linaro.org	2022-10-27 10:27:23 +01:00
Damien Hedde	310616d367	hw/core/resettable: fix reset level counting The code for handling the reset level count in the Resettable code has two issues: The reset count is only decremented for the 1->0 case. This means that if there's ever a nested reset that takes the count to 2 then it will never again be decremented. Eventually the count will exceed the '50' limit in resettable_phase_enter() and QEMU will trip over the assertion failure. The repro case in issue 1266 is an example of this that happens now the SCSI subsystem uses three-phase reset. Secondly, the count is decremented only after the exit phase handler is called. Moving the reset count decrement from "just after" to "just before" calling the exit phase handler allows resettable_is_in_reset() to return false during the handler execution. This simplifies reset handling in resettable devices. Typically, a function that updates the device state will just need to read the current reset state and not anymore treat the "in a reset-exit transition" as a special case. Note that the semantics change to the *_is_in_reset() functions will have no effect on the current codebase, because only two devices (hw/char/cadence_uart.c and hw/misc/zynq_sclr.c) currently call those functions, and in neither case do they do it from the device's exit phase methed. Fixes: `4a5fc890` ("scsi: Use device_cold_reset() and bus_cold_reset()") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1266 Signed-off-by: Damien Hedde <damien.hedde@greensocs.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reported-by: Michael Peter <michael.peter@hensoldt-cyber.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20221020142749.3357951-1-peter.maydell@linaro.org Buglink: https://bugs.launchpad.net/qemu/+bug/1905297 Reported-by: Michael Peter <michael.peter@hensoldt-cyber.com> [PMM: adjust the docs paragraph changed to get the name of the 'enter' phase right and to clarify exactly when the count is adjusted; rewrite the commit message] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Ake Koomsin	c939a7c7b9	target/arm: honor HCR_E2H and HCR_TGE in arm_excp_unmasked() An exception targeting EL2 from lower EL is actually maskable when HCR_E2H and HCR_TGE are both set. This applies to both secure and non-secure Security state. We can remove the conditions that try to suppress masking of interrupts when we are Secure and the exception targets EL2 and Secure EL2 is disabled. This is OK because in that situation arm_phys_excp_target_el() will never return 2 as the target EL. The 'not if secure' check in this function was originally written before arm_hcr_el2_eff(), and back then the target EL returned by arm_phys_excp_target_el() could be 2 even if we were in Secure EL0/EL1; but it is no longer needed. Signed-off-by: Ake Koomsin <ake@igel.co.jp> Message-id: 20221017092432.546881-1-ake@igel.co.jp [PMM: Add commit message paragraph explaining why it's OK to remove the checks on secure and SCR_EEL2] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Jean-Philippe Brucker	7cd5d384bb	hw/arm/virt: Fix devicetree warnings about the virtio-iommu node The "PCI Bus Binding to: IEEE Std 1275-1994" defines the compatible string for a PCIe bus or endpoint as "pci<vendorid>,<deviceid>" or similar. Since the initial binding for PCI virtio-iommu didn't follow this rule, it was modified to accept both strings and ensure backward compatibility. Also, the unit-name for the node should be "device,function". Fix corresponding dt-validate and dtc warnings: pcie@10000000: virtio_iommu@16:compatible: ['virtio,pci-iommu'] does not contain items matching the given schema pcie@10000000: Unevaluated properties are not allowed (... 'virtio_iommu@16' were unexpected) From schema: linux/Documentation/devicetree/bindings/pci/host-generic-pci.yaml virtio_iommu@16: compatible: 'oneOf' conditional failed, one must be fixed: ['virtio,pci-iommu'] is too short 'pci1af4,1057' was expected From schema: dtschema/schemas/pci/pci-bus.yaml Warning (pci_device_reg): /pcie@10000000/virtio_iommu@16: PCI unit address format error, expected "2,0" Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 10:27:23 +01:00
Peter Maydell	e4c93e44ab	target/arm: Implement FEAT_E0PD FEAT_E0PD adds new bits E0PD0 and E0PD1 to TCR_EL1, which allow the OS to forbid EL0 access to half of the address space. Since this is an EL0-specific variation on the existing TCR_ELx.{EPD0,EPD1}, we can implement it entirely in aa64_va_parameters(). This requires moving the existing regime_is_user() to internals.h so that the code in helper.c can get at it. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20221021160131.3531787-1-peter.maydell@linaro.org	2022-10-27 10:27:23 +01:00
David Hildenbrand	bd77c30df9	vl: Allow ThreadContext objects to be created before the sandbox option Currently, there is no way to configure a CPU affinity inside QEMU when the sandbox option disables it for QEMU as a whole, for example, via: -sandbox enable=on,resourcecontrol=deny While ThreadContext objects can be created on the QEMU commandline and the CPU affinity can be configured externally via the thread-id, this is insufficient if a ThreadContext with a certain CPU affinity is already required during QEMU startup, before we can intercept QEMU and configure the CPU affinity. Blocking sched_setaffinity() was introduced in `24f8cdc572` ("seccomp: add resourcecontrol argument to command line"), "to avoid any bigger of the process". However, we only care about once QEMU is running, not when the instance starting QEMU explicitly requests a certain CPU affinity on the QEMU comandline. Right now, for NUMA-aware preallocation of memory backends used for initial machine RAM, one has to: 1) Start QEMU with the memory-backend with "prealloc=off" 2) Pause QEMU before it starts the guest (-S) 3) Create ThreadContext, configure the CPU affinity using the thread-id 4) Configure the ThreadContext as "prealloc-context" of the memory backend 5) Trigger preallocation by setting "prealloc=on" To simplify this handling especially for initial machine RAM, allow creation of ThreadContext objects before parsing sandbox options, such that the CPU affinity requested on the QEMU commandline alongside the sandbox option can be set. As ThreadContext objects essentially only create a persistent context thread and set the CPU affinity, this is easily possible. With this change, we can create a ThreadContext with a CPU affinity on the QEMU commandline and use it for preallocation of memory backends glued to the machine (simplified example): To make "-name debug-threads=on" keep working as expected for the context threads, perform earlier parsing of "-name". qemu-system-x86_64 -m 1G \ -object thread-context,id=tc1,cpu-affinity=3-4 \ -object memory-backend-ram,id=pc.ram,size=1G,prealloc=on,prealloc-threads=2,prealloc-context=tc1 \ -machine memory-backend=pc.ram \ -S -monitor stdio -sandbox enable=on,resourcecontrol=deny And while we can query the current CPU affinity: (qemu) qom-get tc1 cpu-affinity [ 3, 4 ] We can no longer change it from QEMU directly: (qemu) qom-set tc1 cpu-affinity 1-2 Error: Setting CPU affinity failed: Operation not permitted Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-8-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:01:09 +02:00
David Hildenbrand	e681645862	hostmem: Allow for specifying a ThreadContext for preallocation Let's allow for specifying a thread context via the "prealloc-context" property. When set, preallcoation threads will be crated via the thread context -- inheriting the same CPU affinity as the thread context. Pinning preallcoation threads to CPUs can heavily increase performance in NUMA setups, because, preallocation from a CPU close to the target NUMA node(s) is faster then preallocation from a CPU further remote, simply because of memory bandwidth for initializing memory with zeroes. This is especially relevant for very large VMs backed by huge/gigantic pages, whereby preallocation is mandatory. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-7-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:01:03 +02:00
David Hildenbrand	e04a34e55c	util: Make qemu_prealloc_mem() optionally consume a ThreadContext ... and implement it under POSIX. When a ThreadContext is provided, create new threads via the context such that these new threads obtain a properly configured CPU affinity. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-6-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:00:56 +02:00
David Hildenbrand	10218ae6d0	util: Add write-only "node-affinity" property for ThreadContext Let's make it easier to pin threads created via a ThreadContext to all host CPUs currently belonging to a given set of host NUMA nodes -- which is the common case. "node-affinity" is simply a shortcut for setting "cpu-affinity" manually to the list of host CPUs belonging to the set of host nodes. This property can only be written. A simple QEMU example to set the CPU affinity to host node 1 on a system with two nodes, 24 CPUs each, whereby odd-numbered host CPUs belong to host node 1: qemu-system-x86_64 -S \ -object thread-context,id=tc1,node-affinity=1 And we can query the cpu-affinity via HMP/QMP: (qemu) qom-get tc1 cpu-affinity [ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47 ] We cannot query the node-affinity: (qemu) qom-get tc1 node-affinity Error: Insufficient permission to perform this operation But note that due to dynamic library loading this example will not work before we actually make use of thread_context_create_thread() in QEMU code, because the type will otherwise not get registered. We'll wire this up next to make it work. Note that if the host CPUs for a host node change due do CPU hot(un)plug CPU onlining/offlining (i.e., lscpu output changes) after the ThreadContext was started, the CPU affinity will not get updated. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221014134720.168738-5-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:00:50 +02:00
David Hildenbrand	e2de2c497e	util: Introduce ThreadContext user-creatable object Setting the CPU affinity of QEMU threads is a bit problematic, because QEMU doesn't always have permissions to set the CPU affinity itself, for example, with seccomp after initialized by QEMU: -sandbox enable=on,resourcecontrol=deny General information about CPU affinities can be found in the man page of taskset: CPU affinity is a scheduler property that "bonds" a process to a given set of CPUs on the system. The Linux scheduler will honor the given CPU affinity and the process will not run on any other CPUs. While upper layers are already aware of how to handle CPU affinities for long-lived threads like iothreads or vcpu threads, especially short-lived threads, as used for memory-backend preallocation, are more involved to handle. These threads are created on demand and upper layers are not even able to identify and configure them. Introduce the concept of a ThreadContext, that is essentially a thread used for creating new threads. All threads created via that context thread inherit the configured CPU affinity. Consequently, it's sufficient to create a ThreadContext and configure it once, and have all threads created via that ThreadContext inherit the same CPU affinity. The CPU affinity of a ThreadContext can be configured two ways: (1) Obtaining the thread id via the "thread-id" property and setting the CPU affinity manually (e.g., via taskset). (2) Setting the "cpu-affinity" property and letting QEMU try set the CPU affinity itself. This will fail if QEMU doesn't have permissions to do so anymore after seccomp was initialized. A simple QEMU example to set the CPU affinity to host CPU 0,1,6,7 would be: qemu-system-x86_64 -S \ -object thread-context,id=tc1,cpu-affinity=0-1,cpu-affinity=6-7 And we can query it via HMP/QMP: (qemu) qom-get tc1 cpu-affinity [ 0, 1, 6, 7 ] But note that due to dynamic library loading this example will not work before we actually make use of thread_context_create_thread() in QEMU code, because the type will otherwise not get registered. We'll wire this up next to make it work. In general, the interface behaves like pthread_setaffinity_np(): host CPU numbers that are currently not available are ignored; only host CPU numbers that are impossible with the current kernel will fail. If the list of host CPU numbers does not include a single CPU that is available, setting the CPU affinity will fail. A ThreadContext can be reused, simply by reconfiguring the CPU affinity. Note that the CPU affinity of previously created threads will not get adjusted. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221014134720.168738-4-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:00:43 +02:00
David Hildenbrand	7730f32c28	util: Introduce qemu_thread_set_affinity() and qemu_thread_get_affinity() Usually, we let upper layers handle CPU pinning, because pthread_setaffinity_np() (-> sched_setaffinity()) is blocked via seccomp when starting QEMU with -sandbox enable=on,resourcecontrol=deny However, we want to configure and observe the CPU affinity of threads from QEMU directly in some cases when the sandbox option is either not enabled or not active yet. So let's add a way to configure CPU pinning via qemu_thread_set_affinity() and obtain CPU affinity via qemu_thread_get_affinity() and implement them under POSIX using pthread_setaffinity_np() + pthread_getaffinity_np(). Implementation under Windows is possible using SetProcessAffinityMask() + GetProcessAffinityMask(), however, that is left as future work. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-3-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:00:36 +02:00
David Hildenbrand	6556aadc18	util: Cleanup and rename os_mem_prealloc() Let's * give the function a "qemu_" style name make sure the parameters in the implementation match the prototype * rename smp_cpus to max_threads, which makes the semantics of that parameter clearer ... and add a function documentation. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-2-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:00:28 +02:00
Thomas Huth	f7d81a351d	target/s390x: Fix emulation of the VISTR instruction The element size is encoded in the M3 field, not in the M4 field. Fixes: `be6324c6b7` ("s390x/tcg: Implement VECTOR ISOLATE STRING") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1248 Message-Id: <20221012182755.1014853-3-thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-10-27 09:09:50 +02:00
Thomas Huth	117ea96089	tests/tcg/s390x: Test compiler flags only once, not every time This is common practice, see the Makefile.target in the aarch64 folder for example. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20221012182755.1014853-2-thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-10-27 09:09:50 +02:00
Nico Boehr	38621181ae	s390x/tod-kvm: don't save/restore the TOD in PV guests Under PV, the guest's TOD clock is under control of the ultravisor and the hypervisor cannot change it. With upcoming kernel changes[1], the Linux kernel will reject QEMU's request to adjust the guest's clock in this case, so don't attempt to set the clock. This avoids the following warning message on save/restore of a PV guest: warning: Unable to set KVM guest TOD clock: Operation not supported [1] https://lore.kernel.org/all/20221011160712.928239-2-nrb@linux.ibm.com/ Fixes: `c3347ed0d2` ("s390x: protvirt: Support unpack facility") Signed-off-by: Nico Boehr <nrb@linux.ibm.com> Message-Id: <20221012123229.1196007-1-nrb@linux.ibm.com> [thuth: Add curly braces] Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-10-27 09:09:50 +02:00

... 4 5 6 7 8 ...

99390 Commits All Branches Search

99390 Commits

All Branches