Just unfold its definition in only use.
Signed-off-by: Juan Quintela <quintela@redhat.com>
[peter.maydell@linaro.org: fixed typo in the debug code,
added parentheses to fix precedence issue]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Looking at the other architectures, we should be using "how" not "arg1".
Signed-off-by: Juan Quintela <quintela@redhat.com>
[peter.maydell@linaro.org: remove unnecessary initialisation of how]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
We assign ret with the error code, but then return 0 unconditionally.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
The dynamic linker from the GNU C library v2.10+ uses the ELF
auxiliary vector AT_RANDOM [1] as a pointer to 16 bytes with random
values to initialize the stack protection mechanism. Technically the
emulated GNU dynamic linker crashes due to a NULL pointer
derefencement if it is built with stack protection enabled and if
AT_RANDOM is not defined by the QEMU ELF loader.
[1] This ELF auxiliary vector was introduced in Linux v2.6.29.
This patch can be tested with the code above:
#include <elf.h> /* Elf*_auxv_t, AT_RANDOM, */
#include <stdio.h> /* printf(3), */
#include <stdlib.h> /* exit(3), EXIT_*, */
#include <stdint.h> /* uint8_t, */
#include <string.h> /* memcpy(3), */
#if defined(__LP64__) || defined(__ILP64__) || defined(__LLP64__)
# define Elf_auxv_t Elf64_auxv_t
#else
# define Elf_auxv_t Elf32_auxv_t
#endif
main(int argc, char* argv[], char* envp[])
{
Elf_auxv_t *auxv;
/* *envp = NULL marks end of envp. */
while (*envp++ != NULL);
/* auxv->a_type = AT_NULL marks the end of auxv. */
for (auxv = (Elf_auxv_t *)envp; auxv->a_type != AT_NULL; auxv++) {
if (auxv->a_type == AT_RANDOM) {
int i;
uint8_t rand_bytes[16];
printf("AT_RANDOM is: 0x%x\n", auxv->a_un.a_val);
memcpy(rand_bytes, (const uint8_t *)auxv->a_un.a_val, sizeof(rand_bytes));
printf("it points to: ");
for (i = 0; i < 16; i++) {
printf("0x%02x ", rand_bytes[i]);
}
printf("\n");
exit(EXIT_SUCCESS);
}
}
exit(EXIT_FAILURE);
}
Changes introduced in v2 and v3:
* Fix typos + thinko (AT_RANDOM is used for stack canary, not for
ASLR)
* AT_RANDOM points to 16 random bytes stored inside the user
stack.
* Add a small test program.
Signed-off-by: Cédric VINCENT <cedric.vincent@st.com>
Signed-off-by: Laurent ALFONSI <laurent.alfonsi@st.com>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Some architectures (like Blackfin) only implement pselect6 (and skip
select/newselect). So add support for it.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
There were several remaining bugs in the previous implementation of
do_brk():
1. the value of "new_alloc_size" was one page too large when the
requested brk was aligned on a host page boundary.
2. no new pages should be (re-)allocated when the requested brk is
in the range of the pages that were already allocated
previsouly (for the same purpose). Technically these pages are
never unmapped in the current implementation.
The problem/fix can be reproduced/validated with the test-suite above:
#include <unistd.h> /* syscall(2), */
#include <sys/syscall.h> /* SYS_brk, */
#include <stdio.h> /* puts(3), */
#include <stdlib.h> /* exit(3), EXIT_*, */
#include <stdint.h> /* uint*_t, */
#include <sys/mman.h> /* mmap(2), MAP_*, */
#include <string.h> /* memset(3), */
int main()
{
int exit_status = EXIT_SUCCESS;
uint8_t *current_brk = 0;
uint8_t *initial_brk;
uint8_t *new_brk;
uint8_t *old_brk;
int failure = 0;
int i;
void test_brk(int increment, int expected_result) {
new_brk = (uint8_t *)syscall(SYS_brk, current_brk + increment);
if ((new_brk == current_brk) == expected_result)
failure = 1;
current_brk = (uint8_t *)syscall(SYS_brk, 0);
}
void test_result() {
if (!failure)
puts("OK");
else {
puts("failure");
exit_status = EXIT_FAILURE;
}
}
void test_title(const char *title) {
failure = 0;
printf("%-45s : ", title);
fflush(stdout);
}
test_title("Initialization");
test_brk(0, 1);
initial_brk = current_brk;
test_result();
test_title("Don't overlap \"brk\" pages");
test_brk(HOST_PAGE_SIZE, 1);
test_brk(HOST_PAGE_SIZE, 1);
test_result();
/* Preparation for the test "Re-allocated heap is initialized". */
old_brk = current_brk - HOST_PAGE_SIZE;
memset(old_brk, 0xFF, HOST_PAGE_SIZE);
test_title("Don't allocate the same \"brk\" page twice");
test_brk(-HOST_PAGE_SIZE, 1);
test_brk(HOST_PAGE_SIZE, 1);
test_result();
test_title("Re-allocated \"brk\" pages are initialized");
for (i = 0; i < HOST_PAGE_SIZE; i++) {
if (old_brk[i] != 0) {
printf("(index = %d, value = 0x%x) ", i, old_brk[i]);
failure = 1;
break;
}
}
test_result();
test_title("Don't allocate \"brk\" pages over \"mmap\" pages");
new_brk = mmap(current_brk, HOST_PAGE_SIZE / 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
if (new_brk == (void *) -1)
puts("unknown");
else {
test_brk(HOST_PAGE_SIZE, 0);
test_result();
}
test_title("All \"brk\" pages are writable (please wait)");
if (munmap(current_brk, HOST_PAGE_SIZE / 2) != 0)
puts("unknown");
else {
while (current_brk - initial_brk < 2*1024*1024*1024UL) {
old_brk = current_brk;
test_brk(HOST_PAGE_SIZE, -1);
if (old_brk == current_brk)
break;
for (i = 0; i < HOST_PAGE_SIZE; i++)
old_brk[i] = 0xAA;
}
puts("OK");
}
test_title("Maximum size of the heap > 16MB");
failure = (current_brk - initial_brk) < 16*1024*1024;
test_result();
exit(exit_status);
}
Changes introduced in patch v2:
* extend the "brk" test-suite embedded within the commit message;
* heap contents have to be initialized to zero, this bug was
exposed by "tst-calloc.c" from the GNU C library;
* don't [try to] allocate a new host page if the new "brk" is
equal to the latest allocated host page ("brk_page"); and
* print some debug information when DEBUGF_BRK is defined.
Signed-off-by: Cédric VINCENT <cedric.vincent@st.com>
Reviewed-by: Christophe Guillon <christophe.guillon@st.com>
Cc: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
In the m68k semihosting implementation of HOSTED_INIT_SIM, use the correct
check for whether do_brk() has failed -- it does not return -1 but the
previous value of the break limit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
In the ARM semihosting implementation of SYS_HEAPINFO, use the correct
check for whether do_brk() has failed -- it does not return -1 but the
previous value of the break limit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Since mmap() with MAP_FIXED will map over the top of existing mappings,
it's a bad idea to use it to implement brk(), because brk() with a
large size is likely to overwrite important things like qemu itself
or the host libc. So we drop MAP_FIXED and handle "mapped but at
different address" as an error case instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Fix a bug in the linux-user ELF loader code where it was not correctly
handling images where the lowest vaddr to be loaded was not page aligned.
The problem was that the code to probe for a suitable guest base address
was changing the 'loaddr' variable (by rounding it to a page boundary),
which meant that the load bias would then be incorrectly calculated
unless loaddr happened to already be page-aligned.
Binaries generated by gcc with the default linker script do start with
a loadable segment at a page-aligned vaddr, so were unaffected. This
bug was noticed with a binary created by the Google Go toolchain for ARM.
We fix the bug by refactoring the "probe for guest base" code out into
its own self-contained function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
This patch fixes a "double free()" due to "realloc(syms, 0)" in the
loader when the ELF file has no "useful" symbol, as with the following
example (compiled with "sh4-linux-gcc -nostdlib"):
.text
.align 1
.global _start
_start:
mov #1, r3
trapa #40 // syscall(__NR_exit)
nop
The bug appears when the log (option "-d") is enabled.
Signed-off-by: Cédric VINCENT <cedric.vincent@st.com>
Signed-off-by: Yves JANIN <yves.janin@st.com>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Reviewed-by: Richard Henderson <rth@twiddle.net>
When iterating through the XSAVE feature enumeration CPUID leaf (0xD)
we should not stop at the first zero EAX, but instead keep scanning
since there are gaps in the enumeration (ECX=1 for instance).
This fixes the proper usage of AVX in KVM guests.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
kvm_arch_get_supported_cpuid checks for global cpuid restrictions, it
does not require any CPUState reference. Changing its interface allows
to call it before any VCPU is initialized.
CC: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
No one references kvm_check_extension, kvm_has_vcpu_events, and
kvm_has_robust_singlestep outside KVM code.
kvm_update_guest_debug is never called, thus has no job besides
returning an error.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
No longer needed with accompanied kernel headers.
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
No longer needed with accompanied kernel headers.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
No longer needed with accompanied kernel headers. We are only left with
build dependencies that are controlled by kvm arch headers.
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Required header support is now unconditionally available.
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This helps reducing our build-time checks for feature support in the
available Linux kernel headers. And it helps users that do not have
sufficiently recent headers installed on their build machine.
Consequently, the patch removes and build-time checks for kvm and vhost
in configure, the --kerneldir switch, and KVM_CFLAGS. Kernel headers are
supposed to be provided by QEMU only.
s390 needs some extra love as it carries redefinitions from kernel
headers.
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
These kernel headers and the COPYING file were automatically imported
from current Linux git, cb0a02ecf9 (post 3.0-rc2).
Licensing:
asm-powerpc GPLv2
asm-s390 GPLv2
asm-x86 Linux top-level license (GPLv2 with exception)
linux/kvm*: Linux top-level license (GPLv2 with exception)
linux/vhost: Linux top-level license (GPLv2 with exception)
linux/virtio*: 3-clause BSB
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This helper pulls the required kernel headers for KVM and vhost into a
specified directory. The update is triggered via
scripts/update-linux-headers.sh LINUX_PATH
and will place the output under linux-headers/linux and linux-headers/asm-*.
It also imports the COPYING to care for headers without an explicit license.
CC: Alexander Graf <agraf@suse.de>
CC: Christoph Hellwig <hch@lst.de>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This warning is new in gcc 4.6.
Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Fixes crash in i386 when user emulation base address is non-zero.
21797 rt_sigreturn(8,1082124603,1,0,1082126048,1082126248)Exit reason and status: signal 11
Signed-off-by: Mike McCormack <mj.mccormack@samsung.com>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
These FPU states are properly maintained by KVM but not yet by TCG. So
far we unconditionally set them to 0 in the guest which may cause
state corruptions, though not with modern guests.
To avoid breaking backward migration, use a conditional subsection that
is only written if any of the three fields is non-zero. The guest's
FNINIT clears them frequently, and cleared IA32_MISC_ENABLE MSR[2]
reduces the probability of non-zero values further so that this
subsection is not expected to restrict migration in any common scenario.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Introduce a new emulated PCI device, specific to fully virtualized Xen
guests. The device is necessary for PV on HVM drivers to work.
Signed-off-by: Steven Smith <ssmith@xensource.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Compared to the last version I only added a comment to the code.
- remove i440FX-xen and i440fx_write_config_xen
we don't need to intercept pci config writes to i440FX anymore;
- introduce PIIX3-xen and piix3_write_config_xen
we do need to intercept pci config write to the PCI-ISA bridge to update
the PCI link routing;
- set the number of PIIX3-xen interrupts line to 128;
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Xen can only do dirty bit tracking for one memory region, so we should
explicitly avoid trying to track anything but the vga vram region.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
If the cirrus_vga PCI BAR is unmapped than we should not only reset
map_addr but also lfb_addr, otherwise we'll keep trying to map
the old lfb_addr in map_linear_vram.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Use qemu_invalidate_entry in cpu_physical_memory_unmap.
Do not lock mapcache entries in qemu_get_ram_ptr if the address falls in
the ramblock with offset == 0. We don't need to do that because the
callers of qemu_get_ram_ptr either try to map an entire block, other
from the main ramblock, or until the end of a page to implement a single
read or write in the main ramblock.
If we don't lock mapcache entries in qemu_get_ram_ptr we don't need to
call qemu_invalidate_entry in qemu_put_ram_ptr anymore because we can
leave with few long lived block mappings requested by devices.
Also move the call to qemu_ram_addr_from_mapcache at the beginning of
qemu_ram_addr_from_host.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Introduce qemu_ram_ptr_length that takes an address and a size as
parameters rather than just an address.
Refactor cpu_physical_memory_map so that we call qemu_ram_ptr_length only
once rather than calling qemu_get_ram_ptr one time per page.
This is not only more efficient but also tries to simplify the logic of
the function.
Currently we are relying on the fact that all the pages are mapped
contiguously in qemu's address space: we have a check to make sure that
the virtual address returned by qemu_get_ram_ptr from the second call on
is consecutive. Now we are making this more explicit replacing all the
calls to qemu_get_ram_ptr with a single call to qemu_ram_ptr_length
passing a size argument.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: agraf@suse.de
CC: anthony@codemonkey.ws
Signed-off-by: Alexander Graf <agraf@suse.de>
Replace xen_map_block with qemu_map_cache with the appropriate locking
and size parameters.
Replace xen_unmap_block with qemu_invalidate_entry.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
There is no need for qemu_map_cache_unlock, just use
qemu_invalidate_entry instead.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Fix the implementation of qemu_map_cache: correctly support size
arguments different from 0 or MCACHE_BUCKET_SIZE.
The new implementation supports locked mapcache entries with size
multiple of MCACHE_BUCKET_SIZE. qemu_invalidate_entry can correctly
find and unmap these "large" mapcache entries given that the virtual
address passed to qemu_invalidate_entry is the same returned by
qemu_map_cache when the locked mapcache entry was created.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch introduces phys memory client for Xen.
Only sync dirty_bitmap and set_memory are actually implemented.
migration_log will stay empty for the moment.
Xen can only log one range for bit change, so only the range in the
first call will be synced.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This function will be used to support sync dirty bitmap.
This come with a check against every Xen release, and special
implementation for Xen version that doesn't have this specific call.
This function will not be usable with Xen 3.3 because the behavior is
different.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Until now, we've created a union over multiple different TLB types and
allocated that union. While it's a waste of memory (and cache) to allocate
TLB information for a TLB type with much information when you only need
little, it also inflicts another issue.
With the new KVM API, we can now share the TLB between KVM and qemu, but
for that to work we need to have both be in the same layout. We can't just
stretch it over to fit some internal different TLB representation.
Hence this patch moves all TLB types to their own array, allowing us to only
address and allocate exactly the boundaries required for the specific TLB
type at hand.
Signed-off-by: Alexander Graf <agraf@suse.de>
We have some KVM interaction code in Qemu that tries to be clever and
ignore some capabilities when running on BookE style MMUs. Unfortunately,
the default CPU bamboo was defaulting to was not a BookE-style MMU,
resulting in the check to fail.
With this patch, guests can run again on 440 with -enable-kvm.
Signed-off-by: Alexander Graf <agraf@suse.de>
The natural format for e500 cores to do TLB manipulation with are the MAS
registers. Instead of converting them into some internal representation
and back again when the guest reads them, we can just keep the data
identical to the way the guest passed it to us.
The main advantage of this approach is that we're getting closer to being
able to share MMU data with KVM using shared memory, so that we don't need
to copy lots of MMU data back and forth all the time. For this to work
however, another patch is required that gets rid of the TLB union, as that
destroys our memory layout that needs to be identical with the kernel one.
Signed-off-by: Alexander Graf <agraf@suse.de>
As Nathan pointed out correctly, the mtmsr instruction does not modify
the high 32 bits of MSR. It also doesn't matter if SF is set or not,
the instruction always behaves the same.
This patch moves it a bit closer to the spec.
Reported-by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
There were some changes upstream to account for broken usage of mtmsr, so
before applying the mtmsr patch we need to update OpenBIOS, otherwise the
PPC target would break.
Signed-off-by: Alexander Graf <agraf@suse.de>
When running a PPC guest with KVM that can do PV operations, we need
to indicate the guest which instructions to use for a hypercall and
that it is running as KVM guest.
This logic was available on openbios based machines already. This patch
also adds said functionality to the mpc8544ds machine.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Scott Wood <scottwood@freescale.com>
During testing, I was generating a vmlinux binary that easily occupied
more than 20MB of RAM. Since the current -kernel code loads the initrd
at a fixed address behind the kernel, we were overwriting kernel data
when the kernel got too big.
To finally get rid of the issue, let's calculate the initrd and cmdline
addresses relative to the kernel size, so we can have kernels and initrds
that are as big as they want to - as long as they fit in RAM.
Signed-off-by: Alexander Graf <agraf@suse.de>
On at least the PowerPC 601, a direct-store (T=1) with bus unit ID 0x07F
is special-cased as memory-forced I/O controller access. It is supposed
to be checked immediately if T=1, bypassing all protection mechanisms
and acting cache-inhibited and global.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Simplified by avoiding reindentation. Added explanatory comments.
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Alexander Graf <agraf@suse.de>