This patch addes a Bulldozer-based Opteron_G4 CPU model.
This version has the ffxsr bit actually disabled, to match what was
documented below. Thanks to Andre Przywara for spotting the bug.
I am trying to be conservative with the new model, so I am enabling only
features known to be useful to guests, and not enabling anything that
was not tested or found to be useful to a guest.
List of missing flags in comparison to real hardware:
- vme: host-specific feature.
- osxsave: it is not set here because it is set by the guest OS, not by KVM
- monitor: this is filtered out by the KVM module, so no point in
enabling it.
- mmxext: untested, so not enabled.
- Perf*, Topology*, lwp, ibs: not emulated by KVM.
- wdt, skinit, osvw, altmovcr8, extapicspace, cmplegacy: untested,
so not enabled.
List of new flags, in comparison to the Opteron_G3 model:
- xsave: xsave feature, already implemented by Qemu
- avx, aes, sse4.x, ssse3, pclmulqdq: all new state the new instructions
could use is handled by the xsave state loading/saving code on Qemu.
- pdpe1gb: 1GB pages, supported by the KVM kernel module.
- ffxsr: untested, so not enabled
- fma4, xop: all new state the new instructions could use is handled by
the xsave loading/saving code on Qemu.
- 3dnowprefetch: safe to pass through, though the flag is not used by
Linux guests, at least.
Below is the comparison between the current Opteron_G3 model
and the new model being added.
- The "full" line contains the flags found on actual hardware.
- The "missing" line shows the flags that are present on actual
hardware, but not on the added Opteron_G4 model.
- The "new" line shows the flags that were not on the Opteron_G3 model
but are on Opteron_G4.
feature_edx:
Opteron_G3: sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep apic cx8 mce pae msr tsc pse de fpu
full: sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep apic cx8 mce pae msr tsc pse de vme fpu
Opteron_G4: sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep apic cx8 mce pae msr tsc pse de fpu
missing: vme
feature_ecx:
Opteron_G3: popcnt cx16 monitor sse3
full: avx osxsave xsave aes popcnt sse4.2 sse4.1 cx16 ssse3 monitor pclmulqdq sse3
Opteron_G4: avx xsave aes popcnt sse4.2 sse4.1 cx16 ssse3 pclmulqdq sse3
missing: osxsave monitor
new: avx xsave aes sse4.2 sse4.1 ssse3 pclmulqdq
extfeature_edx:
Opteron_G3: lm rdtscp fxsr mmx nx pse36 pat cmov mca pge mtrr syscall apic cx8 mce pae msr tsc pse de fpu
full: lm rdtscp pdpe1gb ffxsr fxsr mmx mmxext nx pse36 pat cmov mca pge mtrr syscall apic cx8 mce pae msr tsc pse de vme fpu
Opteron_G4: lm rdtscp pdpe1gb fxsr mmx nx pse36 pat cmov mca pge mtrr syscall apic cx8 mce pae msr tsc pse de fpu
missing: mmxext vme
new: pdpe1gb
extfeature_ecx:
Opteron_G3: misalignsse sse4a abm svm lahf_lm
full: Perf* Topology* fma4 lwp wdt skinit xop ibs osvw 3dnowprefetch misalignsse sse4a abm altmovcr8 extapicspace svm cmplegacy lahf_lm
Opteron_G4: fma4 xop 3dnowprefetch misalignsse sse4a abm svm lahf_lm
new: fma4 xop 3dnowprefetch
missing: Perf* Topology* lwp wdt skinit ibs osvw altmovcr8 extapicspace cmplegacy
Changes v1 -> v2:
- Actually disable ffxsr bit
Cc: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patches add the definition of a SandyBridge CPU model.
Summary of differences:
Flags present on actual hardware, but not on the added model definition:
- pbe, tm, ht, ss, acpi, vme, xTPR, tm2, eist, smx: host-specific
features, not exposed to guest.
- ds, ds-cpl, dtes64, pdcm: emulation not supported by KVM (although it
may be added in the future if implementing PMU virtualization)
- pcid, vmx, monitor: not emulated by Qemu/KVM right now.
- osxsave: set by the guest OS, not by Qemu.
Flags added, that were not present on Westmere model:
- xsave: already supported by Qemu
- avx, pclmulqdq: all new state the new instructions could use is
handled by xsave state loading/saving code.
- tsc-deadline, x2apic, rdtscp: already supported by Qemu/KVM.
Below there's a comparison of the features on the current Westmere CPU
model, and the SandyBridge CPU model.
- The "full" line contains the flags found on actual hardware.
- The "missing" line shows the flags that are present on actual
hardware, but not on the added SandyBridge model.
- The "new" line shows the flags that were not on the Westmere model,
but are on SandyBridge.
feature_edx:
Westmere: sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep apic cx8 mce pae msr tsc pse de fpu
full: pbe tm ht ss sse2 sse fxsr mmx ds acpi clflush pse36 pat cmov mca pge mtrr sep apic cx8 mce pge msr tsc pse de vme fpu
SandyBridge: sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep apic cx8 mce pae msr tsc pse de fpu
missing: pbe tm ht ss ds acpi vme
feature_ecx:
Westmere: aes popcnt sse4.2 sse4.1 cx16 ssse3 sse3
full: avx osxsave xsave aes tsc-deadline popcnt x2apic sse4.2 sse4.1 pcid pdcm xTPR cx16 ssse3 tm2 eist smx vmx ds-cpl monitor dtes64 pclmulqdq sse3
SandyBridge: avx xsave aes tsc-deadline popcnt x2apic sse4.2 sse4.1 cx16 ssse3 pclmulqdq sse3
missing: osxsave pcid pdcm xTPR tm2 eist smx vmx ds-cpl monitor dtes64
new: avx xsave tsc-deadline x2apic pclmulqdq
extfeature_edx:
Westmere: i64 nx syscall
full: i64 rdtscp nx syscall
SandyBridge: i64 rdtscp nx syscall
new: rdtscp
extfeature_ecx:
Westmere: lahf_lm
full: lahf_lm
SandyBridge: lahf_lm
Cc: "Dugger, Donald D" <donald.d.dugger@intel.com>
Cc: "Zhang, Xiantao" <xiantao.zhang@intel.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Property removal modifies the list, so it is not safe to continue
iteration. We know anyway that each object can have only one
parent (see object_property_add_child), so exit after finding
the requested object.
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
These were stored as NULL due to wrong cut-and-paste from set_pointer.
Reported-by: Gerhard Wiesinger <lists@wiesinger.com>
Tested-by: Gerhard Wiesinger <lists@wiesinger.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The performance test will also check for nesting. It will do
a certain quantity of cycles, and each of one will do a depth
nesting process.
This is useful for benchmarking the creation of coroutines,
given that nesting is creation-intensive (and the other perf
test does not benchmark that).
Signed-off-by: Alex Barcelo <abarcelo@ac.upc.edu>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It's possible to use sigaltstack backend with --with-coroutine=sigaltstack
v2: changed from enable/disable configure flags
Signed-off-by: Alex Barcelo <abarcelo@ac.upc.edu>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Configure tries, as a default, ucontext functions for the
coroutines. But now the user can force another backend by
--with-coroutine=BACKEND option
v2: Using --with-coroutine=BACKEND instead of enable
disable individual configure options
Signed-off-by: Alex Barcelo <abarcelo@ac.upc.edu>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This file is based in both coroutine-ucontext.c and
pth_mctx.c (from the GNU Portable Threads library).
The mechanism used to change stacks is the sigaltstack
function (variant 2 of the pth library).
v2: Some corrections. Moving global variables into
thread storage (CoroutineThreadState).
Signed-off-by: Alex Barcelo <abarcelo@ac.upc.edu>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the first part of a write request is allocated, but the second isn't
and it can be allocated so that the resulting area is contiguous, handle
it at once. This is a common case for sequential writes.
After this patch, alloc_cluster_offset() only checks if the clusters are
already allocated or how many new clusters can be allocated contigouosly.
The actual cluster allocation is split off into a new function
do_alloc_cluster_offset().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This function allows to allocate clusters at a given offset in the image
file. This is useful if you want to allocate the second part of an area
that must be contiguous.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Simplify the blockdev-snapshot-sync code and gain failsafe operation
by turning it into a wrapper around the new transaction command. A new
option is also added matching "mode".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The mode field lets a management application create the snapshot
destination outside QEMU.
Right now, the only modes are "existing" and "absolute-paths". Mirroring
introduces "no-backing-file". In the future "relative-paths" could be
implemented too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We will add other kinds of operation. Prepare for this by adjusting
the schema.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This creates a new test group 'quick' for some test case that take at
most a couple of seconds each, so that the group can be run during a
quick 'make check'
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu-img resize has some limitations with qcow2, but the user is only
told that "this image format does not support resize". Quite confusing,
so add some more detailed error_report() calls and change "this image
format" into "this image".
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Monitor operations that manipulate image files must not execute while a
background job (like image streaming) is in progress. This prevents
corruptions from happening when two pieces of code are manipulating the
image file without knowledge of each other.
The monitor "commit" command raises QERR_DEVICE_IN_USE when
bdrv_commit() returns -EBUSY but "commit all" has no error handling.
This is easy to fix, although note that we do not deliver a detailed
error about which device was busy in the "commit all" case.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The L2 table cache reduces QED metadata reads that would be required
when translating LBAs to offsets into the image file. Since requests
execute in parallel it is possible to share an L2 table between multiple
requests.
There is a potential data corruption issue when an in-use L2 table is
evicted from the cache because the following situation occurs:
1. An allocating write performs an update to L2 table "A".
2. Another request needs L2 table "B" and causes table "A" to be
evicted.
3. A new read request needs L2 table "A" but it is not cached.
As a result the L2 update from #1 can overlap with the L2 fetch from #3.
We must avoid doing overlapping I/O requests here since the worst case
outcome is that the L2 fetch completes before the L2 update and yields
stale data. In that case we would effectively discard the L2 update and
lose data clusters!
Thanks to Benoît Canet <benoit.canet@gmail.com> for extensive testing
and debugging which lead to discovery of this bug.
Reported-by: Benoît Canet <benoit.canet@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Tested-by: Benoît Canet <benoit.canet@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The topic of whether and by whom docs/tracing.txt is maintained was
brought up. It currently does not have an official maintainer.
Add it to the tracing section so that Stefan gets cc'ed on patches.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This patch corrects the configure's trace option in docs/tracing.txt.
Signed-off-by: Jun Koi <junkoi2004@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This patch makes trace_thread_create() to use its function arg to
initialize thread. The other choice is to make this a function to use
void arg, but i prefer this way.
Signed-off-by: Jun Koi <junkoi2004@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
SystemTap provides a "semaphore" that can optionally be tested before
executing a trace event. The purpose of this mechanism is to skip
expensive tracing code when the trace event is disabled.
For example, some applications may have trace events that format or
convert strings for trace events. This expensive processing should only
be done in the case where the trace event is enabled.
Since QEMU's generated trace events never have such special-purpose
code, there is no reason to add the semaphore check.
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Adds a 'TRACE_${NAME}_ENABLED' preprocessor define for each tracing event in
"trace.h".
This lets the user conditionally compile code with a relatively high execution
cost that is only necessary when producing the tracing information for an event
that is enabled.
Note that events using this define will probably have the "disable" property by
default, in order to avoid such costs on regular builds.
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Most MemoryRegionOps already had the const attribute.
This patch adds it to the remaining ones.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
tcg_out_label is always called with a third argument of pointer type
which was casted to tcg_target_long.
These casts can be avoided by changing the prototype of tcg_out_label.
There was also a cast to long. For most hosts with
sizeof(long) == sizeof(tcg_target_long) == sizeof(void *) this did not
matter, but for w64 it was wrong. This is fixed now.
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
MinGW-w64 and some versions of MinGW32 don't provide libiberty.a,
so add this library only if it was found.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
MinGW-w64 already defines lseek and ftruncate (and uses the 64 bit
variants). The conditional compilation avoids redefinitions
(which would be wrong) and compiler warnings.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Commit 021ecd8b9d breaks the build for
PPC hosts because it uses uintptr_t without the necessary include file.
uintptr_t is defined in stdint.h, so add this include.
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Current code depends on variables defined in config-host.mak before it is
actually included.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Paul Brook <paul@codesourcery.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Too many VM kittens were killed since 7d03f82f81. Another one just died
under my fat fingers.
When you quit a kgdb session, does the Linux kernel power off? Or when
you terminate gdb attached to a hardware debugger, does your board
vanish in space? No.
So let's stop terminating QEMU when the gdbstub receives a kill commando
in system emulation mode. Real termination can still be achieved via
"monitor quit". We keep the behavior for user mode emulation which is
arguably more like a gdbserver scenario.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This was a long pending bug, now revealed by the assert in
phys_page_find that stumbled over the large page index returned by
cpu_get_phys_page_debug for NX-marked pages: We need to mask out NX and
all user-definable bits 52..62 from PDEs and the final PTE to avoid
corrupting physical addresses.
Reviewed-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* stefanha/trivial-patches:
configure: Quote the configure args printed in config.log
osdep: Remove local definition of macro offsetof
libcacard: Spelling and grammar fixes in documentation
Spelling fixes in comments (it's -> its)
vnc: Add break statement
libcacard: Use format specifier %u instead of %d for unsigned values
Fix sign of sscanf format specifiers
block/vmdk: Fix warning from splint (comparision of unsigned value)
qmp: Fix spelling fourty -> forty
qom: Fix spelling in documentation
sh7750: Remove redundant 'struct' from MemoryRegionOps
* qemu-kvm/uq/master:
kvm: fill in padding to help valgrind
kvm: x86: Add user space part for in-kernel i8254
kvm: Add kvm_has_pit_state2 helper
i8254: Open-code timer restore
i8254: Factor out base class for KVM reuse
* kraxel/usb.42:
xhci: fix port status
xhci: fix control xfers
usb: add shortcut for control transfers
usb-host: enable pipelineing for bulk endpoints.
usb: add pipelining option to usb endpoints
usb: queue can have async packets
uhci_fill_queue: zap debug printf
usb: add USB_RET_IOERROR
usb: return BABBLE rather then NAK when we receive too much data
usb-ehci: Cleanup itd error handling
usb-ehci: Fix and simplify nakcnt handling
usb-ehci: Remove dead nakcnt code
usb-ehci: Fix cerr tracking
usb-ehci: Any packet completion except for NAK should set the interrupt
usb-ehci: Rip the queues when the async or period schedule is halted
usb-ehci: Drop cached qhs when the doorbell gets rung
usb-ehci: always call ehci_queues_rip_unused for period queues
usb-ehci: split our qh queue into async and periodic queues
usb-ehci: Never follow table entries with the T-bit set
usb-redir: Set ep type and interface
VCARD_ATR_PREFIX is used as part of an array initializer so it should
not have () around it, so far this happened to work, but gcc-4.7 does
not like it.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The return value of cpu_register_io_memory() is no longer used anywhere, so
we can remove it and all associated data and code.
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of indirecting via io_mem_region, dispatch directly
through the MemoryRegion obtained from the iotlb or phys_page_find().
Signed-off-by: Avi Kivity <avi@redhat.com>