Laurent Vivier
ff99b952c8
target-m68k: cmp manages word and bytes operands
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
8a370c6cb7
target-m68k: add/sub manage word and byte operands
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
227de713e0
target-m68k: add addressing modes to neg
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
db3d7945ae
target-m68k: introduce byte and word cc_ops
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
3c980d2ef6
target-m68k: some bit ops cleanup
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
415f4b62eb
target-m68k: suba/adda can manage word operand
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
52dc23c595
target-m68k: and can manage word and byte operands
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
020a465920
target-m68k: or can manage word and byte operands
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
eec37aec85
target-m68k: eor can manage word and byte operands
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
ea4f2a8441
target-m68k: add addressing modes to not
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Richard Henderson
a665a820e5
target-m68k: Inline addx, subx, negx
...
Signed-off-by: Richard Henderson <rth@twiddle.net>
And add opcodes for 680x0
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2016-10-28 10:38:48 +02:00
Laurent Vivier
beff27ab3a
target-m68k: add dbcc
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
d5a3cf33f2
target-m68k: add addressing modes to scc
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
29cf437da4
target-m68k: add exg ops
...
Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
c630e436c0
target-m68k: add linkl
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-28 10:38:48 +02:00
Laurent Vivier
71600eda7c
target-m68k: add bkpt instruction
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2016-10-28 10:38:48 +02:00
Peter Maydell
835f3d24b4
audio: intel-hda: check stream entry count during transfer
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJYEg+bAAoJEEy22O7T6HE42V8QALHC42lwtj9Kx4yHS7Tpn4Jy
ry62EjYvXb/BCd1GGkzCZhPPJSdpiwFRubmm00hwHPzQdYjj32CYfQvAFaLpcRlY
u1Xp2G1YIlIrhhTwjEeYglBQkuLkjqh2g90kWarvw/Ry6iS9WEtrC8GwpbVnHa6/
fAkAJV5KKUmXwKFVdhDZvhpOVf055U88EAoSz7H6P1opKcv/vruCs/wId3bl9LH0
pmdhXnneJmriNWqoqmfEDAHGi37QS1GL2Zhfqs3H/dOfe5WTabYwFNd5fsz+PeyE
SojgzdcTPpeBk25JwFjerx/aesu4uNU8GnUBqvDyVLERpHK4MVvAWToJJN5ruUGQ
m+LYcCcbTIDUjVmvLCASjlJKoztv+iG4CCiFerCHg1tVBiPNMpZtdbkXnj61Vc77
2r9P1sMkn+0KQ6bqoFw1A2Iz/DbL9faw935OQsGRpcLEHWq7laImSFM8qeUEARD2
mpqi8vexIFdb40bW8kQ1IUuTcqrOhbABf7cw/aLGIQGhjH1MSTgAUtRz16erlwz3
zmp3lJne06NWnqti0gepYo6QjVgYsAFcSySvlVgh7fo4lcp+aKaD1QasOnIIJHYY
9hYXjgM5xd3E0k4O0NoJF1HpkuMDI/V+GXbhbog5ZUVlN7KDTSCLPXn45nsWNOva
ttaXt0Fpc2Btwta/Wtkw
=QEV8
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kraxel/tags/pull-audio-20161027-1' into staging
audio: intel-hda: check stream entry count during transfer
# gpg: Signature made Thu 27 Oct 2016 15:30:51 BST
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/pull-audio-20161027-1:
audio: intel-hda: check stream entry count during transfer
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-27 17:24:29 +01:00
Peter Maydell
5929d7e8a0
cmpxchg emulation of atomics, v8
...
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYEMv7AAoJEK0ScMxN0CebtzsIAJD3n9AlCnJoC0xVJDcacqlY
nkUqJqgmV5FkXq+x8KA6t7G5jfxZTBxk6QY42nXBidfuquogXnk0TQ0LJxLqp316
mbjdomF8NHZH/79wMg5cYP/Thu4FAtw4uqJbb7kUKPjvQiPJhkISAl/4Jg9y07WR
n6KptKI09QpIs6cM8q9DQ+RXoYG/xg/hP3Wih5TL4N1hZSiJG78Mpr5nF2HgbFrs
gKRCmUzNkrG/hBweOqWDRo3H/fHvxFUDMNOzceH7gl1OvANaXaIO2lMkEI3MleFF
Jq5pEtJ4TefIeoqCw8Nd5WZyzqZPUucqlXWwz3TLP0x3AgzkTH4F5lWlumqb+WM=
=jEq6
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/rth/tags/pull-atomic-20161026' into staging
cmpxchg emulation of atomics, v8
# gpg: Signature made Wed 26 Oct 2016 16:30:03 BST
# gpg: using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg: aka "Richard Henderson <rth@redhat.com>"
# gpg: aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B
* remotes/rth/tags/pull-atomic-20161026: (37 commits)
target-alpha: Emulate LL/SC using cmpxchg helpers
target-alpha: Introduce MMU_PHYS_IDX
target-arm: remove EXCP_STREX + cpu_exclusive_{test, info}
linux-user: remove handling of aarch64's EXCP_STREX
linux-user: remove handling of ARM's EXCP_STREX
target-arm: emulate aarch64's LL/SC using cmpxchg helpers
target-arm: emulate SWP with atomic_xchg helper
target-arm: emulate LL/SC using cmpxchg helpers
target-arm: Rearrange aa32 load and store functions
tests: add atomic_add-bench
target-i386: remove helper_lock()
target-i386: emulate XCHG using atomic helper
target-i386: emulate LOCK'ed BTX ops using atomic helpers
target-i386: emulate LOCK'ed XADD using atomic helper
target-i386: emulate LOCK'ed NEG using cmpxchg helper
target-i386: emulate LOCK'ed NOT using atomic helper
target-i386: emulate LOCK'ed INC using atomic helper
target-i386: emulate LOCK'ed OP instructions using atomic helpers
target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers
tcg: Emit barriers with parallel_cpus
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-27 14:06:34 +01:00
Peter Maydell
8f9d84df97
-----BEGIN PGP SIGNATURE-----
...
Version: GnuPG v1
iQEcBAABAgAGBQJYEBKaAAoJEO8Ells5jWIR4jQH/3HgiWHs9+iQrUjo8DXrbF1b
Dkdg8B66yYRirwR4KeCVJqOMnscPotISJc47MveoU+CxAwRcmhVtPuH+gZ7MLggp
IrFT9XNo4WhSBlOc1tr/qGyGGgzzkWbcKKBfD3dK049XDcXPm7A3hNshqitf6YJI
ILnlVk0ttKP7PKd6pvwaH+8yNDqcCr4+Rk6uSgOAB4N416+N/zk2AwQGWbMgLSzZ
zBRu95K/7UvRRoyyqR4kxTRGhfNdEqWeOXXISRmTBfBM+iK6W3uaeWSy5ka9QTdo
yXwcwxVe9iBxMuR3sZqNAbi5EbQIBtQSI2echG4bCQwvwjEAw9LUOhnJ44XkTfE=
=GoQg
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Wed 26 Oct 2016 03:19:06 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
colo-proxy: fix memory leak
net: rtl8139: limit processing of ring descriptors
net: vmxnet: initialise local tx descriptor
e1000e: Don't zero out buffer address in rx descriptor
net: rocker: set limit to DMA buffer size
net: eepro100: fix memory leak in device uninit
tap-bsd: OpenBSD uses tap(4) now
net: pcnet: fix source formatting and indentation
net: pcnet: check rx/tx descriptor ring length
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-27 12:45:45 +01:00
Peter Maydell
991a97ac74
-----BEGIN PGP SIGNATURE-----
...
iQIcBAABAgAGBQJYD6tmAAoJEPMMOL0/L748ebQQAKxA2zQn2TPWoy/hzAC+QQIt
VVKm3so/4WwX1JVrfLm9jAVHBsOYvHgABSxHtdwRQK2UD6QP3zh8dZNFIAJgNtGH
SNMRm2HRyE7f16hnbz5Scqp7YGvYDp8XVolNvS/o5bBh3dS9j4V4W4DiC3bu7OZY
FNytzlFCQvhOXNRejhlKsusYvrRECEy5Zaa3LTbYRVX7K/sHtDCD01URQKYJWZFJ
m13juuus1rXNVuYxbs1YLwAJcN9yM4pjnZnO6meBH669+/JSbByjjXuhARwng4Z5
o/f8+ZpyCMlNXMTt7DFu6QPxrFKCHpQ2Rwyy55uVx7lEmtZ1s6n6mF+P/Wp+kXUZ
QzvbBSKCnNsLHGUf+0Us/U1v61WFhd4MZJF7dzWecVpLT8tCbXADFLbfgzFkz5MN
zVd7L2uO4F3CtkwW8GFxmiVqmnHyOl/+2kz8UkejbnJQxwmr8oijVQVexaXDKHCA
KytAz+PEW5qX6uGlfnEV2DnBpNOAzh8RspZF/mzDA5H0VM08bmeK7ySUe+BYRv2p
GZXqdVchQrf2fYmFB1Dn4hGvc/gxtLjxkFst3koItgGeUauY35AAkDZ2B5EJ03UZ
kpfA0grwhWTX92h+MYwXRHYaMoscCQPcjfYNbDGcOSvvwkElOf+fFMTaA9t0MGF2
gREdnWwhl8gr7K/Mh+3t
=uyQH
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/vivier/tags/m68k-part1-pull-request' into staging
# gpg: Signature made Tue 25 Oct 2016 19:58:46 BST
# gpg: using RSA key 0xF30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg: aka "Laurent Vivier <laurent@vivier.eu>"
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C
* remotes/vivier/tags/m68k-part1-pull-request: (23 commits)
target-m68k: Optimize gen_flush_flags
target-m68k: Optimize some comparisons
target-m68k: Use setcond for scc
target-m68k: Introduce DisasCompare
target-m68k: Reorg flags handling
target-m68k: Remove incorrect clearing of cc_x
target-m68k: Some fixes to SR and flags management
target-m68k: Print flags properly
target-m68k: update CPU flags management
target-m68k: don't update cc_dest in helpers
target-m68k: update move to/from ccr/sr
target-m68k: remove m68k_cpu_exec_enter() and m68k_cpu_exec_exit()
target-m68k: Replace helper_xflag_lt with setcond
target-m68k: allow to update flags with operation on words and bytes
target-m68k: REG() macro cleanup
target-m68k: set PAGE_BITS to 12 for m68k
target-m68k: define operand sizes
target-m68k: set disassembler mode to 680x0 or coldfire
target-m68k: introduce read_imXX() functions
target-m68k: manage scaled index
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-27 11:58:43 +01:00
Richard Henderson
ed2839166c
target-alpha: Emulate LL/SC using cmpxchg helpers
...
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. However, portable parallel
code is written assuming only cmpxchg which means that in
practice this is a viable alternative.
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:02 -07:00
Richard Henderson
6a73ecf5cf
target-alpha: Introduce MMU_PHYS_IDX
...
Rather than using helpers for physical accesses, use a mmu index.
The primary cleanup is with store-conditional on physical addresses.
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:02 -07:00
Emilio G. Cota
05188cc72f
target-arm: remove EXCP_STREX + cpu_exclusive_{test, info}
...
The exception is not emitted anymore; remove it and the associated
TCG variables.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-31-git-send-email-cota@braap.org>
2016-10-26 08:29:02 -07:00
Emilio G. Cota
f4e6eb7ffe
linux-user: remove handling of aarch64's EXCP_STREX
...
The exception is not emitted anymore.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-30-git-send-email-cota@braap.org>
2016-10-26 08:29:02 -07:00
Emilio G. Cota
b50b82fc48
linux-user: remove handling of ARM's EXCP_STREX
...
The exception is not emitted anymore.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twidle.net>
Message-Id: <1467054136-10430-29-git-send-email-cota@braap.org>
2016-10-26 08:29:02 -07:00
Emilio G. Cota
1dd089d0ee
target-arm: emulate aarch64's LL/SC using cmpxchg helpers
...
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. Portable parallel code, however,
is written assuming only cmpxchg--and not LL/SC--is available.
This means that in practice emulating LL/SC with cmpxchg is
a viable alternative.
The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers.
This works in both user and system mode. In usermode, it avoids
pausing all other CPUs to perform the LL/SC pair. The subsequent
performance and scalability improvement is significant, as the
plots below show. They plot the throughput of atomic_add-bench
compiled for ARM and executed on a 64-core x86 machine.
Hi-res plots: http://imgur.com/a/JVc8Y
atomic_add-bench: 1000000 ops/thread, [0,1] range
18 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
16 ++master +-H--+ ++
|| |
14 ++ ++
| | |
12 ++| ++
| | |
10 ++++ ++
8 ++E ++
|+++ |
6 ++ | ++
| | |
4 ++ | ++
| | |
2 +H++E+--- ++
+ | +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E|
0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,2] range
18 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
16 ++master +-H--+ ++
| | |
14 ++E ++
| | |
12 ++| ++
|+++ |
10 ++ | ++
8 ++ | ++
| | |
6 ++ | ++
| | |
4 ++ | ++
| +E+--- |
2 +H+ +E+-----+++ +++ +++ ---+E+-----+E+------+++
+++ + +E+---+--+E+----++E+------+E+--- ++++ +++ + +E|
0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,128] range
70 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
60 ++master +-H--+ +++ ---+E+-----+E+------+E+
| +E+------E-------+E+--- |
| --- +++ |
50 ++ +++--- ++
| -+E+ |
40 ++ +++---- ++
| E- |
| --| |
30 ++ -- +++ ++
| +E+ |
20 ++E+ ++
|E+ |
| |
10 ++ ++
+ + + + + + + |
0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,1024] range
160 ++---------+---------+----------+---------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
140 ++master +-H--+ +++ +++
| -+E+-----+E+-------E|
120 ++ +++ ---- +++
| +++ ----E-- |
100 ++ --E--- +++ ++
| +++ ---- +++ |
80 ++ --E-- ++
| ---- +++ |
| -+E+ |
60 ++ ---- +++ ++
| +E+- |
40 ++ -- ++
| +E+ |
20 +EE+ ++
+++ + + + + + + |
0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
[rth: Rearrange 128-bit cmpxchg helper. Enforce alignment on LL.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-28-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:02 -07:00
Emilio G. Cota
cf12bce088
target-arm: emulate SWP with atomic_xchg helper
...
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-25-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:02 -07:00
Emilio G. Cota
354161b37c
target-arm: emulate LL/SC using cmpxchg helpers
...
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. Portable parallel code, however,
is written assuming only cmpxchg--and not LL/SC--is available.
This means that in practice emulating LL/SC with cmpxchg is
a viable alternative.
The appended emulates LL/SC pairs in ARM with cmpxchg helpers.
This works in both user and system mode. In usermode, it avoids
pausing all other CPUs to perform the LL/SC pair. The subsequent
performance and scalability improvement is significant, as the
plots below show. They plot the throughput of atomic_add-bench
compiled for ARM and executed on a 64-core x86 machine.
Hi-res plots: http://imgur.com/a/aNQpB
atomic_add-bench: 1000000 ops/thread, [0,1] range
9 ++---------+----------+----------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
8 +Emaster +-H--+ ++
| | |
7 ++E ++
| | |
6 ++++ ++
| | |
5 ++ | ++
4 ++ | ++
| | |
3 ++ | ++
| | |
2 ++ | ++
|H++E+--- +++ ---+E+------+E+------+E|
1 +++ +E+-----+E+------+E+------+E+------+E+-- +++ +++ ++
++H+ + +++ + +++ ++++ + + + |
0 ++--H----H-+-----H----+----------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,2] range
16 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
14 ++master +-H--+ ++
| | |
12 ++| ++
| E |
10 ++| ++
| | |
8 ++++ ++
|E+| |
| | |
6 ++ | ++
| | |
4 ++ | ++
| +E+--- +++ +++ +++ ---+E+------+E|
2 +H+ +E+------E-------+E+-----+E+------+E+------+E+-- +++
+ | + +++ + ++++ + + + |
0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,128] range
70 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + ++++ + |
60 ++master +-H--+ ----E------+E+-------++
| -+E+--- +++ +++ +E|
| +++ ---- +++ ++|
50 ++ +++ ---+E+- ++
| -E--- |
40 ++ ---+++ ++
| +++--- |
| -+E+ |
30 ++ +++---- ++
| +E+ |
20 ++ +++-- ++
| +E+ |
|+E+ |
10 +E+ ++
+ + + + + + + |
0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,1024] range
120 ++---------+---------+----------+---------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
| master +-H--+ ++|
100 ++ ----E+
| +++ ---+E+--- ++|
| --E--- +++ |
80 ++ ---- +++ ++
| ---+E+- |
60 ++ -+E+-- ++
| +++ ---- +++ |
| -+E+- |
40 ++ +++---- ++
| +++ ---+E+ |
| -+E+--- |
20 ++ +E+ ++
|+E+++ |
+E+ + + + + + + |
0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
[rth: Enforce alignment for ldrexd.]
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-23-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:02 -07:00
Richard Henderson
7f5616f538
target-arm: Rearrange aa32 load and store functions
...
Stop specializing on TARGET_LONG_BITS == 32; unconditionally allocate
a temp and expand with tcg_gen_extu_i32_tl. Split out gen_aa32_addr,
gen_aa32_frob64, gen_aa32_ld_i32 and gen_aa32_st_i32 as separate interfaces.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:02 -07:00
Emilio G. Cota
070e3edcea
tests: add atomic_add-bench
...
With this microbenchmark we can measure the overhead of emulating atomic
instructions with a configurable degree of contention.
The benchmark spawns $n threads, each performing $o atomic ops (additions)
in a loop. Each atomic operation is performed on a different cache line
(assuming lines are 64b long) that is randomly selected from a range [0, $r).
[ Note: each $foo corresponds to a -foo flag ]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-20-git-send-email-cota@braap.org>
2016-10-26 08:29:01 -07:00
Emilio G. Cota
37b995f6e7
target-i386: remove helper_lock()
...
It's been superseded by the atomic helpers.
The use of the atomic helpers provides a significant performance and scalability
improvement. Below is the result of running the atomic_add-test microbenchmark with:
$ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 5000000 -r $r -n $n
, where $n is the number of threads and $r is the allowed range for the additions.
The scenarios measured are:
- atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset)
- cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper
- master: before this patchset
Results sorted in ascending range, i.e. descending degree of contention.
Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64
Opteron 6376 cores.
atomic_add-bench: 5000000 ops/thread, [0,1] range
25 ++---------+----------+---------+----------+----------+----------+---++
+ atomic +-E--+ + + + + + |
|cmpxchg +-H--+ |
20 +Emaster +-N--+ ++
|| |
|++ |
|| |
15 +++ ++
|N| |
|+| |
10 ++| ++
|+|+ |
| | -+E+------ +++ ---+E+------+E+------+E+-----+E+------+E|
|+E+E+- +++ +E+------+E+-- |
5 ++|+ ++
|+N+H+--- +++ |
++++N+--+H++----+++ + +++ --++H+------+H+------+H++----+H+---+--- |
0 ++---------+-----H----+---H-----+----------+----------+----------+---H+
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 5000000 ops/thread, [0,2] range
25 ++---------+----------+---------+----------+----------+----------+---++
++atomic +-E--+ + + + + + |
|cmpxchg +-H--+ |
20 ++master +-N--+ ++
|E| |
|++ |
||E |
15 ++| ++
|N|| |
|+|| ---+E+------+E+-----+E+------+E|
10 ++| | ---+E+------+E+-----+E+--- +++ +++
||H+E+--+E+-- |
|+++++ |
| || |
5 ++|+H+-- +++ ++
|+N+ - ---+H+------+H+------ |
+ +N+--+H++----+H+---+--+H+----++H+--- + + +H+---+--+H|
0 ++---------+----------+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 5000000 ops/thread, [0,8] range
40 ++---------+----------+---------+----------+----------+----------+---++
++atomic +-E--+ + + + + + |
35 +cmpxchg +-H--+ ++
| master +-N--+ ---+E+------+E+------+E+-----+E+------+E|
30 ++| ---+E+-- +++ ++
| | -+E+--- |
25 ++E ---- +++ ++
|+++++ -+E+ |
20 +E+ E-- +++ ++
|H|+++ |
|+| +H+------- |
15 ++H+ ---+++ +H+------ ++
|N++H+-- +++--- +H+------++|
10 ++ +++ - +++ ---+H+ +++ +H+
| | +H+-----+H+------+H+-- |
5 ++| +++ ++
++N+N+--+N++ + + + + + |
0 ++---------+----------+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 5000000 ops/thread, [0,128] range
160 ++---------+---------+----------+---------+----------+----------+---++
+ atomic +-E--+ + + + + + |
140 +cmpxchg +-H--+ +++ +++ ++
| master +-N--+ E--------E------+E+------++|
120 ++ --| | +++ E+
| -- +++ +++ ++|
100 ++ - ++
| +++- +++ ++|
80 ++ -+E+ -+H+------+H+------H--------++
| ---- ---- +++ H|
| ---+E+-----+E+- ---+H+ ++|
60 ++ +E+--- +++ ---+H+--- ++
| --+++ ---+H+-- |
40 ++ +E+-+H+--- ++
| +H+ |
20 +EE+ ++
+N+ + + + + + + |
0 ++N-N---N--+---------+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 5000000 ops/thread, [0,1024] range
350 ++---------+---------+----------+---------+----------+----------+---++
+ atomic +-E--+ + + + + + |
300 +cmpxchg +-H--+ +++
| master +-N--+ +++ ||
| +++ | ----E|
250 ++ | ----E---- ++
| ----E--- | ---+H|
200 ++ -+E+--- +++ ---+H+--- ++
| ---- -+H+-- |
| +E+ +++ ---- +++ |
150 ++ ---+++ ---+H+- ++
| --- -+H+-- |
100 ++ ---+E+ ---- +++ ++
| +++ ---+E+-----+H+- |
| -+E+------+H+-- |
50 ++ +E+ ++
+EE+ + + + + + + |
0 ++N-N---N--+---------+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
hi-res: http://imgur.com/a/fMRmq
For master I stopped measuring master after 8 threads, because there is little
point in measuring the well-known performance collapse of a contended lock.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-21-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Emilio G. Cota
ea97ebe89f
target-i386: emulate XCHG using atomic helper
...
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-19-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Emilio G. Cota
cfe819d309
target-i386: emulate LOCK'ed BTX ops using atomic helpers
...
[rth: Avoid redundant qemu_ld in locked case. Fix previously unnoticed
incorrect zero-extension of address in register-offset case.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-18-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Emilio G. Cota
f53b01817f
target-i386: emulate LOCK'ed XADD using atomic helper
...
[rth: Move load of reg value to common location.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-17-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Emilio G. Cota
8eb8c73856
target-i386: emulate LOCK'ed NEG using cmpxchg helper
...
[rth: Move redundant qemu_load out of cmpxchg loop.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-16-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Emilio G. Cota
2a5fe8ae14
target-i386: emulate LOCK'ed NOT using atomic helper
...
[rth: Avoid qemu_load that's redundant with the atomic op.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-15-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Emilio G. Cota
60e573462f
target-i386: emulate LOCK'ed INC using atomic helper
...
[rth: Merge gen_inc_locked back into gen_inc to share cc update.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-14-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Emilio G. Cota
a7cee522f3
target-i386: emulate LOCK'ed OP instructions using atomic helpers
...
[rth: Eliminate some unnecessary temporaries.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-13-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Emilio G. Cota
ae03f8de45
target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers
...
The diff here is uglier than necessary. All this does is to turn
FOO
into:
if (s->prefix & PREFIX_LOCK) {
BAR
} else {
FOO
}
where FOO is the original implementation of an unlocked cmpxchg.
[rth: Adjust unlocked cmpxchg to use movcond instead of branches.
Adjust helpers to use atomic helpers.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-6-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson
91682118aa
tcg: Emit barriers with parallel_cpus
...
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson
df79b996a7
tcg: Add CONFIG_ATOMIC64
...
Allow qemu to build on 32-bit hosts without 64-bit atomic ops.
Even if we only allow 32-bit hosts to multi-thread emulate 32-bit
guests, we still need some way to handle the 32-bit guest using a
64-bit atomic operation. Do so by dropping back to single-step.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson
7ebee43ee3
tcg: Add atomic128 helpers
...
Force the use of cmpxchg16b on x86_64.
Wikipedia suggests that only very old AMD64 (circa 2004) did not have
this instruction. Further, it's required by Windows 8 so no new cpus
will ever omit it.
If we truely care about these, then we could check this at startup time
and then avoid executing paths that use it.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson
c482cb117c
tcg: Add atomic helpers
...
Add all of cmpxchg, op_fetch, fetch_op, and xchg.
Handle both endian-ness, and sizes up to 8.
Handle expanding non-atomically, when emulating in serial.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson
c86c6e4c80
cputlb: Tidy some macros
...
TGT_LE and TGT_BE are not size dependent and do not need to be
redefined. The others are no longer used at all.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson
82a45b96a2
cputlb: Move most of iotlb code out of line
...
Saves 2k code size off of a cold path.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson
4097842885
cputlb: Remove includes from softmmu_template.h
...
We already include exec/address-spaces.h and exec/memory.h in
cputlb.c; the include of qemu/timer.h appears to be a fossil.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson
3b08f0a925
cputlb: Move probe_write out of softmmu_template.h
...
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson
dea2198201
cputlb: Replace SHIFT with DATA_SIZE
...
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Alex Bennée
b67cb68ba5
linux-user: enable parallel code generation on clone
...
The variable parallel_cpus controls the generation of thread aware
atomic code. We only need to set it once we clone our first thread.
At this point any existing translations need to be thrown away.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson
fdbc2b5722
tcg: Add EXCP_ATOMIC
...
When we cannot emulate an atomic operation within a parallel
context, this exception allows us to stop the world and try
again in a serial context.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00