Fiora
d2e004fa9e
Use CRC to output 64 bits instead of 32
...
A bit hacky, but should dramatically reduce the odds of hash collision.
2014-10-18 00:24:35 -07:00
Fiora
15a4bccb73
Hash: unroll CRC loop, since CRC32 typically has nontrivial latency
...
Seems to be about 20-30% faster texture cache hashing on my machine.
2014-10-17 15:39:08 -07:00
Stevoisiak
ed5e698511
Minor spelling fix
2014-10-17 15:51:19 -04:00
skidau
9ddbdeb39f
Merge pull request #995 from FioraAeterna/fma
...
Add FMA support to emitter and use it in the JIT
2014-10-12 13:56:18 +11:00
Henrik Rydgård
877081c7df
Be consistent with braces.
2014-10-10 22:34:03 +02:00
Henrik Rydgård
9bca1a00d7
x64 emitter: Add some more missing ops (MOVDQA, MOVDQU, PSHUFHW)
...
Also constify some pointers.
2014-10-10 18:30:05 +02:00
Henrik Rydgård
a2c46665c5
x64 emitter: Add a few missing instructions
2014-10-10 18:30:04 +02:00
Fiora
019657cd93
X64Emitter: add FMA3 support
2014-10-07 18:21:07 -07:00
skidau
b3b34d16e6
Merge pull request #1218 from hthh/trampolinecaching
...
JIT: reuse trampolines when possible
2014-10-07 13:26:23 +11:00
skidau
8fdf43109f
Merge pull request #1216 from FioraAeterna/movoptimizations
...
Add more AVX support, refactor emitter, reduce redundant XMM moves
2014-10-07 13:25:28 +11:00
hthh
c7208318fb
JIT: Reuse trampolines when possible
2014-10-05 15:03:11 +11:00
Fiora
7a2dd3a3c6
x64Emitter: refactor, add some new AVX instructions
2014-10-03 10:05:10 -07:00
Fiora
85547d94be
JIT: properly remove FIFO write addresses when code is invalidated
...
Fixes a bug caused by interaction with carry optimizations; might fix other
issues too.
2014-09-30 01:00:23 -07:00
comex
a9b4016cd3
Merge pull request #1166 from FioraAeterna/flaglocking
...
JIT+Emitter: support locking flags
2014-09-30 02:57:53 -04:00
comex
2eebdff01b
Remove useless STACKALIGN macro.
...
It only ever did anything on 32-bit OS X.
Anyway, it wasn't even on the right functions, and these days
ABI_PushRegistersAndAdjustStack should handle maintaining the ABI
correctly.
2014-09-30 01:42:47 -04:00
Fiora
c102fed36a
GekkoDisassembler: show W and I in psq_l/psq_st disassembly
2014-09-28 17:01:35 -07:00
Fiora
ac1fc9ad03
JIT+Emitter: support locking flags
...
This helps us avoid accidentally clobbering flags between two instructions
when the flags are expected to be maintained. Dolphin will of course crash
immediately, but at least it will crash loudly and alert us of the mistake,
instead of forcing hours of bisecting to find the subtle way in which the JIT
has managed to sneak a flag-modifying instruction where there shouldn't be one.
2014-09-26 20:47:06 -07:00
comex
fb3d9c9d58
Fix warning in x64CPUDetect.cpp in generic build by not building it.
2014-09-25 18:48:00 -04:00
Rohit Nirmal
3168361e32
Android: Silence some more warnings.
2014-09-22 17:45:42 -04:00
Ryan Houdek
9206dd016e
Merge pull request #1135 from FioraAeterna/twidisasmfix
...
Disassembler: fix disassembly of some twi instructions
2014-09-21 14:19:05 -05:00
Fiora
9c4407fb80
Disassembler: fix disassembly of some twi instructions
2014-09-21 08:17:41 -07:00
Tony Wasserka
6d4fd54683
ChunkFile: Add a DoArray overload which takes an std::array.
...
This is inconsistent with how other containers are used (i.e. with Do()), but making std::array be used with Do() seems rather confusing when there's also a DoArray available.
2014-09-21 10:38:22 +02:00
Ryan Houdek
eb23882398
Merge pull request #1120 from rohit-n/muh-precompiled-headers
...
Fix build failing when disabling precompiled headers.
2014-09-19 17:43:42 -05:00
Rohit Nirmal
46057db37d
Fix build failing when disabling precompiled headers.
2014-09-19 18:17:51 -04:00
Ryan Houdek
522d7eb275
Merge pull request #1109 from FioraAeterna/ps_cmp
...
JIT: add ps_cmp0/ps_cmp1/ps_res/ps_rsqrte
2014-09-19 14:41:05 -05:00
Fiora
3c49200b22
X64Emitter: add MOVHLPS/MOVLHPS
2014-09-18 17:57:27 -07:00
Ryan Houdek
7608e3f11e
Add AArch64 emitter aliases for MOV and MVN.
2014-09-18 16:30:40 -05:00
comex
7ad9027593
Be pedantic about stack overflow on Linux and OS X.
...
Add some magic to the fault handler to handle stack overflow due to BLR
optimization, and disable the optimization if fastmem is not enabled.
2014-09-17 20:08:09 -04:00
Fiora
d3dee1d7ed
GekkoDisassembler: fix some float opcodes
2014-09-16 02:06:40 -07:00
skidau
8361d2b1da
Merge pull request #805 from FioraAeterna/storerefactor
...
JIT: support immediate stores
2014-09-16 13:31:39 +10:00
Dolphin Bot
bef2016909
Merge pull request #1091 from FioraAeterna/fixdisasm
...
GekkoDisassembler: fix/improve disassembly for a few instructions
2014-09-16 03:53:18 +02:00
Fiora
7368c2ee9e
GekkoDisassembler: fix/improve disassembly for a few instructions
2014-09-15 18:48:54 -07:00
Fiora
d02b7c7755
JIT: support immediate stores
2014-09-15 07:25:32 -07:00
Fiora
02dce5dbbf
x64Emitter: fix silent failure if WriteNormalOp is passed two memory operands
...
Should now fail loudly and clearly instead.
2014-09-15 07:08:08 -07:00
Ryan Houdek
4e7f284a81
Merge pull request #1064 from Sonicadvance1/AArch64-Fix-MOVI2R
...
Fix AArch64 MOVI2R helper function.
2014-09-14 09:26:02 -05:00
Fiora
997c5c2d0e
x64Emitter: add LZCNT/TZCNT support and detection
...
Also add a unit test.
2014-09-14 05:31:22 -07:00
Pierre Bourdon
439068acae
Merge pull request #1055 from FioraAeterna/smallermov
...
X64Emitter: support shorter mov reg, imm opcodes
2014-09-14 01:57:36 +02:00
Lioncash
a92003c1ab
ARM64: Make getters within ArithOption const.
2014-09-12 20:55:26 -04:00
Ryan Houdek
17d31ecd6c
Fix AArch64 MOVI2R helper function.
...
In the case of a zero immediate, it wouldn't generate code at all.
Also in the case of max u32/u64, use ORN to optimize it.
2014-09-12 05:45:10 -05:00
Ryan Houdek
5061a33c29
Merge pull request #1051 from Sonicadvance1/ARM-Common
...
Include a missing include in the ARM emitter's common code.
2014-09-11 21:12:21 -05:00
Fiora
18d83a310e
X64Emitter: support shorter mov reg, imm opcodes
...
Also refactor WriteNormalOp a little bit and add comments.
2014-09-11 11:40:30 -07:00
Lioncash
b06ec302d1
Remove some unnecessary semicolons
2014-09-11 13:05:31 -04:00
Fiora
5726e0cdfb
JIT: use XCHG in MOVTwo
...
Roughly the same speed or slightly faster depending on CPU; mostly just cleaner
since we don't have to pass in a temp.
2014-09-10 22:17:38 -07:00
Ryan Houdek
44baab30cf
Include a missing include in the ARM emitter's common code.
2014-09-10 20:39:19 -05:00
Ryan Houdek
24f6c98a55
Add sign extending aliases to the ARM64Emitter.
2014-09-10 17:52:54 -05:00
Ryan Houdek
71cb09f1ca
Merge pull request #1027 from rohit-n/change-include
...
Include CommonTypes.h instead of Common.h.
2014-09-10 00:35:16 -05:00
Ryan Houdek
09c1ad1631
Merge pull request #753 from FioraAeterna/integeropts
...
JIT64: various integer optimizations
2014-09-09 04:10:30 -05:00
Ryan Houdek
f09cb723c5
Merge pull request #1044 from lioncash/pedantry
...
Common: Fix code styling in Arm64Emitter
2014-09-08 23:29:19 -05:00
Ryan Houdek
af732dea39
Merge pull request #1043 from lioncash/unused
...
Common: Remove unused variable in MemoryMap_Setup
2014-09-08 22:46:04 -05:00
Lioncash
bc331ee809
Common: Fix code styling in Arm64Emitter
2014-09-08 23:39:20 -04:00
Ryan Houdek
ed476c997c
Fix Generic build from AArch64 merge.
...
I had missed this file and hadn't tested the branch on my new build system.
2014-09-08 22:24:23 -05:00
Fiora
94c20db369
Rename Log2 and add IsPow2 to MathUtils for future use
...
Also remove unused pow2/pow2f functions.
2014-09-08 20:15:45 -07:00
skidau
0926f1d344
Merge pull request #897 from Sonicadvance1/AArch64-jit
...
Initial AArch64 JIT
2014-09-09 12:34:58 +10:00
Lioncash
22800dc711
Common: Remove unused variable in MemoryMap_Setup
2014-09-08 21:44:03 -04:00
Rohit Nirmal
fbc64984ca
Include CommonTypes.h instead of Common.h.
2014-09-08 15:39:58 -04:00
comex
7fb6628789
Merge pull request #1024 from comex/abi-cleanup
...
ABI cleanup
2014-09-08 01:03:36 -04:00
comex
4dc090643d
Remove ABI_AlignStack/ABI_RestoreStack and the noProlog option to ABI_CallFunctionRR.
...
The latter being true was the only case where the former would do
anything, and it was never true. They became obsolete with x86's
removal.
2014-09-08 01:00:10 -04:00
comex
c5c0b36046
Remove the inaccurately named ABI_PushAllCalleeSavedRegsAndAdjustStack (it didn't preserve FPRs!) and replace with ABI_PushRegistersAndAdjustStack.
...
To avoid FPRs being pushed unnecessarily, I checked the uses: DSPEmitter
doesn't use FPRs, and VertexLoader doesn't use anything but RAX, so I
specified the register list accordingly. The regular JIT, however, does
use FPRs, and as far as I can tell, it was incorrect not to save them in
the outer routine. Since the dispatcher loop is only exited when
pausing or stopping, this should have no noticeable performance impact.
2014-09-08 01:00:10 -04:00
comex
2dafbfb3ef
Improve code and clarify parameters to ABI_Push/PopRegistersAndAdjustStack.
...
- Factor common work into a helper function.
- Replace confusingly named "noProlog" with "rsp_alignment". Now that
x86 is not supported, we can just specify it explicitly as 8 for
clarity.
- Add the option to include more frame size, which I'll need later.
- Revert a change by magumagu in March which replaced MOVAPD with MOVUPD
on account of 32-bit Windows, since it's no longer supported. True,
apparently recent processors don't execute the former any faster if the
pointer is, in fact, aligned, but there's no point using MOVUPD for
something that's guaranteed to be aligned...
(I discovered that GenFrsqrte and GenFres were incorrectly passing false
to noProlog - they were, in fact, functions without prologs, the
original meaning of the parameter - which caused the previous change to
break. This is now fixed.)
2014-09-08 00:58:56 -04:00
Lioncash
a38093729e
Common: Inline declare some loop variables in ArmEmitter
2014-09-07 00:26:26 -04:00
Ryan Houdek
2b06257e16
Beginning of the AArch64 JIT branch.
...
This is the bare minimum required to run a few games on AArch64.
Was able to run starfield and Animal Crossing to the Nintendo logo.
QEmu emulation is literally the slowest thing in the world, it maxes out at around 12mhz on my Core i7-4930MX.
2014-09-06 20:14:52 -05:00
Ryan Houdek
f107b5e176
[AArch64-emitter] Initial work on a emitter for 64bit ARM.
...
I've tested a few instruction encodings and am expecting most to work as long as one stays away from VFP/SIMD.
This implements mostly instructions to bring up an initial JIT with integer support.
This can be improved to allow ease of use functions in the future, dealing with the raw imms/immr encodings is probably the worst thing ever.
2014-09-06 20:13:44 -05:00
shuffle2
9302218a19
Merge pull request #851 from lioncash/logg
...
Common: Kill off duplicate log warning definitions
2014-09-06 12:35:19 -07:00
Ryan Houdek
01b90c1007
Fix ArmEmitter's asserts from failing to compile.
...
Changed them all from debug asserts to regular asserts, since they shouldn't only be run at debug time.
2014-09-06 15:11:39 -04:00
Lioncash
690ed8580c
Common: Kill off duplicate log warning definitions
...
Also embed the log checks rather than using macros
2014-09-06 15:11:29 -04:00
shuffle2
85fd8c2bec
Merge pull request #983 from lioncash/lol-str
...
Common: Fix a potential infinite loop in ReplaceAll
2014-09-06 12:00:23 -07:00
shuffle2
1b23432d34
Merge pull request #990 from rohit-n/fix-formatting
...
Fix formatting
2014-09-06 11:54:17 -07:00
comex
6c382f6627
Merge pull request #926 from comex/ppcstate-reg
...
PowerPCState register (and rationalize register usage, and add some registers to replace it)
2014-09-06 13:24:38 -04:00
comex
6fd0333c14
Symbolicize explicit uses of x86 registers where possible (GPRs only for now).
...
Uses are split into three categories:
- Arbitrary (except for size savings) - constants like RSCRATCH are
used.
- ABI (i.e. RAX as return value) - ABI_RETURN is used.
- Fixed by architecture (RCX shifts, RDX/RAX for some instructions) -
explicit register is kept.
In theory this allows the assignments to be modified easily. I verified
that I was able to run Melee with all the registers changed, although
there may be issues if RSCRATCH[2] and ABI_PARAM{1,2} conflict.
2014-09-06 13:18:31 -04:00
comex
67cdb6e07a
Factor code from ABI_CallFunctionRR and GetWriteTrampoline into a helper, and fix a special case.
...
The special case is where the registers are actually to be swapped (i.e.
func(ABI_PARAM2, ABI_PARAM1); this was previously impossible but would
be ugly not to handle anyway.
2014-09-06 13:16:20 -04:00
Lioncash
1d66b1d3f4
Common: Remove HAVE_CXX11_SYNTAX define from Common.h
...
All the compilers we support have C++11 support now, so this isn't needed.
2014-09-06 11:32:19 -04:00
Rohit Nirmal
629ceaf2b1
Split some parts of UpdateBoundingBox into multiple lines. Also,
...
fix issues causing failure on Lint.
2014-09-06 09:49:27 -05:00
Rohit Nirmal
1ecb318bcc
Fix some formatting (new lines on collapsed single-line conditionals,
...
new lines for opening braces).
2014-09-06 01:23:05 -05:00
lioncash
3e0c04a83e
Common: Fix a potential infinite loop in ReplaceAll
...
Prior to this change, it was possible to cause an infinite loop by making the string to be replaced and the replacing string the same thing.
e.g.
std::string some_str = "test";
ReplaceAll(some_str, "test", "test");
This also changes the replacing in a way that doesn't require starting from the beginning of the string on each replacement iteration.
2014-09-05 15:12:17 -04:00
Fiora
07e0c917c6
Revert "JIT64: optimize CA calculations"
2014-09-05 10:26:30 -07:00
comex
97420c6ec6
Merge pull request #852 from FioraAeterna/optimizeca
...
JIT64: optimize CA calculations
2014-09-05 11:52:02 -04:00
comex
aa1df21bb6
Merge pull request #947 from FioraAeterna/rsqrte
...
JIT: implement frsqte
2014-09-05 11:48:00 -04:00
Lioncash
6369173981
DolphinWX: Simplify wiki link construction
2014-09-04 21:30:33 -04:00
lioncash
bd91e8b0c8
Common: Remove unused header from Thread.cpp
...
This define isn't even used in the Windows builds.
2014-09-04 09:15:18 -04:00
Rachel Bryk
345b608d64
Change IniFile::Section::Set() with default value to use a template.
2014-09-04 03:29:49 -04:00
shuffle2
4fcb633df5
Merge pull request #961 from RachelBryk/logs
...
Read the config file before enabling logs.
2014-09-03 17:20:11 -07:00
Rachel Bryk
22d2c7d053
Read the config file before enabling logs.
2014-09-03 19:50:02 -04:00
shuffle2
05cd06539b
Merge pull request #960 from lioncash/preproc-stuff
...
Common: Make TITLEID_SYSMENU a static const variable in NandPaths.h
2014-09-03 15:22:32 -07:00
lioncash
a687cc556b
Common: Make TITLEID_SYSMENU a static const variable in NandPaths.h
2014-09-03 18:03:23 -04:00
Fiora
1b50f9df14
JIT: implement fres
...
Mostly a straightforward translation of the interpreter code, with a few
tricksy optimizations and fallbacks for rare paths.
2014-09-03 12:15:30 -07:00
Fiora
c72a133206
JIT: implement frsqrte
...
Mostly a straightforward translation of the interpreter code, with a few
tricksy optimizations and fallbacks for rare paths.
2014-09-03 11:21:04 -07:00
Dolphin Bot
e1248599eb
Merge pull request #868 from FioraAeterna/bmi
...
x64Emitter: add BMI1/BMI2 support
2014-09-03 19:24:27 +02:00
lioncash
f69e6ef16f
Common: Remove unnecessary define check in Log2
2014-09-03 13:04:48 -04:00
Fiora
5088a2b4e2
x64Emitter: add BMI1/BMI2 support
...
TZCNT and LZCNT use a completely different encoding scheme, so they should
probably go in a separate patch.
Also add some tests.
2014-09-03 10:04:01 -07:00
shuffle2
db84a22109
Merge pull request #770 from lioncash/panic
...
Core: Fix case where a panic alert wouldn't be shown in MemoryUtil.cpp
2014-09-03 00:10:12 -07:00
comex
64575d565a
Merge pull request #923 from FioraAeterna/fixcallersave
...
JIT: Fix caller-save registers on WIN64
2014-09-03 02:27:44 -04:00
shuffle2
532b7bb7da
Merge pull request #893 from rohitnirmal/scan-build-fixes
...
Scan build fixes
2014-09-02 23:15:18 -07:00
Fiora
9e4419e786
x64Emitter: add support for shorter EAX forms of instructions
...
Should save a few bytes of code size here and there.
2014-09-02 21:52:41 -07:00
Fiora
6655c7775e
JIT: Fix callee-save registers on WIN64
2014-09-02 10:56:14 -07:00
Pierre Bourdon
e72146d19c
x64Emitter: Do not assert-fail on redundant MOVs, instead show an error log
2014-09-02 10:17:32 +02:00
Pierre Bourdon
a79ced2fc2
x64Emitter: Make it clear for both SSE to int conv that X64 regs are expected
2014-09-02 09:55:47 +02:00
Pierre Bourdon
c428c5999f
x64Emitter: UNPCKLPS/HPS are now tested
2014-09-02 09:53:00 +02:00
Pierre Bourdon
cc0b048c0b
x64Emitter: Support FLD/FSTP with 80 bits operands
2014-09-02 09:52:59 +02:00
Pierre Bourdon
f99f302c91
x64Emitter: assert instead of crashing when generating MOVZX with a wrong size
2014-09-02 09:52:04 +02:00
Pierre Bourdon
b1738b60fc
x64Emitter: Fix MUL with AH/BH/CH/DH registers.
2014-09-02 09:52:04 +02:00