flacs
d373dd372d
Merge pull request #2913 from Tilka/fix_warning_fix
...
AVIDump: fix -Wsign-compare warning
2015-08-27 23:50:34 +02:00
Ryan Houdek
d9b18862f3
Merge pull request #2908 from Sonicadvance1/gles_3_2
...
Support OpenGL ES 3.2.
2015-08-26 18:19:17 -05:00
Ryan Houdek
447b1b09e3
Support OpenGL ES 3.2.
...
OpenGL ES 3.2 adds a few things we care about supporting in core. In particular:
- GL_{ARB,EXT,OES}_draw_elements_base_vertex
- KHR_Debug
- Sample Shading
- GL_{ARB,EXT,OES,NV}_copy_image
- Geometry shaders
- Geometry shader instancing (If they support GL_{EXT,OES}_geometry_point_size)
Nvidia was the first to release an OpenGL ES 3.2 driver which I uesd to test this on.
This also enables GS Instancing on GLES 3.1 hardware if it supports all of the required extensions.
2015-08-26 17:57:51 -05:00
Ryan Houdek
6d25c469cf
Merge pull request #2915 from degasus/arm
...
JitArm64: Implement rlwnmx
2015-08-26 15:52:37 -05:00
Markus Wick
54f882704a
Merge pull request #2914 from JosJuice/fix-volumedirectory
...
Fix VolumeDirectory
2015-08-26 22:12:23 +02:00
degasus
e516d4ef59
JitArm64: Implement rlwnmx
2015-08-26 21:59:10 +02:00
JosJuice
d276d1abbb
Fix VolumeDirectory
...
Fixes the regression from a225426
and clarifies a related comment.
2015-08-26 19:21:09 +02:00
Markus Wick
3e9dac3910
Merge pull request #2810 from Sonicadvance1/disassembler_improv
...
Have the disassembler show the PC next to host instructions.
2015-08-26 17:01:39 +02:00
flacs
99e88a7af7
Merge pull request #2887 from Tilka/swap
...
Jit64: some byte-swapping changes
2015-08-26 16:43:45 +02:00
flacs
eb6ac641be
Merge pull request #2906 from Tilka/fpscr
...
Jit64: fix bugs in the FPSCR instructions
2015-08-26 16:43:28 +02:00
Tillmann Karras
6ec4bdf862
CoreTiming: remove unused functions
2015-08-26 15:40:15 +02:00
Tillmann Karras
0f4861cac2
CoreTiming: make loops easier to read
2015-08-26 14:53:58 +02:00
Ryan Houdek
ca51f1a4f6
[AArch64] Optimize paired registers being used in double operations.
...
In particular this optimizes the case where a 32bit float is loaded via lfs, and then used in double operations.
This happens very often in Gekko based code because the best way to load a 32bit value as a double is lfs since it automatically turns in to a double value.
There are a few other implications of this in practice as well. Like if both of the paired registers are loaded via psq_l and then used in double
operations it would be improved.
Also if we implement a double register we've got to be careful to make sure we understand if it is in "lower" register or the full 128bit register.
2015-08-26 05:50:04 -05:00
Markus Wick
5716d18d10
Merge pull request #2910 from Sonicadvance1/aarch64_regcache_fix
...
[AArch64] Fix a bug in the register cache.
2015-08-26 08:31:24 +02:00
Ryan Houdek
4f5f29a0fb
[AArch64] Fix a bug in the register cache.
...
If the register was only a lower pair and it needed the full register, then we need to load the high 64bits.
Which we weren't doing before.
2015-08-26 01:21:43 -05:00
Markus Wick
43d17cb360
Merge pull request #2904 from Sonicadvance1/aarch64_more_inst
...
[AArch64] Implement fdivx/fdivsx/mfcr/mtcrf.
2015-08-26 07:48:24 +02:00
Tillmann Karras
ee4a12ffe2
Jit64: some byte-swapping changes
2015-08-26 05:41:18 +02:00
flacs
6015e2d812
Merge pull request #2900 from aroulin/x64emitter-rcp
...
x64Emitter: add RCPPS and RCPSS SSE instructions
2015-08-26 05:05:53 +02:00
Ryan Houdek
6729a36d8d
[AArch64] Set BindToRegister's to_load correctly for double FP ops.
2015-08-25 21:29:27 -05:00
Lioncash
db4f692482
GCMemcard: Clean up memcard logging messages.
2015-08-25 21:55:52 -04:00
Tillmann Karras
ee50a2ef28
Jit64: fix bugs in the FPSCR instructions
2015-08-25 23:48:14 +02:00
Markus Wick
bd08c1b01a
Merge pull request #2901 from Sonicadvance1/aarch64_stfiwx
...
[AArch64] Implement stfiwx
2015-08-25 22:47:39 +02:00
Markus Wick
24cb650078
Merge pull request #2663 from degasus/dcbx
...
Jit64: dcbf + dcbi
2015-08-25 12:16:56 +02:00
Ryan Houdek
0666c0750b
[AArch64] Implement fdivx/fdivsx/mfcr/mtcrf.
...
Gets the povray bench to better times than the Wii.
2015-08-24 15:32:19 -05:00
Ryan Houdek
d96be9250c
Merge pull request #2899 from Sonicadvance1/aarch64_fctiwzx
...
[AArch64] Implement fctiwzx
2015-08-24 13:22:27 -05:00
Ryan Houdek
cd03b8baf6
Merge pull request #2895 from Sonicadvance1/qualcomm_workaround_gles31
...
Disable OpenGL ES 3.1 on all Qualcomm Adreno devices.
2015-08-24 13:22:12 -05:00
degasus
0d92c8fb89
Jit64: Optimize dcbx
2015-08-24 18:33:23 +02:00
Tillmann Karras
ac84d6d0fa
Jit64: some cache flush changes
...
- dynamically allocate third scratch register instead of forcing ECX
- use LEA as 3 operand add if possible
- use BT,JC instead of SHR,TEST,JNZ
- merge MOV,TEST
- use appropriate ABI function (no asm change)
2015-08-24 18:33:23 +02:00
degasus
6f34b27323
Jit64: implement dcbf + dcbi
2015-08-24 18:33:19 +02:00
Markus Wick
0ad6fa8f62
Merge pull request #2903 from lioncash/cast
...
Memmap: Remove pointer casts
2015-08-24 15:42:56 +02:00
Lioncash
abd3b124be
Memmap: Remove pointer casts
2015-08-24 09:07:09 -04:00
Tillmann Karras
33eefc2d86
Jit64: quickfix for mtfsfx
2015-08-24 12:12:31 +02:00
Ryan Houdek
d3176fe22a
[AArch64] Implement stfiwx
...
Improves povray performance by ~4%
2015-08-24 01:10:55 -05:00
Ryan Houdek
80fa9af9b1
Merge pull request #2898 from degasus/linking
...
JitArm64: Faster linking of continuous blocks
2015-08-23 18:09:02 -05:00
degasus
7320d519b4
JitArm64: Implement srwx
2015-08-23 23:29:48 +02:00
degasus
4722a69fd0
JitArm64: Implement divwux
2015-08-23 23:29:18 +02:00
degasus
9e4366963c
JitArm64: Implement subfic
2015-08-23 23:29:07 +02:00
degasus
95be17772f
JitArm64: Implement addex
2015-08-23 23:29:02 +02:00
degasus
025e7c835a
JitArm64: Implement subfcx
2015-08-23 23:28:28 +02:00
degasus
550a90e691
JitArm64: Implement subfex
2015-08-23 23:28:24 +02:00
Ryan Houdek
561744819e
[AArch64] Implement fctiwzx
...
Improves the povray benchmark time by 5.6%
2015-08-23 15:35:18 -05:00
Ryan Houdek
4fa23abbe1
[AArch64] Implement MOVI and ORR(imm) in the NEON emitter.
2015-08-23 15:34:53 -05:00
aroulin
0a0e012fab
x64Emitter: add RCPPS and RCPSS SSE instructions
2015-08-23 16:59:27 +02:00
degasus
77a6798094
JitArm64: Faster linking of continuous blocks
2015-08-23 14:44:23 +02:00
Markus Wick
73067b1ef1
Merge pull request #2888 from degasus/jit64
...
Jit64: Faster linking of continuous blocks
2015-08-23 13:24:15 +02:00
Lioncash
2a1abf8dd6
Merge pull request #2896 from lioncash/using
...
Core: Minor CPU core typedef cleanup
2015-08-22 19:00:23 -04:00
Ryan Houdek
cc3fb7e7b4
Merge pull request #2883 from degasus/master
...
Profiler: Sort output by total time
2015-08-22 17:52:54 -05:00
Markus Wick
8b881a6c34
Merge pull request #2891 from Sonicadvance1/aarch64_implement_crxxx
...
[AArch64] Implement the cr instructions
2015-08-23 00:44:47 +02:00
Lioncash
fdafa5d063
Core: Move includes out of instruction table headers
...
These aren't necessary (and cause unnecessary indirect inclusions).
2015-08-22 14:15:02 -04:00
Lioncash
a248a4d2ce
Jit64/JitIL: Relocate instruction typedefs
2015-08-22 14:15:00 -04:00