Commit Graph

13575 Commits

Author SHA1 Message Date
James Cowgill 50d5a9a9bc HW: Fix spelling mistake 2015-09-08 21:11:28 +01:00
flacs 48031eaff7 Merge pull request #2974 from Tilka/fprf
Jit64: fix errors in FPRF calculation
2015-09-08 18:59:22 +00:00
flacs 0c381d6547 Merge pull request #2975 from lioncash/emit
x64Emitter: Simplify/compress some conditionals
2015-09-08 18:44:54 +00:00
Ryan Houdek 5d7f834cde Add run count to the JIT profile information 2015-09-08 11:09:52 -05:00
Markus Wick 81c07d4919 Merge pull request #2990 from lioncash/noncopy
Common: Alter semantics of the NonCopyable mixin
2015-09-08 11:08:58 +02:00
Ryan Houdek 2e02de6587 Merge pull request #2998 from Sonicadvance1/GLES_BBox
[GLES] Enable bounding box support.
2015-09-08 02:38:54 -05:00
Scott Mansell 332e81d2d7 Merge pull request #2984 from JosJuice/dvdinterface-round-down
DVDInterface: Use ROUND_DOWN
2015-09-08 12:43:11 +12:00
degasus 664beea538 OGL: reimplement SSAA based on ARB_gpu_shader5
So i965 shall support it again.
2015-09-07 22:21:11 +02:00
Ryan Houdek 2ad26ab3e9 [AArch64] Fix Test&Branch to relative location instructions.
Wasn't masking by the size of the offset encoding so negative values were killing the instruction
Missed commiting this in my integer gatherpipe PR.
Fixes crashing on AArch64.
2015-09-07 13:38:58 -05:00
Ryan Houdek bfb544e1fb [GLES] Enable bounding box support. 2015-09-07 12:07:27 -05:00
Markus Wick 5585c5adc2 Merge pull request #2994 from aserna3/master
Properly implemented confirm on stop CLI switch
2015-09-07 14:00:17 +02:00
Ryan Houdek a9a339a00c Merge pull request #2962 from Sonicadvance1/aarch64_integer_gatherpipe
[AArch64] Implement integer gatherpipe writes.
2015-09-07 06:20:01 -05:00
Scott Mansell 1f800b80dd Merge pull request #2960 from phire/improve_efb2tex
Make efb2tex behave much more like efb2ram.
2015-09-07 14:12:03 +12:00
Ryan Houdek 99a7dfaf5e Merge pull request #2965 from Sonicadvance1/Android_config_changes
[Android] Fix multi-gamecube controller input, config changes
2015-09-06 20:07:32 -05:00
Anthony Serna a5d6072a45 Properly implemented confirm on stop CLI switch 2015-09-06 14:35:26 -07:00
Lioncash 1b026364bf Merge pull request #2992 from aserna3/master
Implemented CLI switch to disable confirm on stop
2015-09-06 16:55:11 -04:00
Anthony Serna ad1a8a1b31 Implemented CLI switch to disable confirm on stop 2015-09-06 13:08:29 -07:00
Lioncash ec3d55b093 EXI_DeviceIPL: Get rid of a pointer cast
Also moves the RTC updating code to its own function.
2015-09-06 14:42:43 -04:00
Lioncash 53465e329a Common: Alter semantics of the NonCopyable mixin
Uses delete to make the unimplemented functions detectable at compile time
and not link time if they are used.

The move constructor and assignment operator are removed as moves are not
copies, but transfers of ownership, which isn't suited for this class.
2015-09-06 13:45:08 -04:00
degasus 1c0366993a VideoBackends: Reimplement SSAA, now for D3D + OGL 2015-09-06 19:40:00 +02:00
Lioncash f194ee6223 x64Emitter: Simplify/compress some conditionals 2015-09-06 13:28:36 -04:00
Lioncash 623df9b5ca DiscIO/VS: Remove an empty filter 2015-09-06 13:23:33 -04:00
Scott Mansell ac467d9fb9 FifoPlayer: Don't check efb copy hashes when plaing back a broken dff 2015-09-07 05:20:25 +12:00
Lioncash 8ce04f9a65 General: Replace GC_ALIGN macros with alignas
Standard supported alignment -> out with compiler-specific.
2015-09-06 12:53:51 -04:00
Scott Mansell bda964e0b9 Workaround to allow partial texture updates to keep working in NSMBWii 2015-09-07 02:32:01 +12:00
Scott Mansell ee649c6d9f Make efb2tex behave more like efb2ram.
Instead of having special case code for efb2tex that ignores hashes,
the only diffence between efb2tex and efb2ram now is that efb2tex
writes zeros to the memory instead of actual texture data.

Though keep in mind, all efb2tex copies will have hashes of zero as
their hash.
2015-09-07 02:32:01 +12:00
Scott Mansell c08a83a5aa Merge pull request #2957 from phire/unify_efbcopy
Cleanup and unify efb copy implemtations into VideoCommon
2015-09-07 00:10:42 +12:00
JosJuice 4716c8ecf6 DolphinWX: Little simplification for game right-click menu 2015-09-06 13:33:23 +02:00
JosJuice cb3b1b6f44 DolphinWX: Don't use IsElfOrDol outside of ISOFile
It's redundant because GetPlatform can do the same thing.
2015-09-06 13:33:19 +02:00
Scott Mansell d797b5d0b5 Use ROUND_UP instead of custom bittwiddling. 2015-09-06 23:00:17 +12:00
JosJuice 5c454379ae DVDInterface: Use ROUND_DOWN 2015-09-06 12:15:01 +02:00
Scott Mansell 137856bd00 Fix palette conversions for 4 bit efb copies.
Fixes purple shadow in THPS4 and many other things.
2015-09-06 21:16:52 +12:00
Scott Mansell e745748c3e Improve comments for texture formats enum. 2015-09-06 21:16:51 +12:00
Scott Mansell b9be3245e1 Move common EFB copy code into VideoCommon
Addded a few duplicated depth copy texture formats to the enum
in TextureDecoder.h. These texture formats were already implemented
in TextureCacheBase and the ogl/dx11 texture cache implementations.
2015-09-06 21:16:51 +12:00
Scott Mansell be4caa3dc0 Merge pull request #2961 from lioncash/printf
General: Toss out PRI macro usage
2015-09-06 21:02:22 +12:00
Scott Mansell d52d8bf935 Merge pull request #2982 from lioncash/unique
Common: Remove StdMakeUnique.h
2015-09-06 21:01:48 +12:00
Lioncash 8b027f6ed7 CMakeLists: Bump C++ compilation from gnu++0x to c++1y 2015-09-06 04:10:40 -04:00
Lioncash 4fc71e9708 Common: Remove StdMakeUnique.h 2015-09-06 04:09:53 -04:00
Scott Mansell a26cac87fc Merge pull request #2959 from rohit-n/build-pch
Fix building with PCH disabled.
2015-09-06 19:59:49 +12:00
Scott Mansell 61da83182e Merge pull request #2978 from lioncash/override
Add missing override specifiers
2015-09-06 19:21:59 +12:00
Scott Mansell 728d082bc7 Merge pull request #2983 from lioncash/delete
WII_Socket: Make the copy-assignment operator deleted
2015-09-06 19:20:47 +12:00
Lioncash 007939c4e9 WII_Socket: Make the copy-assignment operator deleted 2015-09-06 03:12:01 -04:00
Lioncash f64de2006b HLE_Misc: Remove unnecessary headers 2015-09-06 02:27:04 -04:00
Lioncash 4a2e680ed2 VolumeGC: Initialize a variable
Silences an uninitialized variable warning
2015-09-05 23:05:28 -04:00
Lioncash 22635c1800 Add missing override specifiers 2015-09-05 22:40:19 -04:00
comex 96e42dff52 Merge pull request #2977 from lioncash/unused
General: Remove unimplemented function prototypes
2015-09-05 22:20:47 -04:00
Lioncash 633be0387d General: Remove unimplemented function prototypes 2015-09-05 22:01:07 -04:00
Pierre Bourdon 6af8e00b66 Merge pull request #2813 from lioncash/updating_list
CheatSearchTab: Make the search results list auto update address values
2015-09-05 23:10:06 +02:00
Pierre Bourdon 8a993e1fbc Merge pull request #2958 from JosJuice/dol-elf-banners
DolphinWX: Support banners in Homebrew Channel format
2015-09-05 23:08:07 +02:00
Lioncash 8fdb013d54 General: Toss out PRI macro usage
Now that VS supports more printf specifiers, these aren't necessary
2015-09-05 16:02:35 -04:00
Tillmann Karras 8f777cd839 Jit64: fix errors in FPRF calculation 2015-09-05 20:17:53 +02:00
booto d4344abd89 Revert "Merge pull request #2943 from booto/vi-enb"
This reverts commit 8dd80b8e97, reversing
changes made to c5979b47be.
2015-09-06 01:21:23 +08:00
Markus Wick 60e0339ae5 Merge pull request #2963 from Sonicadvance1/AArch64_improved_vertexloader
[AArch64] Minor improves to the vertex loader JIT
2015-09-05 15:44:52 +02:00
Ryan Houdek de80b9e988 Merge pull request #2971 from degasus/arm
JitArm64: fix smaller issues
2015-09-05 08:43:44 -05:00
degasus 9187200de3 Android: Abort the screenshot after 2 seconds
If the emulation is crashed, this blocks forever. So there is no way to exit the emulation within a finite time.
2015-09-05 15:29:22 +02:00
degasus 24fec3ebca JitArm64: Fix float load & store 2015-09-05 13:48:29 +02:00
degasus 36902c58eb JitArm64: Fix lwbrx and lhbrx 2015-09-05 13:48:29 +02:00
degasus 696f95d5f9 JitArm64: Fix subfic 2015-09-05 13:48:29 +02:00
degasus baa28e13f4 JitArm64: Remove FLUSH_INTERPRETER
It seems to be broken for some instructions, and there is no need for it any more.
2015-09-05 13:48:29 +02:00
Scott Mansell 52948bb3ef Cleanup and unify handling of efb copy stride. 2015-09-05 23:37:24 +12:00
Tillmann Karras a8c8f52f20 OGL: remove unused variable 2015-09-05 12:40:14 +02:00
Tillmann Karras 405554e327 Jit64: remove unnecessary indirection 2015-09-05 12:40:14 +02:00
Tillmann Karras 72eed1aa82 JitCache: drop unused method 2015-09-05 12:40:14 +02:00
Ryan Houdek 7650117c26 Properly support MSAA and SSAA as separate features(+GLES)
SSAA relies on MSAA being active to work. We only supports 4x SSAA while in fact you can enable SSAA at any MSAA level.
I even managed to run 64xMSAA + SSAA on my Quadro which made some pretty sleek looking games. They were very cinematic though.

With this, it properly fixes up SSAA and MSAA support in GLES as well. Before they were broken when stereo rendering was enabled.
Now in GLES they can properly support MSAA and also stereo rendering with MSAA enabled(with proper extensions).
2015-09-05 05:23:29 -05:00
Ryan Houdek 9618738278 Remove all of our workarounds for Qualcomm devices we don't support anymore. 2015-09-04 23:45:35 -05:00
Markus Wick e7660325b4 Merge pull request #2967 from Sonicadvance1/GLES_blend_func_extended
Support EXT_blend_func_extended in GLES.
2015-09-05 03:43:28 +02:00
Ryan Houdek 5fa4c8d930 Support EXT_blend_func_extended in GLES.
This lets us get dual source blending on GLES targets.
2015-09-04 20:25:59 -05:00
Ryan Houdek 9bb63bf2eb [Android] Fix multi-gamecube controller input, config changes 2015-09-04 20:06:01 -05:00
Ryan Houdek 6cf7048423 Implement ClearCurrent on the EGL GLInterface
This fixes an error on GLInterface shutdown when using EGL.
2015-09-04 19:58:58 -05:00
Ryan Houdek c458c9d049 [AArch64] Minor improves to the vertex loader JIT
Just some minor improvements noticed by dumping the vertex loader blocks.
2015-09-04 19:57:08 -05:00
Ryan Houdek de051dac71 [AArch64] Implement integer gatherpipe writes. 2015-09-04 19:52:25 -05:00
Ryan Houdek 791c7d5a84 [AArch64] Clean up bogus vector FCVT{N,L} instruction usage.
Replace the instruction with the scalar variant FCVT instruction.
FCVT{N,L} 8 cycles latency on the Cortex A57
FCVT has five cycle latency and slightly higher throughput

On the A72 all three of these instructions will have three cycle latency,
While FCVT{N,L} will have half the throughput.
2015-09-04 19:41:54 -05:00
Ryan Houdek 2c68f6bfc5 [AArch64] Implement Fiora's preemptive paired loadstore optimization.
This provides a decent speed up in pretty much everything that touches pair loadstores because in most cases they are just regular non-quantizing
float loadstores that happen.
2015-09-04 19:20:33 -05:00
Lioncash e01428935f Merge pull request #2954 from lioncash/snprintf
CommonFuncs: Remove define for snprintf
2015-09-04 19:11:38 -04:00
Lioncash e90eb17aeb Merge pull request #2956 from JosJuice/extra-space
Remove extra space from 5a32c3f
2015-09-04 14:22:59 -04:00
JosJuice 41315b19f1 DolphinWX: Support banners in Homebrew Channel format
HBC uses files named icon.png for icons. This change makes Dolphin
support that file name, and also [executable file name].png
in case someone wants to have multiple files in one folder.

The HBC banner support is mainly intended for DOL and ELF files,
but it can also be used to override banners of disc images,
something that wasn't possible in the past.

There are currently issues with banner scaling not preserving
the aspect ratio and looking bad in general.
2015-09-04 19:08:30 +02:00
Rohit Nirmal 8aed7589ae Fix building with PCH disabled. 2015-09-04 10:34:45 -05:00
JosJuice 0af2bbcea3 Remove extra space from 5a32c3f 2015-09-04 15:32:30 +02:00
Markus Wick 7ada372ed9 Merge pull request #2944 from degasus/arm
JitArm64: Cleanup floating point regcache
2015-09-04 13:14:29 +02:00
booto 97f55c0cc9 VI: Less log spam in Release build 2015-09-04 17:08:19 +08:00
Lioncash 0d0dd075ef CommonFuncs: Remove define for snprintf
VS 2015 implements snprintf
2015-09-04 03:13:02 -04:00
Lioncash a11ae2cf30 CommonFuncs: Remove SLEEP macro
There's already a function in Thread for this.
2015-09-04 02:43:38 -04:00
shuffle2 4218fb4eea Merge pull request #2916 from lioncash/wx
DolphinWX: Minor changes to Main
2015-09-03 22:59:29 -07:00
shuffle2 9a92ff5238 Merge pull request #2926 from lioncash/wx-mc
MemcardManager: Remove explicit delete and new
2015-09-03 22:58:06 -07:00
shuffle2 272302be82 Merge pull request #2950 from lioncash/bf
BitField: Enable ifdef'd out code for Windows
2015-09-03 22:56:55 -07:00
shuffle2 a09b9bef8d Merge pull request #2952 from lioncash/constexpr
CommonFuncs: Replace ArraySize define with constexpr equivalent
2015-09-03 22:56:25 -07:00
Lioncash 3f1b488a12 CommonFuncs: Replace ArraySize define with constexpr equivalent 2015-09-03 23:47:14 -04:00
Lioncash 102a2a975d BitField: Enable ifdef'd out code for Windows 2015-09-03 22:06:15 -04:00
Pierre Bourdon 8dd80b8e97 Merge pull request #2943 from booto/vi-enb
VI: Respect DisplayControlRegister ENB bit
2015-09-04 03:50:39 +02:00
Lioncash 4fd060ba11 Core: Use constexpr for default pad and attachment radius 2015-09-03 19:44:42 -04:00
Rukario e939fba3e7 Updated terms in Netplay window. 2015-09-03 07:36:52 -07:00
Shawn Hoffman 399083ac8a Drop the old msvcrt files. 2015-09-03 06:10:01 -07:00
Shawn Hoffman 66a3951c3b [windows] Add workaround(HACK) for vs2015 implementating a conformant std::is_trivially_copyable...
see https://github.com/dolphin-emu/dolphin/pull/2218
2015-09-03 04:39:06 -07:00
Shawn Hoffman bea18eedc4 [windows] remove various workarounds which were required for vs2013 2015-09-03 04:39:05 -07:00
Shawn Hoffman 30702c17b6 [windows] When making scmrev.h, also look for msysgit explicitly. VS2015 packages it. 2015-09-03 04:39:04 -07:00
Shawn Hoffman aa7208e270 [windows] Update projects to vs2015. 2015-09-03 04:23:01 -07:00
Markus Wick ad978122d9 Merge pull request #2942 from booto/xfb_lines
VideoCommon: xfb height calculation adjusted
2015-09-03 09:40:24 +02:00
Scott Mansell a1538a30ef Merge pull request #2941 from lioncash/gp
GPFifo: Remove pointer casts
2015-09-03 13:47:26 +12:00
Lioncash 2d224bd3b1 ActionReplay: Remove an alloca call 2015-09-02 17:41:19 -04:00
degasus 5797111ef0 JitArm64: Optimize fpr.R() 2015-09-02 22:46:14 +02:00
degasus dfd44730c8 JitArm64: simplify fpr call 2015-09-02 22:46:14 +02:00
booto 28d788ba2c VI: Respect DisplayControlRegister ENB bit
When ENB is set to 0 (default), VI should not
generate clocks, and so shouldn't generate
output.
2015-09-03 04:13:32 +08:00
booto 5a32c3fba4 VideoCommon: xfb height calculation adjusted
Baten Kaitos allocates its XFBs from a tagged heap
structure. With the old calculation, too many lines
were being written so the tag of the allocation
after the XFB was being corrupted. Fixes crash
mentioned in this comment:
https://code.google.com/p/dolphin-emu/issues/detail?id=7734#c6
2015-09-03 03:57:03 +08:00
Lioncash f32b79e612 GPFifo: Get rid of pointer casts 2015-09-02 15:24:33 -04:00
Lioncash db98efdc98 GPFifo: Adjust parameter names 2015-09-02 15:20:02 -04:00
Ryan Houdek 3b65c4070c Merge pull request #2935 from Sonicadvance1/GLES_palette_conversion
[GLES] Support texture_buffer for palette texture conversion.
2015-09-02 10:11:41 -05:00
Scott Mansell ecbb83fa0f Merge pull request #2686 from booto/field-timing
VI: derive field timing from VI registers
2015-09-03 01:09:43 +12:00
Markus Wick 66e9395d5d Merge pull request #2937 from lioncash/enums
VideoCommon: Convert some defines into enums
2015-09-02 13:27:09 +02:00
Rohit Nirmal 9c26bb0e17 Fix building with PCH disabled. 2015-09-01 13:45:23 -05:00
flacs 3b134497dd Merge pull request #2774 from AdmiralCurtiss/wiimote-extension-reconnect-on-button-press
Wiimote: Extend emulated Wiimote reconnect-on-button-press to attachments.
2015-09-01 18:31:39 +02:00
Lioncash 71ef0a0245 PixelShaderGen: Use spaces instead of tabs for vertical alignment 2015-09-01 12:20:50 -04:00
Lioncash 91eff28699 PixelShaderGen: Move defines into the implementation file
These aren't used outside of it. This also reduces the amount of things in
the global namespace.
2015-09-01 12:18:18 -04:00
Lioncash c760ffbd28 BPMemory/XFMemory: Convert defines to enums
These actually convey a concrete type, as well as also providing a
symbolic constant during debugging.
2015-09-01 12:07:10 -04:00
booto f6e4a8e680 FifoPlayer: Use VI derived timing, not hardcoded 60Hz 2015-09-01 20:24:42 +08:00
booto 8d6c39a89d VI: Adjust forced-progressive hack per magumagu's suggestion 2015-09-01 20:24:41 +08:00
booto acc9a74174 VI: Restore forced-progressive hack with option
Bugfix: TargetRefreshRate uses rounded result
NTSC's 59.94 was becoming 59 with integer division.
2015-09-01 20:24:40 +08:00
booto 480dbb22f2 VI: derive field timing from VI registers 2015-09-01 20:24:40 +08:00
Ryan Houdek 7a35f9285b [GLES] Support texture_buffer for palette texture conversion.
OpenGL ES 3.2 adds this feature to core
It was available to GLES 3.1 as GL_{EXT, OES}_texture_buffer as well.
For the non-Nvidia vendors that implemented this is:
      - Qualcomm's Adreno 4xx
      - IMGTec's PowerVR Rogue
2015-09-01 05:41:03 -05:00
Lioncash 222b33f0a3 VolumeCreator: Fix a typo in VolumeKeyForPartition's name 2015-08-31 20:01:51 -04:00
Lioncash 1db1a8aacf VolumeCreator: Use a unique_ptr in CreateVolumeFromFilename 2015-08-31 20:01:44 -04:00
flacs b9ea9c05ad Merge pull request #2931 from lioncash/mem
VertexLoader_Color: Remove some pointer casts
2015-09-01 00:32:43 +02:00
Lioncash f7e22c8126 VertexLoader_Color: Mark translation-unit-local functions static 2015-08-31 17:31:23 -04:00
Lioncash ec42be79f3 VertexLoader_Color: Get rid of some pointer casts 2015-08-31 17:31:11 -04:00
Ryan Houdek ae0a06a018 [AArch64] Implement dcbz instruction 2015-08-31 15:39:47 -05:00
Ryan Houdek d495ad5104 [AArch64] Make TST reg, reg emitter alias 2015-08-31 14:03:32 -05:00
Ryan Houdek 0f54aa48b4 Merge pull request #2928 from Sonicadvance1/aarch64_improved_singles
[AArch64] Improve floating point single instructions.
2015-08-31 12:00:08 -05:00
Ryan Houdek bcde1aa8ff [AArch64] Improve floating point single instructions.
Instead of having an "INS" instruction after every single instruction to duplicate the bottom 64bits in to the top 64bits of the register,
create a new FPR register cache type to track when a register's lower 64bits is supposed to be duplicated in to the high 64bits.
Not necessarily actually having the lower bits duplicated in the host side register. This removes inefficient INS instructions from sequential single
float instructions.
In particular a very heavy single heavy block in Animal Crossing went from 712 instructions down to 520 instructions(~37% less instructions!)
2015-08-31 11:09:17 -05:00
Ryan Houdek d003934b8a Merge pull request #2929 from Sonicadvance1/aarch64_optimize_gpr_flush
Aarch64 optimize gpr flush
2015-08-31 10:55:45 -05:00
Ryan Houdek 8bf332cf08 [AArch64] Optimize GPR cache flushing.
If we are flushing multiple sequential guest GPRs then we can store two in a single STP instruction.
Ikaruga does this quite a bit in their blocks where they do an lmw at the very end and then we have to flush them all.
Typically cuts 16 STR instructions down to 8 STP instructions there.
2015-08-30 23:07:12 -05:00
Ryan Houdek f2c17436ab [AArch64] Fix issue in emitter.
Loadstore pairs support only signed offsets, not unsigned.
2015-08-30 23:05:59 -05:00
Scott Mansell 368867dba0 Merge pull request #2922 from aserna3/SDBlock
Implemented ability to block writes to the SD card
2015-08-31 04:51:50 +12:00
Ryan Houdek b907576510 [AArch64] Support profiling by cycle counters if they are available to EL0 2015-08-30 10:25:16 -05:00
Ryan Houdek 5110574c1f Merge pull request #2921 from Sonicadvance1/aarch64_optimize_lmw
[AArch64] Optimize lmw.
2015-08-30 10:23:57 -05:00
Anthony Serna 0390bd61df Fixed introduced compiler warning in Linux 2015-08-29 20:41:59 -07:00
Lioncash e0aabc5f6c MemcardManager: Remove trivial explicit delete and new
Also gets rid of pointer casting.
2015-08-29 22:46:18 -04:00
Lioncash d58550e874 MemcardManager: Minor cleanup of header code 2015-08-29 05:19:51 -04:00
Lioncash 0f3e4c50e1 MemcardManager: Correct class indentation 2015-08-29 05:13:20 -04:00
Lioncash 072150589e Merge pull request #2924 from lioncash/scope
Hash: Narrow define scope
2015-08-29 03:12:18 -04:00
Lioncash e7c7dcaa1f Merge pull request #2923 from lioncash/override
Jit_Util: Add missing override specifiers
2015-08-29 03:12:11 -04:00
Lioncash 310bb46967 Hash: Narrow define scope 2015-08-29 02:57:35 -04:00
Markus Wick a16669231a Merge pull request #2917 from Sonicadvance1/android_fix_sgs6
[Android] Workaround Mali driver issue on the Samsung Galaxy S6.
2015-08-29 08:56:32 +02:00
Lioncash df19f11cb9 Jit_Util: Add missing override specifiers 2015-08-29 00:30:18 -04:00
Anthony Serna db7fe9507e Implemented ability to block writes to the SD card
Renamed variable to be more accurate
2015-08-28 17:32:29 -07:00
Markus Wick 6004ecc521 Merge pull request #2920 from rohit-n/build-pch
Fix building with PCH disabled.
2015-08-28 23:08:24 +02:00
Ryan Houdek 8d61706440 [AArch64] Optimize lmw.
This instruction is fairly heavily used by Ikaruga to load a bunch of registers from the stack.
In particular at the start of the second stage is a block that takes up ~20% CPU time that includes a usage of lmw to load half of the guest
registers.

Basic thing optimized here is changing from a single 32bit LDR to potentially a single 128bit LDR.
a single 32bit LDR is fairly slow, so we can optimize a few ways.
If we have four or more registers to load, do a 64bit LDP in to two host registers, byteswap, and then move the high 32bits of the host registers in
to the correct mapped guest register locations.
If we have two registers to load then do a 32bit LDP which will load two guest registers in a single instruction.
and then if we have only one register left to load, load it as before.

This saves quite a bit of cycles since the Cortex-A57 and A72's LDR instruction takes a few cycles.

Each 32bit LDR takes 4 cycles latency, plus 1 cycle for post-index(which typically happens in parallel.
Both the 32bit and 64bit LDP take the same amount of latency.

So we are improving latencies and reducing code bloat here.
2015-08-28 14:40:30 -05:00
Ryan Houdek 2c3fa8da28 [AArch64] Fix a bug in the register caches.
This is a bug that crops if BindToRegister() is called multiple times in a row without a R() function call between them.
How to reproduce the bug:
1) Have a completely filled cache with no host register remaining
2) Call BindToRegister() with different guest registers
3) Don't call R() between the BindToRegister() calls.

This issue typically wouldn't be seen for a couple of reasons. Typically we have /plenty/ of registers in the cache, and in most cases we only call
BindToRegister() once per instruction. In the off chance that it is called multiple times, it wouldn't update the last used counts and would flush the
same register as the previous call to it.
2015-08-28 14:36:14 -05:00
Rohit Nirmal 6252d2d71a Fix building with PCH disabled. 2015-08-28 14:13:28 -05:00
Lioncash a6bd2fea28 Merge pull request #2919 from lioncash/vec
Vec3: Remove a memset call on the this pointer.
2015-08-28 15:05:02 -04:00
Lioncash e787501528 Vec3: Simplify operator== code 2015-08-28 14:46:40 -04:00
Markus Wick b11de5bddb Merge pull request #2918 from lioncash/memcpy
DataReader: Get rid of pointer casts
2015-08-28 20:45:15 +02:00