Commit Graph

12079 Commits

Author SHA1 Message Date
comex 683191b6c6 Merge pull request #892 from comex/oh-the-abstraction
Optimize PointerWrap.
2014-08-28 17:28:16 -04:00
comex faa2666393 PointerWrap currently checks its mode for every individual byte of everything it 'does', including all of RAM. Make it not do that.
Decreases total Wii state save time (not counting compression) from
~570ms to ~18ms.

The compiler can't remove this check because of potential aliasing; this
might be fixable (e.g. by making mode const), but there is no reason to
have the code work in such a braindead way in the first place.

- DoVoid now uses memcpy.
- DoArray now uses DoVoid on the whole rather than Doing each element
(would fail for an array of STL structures, but we don't have any of
those).
- Do also now uses DoVoid.  (In the previous version, it replicated
DoVoid's code in order to ensure each type gets its own implementation,
which for small types then becomes a simple load/store in any modern
compiler.  Now DoVoid is __forceinline, which addresses that issue and
shouldn't make a big difference otherwise - perhaps a few extra copies
of the code inlined into DoArray or whatever.)
2014-08-28 15:35:19 -04:00
Ryan Houdek ad8fe0fb52 Merge pull request #879 from FioraAeterna/frspx
JIT64: add frspx implementation
2014-08-28 14:12:21 -05:00
Fiora c359d65dfe JIT64: add frspx implementation 2014-08-28 11:40:31 -07:00
Ryan Houdek 4a78a8a72a Merge pull request #876 from FioraAeterna/floatloadstore
JIT64: clean up and unify float load/store code
2014-08-28 13:37:27 -05:00
Dolphin Bot 359aa664e1 Merge pull request #898 from FioraAeterna/fprffix
JIT: make fprf conditional in fcmp, just like the other instructions
2014-08-28 20:25:26 +02:00
Dolphin Bot 5e514dcfbc Merge pull request #881 from FioraAeterna/mulhwx
JIT64: add mulhwx implementation
2014-08-28 20:25:13 +02:00
Fiora 7929f2f033 JIT: make fprf conditional in fcmp, just like the other instructions
Missed in the FPRF merge (it didn't break anything, but it's probably a bit
slower and not consistent with the others).
2014-08-28 11:19:09 -07:00
Ryan Houdek 23bf8df0e2 Merge pull request #894 from FioraAeterna/missingcvt
x64Emitter: add support for some missing CVT instructions
2014-08-28 13:18:35 -05:00
Ryan Houdek 0217fb2008 Merge pull request #843 from FioraAeterna/fprf
JIT: Initial FPRF support
2014-08-28 13:15:50 -05:00
Dolphin Bot 1cf77c773b Merge pull request #758 from FioraAeterna/loadstoreopt
Jit64: some load/store optimizations
2014-08-28 19:30:26 +02:00
Fiora 043256449e Jit64: some load/store optimizations
Avoid extra ops during address calculation in loads; use LEAs or immediates
whenever possible.
2014-08-28 10:12:55 -07:00
Ryan Houdek c908a1e212 Merge pull request #882 from Sonicadvance1/fix-pp-blackness
Fix PostProcessing shader garbage on screen.
2014-08-28 10:29:03 -05:00
Ryan Houdek ca68526ec7 Clear the texture used by PP shaders prior to use.
We were generating a texture without ever setting the data to a known value.
This happened on the old code as well, just that PP shaders are receiving some love and people are using it and noticing some of its issues.
2014-08-28 10:16:39 -05:00
Ryan Houdek 8e7d7418af Merge pull request #890 from degasus/glx
glx: fix shutdown hang
2014-08-28 09:13:50 -05:00
Lioncash 6955e023a0 Merge pull request #877 from lioncash/voldir
DiscIO: Move VolumeDirectory off of raw pointers
2014-08-28 04:11:33 -04:00
comex a4a533e39f Re-enable the vertex loader JIT on OS X.
Why was it ever disabled?
2014-08-27 23:50:59 -04:00
Fiora f9d4ff0d5d x64Emitter: add support for some missing CVT instructions 2014-08-27 20:15:42 -07:00
degasus 8b0ad5daec glx: fix shutdown hang 2014-08-27 18:16:56 +02:00
Pierre Bourdon 7d05ebbc9b Merge pull request #888 from FioraAeterna/fmulinterp
Fix another absent-minded typo in the fmul interpreter patch
2014-08-27 10:57:09 +02:00
Fiora 7e07acbf3f Fix another absent-minded typo in the fmul interpreter patch 2014-08-26 23:00:11 -07:00
shuffle2 061f2058c2 Merge pull request #887 from FioraAeterna/fmulinterp
Bugfixes for fmul rounding
2014-08-26 21:50:43 -07:00
Fiora 1a0a33518b Bugfixes for fmul rounding
Fix the places I forgot to add Force25Bit, and fix an incredibly silly typo bug
2014-08-26 21:37:45 -07:00
Fiora 7dbc623dc0 JIT: Initial FPRF support
Doesn't support all the FPSCR flags, just the FPRF ones.
Add PPCAnalyzer support to remove unnecessary FPRF calculations.

POV-ray benchmark with enableFPRF forced on for an extreme comparison:
Before: 1500s
After, fmul/fmadd only: 728s
After, all float: 753s

In real games that use FPRF, like F-Zero GX, FPRF previously cost a few percent
of total runtime.

Since FPRF is so much faster now, if enableFPRF is set, just do it for every
float instruction, not just fmul/fmadd like before. I don't know if this will
fix any games, but there's little good reason not to.
2014-08-26 10:57:03 -07:00
comex e31d6feaa2 Unify three types of non-FIFO requests to the GPU thread around Common::Event and Common::Flag.
The only possible functionality change is that s_efbAccessRequested and
s_swapRequested are no longer reset at init and shutdown of the OGL
backend (only; this is the only interaction any files other than
MainBase.cpp have with them).  I am fairly certain this was entirely
vestigial.

Possible performance implications: efbAccessReady now uses an Event
rather than spinning, which might be slightly slower, but considering
the slow loop the flags are being checked in from the GPU thread, I
doubt it's noticeable.

Also, this uses sequentially consistent rather than release/acquire
memory order, which might be slightly slower, especially on ARM...
something to improve in Event/Flag, really.
2014-08-26 12:43:39 -04:00
comex de7294ecc1 Add Flag support to ChunkFile.h 2014-08-26 12:43:39 -04:00
comex 45a4236283 A tiny restructuring to allow inlining of FifoCommandRunnable. Probably useless. 2014-08-26 12:43:39 -04:00
comex 14125cf951 Refactor SetCpStatus into two functions for from-GPU and from-CPU mode rather than a boolean parameter.
This shouldn't affect functionality.  I'm not sure if the breakpoint
distinction is actually necessary (my commit messages from the old
dc-netplay last year claim that breakpoints are broken anyway, but I
don't remember why), but I don't actually need to change this part of
the code (yet), so I'll stick with the trimmings change for now.
2014-08-26 12:43:39 -04:00
Dolphin Bot f52888d3ec Merge pull request #884 from FioraAeterna/ppcfpopt
PPCFP: add comment
2014-08-26 18:28:26 +02:00
Fiora 288babf414 PPCFP: add comment 2014-08-26 09:08:22 -07:00
Fiora 90324f3809 JIT64: add mulhwx implementation 2014-08-26 01:09:04 -07:00
Lioncash 5082afa670 DiscIO: Move VolumeDirectory off of raw pointers 2014-08-26 00:23:16 -04:00
Fiora aaca1b01e5 JIT64: clean up and unify float load/store code
While we're at it, support a bunch of float load/store variants that weren't
implemented in the JIT. Might not have a big speed impact on typical games but
they're used at least a bit in povray and luabench.

694 -> 644 seconds on povray.
2014-08-25 19:51:40 -07:00
Lioncash f18fec81fe DiscIO: Make the unordered set in IsSoundFile static
Doesn't need to be instantiated every time the function is called.
2014-08-25 19:56:09 -04:00
Tillmann Karras 07c7e6f35e CommandProcessor: mark some functions as static 2014-08-25 21:09:42 +02:00
Lioncash 44ee2f20b9 Merge pull request #874 from FioraAeterna/fixidiocy
JIT: fix incredibly silly mistake in fmul rounding patch
2014-08-25 13:22:33 -04:00
Lioncash 8a77fe0539 Merge pull request #865 from lioncash/debugger-stuff
DolphinWX: Use wxGraphicsContext in the Code View for the debugger.
2014-08-25 13:21:32 -04:00
Fiora f04e362721 JIT: fix incredibly silly mistake in fmul rounding patch 2014-08-25 10:10:28 -07:00
Pierre Bourdon bf93920c05 Revert "Catch broken configurations inside of the Post Processing shaders." 2014-08-25 14:33:41 +02:00
Dolphin Bot 2f2f992bc7 Merge pull request #828 from Sonicadvance1/pp-shader-catch-broken-config
Catch broken configurations inside of the Post Processing shaders.
2014-08-25 09:17:30 +02:00
comex 6574682ff5 Remove unused variable m_zero. 2014-08-24 16:22:19 -04:00
comex d128795594 Merge pull request #862 from comex/registersinuse
Reduce my idiocy in register saving code.
2014-08-24 16:16:32 -04:00
comex a7752f49be Merge pull request #861 from comex/warnings
Fix warnings for OS X
2014-08-24 16:15:58 -04:00
comex 80da767576 Improve wording of a particularly atrocious message.
(Now without gettextize.)
2014-08-24 16:00:58 -04:00
comex cf01f47b52 Fix bloody printf specifiers.
In particular, even in code that only runs on x86-64, you can't use
PRIx64 for size_t because, on OS X, one is unsigned long and the other
is unsigned long long and clang whines about the difference.  I guess
you could make a size_t specifier macro, but those are horribly ugly, so
I just used casting.

Anyone want to make a nice (and slow) template-based printf?

Now without bare 'unsigned'.
2014-08-24 15:56:41 -04:00
Lioncash f239ea3853 DolphinWX: Parenthesize some expressions in CodeView.cpp 2014-08-24 15:40:19 -04:00
Lioncash c3e41809d9 DolphinWX: Move the CodeView debugger view over to wxGraphicsContext
This is a more advanced drawing 'backend' over the previous one and allows us to control things like transparency and anti-aliasing, etc.
2014-08-24 15:40:05 -04:00
Pierre Bourdon 9ff7125786 Merge pull request #810 from lioncash/controller-interface
InputCommon: Don't base default radius of analog sticks off of their name
2014-08-24 19:58:25 +02:00
Pierre Bourdon ebf1b98106 Merge pull request #834 from FioraAeterna/fixfmulrounding
JIT64: Fix fmul rounding issues
2014-08-24 19:49:56 +02:00
Fiora 4d7b1275c9 Interpreter: apply the same odd rounding to single multiplies as the JIT 2014-08-24 10:28:52 -07:00