Commit Graph

2315 Commits

Author SHA1 Message Date
comex 5c2a470b97 Fix 'sizeof' which broke in my reference-to-pointer conversion. 2014-10-25 15:02:12 -04:00
skidau bc26cb1b19 Merge pull request #1322 from degasus/ogl-pp
OGL: force enable postprocessing
2014-10-25 13:48:27 +11:00
booto 6afdff6023 VideoCommon: loop bug in ShaderGenCommon.h debug 2014-10-25 01:52:31 +08:00
skidau 716fe06289 Merge pull request #1349 from comex/good-job-dereferencing-null-on-purpose
Fix some warnings from Clang trunk in an overly aggressive manner
2014-10-24 13:03:09 +11:00
Sylvain Gadrat 3a12c50dc1 Fix timing of AVI files dumped on Linux
The timing information is set on s_scaled_frame->pts, giving precise
timing information to the encoder. Frames arriving too early (less than
one tick after the previous frame) are droped. The setting of packet's
timestamps and flags is done after the call to avcodec_encode_video2()
as this function resets these fields according to its documentation.
2014-10-23 23:34:38 +02:00
comex 1f5b1001ce Merge pull request #1342 from phire/lessGetPointer
Eliminate getPointers which are memcpyed or memset.
2014-10-23 14:42:37 -04:00
Scott Mansell 23832987b5 Revert changes preloading of RGBA8 tiles.
This path should probally be optimised, but it's out of the
scope of this PR.
2014-10-23 18:15:29 +13:00
degasus 7292ea6a04 OGL: force enable postprocessing 2014-10-23 00:21:52 +02:00
Yuriy O'Donnell db497cc55f Added projection matrix epsilon that fixes depth clipping issues in some games 2014-10-23 00:20:47 +02:00
comex 6e774f1b64 Add missing includes where headers depend on other headers having been included first.
This is good hygiene, and also happens to be required to build Dolphin
using Clang modules.

(Under this setup, each header file becomes a module, and each #include
is automatically translated to a module import.  Recursive includes
still leak through (by default), but modules are compiled independently,
and can't depend on defines or types having previously been set up.  The
main reason to retrofit it onto Dolphin is compilation performance - no
more textual includes whatsoever, rather than putting a few blessed
common headers into a PCH.  Unfortunately, I found multiple Clang bugs
while trying to build Dolphin this way, so it's not ready yet, but I can
start with this prerequisite.)
2014-10-21 21:22:16 -04:00
comex 8492d04dfa Use pointers instead of references in GetUidData to avoid the undefined behavior of *(T *)nullptr (ewwww) 2014-10-21 21:20:05 -04:00
Scott Mansell 3aa979d7d7 Remove another 3 getPointers.
Thanks neobrain for spotting these.
2014-10-21 12:18:54 +13:00
skidau f65bb10c93 Merge pull request #1308 from kayru/shader_generator_write_opt
Workaround for MSVC not optimizing away Write() in GeneratePixelShader
2014-10-20 17:15:53 +11:00
skidau 81efd0e87f Merge pull request #1315 from RisingFog/movie-menu-input-display
Moved Input Display to Movie Menu
2014-10-20 14:34:02 +11:00
Ryan Houdek b3cee80faa Merge pull request #1321 from Sonicadvance1/Qualcomm-v95
Change driver details to reflect Qualcomm's changes with their v95 driver
2014-10-19 08:48:42 -05:00
Ryan Houdek 9108a11af4 Change driver details to reflect Qualcomm's changes with their v95 driver.
They fixed their issues with dynamic UBO array member access.
There are many other issues though.
2014-10-18 02:50:57 -05:00
skidau a5674bbe84 Merge pull request #475 from kayru/d3d_state_cache
D3D: Implemented cache for dynamic render states
2014-10-18 13:11:39 +11:00
Fog 467ab1a629 Moved Input Display to Movie Menu 2014-10-17 21:08:34 -04:00
Yuriy O'Donnell d23da7dbef Added __forceinline to AlphaTest::TestResult() to make MSVC inline it 2014-10-17 21:50:41 +02:00
Yuriy O'Donnell 5fdda135d2 Workaround for MSVC not optimizing away Write() in GeneratePixelShader
ShaderConstantProfile and ShaderUid now have an empty implementation
of Write() that uses variadic templates instead of varargs. MSVC is now
able to inline and optimize away this when necessary.
2014-10-17 21:37:42 +02:00
comex 4134a0ad54 Make some variables static (should probably adjust for coding style too, but I'm not the one who merged code with bad style...) 2014-10-16 17:03:37 -04:00
comex 1ff86a4716 Merge pull request #1295 from crudelios/remove-bbox-settings
Remove setting to enable or disable Bounding Box calculation.
2014-10-16 17:00:37 -04:00
degasus 8f403696ea DriverDetails: mark intel buffer_storage bug as fixed 2014-10-16 22:51:32 +02:00
skidau 3023abc1b5 Merge pull request #1285 from degasus/master
PixelShaderGen: replace multiplication with shift
2014-10-16 14:04:25 +11:00
Yuriy O'Donnell 2e4667caaa D3D: Moved render state cache into separate source files.
Refactored  StateCache::Get() to early out for narrower indentation.
Added comments to clarify ownership of objects returned by  StateCache::Get().
2014-10-15 20:22:39 +02:00
crudelios d281b4d7e1 Remove setting to enable or disable Bounding Box calculation. 2014-10-15 19:02:54 +01:00
skidau 8ef21bc5e2 Merge pull request #1272 from RisingFog/sconfig-dump-frames
Move bDumpFrames to SConfig (and it's references)
2014-10-15 13:42:37 +11:00
Markus Wick 1227bd2ba6 PixelShaderGen: replace multiplication with shift
iirc both nvidia and i965 doesn't optimize this
2014-10-14 12:34:37 +02:00
skidau 9551650c42 Merge pull request #1095 from crudelios/sw-bbox
Reimplement Bounding Box calculation using the software renderer.
2014-10-13 15:57:11 +11:00
Fog 8d424b114a Move bDumpFrames to SConfig (and it's references) 2014-10-12 23:56:16 -04:00
Fog cd0c784d5a Changed Dump Frames References 2014-10-12 19:51:13 -04:00
skidau 18c81dbc33 Merge pull request #1261 from lioncash/mesa-resonance-cascade
AVIDump: Add missing CoreTiming header
2014-10-12 14:26:23 +11:00
Lioncash e283839b68 AVIDump: Add missing CoreTiming header
Fixes build on the mesa buildbot
2014-10-11 23:12:19 -04:00
skidau a00ad6871c Merge pull request #1220 from RisingFog/avsync
Proper Audio/Video Dumping
2014-10-12 14:04:45 +11:00
crudelios 1e3b9ecdc1 Fix compile errors after rebase. 2014-10-10 12:44:44 +01:00
crudelios 176ea06e82 Get buildbot to compile. 2014-10-10 12:28:15 +01:00
crudelios 47c67f014f Fix linux build and various warnings.
Increase savestate version.
2014-10-10 12:28:13 +01:00
crudelios 2d4b7e3f3f Reimplement Bounding Box calculation using the software renderer. 2014-10-10 12:27:06 +01:00
Fog fc4125cdd1 Proper Audio/Video Dumping 2014-10-09 00:06:04 -04:00
Jules Blok 39f421d45d Support the borderless fullscreen option in all backends. 2014-10-07 16:48:43 +02:00
Jules Blok 7344f752b7 Replace BorderlessFullscreenEnabled by ExclusiveFullscreenEnabled.
Special handling was associated with this function, which only applies to exclusive fullscreen.
2014-10-07 16:43:32 +02:00
Lioncash 16a74a9557 Fifo: Fix tab/space mismatches 2014-10-06 20:04:57 -04:00
Lioncash 7c05d029d3 Merge pull request #1085 from waddlesplash/refactoring
Migrate global init stuff into UICommon.
2014-10-05 21:25:44 -04:00
Augustin Cavalier 19109e2d01 Migrate global init stuff into UICommon.
This avoids code duplication in a bunch of places .
I also moved the NVIDIA Optimus export into VideoCommon.
2014-10-05 20:47:37 -04:00
comex 7f6284c2fc Change a bunch of reference function arguments to pointers.
Per the coding style and sanity.
2014-10-02 03:00:33 -04:00
Rohit Nirmal ce8a4f5cc5 VideoCommon: Silence -Wmaybe-uninitialized warnings. 2014-09-30 16:14:18 -04:00
Tony Wasserka 13fc8e7df1 Merge pull request #578 from RachelBryk/IR
Cleanup Renderer::CalculateTargetSize(), and allow IRs higher than 4x to be set via INI.
2014-09-30 19:21:21 +02:00
comex 2eebdff01b Remove useless STACKALIGN macro.
It only ever did anything on 32-bit OS X.

Anyway, it wasn't even on the right functions, and these days
ABI_PushRegistersAndAdjustStack should handle maintaining the ABI
correctly.
2014-09-30 01:42:47 -04:00
comex 87a95727cd ReadDataFromFifo is always called with len = 32. Remove the parameter to enable optimizations.
And rename some variables around it to be less confusing.
2014-09-29 22:07:16 -04:00
skidau 007ba13cfa Merge pull request #1144 from skidau/fifo-linked
Moved the linking of the FIFO CPWritePointer near where CPWritePointer gets updated
2014-09-29 13:52:33 +10:00
comex 6c0a68d507 Add the override config option.
I hate the config code, but now is not the time to fix it...
2014-09-28 21:34:31 -04:00
comex 3a2048ea57 Add a central variable g_want_determinism which controls whether to try to make things deterministic.
It now affects the GPU determinism mode as well as some miscellaneous
things that were calling IsNetPlayRunning.  Probably incomplete.

Notably, this can change while paused, if the user starts recording a
movie.  The movie code appears to have been missing locking between
setting g_playMode and doing other things, which probably had a small
chance of causing crashes or even desynced movies; fix that with
PauseAndLock.

The next commit will add a hidden config variable to override GPU
determinism mode.
2014-09-28 21:34:31 -04:00
comex 65af90669b Add the 'desynced GPU thread' mode.
It's a relatively big commit (less big with -w), but it's hard to test
any of this separately...

The basic problem is that in netplay or movies, the state of the CPU
must be deterministic, including when the game receives notification
that the GPU has processed FIFO data.  Dual core mode notifies the game
whenever the GPU thread actually gets around to doing the work, so it
isn't deterministic.  Single core mode is because it notifies the game
'instantly' (after processing the data synchronously), but it's too slow
for many systems and games.

My old dc-netplay branch worked as follows: everything worked as normal
except the state of the CP registers was a lie, and the CPU thread only
delivered results when idle detection triggered (waiting for the GPU if
they weren't ready at that point).  Usually, a game is idle iff all the
work for the frame has been done, except for a small amount of work
depending on the GPU result, so neither the CPU or the GPU waiting on
the other affected performance much.  However, it's possible that the
game could be waiting for some earlier interrupt, and any of several
games which, for whatever reason, never went into a detectable idle
(even when I tried to improve the detection) would never receive results
at all.  (The current method should have better compatibility, but it
also has slightly higher overhead and breaks some other things, so I
want to reimplement this, hopefully with less impact on the code, in the
future.)

With this commit, the basic idea is that the CPU thread acts as if the
work has been done instantly, like single core mode, but actually hands
it off asynchronously to the GPU thread (after backing up some data that
the game might change in memory before it's actually done).  Since the
work isn't done, any feedback from the GPU to the CPU, such as real
XFB/EFB copies (virtual are OK), EFB pokes, performance queries, etc. is
broken; but most games work with these options disabled, and there is no
need to try to detect what the CPU thread is doing.

Technically: when the flag g_use_deterministic_gpu_thread (currently
stuck on) is on, the CPU thread calls RunGpu like in single core mode.
This function synchronously copies the data from the FIFO to the
internal video buffer and updates the CP registers, interrupts, etc.
However, instead of the regular ReadDataFromFifo followed by running the
opcode decoder, it runs ReadDataFromFifoOnCPU ->
OpcodeDecoder_Preprocess, which relatively quickly scans through the
FIFO data, detects SetFinish calls etc., which are immediately fired,
and saves certain associated data from memory (e.g. display lists) in
AuxBuffers (a parallel stream to the main FIFO, which is a bit slow at
the moment), before handing the data off to the GPU thread to actually
render.  That makes up the bulk of this commit.

In various circumstances, including the aforementioned EFB pokes and
performance queries as well as swap requests (i.e. the end of a frame -
we don't want the CPU potentially pumping out frames too quickly and the
GPU falling behind*), SyncGPU is called to wait for actual completion.

The overhead mainly comes from OpcodeDecoder_Preprocess (which is,
again, synchronous), as well as the actual copying.

Currently, display lists and such are escrowed from main memory even
though they usually won't change over the course of a frame, and
textures are not even though they might, resulting in a small chance of
graphical glitches.  When the texture locking (i.e. fault on write) code
lands, I can make this all correct and maybe a little faster.

* This suggests an alternate determinism method of just delaying results
until a short time before the end of each frame.  For all I know this
might mostly work - I haven't tried it - but if any significant work
hinges on the competion of render to texture etc., the frame will be
missed.
2014-09-28 21:34:29 -04:00
comex 2d4b7c5900 Make ReadDataFromFifo static. 2014-09-28 21:25:12 -04:00
comex 0ae9e398c8 Rejigger some FIFO buffer variables to be more rational.
videoBuffer -> s_video_buffer
size -> s_video_buffer_write_ptr
g_pVideoData -> g_video_buffer_read_ptr (impl moved to Fifo.cpp)

This eradicates the wonderful use of 'size' as a global name, and makes
it clear that s_video_buffer_write_ptr and g_video_buffer_read_ptr are
the two ends of the FIFO buffer s_video_buffer.

Oh, and remove a useless namespace {}.
2014-09-28 21:25:12 -04:00
comex e86ddacb18 Changes to allow LoadCPReg to work in a preprocess mode which affects a separate state.
This state will be used to calculate sizes for skipping over commands on
a separate thread.  An alternative to having these state variables would
be to have the preprocessor stash "state as we go" somewhere, but I
think that would be much uglier.

GetVertexSize now takes an extra argument to determine which state to
use, as does FifoCommandRunnable, which calls it.  While I'm modifying
FifoCommandRunnable, I also change it to take a buffer and size as
parameters rather than using g_pVideoData, which will also be necessary
later.  I also get rid of an unused overload.
2014-09-28 21:25:06 -04:00
comex f0131c2e09 Mechanical changes to move most CP state to a struct rather than separate globals.
The next commit will add a separate copy of the struct and the ability
for LoadCPReg to work on it.
2014-09-28 21:23:29 -04:00
comex 90638c6806 Switch to an unordered_map as a micro-optimization. 2014-09-28 21:23:29 -04:00
comex f8452ff501 Fix threading issue with vertex loader JIT.
VertexLoader::VertexLoader was setting loop_counter, a *static*
variable, to 0.  This was nonsensical, but harmless until I started to
run it on a separate thread, where it had a chance of interfering with a
running vertex translator.

Switch to just using a register for the loop counter.
2014-09-28 21:23:28 -04:00
comex 63c62b277d Some changes to VertexLoaderManager:
- Lazily create the native vertex format (which involves GL calls) from
RunVertices rather than RefreshLoader itself, freeing the latter to be
run from the CPU thread (hopefully).

- In order to avoid useless allocations while doing so, store the native
format inside the VertexLoader rather than using a cache entry.

- Wrap the s_vertex_loader_map in a lock, for similar reasons.
2014-09-28 21:23:28 -04:00
skidau 275226c2b6 Merge pull request #1147 from RachelBryk/unicode-tex
Allow custom textures to load from unicode paths.
2014-09-28 14:54:56 +10:00
Rachel Bryk 4fe1119e52 Cleanup Renderer::CalculateTargetSize(), and allow IRs higher than 4x to be set via ini. 2014-09-25 19:50:25 -04:00
skidau 539f270c67 Added a xf.numtexgen != bp.numtextgen error log if there is a mismatch detected. 2014-09-24 10:46:09 +10:00
skidau b4399dbdf3 Fixed the "Undeclared identifier: uv0" OpenGL shader compile error that appears in NBA2K11. 2014-09-24 00:10:45 +10:00
Rachel Bryk 4ed9b561bd Allow custom textures to load from unicode paths. 2014-09-22 12:51:30 -04:00
skidau 8c5e12cf02 Moved the linking of the FIFO CPWritePointer near where CPWritePointer gets updated. The CPWritePointer was getting updated while it was in-flight causing Pac-man Party to flicker. Fixes issue 5223. 2014-09-22 16:49:09 +10:00
Tony Wasserka 1d23c2ca8b GPU: Only load the relevant color components upon writes to the tev color registers.
The other two components need not be valid upon write, hence loading them results in glitches.

Fixes issue 6783.
2014-09-21 10:38:22 +02:00
skidau 536582b2eb Merge pull request #1129 from lioncash/casing
VideoCommon: Fix function casing in FrameBufferManagerBase
2014-09-21 15:56:17 +10:00
Lioncash a6ffa55215 VideoCommon: Fix function casing in FrameBufferManagerBase 2014-09-20 14:54:59 -04:00
Lioncash 91438fa9e7 VideoCommon: Make zfreeze in GenMode 1 bit in size 2014-09-20 14:30:41 -04:00
Ryan Houdek eb23882398 Merge pull request #1120 from rohit-n/muh-precompiled-headers
Fix build failing when disabling precompiled headers.
2014-09-19 17:43:42 -05:00
Rohit Nirmal 46057db37d Fix build failing when disabling precompiled headers. 2014-09-19 18:17:51 -04:00
magumagu 32e5043b29 WIP XFB scaling.
Still an ugly mess.
2014-09-19 12:33:15 -05:00
Lioncash b06ec302d1 Remove some unnecessary semicolons 2014-09-11 13:05:31 -04:00
Ryan Houdek 71cb09f1ca Merge pull request #1027 from rohit-n/change-include
Include CommonTypes.h instead of Common.h.
2014-09-10 00:35:16 -05:00
skidau d1439bc1db Merge pull request #1041 from RachelBryk/kill-g_CoreStartupParameter
Kill Core::g_CoreStartupParameter.
2014-09-10 11:00:42 +10:00
Rachel Bryk f93aa7087c Kill Core::g_CoreStartupParameter. 2014-09-09 00:24:49 -04:00
Fiora 94c20db369 Rename Log2 and add IsPow2 to MathUtils for future use
Also remove unused pow2/pow2f functions.
2014-09-08 20:15:45 -07:00
Rohit Nirmal fbc64984ca Include CommonTypes.h instead of Common.h. 2014-09-08 15:39:58 -04:00
comex c5c0b36046 Remove the inaccurately named ABI_PushAllCalleeSavedRegsAndAdjustStack (it didn't preserve FPRs!) and replace with ABI_PushRegistersAndAdjustStack.
To avoid FPRs being pushed unnecessarily, I checked the uses: DSPEmitter
doesn't use FPRs, and VertexLoader doesn't use anything but RAX, so I
specified the register list accordingly.  The regular JIT, however, does
use FPRs, and as far as I can tell, it was incorrect not to save them in
the outer routine.  Since the dispatcher loop is only exited when
pausing or stopping, this should have no noticeable performance impact.
2014-09-08 01:00:10 -04:00
comex 2dafbfb3ef Improve code and clarify parameters to ABI_Push/PopRegistersAndAdjustStack.
- Factor common work into a helper function.
- Replace confusingly named "noProlog" with "rsp_alignment".  Now that
x86 is not supported, we can just specify it explicitly as 8 for
clarity.
- Add the option to include more frame size, which I'll need later.
- Revert a change by magumagu in March which replaced MOVAPD with MOVUPD
on account of 32-bit Windows, since it's no longer supported.  True,
apparently recent processors don't execute the former any faster if the
pointer is, in fact, aligned, but there's no point using MOVUPD for
something that's guaranteed to be aligned...

(I discovered that GenFrsqrte and GenFres were incorrectly passing false
to noProlog - they were, in fact, functions without prologs, the
original meaning of the parameter - which caused the previous change to
break.  This is now fixed.)
2014-09-08 00:58:56 -04:00
shuffle2 227b79bf84 Merge pull request #1004 from comex/warning-fixes-2
Two trivial warning fixes
2014-09-06 11:56:32 -07:00
comex 917c6d324a Remove unused functions in TextureDecoder. 2014-09-06 13:32:54 -04:00
Rohit Nirmal 629ceaf2b1 Split some parts of UpdateBoundingBox into multiple lines. Also,
fix issues causing failure on Lint.
2014-09-06 09:49:27 -05:00
Rohit Nirmal debe3999b5 Remove more dead and redundant code. 2014-09-05 23:22:48 -05:00
shuffle2 0576046fdd Merge pull request #972 from Sonicadvance1/fix-intel-windows
Work around Intel's failings with buffer_storage
2014-09-05 11:06:49 -07:00
Fiora 07e0c917c6 Revert "JIT64: optimize CA calculations" 2014-09-05 10:26:30 -07:00
comex 97420c6ec6 Merge pull request #852 from FioraAeterna/optimizeca
JIT64: optimize CA calculations
2014-09-05 11:52:02 -04:00
shuffle2 a9a6270982 Merge pull request #774 from magcius/texdecode-cleanup
Clean up the TextureDecoder and some related things
2014-09-04 19:37:49 -07:00
Jasper St. Pierre 76b4dbdf28 TextureDecoder: Clean up the code style
For a long time, we've had ugly and inconsistent function names here as
helpers, names like "decodebytesRGB5A3rgba" which are absolutely
incomprehensible to understand. Fix this by introducing a new consistent
naming scheme, where the above function now becomes "DecodeBytes_RGB5A3".
2014-09-04 18:36:57 -07:00
Jasper St. Pierre 0b7bed4a52 TextureDecoder: Simplify how the reference texture decoder works
Instead of having three separate functions and checking the tlutfmt in a
variety of places, just do it once in a helper method. This is already
for the slow path either in our Generic decoder or in our Software
renderer, so it doesn't matter that this is slower.

x64 will continue using the separate functions for speed.
2014-09-04 18:36:57 -07:00
Jasper St. Pierre ea1245d191 TextureDecoder: Pass the TLUT address straight into the texture decoder
This removes the requirement for the TextureDecoder to have access to
global texture memory.
2014-09-04 18:36:57 -07:00
Jasper St. Pierre fcd4ecc942 TextureDecoder: Add an enum for the TLUT formats
Quick code cleanup. The enum names and values come from libogc.
2014-09-04 18:36:56 -07:00
Jasper St. Pierre 32da01edec TextureDecoder: Rearrange header slightly
Put the two Decode APIs together.
2014-09-04 18:36:56 -07:00
Jasper St. Pierre f975307016 TextureDecoder: Add some statics to some of our helper functions
I know these are already inline, but this makes it more clear that
they're helper functions to be used in this file only.
2014-09-04 18:36:56 -07:00
Jasper St. Pierre a8e591dc73 VideoCommon: Remove support for decoding to ARGB textures
The D3D / OGL backends only ever used RGBA textures, and the Software
backend uses its own custom code for sampling. The ARGB path seems to
just be dead code.

Since ARGB and RGBA formats are similar, I don't think this will make
the code more difficult to read or unable to be used as
reference. Somebody who wants to use this code to output ARGB can simply
modify the MakeRGBA function to put the shift at the other end.
2014-09-04 18:36:56 -07:00
Jasper St. Pierre 9438a30384 VideoCommon: Start putting common texture decoding code in TextureDecoder_Common
This pulls all the duplicate code from TextureDecoder_Generic /
TextureDecoder_x64 out and puts it in a common file. Out custom font
used for debugging the texture cache is also pulled out and put in a
common "sfont.inc" file. At some point we should also combine this font
with the other six binary fonts we ship.
2014-09-04 18:36:53 -07:00
skidau 73fc45db68 Merge pull request #967 from skidau/SyncGPU-SaveState
Added the EmuRunningState check to the GPU thread's FIFO loop
2014-09-05 11:24:46 +10:00
Jasper St. Pierre bfb2c04ace TextureDecoder: Remove unused function
GetPC_TexFormat was never used. It was added in commit d02426a, with the
only user being commented out code. The commented out code was later
removed in 9893122, but the implementation stayed.
2014-09-04 17:32:06 -07:00
Jasper St. Pierre 6682a2fadd TextureDecoder: Fix a RGBA/BGRA copy/paste typo
We were decoding to BGRA32 textures in our RGBA32 texture decoder. Since
this is the same for the BGRA32 decoder implementation, this is most
likely a copy/paste typo, rather than the texture actually being
bit-swapped. Fix this.

I'm not sure of any games that use the C14X2 texture format, so I'm not
sure this fixes any games, but it does make the code cleaner for when we
clean it up in the future, and merge some of these similar loops.
2014-09-04 17:30:53 -07:00
Jasper St. Pierre a5297f6da8 PixelEngine: Remove unused AllowIdleSkipping and all references to it 2014-09-04 17:25:59 -07:00
Sonicadvance1 e32b2e1771 Work around Intel's failings with with buffer_storage 2014-09-04 19:03:49 -05:00
Yuriy O'Donnell d8d9bc8c6c Render: Implemented simple render target pool
This avoids creating and destroying render targets every frame,
which is a significant CPU overhead.

Old render targets are destroyed after 3 frames.
2014-09-04 22:21:06 +02:00
Dolphin Bot 830a03c540 Merge pull request #957 from degasus/frame_skipping
VideoCommon: rewrite frame skipping code
2014-09-04 18:27:19 +02:00
degasus 8b84ddce9a VideoCommon: rewrite frame skipping code 2014-09-04 18:07:39 +02:00
degasus ef6f6a7fa9 VideoCommon: remove XFReg copy optimization
This code is just ugly and I doubt there is a way that copying twice is faster.
2014-09-04 17:56:17 +02:00
skidau 86db0bf8c3 Added the EmuRunningState check to the GPU thread's FIFO loop so that the GPU thread services any waiting save states. This is needed for games that have the Sync GPU option enabled. 2014-09-04 22:02:21 +10:00
comex dd5be7c0dc Merge pull request #924 from comex/fifo-command-runnable
Refactor opcode decoding a bit to kill FifoCommandRunnable.
2014-09-02 23:27:30 -04:00
Shawn Hoffman 266992684d msvc: remove some remnants of SDL and DSound from projects and general cleanup. 2014-09-01 21:27:44 -07:00
Fiora b51aa4fa89 Rename Log2 and add IsPow2 to MathUtils for future use
Also remove unused pow2/pow2f functions.
2014-09-01 20:41:07 -07:00
comex 608f9bcd67 Refactor opcode decoding a bit to kill FifoCommandRunnable.
Separated out from my gpu-determinism branch by request.  It's not a big
commit; I just like to write long commit messages.

The main reason to kill it is hopefully a slight performance improvement
from avoiding the double switch (especially in single core mode);
however, this also improves cycle calculation, as described below.

- FifoCommandRunnable is removed; in its stead, Decode returns the
number of cycles (which only matters for "sync" GPU mode), or 0 if there
was not enough data, and is also responsible for unknown opcode alerts.

Decode and DecodeSemiNop are almost identical, so the latter is replaced
with a skipped_frame parameter to Decode.  Doesn't mean we can't improve
skipped_frame mode to do less work; if, at such a point, branching on it
has too much overhead (it certainly won't now), it can always be changed
to a template parameter.

- FifoCommandRunnable used a fixed, large cycle count for display lists,
regardless of the contents.  Presumably the actual hardware's processing
time is mostly the processing time of whatever commands are in the list,
and with this change InterpretDisplayList can just return the list's
cycle count to be added to the total.  (Since the calculation for this
is part of Decode, it didn't seem easy to split this change up.)

To facilitate this, Decode also gains an explicit 'end' parameter in
lieu of FifoCommandRunnable's call to GetVideoBufferEndPtr, which can
point to there or to the end of a display list (or elsewhere in
gpu-determinism, but that's another story).  Also, as a small
optimization, InterpretDisplayList now calls OpcodeDecoder_Run rather
than having its own Decode loop, to allow Decode to be inlined (haven't
checked whether this actually happens though).

skipped_frame mode still does not traverse display lists and uses the
old fake value of 45 cycles.  degasus has suggested that this hack is
not essential for performance and can be removed, but I want to separate
any potential performance impact of that from this commit.
2014-09-01 14:35:23 -04:00
Pierre Bourdon 494a60e41b VertexLoader: Change VtxDesc to use u64 instead of u32
This is required to make packing consistent between compilers: with u32, MSVC
would not allocate a bitfield that spans two u32s (it would leave a "hole").
2014-09-01 11:18:02 +02:00
Lioncash 4af8d9d248 VideoCommon: Clean up brace placements 2014-08-30 18:06:45 -04:00
Lioncash 1d706b2311 Get rid of C-style empty function parameter indicators 2014-08-30 15:23:48 -04:00
comex a4a533e39f Re-enable the vertex loader JIT on OS X.
Why was it ever disabled?
2014-08-27 23:50:59 -04:00
comex e31d6feaa2 Unify three types of non-FIFO requests to the GPU thread around Common::Event and Common::Flag.
The only possible functionality change is that s_efbAccessRequested and
s_swapRequested are no longer reset at init and shutdown of the OGL
backend (only; this is the only interaction any files other than
MainBase.cpp have with them).  I am fairly certain this was entirely
vestigial.

Possible performance implications: efbAccessReady now uses an Event
rather than spinning, which might be slightly slower, but considering
the slow loop the flags are being checked in from the GPU thread, I
doubt it's noticeable.

Also, this uses sequentially consistent rather than release/acquire
memory order, which might be slightly slower, especially on ARM...
something to improve in Event/Flag, really.
2014-08-26 12:43:39 -04:00
comex 45a4236283 A tiny restructuring to allow inlining of FifoCommandRunnable. Probably useless. 2014-08-26 12:43:39 -04:00
comex 14125cf951 Refactor SetCpStatus into two functions for from-GPU and from-CPU mode rather than a boolean parameter.
This shouldn't affect functionality.  I'm not sure if the breakpoint
distinction is actually necessary (my commit messages from the old
dc-netplay last year claim that breakpoints are broken anyway, but I
don't remember why), but I don't actually need to change this part of
the code (yet), so I'll stick with the trimmings change for now.
2014-08-26 12:43:39 -04:00
Tillmann Karras 07c7e6f35e CommandProcessor: mark some functions as static 2014-08-25 21:09:42 +02:00
Pierre Bourdon bf93920c05 Revert "Catch broken configurations inside of the Post Processing shaders." 2014-08-25 14:33:41 +02:00
Dolphin Bot 2f2f992bc7 Merge pull request #828 from Sonicadvance1/pp-shader-catch-broken-config
Catch broken configurations inside of the Post Processing shaders.
2014-08-25 09:17:30 +02:00
Shawn Hoffman 327d35377d windows: remove now-extraneous NOMINMAX and WIN32_LEAN_AND_MEAN #defines from dolphin code.
Wrap dinput.h in a header defining DIRECTINPUT_VERSION instead of repeating it multiple places.
2014-08-23 10:48:48 -07:00
Lioncash f17dcd2019 Merge pull request #764 from magcius/new-nogui-2
Rewrite GLInterface
2014-08-21 14:14:54 -04:00
Shawn Hoffman 4bf031c064 msvc: resolve all warnings in VideoCommon. 2014-08-19 22:33:46 -07:00
Jasper St. Pierre 63f1a16969 Core: Remove UpdateFPSDisplay
This is effectively unused, as the window handles that we pass to the
GLInterface are window handles for the frame which isn't ever a real
toplevel window. Host_UpdateTitle is what actually sets the proper title
on the render window.
2014-08-19 10:05:58 -04:00
Jasper St. Pierre 7ca8d8dfc7 Core: Don't pass through a reference to the window handle
Now that MainNoGUI is properly architected and GLX doesn't need to
sometimes craft its own windows sometimes which we have to thread back
into MainNoGUI, we don't need to thread the window handle that GLX
creates at all.

This removes the reference to pass back here, and the g_pWindowHandle
always be the same as the window returned by Host_GetRenderHandle().

A future cleanup could remove g_pWindowHandle entirely.
2014-08-19 10:05:58 -04:00
Ryan Houdek 7d90a00bbe Qualcomm fixed screen rotation in their latest v66 development drivers.
The framebuffer is no longer rotated the wrong way around in Qualcomm's latest development drivers.
They did something right, only took them over a year.
2014-08-18 00:17:09 -05:00
Ryan Houdek 2d624780c0 Catch broken configurations inside of the Post Processing shaders.
This catches most instances of configuration failures that can happen in a post processing shader.
Gives a user a helpful error message that lets them know what they have failed to set up correctly
2014-08-17 23:59:21 -05:00
shuffle2 2270c3e90a Merge pull request #797 from shuffle2/msvc-pch
Windows: Use a shared precompiled header for dolphin code under Source/
2014-08-16 14:58:28 -07:00
Charles Rozhon 6f34a8ac47 Removed warnings by assigning to bool 2014-08-16 14:16:10 -05:00
degasus a64b0bf499 VertexLoader: cache NativeVertexFormat
This fix a performance regression of PR #672.
2014-08-16 12:58:52 +02:00
Shawn Hoffman f1b82a34b2 Windows: Use a shared precompiled header for dolphin code under Source/ 2014-08-14 23:51:13 -07:00
Ryan Houdek b8a21b3744 Add the PostProcessing class object to RenderBase in VideoCommon.
Backends will initialize this variable with their own inherited PostProcessing class object.
2014-08-13 01:05:14 -05:00
Ryan Houdek 6bdc32c54a Add the VideoCommon PostProcessing class.
This class loads all the common PP shader configuration options and passes those options through to a inherited class that OpenGL or D3D will have.
Makes it so all the common code for PP shaders is in VideoCommon instead of duplicating the code across each backend.
2014-08-13 01:05:10 -05:00
Jasper St. Pierre c54fef5496 VideoBackendBase: Remove unused stub Initialize implementation
Both D3D and OGL have their own overrides, so this isn't used.
2014-08-06 21:35:52 -04:00
Pierre Bourdon 16f180524c VertexLoader: do not prepare for vertices if we need to skip them 2014-08-04 20:47:02 -07:00
Pierre Bourdon 15920d0f10 Merge pull request #394 from degasus/d3d_lighting_fix
VideoCommon: normalize light direction
2014-08-03 21:21:23 -07:00
Pierre Bourdon 4c42b38de1 Merge pull request #428 from Sonicadvance1/x86_32-removal
Remove x86_32 support from Dolphin.
2014-08-03 21:17:28 -07:00
Ryan Houdek d9b5482840 Remove x86_32 from VertexLoader. 2014-08-03 13:44:37 -05:00
Pierre Bourdon 6f715a1fbe VertexLoader: Remove more global state dependencies (this time IndexGenerator and VertexManager) 2014-08-02 09:34:39 -07:00
Pierre Bourdon 83838a645f Merge pull request #690 from Armada651/d3dfullscreen_fixes
Exclusive fullscreen fixes
2014-07-30 16:28:56 -07:00
Jules Blok 3b5625c76b VideoConfig: Ignore Borderless Fullscreen setting when the backend does not support exclusive fullscreen.
This was expected to be handled by VerifyValidity(), but that only verifies the validity of the INI files.
2014-07-30 12:15:58 +02:00
Jules Blok 4501aeefbe CFrame: Check borderless fullscreen setting before enabling exclusive fullscreen in the video config.
Fixes a bug where "Use Fullscreen" would initialize into exclusive fullscreen regardless of the borderless fullscreen setting.

Also relieves the need for the video renderer to check the borderless fullscreen setting each time.
2014-07-30 12:15:26 +02:00
Lioncash 522a5c35ad Convert some more header inclusions into forward declarations 2014-07-29 20:55:07 -04:00
Jules Blok ec402a0d5f FPSCounter: Initialize members. 2014-07-26 14:37:18 +02:00
Pierre Bourdon 8e865f3848 Merge pull request #506 from Armada651/d3dfullscreen
D3D: Add exclusive fullscreen support.
2014-07-26 13:22:11 +02:00
Jules Blok 6724ce6275 Cosmetic changes based on feedback on PR #506. 2014-07-26 13:04:39 +02:00
Jules Blok bd9953d97e Remove the 3D Vision hack.
The hack was needed because the Nvidia 3D Vision heuristics are documented to only support surfaces that are the same size as the backbuffer. This would be the case if you enabled the hack and selected the "Auto (Window Size)" internal resolution.

However, on recent drivers the same effect is achieved by selecting the "Auto (Multiple)" internal resolution. Therefore the hack is no longer required.
2014-07-26 12:45:10 +02:00
Pierre Bourdon 73f9a22e2e VertexLoader: Remove global state dependency on g_nativeVertexFmt 2014-07-26 01:35:09 +02:00
Pierre Bourdon 78c3a22060 VertexLoader: take the VAT object directly for RunVertices 2014-07-24 01:51:37 +02:00
Pierre Bourdon 069801a7d1 VertexLoader: Simplify SetVAT 2014-07-24 01:25:23 +02:00
Pierre Bourdon 20369743a4 VertexLoaderUID: remove global state dependency 2014-07-24 01:12:12 +02:00
Jules Blok 009b4dd376 Exit exclusive fullscreen when the stop confirmation is shown.
Also have the renderer remember its own fullscreen state. This is done to prevent a case where we exit exclusive fullscreen through the configuration and a focus shift at the same time. In this case the renderer would fail to detect that the fullscreen state was changed.
2014-07-21 20:50:48 +02:00
Jules Blok cd94ff1966 VideoConfig: Add "Borderless Fullscreen" option.
This option will disable exclusive fullscreen for users who prefer the old behaviour.
2014-07-20 22:02:57 +02:00
Jules Blok 77bc879384 D3D: Add exclusive fullscreen support. 2014-07-19 21:14:44 +02:00
Ryan Houdek bc9ef95643 Support Sampler binding in the shader.
In the cases where we support the binding layout keyword, use it for more than binding UBO location.
This changes it so it is supported for samplers as well.

Instances when this is enabled is if a device supports GL_ARB_shading_language_420pack, or if it supports GLES 3.10.
2014-07-18 17:04:03 -05:00
Dolphin Bot b9dc69105d Merge pull request #595 from Armada651/pref_log
FPSCounter: Flush the logs every second and close them when the renderer is shut down.
2014-07-18 12:59:04 +02:00
Jules Blok eaa7460636 FPSCounter: Remove redundant destructor. 2014-07-18 12:49:40 +02:00
Jules Blok 3b978f7c27 Turn the FPSCounter namespace into a class. 2014-07-16 20:40:40 +02:00
degasus 01fd96ab31 PixelShaderGen: fix indentation 2014-07-16 17:24:43 +02:00
Tillmann Karras 4063694d20 VideoCommon: fix ifdef expression 2014-07-15 04:15:49 +02:00
shuffle2 0c6eeaff05 Merge pull request #617 from Tilka/clang_bug
VideoCommon: fix clang version check
2014-07-14 01:55:31 -07:00
Tillmann Karras dbc30c6c76 VideoCommon: make version check easier to read 2014-07-14 03:05:56 +02:00
Tillmann Karras 0be03252cc VideoCommon: fix clang version check
That was... er... a typo!
2014-07-14 02:59:31 +02:00
shuffle2 3f67ec0d50 Merge pull request #611 from Tilka/clang_bug
VideoCommon: version-check clang for workaround
2014-07-13 17:52:54 -07:00
Tillmann Karras b6f3ae23bc VideoCommon: version-check clang for workaround
The bug was fixed in clang 3.4.
2014-07-14 02:12:48 +02:00
Pierre Bourdon 8876ee120a Change libav* autodetection to support framedumping on Ubuntu 14.04
Add an "ugly" workaround in the AVIDump code, but looking at other project this
seems to be the most common way to handle this API change.
2014-07-13 23:06:20 +02:00
Tillmann Karras 0ccee6c87b Fix warnings unearthed by #579 2014-07-13 02:16:51 +02:00
degasus 7e79806efc remove unused globals
Also change globals into statics which are only used in one file
2014-07-11 16:10:20 +02:00
degasus 81ed17be53 avoid the extern keyword in .cpp files 2014-07-11 16:10:20 +02:00
degasus 6d3f249dcc mark all local variables as static 2014-07-11 16:10:20 +02:00
degasus 22e1aa5bb4 mark all local functions as static 2014-07-11 16:07:23 +02:00
Jules Blok 6def4ead01 FPSCounter: Flush the logs every second and close them when the renderer is shut down. 2014-07-10 23:11:28 +02:00
Jules Blok 1754cbda9d Move FPSCounter calls to RenderBase. 2014-07-10 23:11:09 +02:00
shuffle2 15c1250d9d Merge pull request #596 from delroth/master
AVIDump: fix FFV1 encoding
2014-07-09 18:02:40 -07:00
Pierre Bourdon da697df6ee AVIDump: fix FFV1 encoding
ffmpeg 2.0 changed requirements for the FFV1 encoder and made them more strict,
requiring more fields of the input frame to be initialized. Explicitly setting
pixfmt, width and height solve the EINVAL issues with FFV1 encoding.

Original fix from http://ffmpeg.org/pipermail/libav-user/2013-October/005759.html
2014-07-10 02:53:12 +02:00
Jules Blok 09304cab57 FPSCounter: Change format string to match value. 2014-07-09 19:45:56 +02:00
Jules Blok 95b579746f Replace "Log FPS to file" by the "Log render time to file" feature. 2014-07-09 17:56:11 +02:00
Jules Blok 61d44cf73f FPSCounter: Use a Timer for the FPS update time. 2014-07-09 17:53:41 +02:00
Jules Blok efeadb7fe9 FPSCounter: Add "Log render time to file" feature.
Allows for a more accurate performance measurement.
2014-07-09 17:53:31 +02:00
Lioncash ec1e52de53 VideoCommon: Get rid of an snprintf call in VideoConfig.cpp 2014-07-06 15:33:08 -04:00
Dolphin Bot cc3dda5b22 Merge pull request #362 from Tilka/ffmpeg_libav_new
AVIDump: use new ffmpeg/libav API
2014-07-06 19:33:27 +02:00
Tony Wasserka a798548c30 Merge pull request #546 from workhorsy/header_guard_to_pragma_once
Changed lingering header include guards to pragma once.
2014-07-06 14:19:32 +02:00
Ryan Houdek 4483b64bcb Merge pull request #463 from degasus/vertex_format_cache
VideoCommon: Cache native vertex formats
2014-07-06 05:26:42 -05:00
Lioncash 48ff45b8a8 VideoCommon: Remove some unused constants from VertexShaderGen.h. 2014-07-05 23:46:07 -04:00
degasus bb2fc8ecbb VideoCommon: Cache native vertex formats
We are used to have a 1:1 mapping of GX vertex formats and the native (OGL + D3D) ones, but there are by far more GX ones.
This new cache maps them directly so that we don't flush on GX vertex format changes as long as the native one doesn't change.

The idea is stolen from galop1n.
2014-07-04 14:39:27 +02:00
degasus 02ac5e95c8 VideoCommon: normalize lighting direction.
It seems that the lighting direction must be normalized. This fixes lots of lighting issues mostly shown in the d3d backend.
2014-07-03 21:08:19 +02:00
Lioncash 00efaedb02 FPS counter cleanup
- Isolate it into it's own namespace
- Shorten function names, the namespace self-documents.
- Just use the std I/O, we can just write directly to the stream for
  logging.
2014-07-02 20:23:09 -04:00
Matthew Brennan Jones 124210c50f Changed lingering header include guards to pragma once.
Some headers where using #ifndef to guard being including multiple times. But most were using pragma once. So for consistency I changed them all to use pragma once.
2014-07-01 22:17:33 -07:00
Tillmann Karras 6b3e6e6ffb AVIDump: rename frame variables 2014-06-27 19:48:36 +02:00
Tillmann Karras c2c46d7573 AVIDump: update ffmpeg/libav API usage
libav 10 was released on May 10th, 2014 and it drops support for some
long-deprecated stuff like avcodec_encode_video().
2014-06-27 19:48:36 +02:00
Tillmann Karras e3fef8c990 AVIDump: cleanup 2014-06-27 19:48:35 +02:00
degasus 7db5a4b22d Statistics: Reformat stats string 2014-06-27 09:36:50 +02:00
degasus f1ddd3c66a VideoCommon: remove unused stats 2014-06-27 09:35:26 +02:00
Lioncash ca5340ebde Centralize the logging code into its own folder in Common. 2014-06-25 22:11:42 -04:00
Lioncash 8b13afbb8e Remove the 32-bit config platform from the VS solution and projects 2014-06-24 22:07:26 -04:00
Pierre Bourdon 5dff577339 Merge pull request #500 from lioncash/ini
Use only section-based ini reading.
2014-06-22 17:21:45 +02:00
degasus 924ad1ee9f LightingShader: hard code const variable 2014-06-19 16:46:53 +02:00
degasus e456a5e64f PixelShader: remove the duplicated ppl constants 2014-06-19 16:33:33 +02:00
degasus d93f2973f7 PixelShader: use the vertex const buffer for ppl 2014-06-19 16:33:33 +02:00