Commit Graph

2261 Commits

Author SHA1 Message Date
Jules Blok d583720a59 GeometryShaderGen: Support stereoscopy on GPUs without support for instancing. 2014-11-23 14:26:56 +01:00
Jules Blok 176191dc16 ShaderGenCommon: Move uniforms into a common static string. 2014-11-23 14:24:09 +01:00
Jules Blok fa32f751d3 ShaderGen: Handle ShaderCode objects directly.
ShaderGeneratorInterface does not have virtual function members, so we have to implement each type explicitly.
2014-11-23 14:24:09 +01:00
Jules Blok b236c363de ShaderGen: Add a stereoscopy flag in the UID data. 2014-11-23 14:23:42 +01:00
Jules Blok 5944d15021 TextureCache: Check the number of layers before reusing a texture. 2014-11-23 14:23:42 +01:00
Jules Blok 272ea90ca5 GeometryShaderGen: Allow stereoscopy to be disabled.
Will facilitate future use of this generator for other purposes.
2014-11-23 14:23:41 +01:00
Jules Blok d9e280e338 PixelShaderGen: Sample the correct texture layer. 2014-11-23 14:23:41 +01:00
Jules Blok f6ea293027 VertexShaderManager: Compute stereoscopy projection matrices. 2014-11-23 14:23:41 +01:00
Jules Blok c64486075d PostProcessing: Add layered stereoscopy support. 2014-11-23 14:23:41 +01:00
Jules Blok 2d8ec62beb Pass VS_OUTPUT structs between shaders. 2014-11-23 14:23:41 +01:00
Jules Blok b005f61a2e Add geometry shader generator for stereo 3D. 2014-11-23 14:22:55 +01:00
degasus 6670cacddc use GL_TEXTURE_2D_ARRAY for most of our textures 2014-11-23 14:22:22 +01:00
degasus 6f3e20ac42 OGL: disable bbox writes if not supported 2014-11-22 15:17:57 +01:00
Matthew Parlane 4ef0ab2731 Merge pull request #1534 from FioraAeterna/fixd3dtex1x1
D3D: fix issues with multi-level 1x1 textures on D3D
2014-11-21 19:12:58 +13:00
Matthew Parlane 21e4e035cc Merge pull request #1281 from Stevoisiak/RenameEuRGB60
Renamed EuRGB to PAL60
2014-11-21 19:09:42 +13:00
Fiora 3ddf82a318 Vertex Loader: SSE implementations of more position/texcoord/normal formats
~35-45% faster NFS:HP2, possibly other vertex-bound games.
2014-11-20 02:13:19 -08:00
comex fb50cb6d99 Merge pull request #1550 from degasus/bbox
OGL: implement bounding box support with ssbo
2014-11-19 20:25:23 -05:00
skidau ca3e5ce5e1 Added an exception check when the game is close to overflowing. Fixes the fifo overflow that occurs in Battalion Wars 2.
Changed the CPEnd loop check to an exact match.
2014-11-19 12:48:09 +11:00
skidau 3d448e49c6 Update CPStatus before processing the FIFO events and force an exception check on interrupts.
Added more information into the FIFO unknown opcode error message.
2014-11-19 12:48:08 +11:00
skidau b2c02e216c Separated out the CPU and GPU thread path to avoid clobbering.
Removed the Eternal Darkness check as it is no longer required.

Fixes issue 7835.
2014-11-19 12:48:08 +11:00
Stevoisiak e7a82c4ded Renamed EuRGB60 to PAL60 2014-11-18 16:51:21 -05:00
degasus c211450b99 OGL: implement bounding box support with ssbo
This implemention tries to be as accurate as the old SW implemention, but it will remove the dependcy of our vertexloader on videosw.
2014-11-17 21:20:32 +01:00
degasus 90613a1bda OpcodeDecoder: Skip recursiv display lists 2014-11-15 16:24:06 +01:00
Stevoisiak b25e1a2eb4 Various formatting and consistency fixes 2014-11-13 22:42:18 -05:00
Fiora 733795891c D3D: fix issues with multi-level 1x1 textures on D3D
Fixes NBA 2K11, maybe other things.
2014-11-12 21:43:48 -08:00
Yuriy O'Donnell dc08de028c Moved projection epsilon to a more reasonable place 2014-11-09 15:25:49 +01:00
Jasper St. Pierre 44b879dac2 Destroy OpenMP 2014-11-06 18:38:24 -08:00
Lioncash f6b4b4dbba Merge pull request #1497 from lioncash/host
Host: Kill off Host_SysMessage
2014-11-06 20:41:53 -05:00
Lioncash 884ec2ed13 Host: Kill off Host_SysMessage
Equivalent facilities already exist.
2014-11-05 02:30:48 -05:00
Tillmann Karras c34d99e40e Work around LLVM header peculiarity
Bug report: http://llvm.org/bugs/show_bug.cgi?id=21472
2014-11-04 02:29:33 +01:00
comex 08b61fdd9c Merge pull request #1465 from degasus/master
VideoCommon: Remove GetPointer in fifo code
2014-11-02 19:58:45 -05:00
Ryan Houdek 6e43562496 Merge pull request #1468 from Tilka/cleanup
Small cleanup
2014-11-02 11:02:35 -06:00
Tillmann Karras ff41dd479b Fix warnings about non-static variables 2014-11-02 04:51:44 +01:00
Lioncash 9ab924513e VideoCommon/VideoBackends: Remove unnecessary wxWidgets references.
EmuWindow doesn't even exist anymore. wxWidgets is also decoupled from the backends.
2014-11-01 19:19:00 -04:00
degasus cd9f0c34e4 VideoCommon: Remove GetPointer in fifo code 2014-11-01 12:24:43 +01:00
Ryan Houdek 3e82cb4628 Merge pull request #1440 from Sonicadvance1/attributeless-workaround
Implements PP shader system using attribute workaround.
2014-10-30 12:46:40 -06:00
Ryan Houdek 9da7e6ae79 Adds a DriverDetails bug to track Qualcomm attributeless rendering.
This particular issue was fixed in the v66 (07-08-2014) development drivers from Qualcomm.
To make sure we cover all drivers that may or may not have the issue fixed, make sure to mandate v95 minimum to work around the issue.
The next commit is the actual work around for post processing for this.
2014-10-29 19:58:18 -05:00
Ryan Houdek daabcfd6fc Removes Qualcomm's rotated framebuffer bug from DriverDetails.
Due to changes in how we render to the final framebuffer we no longer encounter this bug.
With the change to post processing being enabled at all times and no longer using glBlitFramebuffer, Qualcomm no longer has the chance to rotate our
framebuffer underneath of us.
2014-10-29 19:57:51 -05:00
comex 67452c53f1 Merge pull request #1386 from booto/small-loop-fix
VideoCommon: loop bug in ShaderGenCommon.h debug
2014-10-29 17:28:10 -04:00
Ryan Houdek bbaf8f9c0e Merge pull request #1434 from Sonicadvance1/fix-qualcomm
Fixes missing objects on Adreno hardware.
2014-10-29 10:38:23 -06:00
Ryan Houdek 6d4867e36a Fixes missing objects on Adreno hardware.
This particular bug from our friends over at Qualcomm manifests itself due to our alpha testing code having a conditional if statement in it.
This is a fairly recent breakage this time around, it was introduced in the v95 driver which comes with Android 5.0 on the Nexus 5.

So to break this issue down; In our alpha testing code we have two comparisons that happen and if they are true we will continue rendering, but if
they aren't true we do an early discard and return. This is summed up with a fairly simple if statement.

if (!(condition_1 <logic op> condition_2)) { /* discard and return */ }

This particular issue isn't actually due to the conditions within the if statement, but the negation of the result. This is the particular issue that
causes Qualcomm to fall flat on its face while doing so.

I've got two simple test cases that demonstrate this.
Non-working: http://hastebin.com/evugohixov.avrasm
Working: http://hastebin.com/afimesuwen.avrasm

As one can see, the disassembled output between the two shaders is different even though in reality it should have the same visual result.

I'm currently writing up a simple test program for Qualcomm to enjoy, since they will be asking for one when I tell them about the bug.
It will be tracked in our video driver failure spreadsheet along with the others.
2014-10-29 06:21:03 -05:00
comex 089e32ba7d Merge pull request #1307 from comex/bitset
Higher level bitset wrapper
2014-10-28 23:39:35 -04:00
Scott Mansell ba58cc47a3 Remove old (and now incorrect) error checking code.
We will now rely on Memory::CopyFromEmu to do bounds checking.

Some games actually load palettes from 0x00000000, despite the
fact no valid palette data should ever be there.

Fixes Issue 7792.
2014-10-29 08:53:53 +13:00
skidau b13ba0680c Merge pull request #1345 from sgadrat/fix-avidump-framerate
Fix timing of AVI files dumped on Linux
2014-10-28 12:50:01 +11:00
comex b29e5146ec Convert some VideoCommon stuff to BitSet.
Now with a minor performance improvement removed for no reason.
2014-10-25 16:57:25 -04:00
comex eb7f4dac50 Convert registersInUse to BitSet. 2014-10-25 16:57:25 -04:00
comex 5c2a470b97 Fix 'sizeof' which broke in my reference-to-pointer conversion. 2014-10-25 15:02:12 -04:00
skidau bc26cb1b19 Merge pull request #1322 from degasus/ogl-pp
OGL: force enable postprocessing
2014-10-25 13:48:27 +11:00
booto 6afdff6023 VideoCommon: loop bug in ShaderGenCommon.h debug 2014-10-25 01:52:31 +08:00
skidau 716fe06289 Merge pull request #1349 from comex/good-job-dereferencing-null-on-purpose
Fix some warnings from Clang trunk in an overly aggressive manner
2014-10-24 13:03:09 +11:00
Sylvain Gadrat 3a12c50dc1 Fix timing of AVI files dumped on Linux
The timing information is set on s_scaled_frame->pts, giving precise
timing information to the encoder. Frames arriving too early (less than
one tick after the previous frame) are droped. The setting of packet's
timestamps and flags is done after the call to avcodec_encode_video2()
as this function resets these fields according to its documentation.
2014-10-23 23:34:38 +02:00
comex 1f5b1001ce Merge pull request #1342 from phire/lessGetPointer
Eliminate getPointers which are memcpyed or memset.
2014-10-23 14:42:37 -04:00
Scott Mansell 23832987b5 Revert changes preloading of RGBA8 tiles.
This path should probally be optimised, but it's out of the
scope of this PR.
2014-10-23 18:15:29 +13:00
degasus 7292ea6a04 OGL: force enable postprocessing 2014-10-23 00:21:52 +02:00
Yuriy O'Donnell db497cc55f Added projection matrix epsilon that fixes depth clipping issues in some games 2014-10-23 00:20:47 +02:00
comex 6e774f1b64 Add missing includes where headers depend on other headers having been included first.
This is good hygiene, and also happens to be required to build Dolphin
using Clang modules.

(Under this setup, each header file becomes a module, and each #include
is automatically translated to a module import.  Recursive includes
still leak through (by default), but modules are compiled independently,
and can't depend on defines or types having previously been set up.  The
main reason to retrofit it onto Dolphin is compilation performance - no
more textual includes whatsoever, rather than putting a few blessed
common headers into a PCH.  Unfortunately, I found multiple Clang bugs
while trying to build Dolphin this way, so it's not ready yet, but I can
start with this prerequisite.)
2014-10-21 21:22:16 -04:00
comex 8492d04dfa Use pointers instead of references in GetUidData to avoid the undefined behavior of *(T *)nullptr (ewwww) 2014-10-21 21:20:05 -04:00
Scott Mansell 3aa979d7d7 Remove another 3 getPointers.
Thanks neobrain for spotting these.
2014-10-21 12:18:54 +13:00
skidau f65bb10c93 Merge pull request #1308 from kayru/shader_generator_write_opt
Workaround for MSVC not optimizing away Write() in GeneratePixelShader
2014-10-20 17:15:53 +11:00
skidau 81efd0e87f Merge pull request #1315 from RisingFog/movie-menu-input-display
Moved Input Display to Movie Menu
2014-10-20 14:34:02 +11:00
Ryan Houdek b3cee80faa Merge pull request #1321 from Sonicadvance1/Qualcomm-v95
Change driver details to reflect Qualcomm's changes with their v95 driver
2014-10-19 08:48:42 -05:00
Ryan Houdek 9108a11af4 Change driver details to reflect Qualcomm's changes with their v95 driver.
They fixed their issues with dynamic UBO array member access.
There are many other issues though.
2014-10-18 02:50:57 -05:00
skidau a5674bbe84 Merge pull request #475 from kayru/d3d_state_cache
D3D: Implemented cache for dynamic render states
2014-10-18 13:11:39 +11:00
Fog 467ab1a629 Moved Input Display to Movie Menu 2014-10-17 21:08:34 -04:00
Yuriy O'Donnell d23da7dbef Added __forceinline to AlphaTest::TestResult() to make MSVC inline it 2014-10-17 21:50:41 +02:00
Yuriy O'Donnell 5fdda135d2 Workaround for MSVC not optimizing away Write() in GeneratePixelShader
ShaderConstantProfile and ShaderUid now have an empty implementation
of Write() that uses variadic templates instead of varargs. MSVC is now
able to inline and optimize away this when necessary.
2014-10-17 21:37:42 +02:00
comex 4134a0ad54 Make some variables static (should probably adjust for coding style too, but I'm not the one who merged code with bad style...) 2014-10-16 17:03:37 -04:00
comex 1ff86a4716 Merge pull request #1295 from crudelios/remove-bbox-settings
Remove setting to enable or disable Bounding Box calculation.
2014-10-16 17:00:37 -04:00
degasus 8f403696ea DriverDetails: mark intel buffer_storage bug as fixed 2014-10-16 22:51:32 +02:00
skidau 3023abc1b5 Merge pull request #1285 from degasus/master
PixelShaderGen: replace multiplication with shift
2014-10-16 14:04:25 +11:00
Yuriy O'Donnell 2e4667caaa D3D: Moved render state cache into separate source files.
Refactored  StateCache::Get() to early out for narrower indentation.
Added comments to clarify ownership of objects returned by  StateCache::Get().
2014-10-15 20:22:39 +02:00
crudelios d281b4d7e1 Remove setting to enable or disable Bounding Box calculation. 2014-10-15 19:02:54 +01:00
skidau 8ef21bc5e2 Merge pull request #1272 from RisingFog/sconfig-dump-frames
Move bDumpFrames to SConfig (and it's references)
2014-10-15 13:42:37 +11:00
Markus Wick 1227bd2ba6 PixelShaderGen: replace multiplication with shift
iirc both nvidia and i965 doesn't optimize this
2014-10-14 12:34:37 +02:00
skidau 9551650c42 Merge pull request #1095 from crudelios/sw-bbox
Reimplement Bounding Box calculation using the software renderer.
2014-10-13 15:57:11 +11:00
Fog 8d424b114a Move bDumpFrames to SConfig (and it's references) 2014-10-12 23:56:16 -04:00
Fog cd0c784d5a Changed Dump Frames References 2014-10-12 19:51:13 -04:00
skidau 18c81dbc33 Merge pull request #1261 from lioncash/mesa-resonance-cascade
AVIDump: Add missing CoreTiming header
2014-10-12 14:26:23 +11:00
Lioncash e283839b68 AVIDump: Add missing CoreTiming header
Fixes build on the mesa buildbot
2014-10-11 23:12:19 -04:00
skidau a00ad6871c Merge pull request #1220 from RisingFog/avsync
Proper Audio/Video Dumping
2014-10-12 14:04:45 +11:00
crudelios 1e3b9ecdc1 Fix compile errors after rebase. 2014-10-10 12:44:44 +01:00
crudelios 176ea06e82 Get buildbot to compile. 2014-10-10 12:28:15 +01:00
crudelios 47c67f014f Fix linux build and various warnings.
Increase savestate version.
2014-10-10 12:28:13 +01:00
crudelios 2d4b7e3f3f Reimplement Bounding Box calculation using the software renderer. 2014-10-10 12:27:06 +01:00
Fog fc4125cdd1 Proper Audio/Video Dumping 2014-10-09 00:06:04 -04:00
Jules Blok 39f421d45d Support the borderless fullscreen option in all backends. 2014-10-07 16:48:43 +02:00
Jules Blok 7344f752b7 Replace BorderlessFullscreenEnabled by ExclusiveFullscreenEnabled.
Special handling was associated with this function, which only applies to exclusive fullscreen.
2014-10-07 16:43:32 +02:00
Lioncash 16a74a9557 Fifo: Fix tab/space mismatches 2014-10-06 20:04:57 -04:00
Lioncash 7c05d029d3 Merge pull request #1085 from waddlesplash/refactoring
Migrate global init stuff into UICommon.
2014-10-05 21:25:44 -04:00
Augustin Cavalier 19109e2d01 Migrate global init stuff into UICommon.
This avoids code duplication in a bunch of places .
I also moved the NVIDIA Optimus export into VideoCommon.
2014-10-05 20:47:37 -04:00
comex 7f6284c2fc Change a bunch of reference function arguments to pointers.
Per the coding style and sanity.
2014-10-02 03:00:33 -04:00
Rohit Nirmal ce8a4f5cc5 VideoCommon: Silence -Wmaybe-uninitialized warnings. 2014-09-30 16:14:18 -04:00
Tony Wasserka 13fc8e7df1 Merge pull request #578 from RachelBryk/IR
Cleanup Renderer::CalculateTargetSize(), and allow IRs higher than 4x to be set via INI.
2014-09-30 19:21:21 +02:00
comex 2eebdff01b Remove useless STACKALIGN macro.
It only ever did anything on 32-bit OS X.

Anyway, it wasn't even on the right functions, and these days
ABI_PushRegistersAndAdjustStack should handle maintaining the ABI
correctly.
2014-09-30 01:42:47 -04:00
comex 87a95727cd ReadDataFromFifo is always called with len = 32. Remove the parameter to enable optimizations.
And rename some variables around it to be less confusing.
2014-09-29 22:07:16 -04:00
skidau 007ba13cfa Merge pull request #1144 from skidau/fifo-linked
Moved the linking of the FIFO CPWritePointer near where CPWritePointer gets updated
2014-09-29 13:52:33 +10:00
comex 6c0a68d507 Add the override config option.
I hate the config code, but now is not the time to fix it...
2014-09-28 21:34:31 -04:00
comex 3a2048ea57 Add a central variable g_want_determinism which controls whether to try to make things deterministic.
It now affects the GPU determinism mode as well as some miscellaneous
things that were calling IsNetPlayRunning.  Probably incomplete.

Notably, this can change while paused, if the user starts recording a
movie.  The movie code appears to have been missing locking between
setting g_playMode and doing other things, which probably had a small
chance of causing crashes or even desynced movies; fix that with
PauseAndLock.

The next commit will add a hidden config variable to override GPU
determinism mode.
2014-09-28 21:34:31 -04:00
comex 65af90669b Add the 'desynced GPU thread' mode.
It's a relatively big commit (less big with -w), but it's hard to test
any of this separately...

The basic problem is that in netplay or movies, the state of the CPU
must be deterministic, including when the game receives notification
that the GPU has processed FIFO data.  Dual core mode notifies the game
whenever the GPU thread actually gets around to doing the work, so it
isn't deterministic.  Single core mode is because it notifies the game
'instantly' (after processing the data synchronously), but it's too slow
for many systems and games.

My old dc-netplay branch worked as follows: everything worked as normal
except the state of the CP registers was a lie, and the CPU thread only
delivered results when idle detection triggered (waiting for the GPU if
they weren't ready at that point).  Usually, a game is idle iff all the
work for the frame has been done, except for a small amount of work
depending on the GPU result, so neither the CPU or the GPU waiting on
the other affected performance much.  However, it's possible that the
game could be waiting for some earlier interrupt, and any of several
games which, for whatever reason, never went into a detectable idle
(even when I tried to improve the detection) would never receive results
at all.  (The current method should have better compatibility, but it
also has slightly higher overhead and breaks some other things, so I
want to reimplement this, hopefully with less impact on the code, in the
future.)

With this commit, the basic idea is that the CPU thread acts as if the
work has been done instantly, like single core mode, but actually hands
it off asynchronously to the GPU thread (after backing up some data that
the game might change in memory before it's actually done).  Since the
work isn't done, any feedback from the GPU to the CPU, such as real
XFB/EFB copies (virtual are OK), EFB pokes, performance queries, etc. is
broken; but most games work with these options disabled, and there is no
need to try to detect what the CPU thread is doing.

Technically: when the flag g_use_deterministic_gpu_thread (currently
stuck on) is on, the CPU thread calls RunGpu like in single core mode.
This function synchronously copies the data from the FIFO to the
internal video buffer and updates the CP registers, interrupts, etc.
However, instead of the regular ReadDataFromFifo followed by running the
opcode decoder, it runs ReadDataFromFifoOnCPU ->
OpcodeDecoder_Preprocess, which relatively quickly scans through the
FIFO data, detects SetFinish calls etc., which are immediately fired,
and saves certain associated data from memory (e.g. display lists) in
AuxBuffers (a parallel stream to the main FIFO, which is a bit slow at
the moment), before handing the data off to the GPU thread to actually
render.  That makes up the bulk of this commit.

In various circumstances, including the aforementioned EFB pokes and
performance queries as well as swap requests (i.e. the end of a frame -
we don't want the CPU potentially pumping out frames too quickly and the
GPU falling behind*), SyncGPU is called to wait for actual completion.

The overhead mainly comes from OpcodeDecoder_Preprocess (which is,
again, synchronous), as well as the actual copying.

Currently, display lists and such are escrowed from main memory even
though they usually won't change over the course of a frame, and
textures are not even though they might, resulting in a small chance of
graphical glitches.  When the texture locking (i.e. fault on write) code
lands, I can make this all correct and maybe a little faster.

* This suggests an alternate determinism method of just delaying results
until a short time before the end of each frame.  For all I know this
might mostly work - I haven't tried it - but if any significant work
hinges on the competion of render to texture etc., the frame will be
missed.
2014-09-28 21:34:29 -04:00
comex 2d4b7c5900 Make ReadDataFromFifo static. 2014-09-28 21:25:12 -04:00