Commit Graph

2583 Commits

Author SHA1 Message Date
Lioncash 8e59987d46 TextureDecoder_Common: Add missing algorithm include
It was being indirectly included, causing VS to syntactically mark std::min/max usages as being undefined (however it still compiled fine, of course)
2015-04-27 23:17:41 -04:00
skidau 4bf4778cd7 Merge pull request #2312 from comex/shutdown-race-condition
Exit ReadDataFromFifoOnCPU, PushFifoAuxBuffer early if shutting down (GpuRunningState=false)
2015-04-27 19:47:00 +10:00
Dwayne Slater ae83a1b821 Fix OpenGLES 3.0 on Qualcomm's crappy driver, it can't bitshift sometimes.
[fixed lint issues and grammar ~comex]
2015-04-23 16:33:12 -04:00
comex 74c30d1784 Fix code broken by merge 2015-04-23 02:07:45 -04:00
comex ad95454d04 Merge pull request #2223 from phire/imm
Cleanup OpArg, make immediates more explicit.
2015-04-23 01:53:18 -04:00
comex 06dd0ba3b4 Exit ReadDataFromFifoOnCPU, PushFifoAuxBuffer early if shutting down (GpuRunningState=false)
This was causing a race condition where the "absurdly large aux buffer"
panic alert would be triggered in the last bit of fifo processing on the
CPU thread in deterministic mode (i.e. netplay).  SyncGPU is supposed to
move the auxiliary queue data to the beginning of the containing buffer
so we don't have to deal with wraparound; if GpuRunningState is false,
however, it just returns, because it's set to false by another thread -
thus it doesn't know whether RunGpuLoop is still executing (in which
case it can't just reset the pointers, because it may still be using the
buffer) or not (in which case the condition variable it normally waits
for to avoid the previous problem will never be signaled).  However,
SyncGPU's caller PushFifoAuxBuffer wasn't aware of this, so if the
buffer was filling at just the right time, it'd stay full and that
function would complain that it was about to overflow it.  Similar
problem with ReadDataFromFifoOnCPU afaik.  Fix this by returning early
from those as well; other callers of SyncGPU should be safe.  A
*slightly* cleaner alternative would be giving the CPU thread a way to
tell when RunGpuLoop has actually exited, but whatever, this works.
2015-04-21 22:33:29 -04:00
Lioncash 9eb608c9da Merge pull request #2301 from lioncash/const
General: Apply the const specifier where applicable
2015-04-16 23:13:39 -04:00
Lioncash 63393570fb PerfQueryBase: Move common implementation variables into base class 2015-04-15 19:22:16 -04:00
Lioncash b0613bb1c8 General: Apply the const specifier where applicable 2015-04-15 02:04:03 -04:00
degasus 74795b4553 Fifo: rewrite Fifo_PauseAndLock
This lock isn't required any more as our FlushGpu garanty to block until the GPU is idle
2015-04-06 12:35:35 +02:00
degasus b1ffd32f5f Fifo: only touch the SIMD state once in the single core loop 2015-04-06 12:35:35 +02:00
degasus d2c62b1744 Fifo: only sleep once within every ms of emulated time 2015-04-06 12:35:35 +02:00
degasus b020ae1c5d Fifo: rewrite sync on idle skipping hack
Now it's done without a busy loop
2015-04-06 12:35:35 +02:00
degasus 9bdaa00e2d Fifo: use the outer loop on sync GPU 2015-04-06 12:35:35 +02:00
degasus 279c657cda Fifo: Replace busy loop with condition variable 2015-04-06 12:35:27 +02:00
Tillmann Karras 9da86092ae VertexLoaderX64: use common code for FORMAT_FLOAT 2015-03-18 12:12:21 +01:00
Tillmann Karras 7030542546 VertexLoaderX64: support SSE2 as a fallback
With suggestions by Fiora and magumagu.
2015-03-18 12:12:21 +01:00
Tillmann Karras 8d90ecda7f VertexLoaders: make positions more compact 2015-03-18 12:09:06 +01:00
Scott Mansell 858ff69c01 Make OpArg.offset and operandReg private.
Also cleaned up WriteRest function.
2015-03-17 18:49:30 +13:00
magumagu 629fb8fb49 Merge pull request #2222 from Tilka/fix_warnings
Fix warnings
2015-03-16 17:41:46 -07:00
Tillmann Karras f82afd1b2f Fix warnings 2015-03-16 19:02:30 +01:00
Shawn Hoffman ad64336137 quiet some warnings which appear on vs2015.
quieted warnings include shadowed variable names and integer extensions.
2015-03-15 19:28:47 -07:00
skidau cdff138c67 Show no more than one FIFO error per session. 2015-03-13 23:25:15 +11:00
Tillmann Karras 3987725217 VertexLoaderX64: fix harmless off-by-one error 2015-03-08 04:43:59 +01:00
degasus 35373c5185 TextureCache: load all mipmap levels from custom textures
This drops the "feature" to load level 0 from the custom texture
and all other levels from the native one if the size matches.
But in my opinion, when a custom texture only provide one level,
no more should be used at all.
2015-03-02 00:09:09 +01:00
magumagu 7f7973efa5 Merge pull request #2148 from Tilka/fifo_cleanup
Small FIFO-related cleanup
2015-03-01 13:06:43 -08:00
degasus 7ca24f90d1 TexCache: increase TEXTURE_KILL_THRESHOLD
Xenoblade uses more than 40 textures alternately per frame for eg water effects.
So don't try to drop them as aggressive.
2015-03-01 13:41:14 +01:00
Tillmann Karras 9493c713dd Fifo: small cleanup 2015-02-28 15:40:01 +01:00
Tillmann Karras e28c97f6bd Fifo: drop unused functions 2015-02-28 15:40:00 +01:00
Lioncash d10571a86a PixelShaderManager: Remove unnecessary casts.
EFBToScaledXf and EFBScaledYf return a float, so the cast isn't needed here.
2015-02-28 00:04:05 -05:00
Lioncash 7408de7e79 Merge pull request #2058 from Stevoisiak/Codemaid-Cleanup-Take2
Basic Formatting/Whitespace Cleanup
2015-02-25 18:07:56 -05:00
Stevoisiak 93b16a4a2d Formatting/Whitespace Cleanup
Various fixes to formatting and whitespace
2015-02-25 10:48:21 -05:00
degasus 967eaad8df VideoCommon: rename efb2tex and efb2ram 2015-02-24 23:10:13 +01:00
degasus 1313d3461f VideoCommon: always enable efb copy 2015-02-24 23:01:01 +01:00
Tillmann Karras e2fec13ab6 Fix some -Wsign-compare warnings 2015-02-24 10:29:59 +01:00
Tillmann Karras f298f00e1b Clean up the intrinsics #ifdef mess 2015-02-24 01:02:36 +01:00
skidau 593563e16c Merge pull request #2087 from Armada651/epsilon-3d
VertexShaderManager: Turn the epsilon hack back on for 3D Vision.
2015-02-23 13:12:55 +11:00
skidau f8e51a1a26 Merge pull request #2050 from Tilka/reset_vertex_loader_stats
VertexLoaderManager: reset stats properly
2015-02-23 13:10:38 +11:00
degasus b35fa222f5 VideoCommon: perf querys by async events 2015-02-22 08:41:15 +01:00
degasus edbd402101 VideoCommon: bbox by async events 2015-02-22 08:41:15 +01:00
degasus ad7264da7d VideoCommon: implement swap requests in the full async way 2015-02-22 08:41:15 +01:00
degasus bc248f8941 VideoCommon: use a new async event system for efb access 2015-02-22 08:41:15 +01:00
Jules Blok ff4127cf50 VertexShaderManager: Turn the epsilon hack back on for 3D Vision.
The bug is fixed in version 347.52 of the drivers.
2015-02-21 12:09:49 +01:00
Jules Blok 139ad3b2b9 TextureConversionShader: Use a Texture2DArray to match the shader resource view. 2015-02-21 11:50:20 +01:00
Markus Wick 6bbf774507 Merge pull request #2075 from magumagu/titantron-fix
Partially fix WWE12 titantron videos.
2015-02-21 10:09:47 +01:00
Scott Mansell 355be1719e Fix regression with directx when zfreeze=true and ztest=false. 2015-02-21 10:52:29 +13:00
magumagu 074397c12d Explicitly set up AllocateTexture configuration for palette conversion.
No functional change.
2015-02-19 15:57:05 -08:00
magumagu ddc815dd7a Remove TextureAddress struct. 2015-02-19 15:36:32 -08:00
magumagu c0a4760f0e Decode EFB copies used as paletted textures.
A number of games make an EFB copy in I4/I8 format, then use it as a
texture in C4/C8 format.  Detect when this happens, and decode the copy on
the GPU using the specified palette.

This has a few advantages: it allows using EFB2Tex for a few more games,
it, it preserves the resolution of scaled EFB copies, and it's probably a
bit faster.

D3D only at the moment, but porting to OpenGL should be straightforward..
2015-02-19 15:09:27 -08:00
magumagu 4cdf9f543f Partially fix WWE12 titantron videos.
The obvious question here is, why does it matter if we round or truncate?
The key is that GC/Wii does fixed-point interpolation, where PC GPUs do
floating-point interpolation. Discarding fractional bits makes the conversion
from floating-point to fixed point give more consistent results.

I'm not confident this is really the right fix, or that my explanation is
completely correct; ideally, we don't want to depend on floating-point
interpolation at all.
2015-02-18 19:41:00 -08:00
mimimi085181 2f8e0c9bb9 Allow multiple texture cache entries for textures at the same address
This is the same trick which is used for Metroid's fonts/texts, but for all textures. If 2 different textures at the same address are loaded during the same frame, create a 2nd entry instead of overwriting the existing one. If the entry was overwritten in this case, there wouldn't be any caching, which results in a big performance drop.

The restriction to textures, which are loaded during the same frame, prevents creating lots of textures when textures are used in the regular way. This restriction is new. Overwriting textures, instead of creating new ones is faster, if the old ones are unlikely to be used again.

Since this would break efb copies, don't do it for efb copies.

Castlevania 3 goes from 80 fps to 115 fps for me.

There might be games that need a higher texture cache accuracy with this, but those games should also see a performance boost from this PR.

Some games, which use paletted textures, which are not efb copies, might be faster now. And also not require a higher texture cache accuracy anymore. (similar sitation as PR https://github.com/dolphin-emu/dolphin/pull/1916)
2015-02-18 23:54:40 +01:00
Ryan Houdek 3aa605236d Merge pull request #2041 from Sonicadvance1/AArch64_vertex_loader
[AArch64] Vertex loader and things
2015-02-17 00:51:51 -06:00
Ryan Houdek ed008c3a69 [AArch64] Change the vertex loader over to using unscaled loadstores.
In nearly all direct loadstore cases we can use unscaled loadstores.
Still have a fallback in case we hit a situation that we /can't/ do a unscaled loadstore.
2015-02-16 22:03:09 -06:00
Ryan Houdek b4b03641b3 [AArch64] Implement vertex loader recompiler.
Shows a noticeable reduction in time spent in the vertex loader.
2015-02-16 16:51:32 -06:00
Pierre Bourdon 3500740dd4 Windows AVIDump: support "silent" frame dumping
When enabled, the silent option will avoid popping up dialog boxes for
overwrite confirmation or codec selection. The codec selection defaults to
uncompressed RGB.

This is required for FifoCI on Windows which needs to drive Dolphin from the
command line exclusively.
2015-02-14 23:38:14 +01:00
Markus Wick 405444d4fe Merge pull request #1803 from lioncash/rgb
OnScreenDisplay: Allow for different colored messages
2015-02-14 10:47:47 +01:00
Tillmann Karras 1a52cff1c9 VertexLoaderManager: reset stats properly 2015-02-14 02:30:05 +01:00
Ryan Houdek 15e41c67f8 Change RunVertices' function arguments.
This reduces some dumb state shuffling when calling the emitted vertex loaders.
2015-02-13 12:16:06 -06:00
degasus c404e87226 ShaderGen: Fix pixel offset correction
We want to move the vertex by 1/12 pixel, but the old code
did miss the perspective division. So by multiplying with pos.w,
the position is moved correctly after the perspective division.
2015-02-11 20:54:15 +01:00
magumagu d9988ee9b5 Merge pull request #1987 from magumagu/thread-safety
Cleanup usage of atomic/threadsafe functions
2015-02-10 13:48:12 -08:00
Lioncash 9d5c6c55fe OnScreenDisplay: Allow for different colored messages 2015-02-07 17:35:21 -05:00
Lioncash e07679114b Use emplace_* functions where in-place construction is preferable 2015-02-04 11:39:08 -05:00
magumagu 57d94de2ad Fix regression for D3D EFB depth copies.
On D3D, we read from the depth buffer using the format
DXGI_FORMAT_R24_UNORM_X8_TYPELESS (essentially, the "r" component contains
the depth, and the other components contain nothing).
2015-02-03 11:27:27 -08:00
Markus Wick 3c475b91ea Merge pull request #1993 from Armada651/line-perspective
GeometryShaderGen: Perspective divide the line coordinates before comparing the angle.
2015-01-31 23:45:54 +01:00
Jules Blok 8c55ec0d51 GeometryShaderGen: Perspective divide the line coordinates before comparing the angle. 2015-01-31 23:32:23 +01:00
Tillmann Karras 1aac65f988 VertexLoaderManager: assimilate GetVertexSize() 2015-01-31 09:23:50 +01:00
magumagu 47be9d8e6b Clean up usage of ScheduleEvent_Threadsafe. 2015-01-30 14:48:23 -08:00
degasus 20628b6e5d OpcodeDecoder: Calculate decoding time for vertices 2015-01-29 19:55:28 +01:00
Gabriel Corona a4adfe194a JitRegister: overload Register with a [start,end) variant 2015-01-28 09:50:19 +01:00
Markus Wick beaa9905a6 Merge pull request #1966 from magumagu/unify-efb-encode
Unify EFB encoding shader generation
2015-01-27 23:14:18 +01:00
Markus Wick 43605f8716 Merge pull request #1948 from magumagu/remove-efb-cache
Remove EFB to RAM cache, and simplify code.
2015-01-27 09:42:15 +01:00
Tillmann Karras 3dbd6cd384 VertexLoaderX64: save XMM0 if the ABI requires it 2015-01-26 22:24:06 +01:00
Tillmann Karras 8416a86b6d VertexLoaderBase: fix crash on invalid formats 2015-01-26 22:24:06 +01:00
Tillmann Karras 66f28707e7 VertexLoader: small clean up 2015-01-26 22:24:06 +01:00
magumagu b56025e6eb Don't use boolean negation. 2015-01-25 23:28:59 -08:00
magumagu 92189823f3 Fix RGBA8 encoding. 2015-01-25 22:53:30 -08:00
magumagu b0b99b6922 Fix shader so it's possible to use with D3D Map().
Well, that's not strictly true, but trying to memcpy between two buffers
using different row lengths and different strides is at minimum extremely
unintuitive.
2015-01-25 19:57:09 -08:00
magumagu 6c1bdfe04c More work. 2015-01-25 19:57:07 -08:00
magumagu ef75f3005d WIP. 2015-01-25 15:49:35 -08:00
Jules Blok 5c4ee2f71e PostProcessing: Move default pixel shader to PostProcessingShaderConfiguration.
Reduces code complexity and fixes a bug where the shader is not properly invalidated.
2015-01-25 23:08:49 +01:00
Jules Blok 262c3b19ec PostProcessing: Add support for user-supplied anaglyph shaders.
There are lots of different anaglyph glasses out there and there may be even more creative uses for stereoscopic post-processing shaders.
2015-01-25 22:07:03 +01:00
Scott Mansell 61215e7180 Fix a buffer underrun in CalculateZSlope. 2015-01-25 20:31:20 +13:00
Lioncash 9cdfe889af Coding style cleanup from the zfreeze merge 2015-01-24 15:16:48 -05:00
Markus Wick ae514cb0f2 Merge pull request #1955 from degasus/master
TexCache: Rewrite the texID generation for paletted textures
2015-01-24 15:37:25 +01:00
degasus 51990fcdfa TexCache: Rewrite the texID generation for paletted textures
This changes the behavior if both texture are available. The old code did
try to load the modfied texID, the new code tries the unmodified texID first.
2015-01-24 13:58:20 +01:00
Markus Wick 4f6d0049a7 Merge pull request #1951 from Sonicadvance1/Remove_old_defines
Remove an old GLES define that I missed.
2015-01-24 13:38:26 +01:00
Tony Wasserka 43036af944 Merge pull request #1812 from phire/real_zfreeze
Add proper zfreeze support.
2015-01-24 13:29:57 +01:00
Scott Mansell 14baf038e7 Stop doing nastly shit to OpenGL stream buffers.
Instead we keep the loaded vertices in CPU memory.
2015-01-24 14:41:51 +13:00
Ryan Houdek 189528171b Remove an old GLES define that I missed. 2015-01-23 14:30:23 -06:00
magumagu 6659c15bed Remove EFB to RAM cache, and simplify code. 2015-01-23 10:48:15 -08:00
Scott Mansell 5510c86b81 Move Zfreeze code out individual backends into videoCommon
Also:
 * Implement support for per-vertex PosMatrixIndex
 * Only update zslope constant once when zfreeze is activated.
 * Added a bunch of comments.
2015-01-24 03:22:27 +13:00
Scott Mansell daf760b202 A few small cleanups based on code review. 2015-01-23 04:38:36 +13:00
Scott Mansell e88c02dece Ensure that ZSlopes save/restore state correctly.
Had to re-do *ShaderManager so they saved their constant arrays
instead of completly rebuilding them on restore state.
2015-01-23 03:32:31 +13:00
Scott Mansell 128d303656 Reduce number of divisions in screenspace transform.
This is closer to what the hardware does anyway.
2015-01-23 03:32:31 +13:00
NanoByte011 add59b3bea Fixes Mario Tennis Gimmick Courts and adds support for FastDepthCalc
- Calculate ZSlope every flush but only set PixelShader Constant on Reset Buffer when zfreeze
- Fixed another Pixel Shader bug in D3D that was giving me grief
2015-01-23 03:32:31 +13:00
Scott Mansell 6d5065c58d Fix pixelshader constant offsets. 2015-01-23 03:32:31 +13:00
Scott Mansell 88c7afd315 Make zfreeze use screenspace coordinates independant of IR.
OpenGL requires the y coordinates to be flipped.

Also refactored PixelGen code to remove duplicate code.
2015-01-23 03:32:31 +13:00
Scott Mansell 418296961c Fix various issues with zfreeze implemntation.
Results are still not correct, but things are getting closer.

 * Don't cull CULLALL primitives so early so they can be used as reference
        planes.
 * Convert CalculateZSlope to screenspace coordinates.
 * Convert Pixelshader to screenspace coordinates (instead of worldspace
        xy coordinates, which is totally wrong)
 * Divide depth by 2^24 instead of clamping to 0.0-1.0 as was done
        before.

Progress:
 * Rouge Squadron 2/3 appear correct in game (videos in rs2 save file
         selection are missing)
 * Shadows draw 100% correctly in NHL 2003.
 * Mario golf menu renders correctly.
 * NFS: HP2, shadows sometimes render on top of car or below the road.
 * Mario Tennis, courts and shadows render correctly, but at wrong depth
 * Blood Omen 2, doesn't work.
2015-01-23 03:32:31 +13:00
NanoByte011 613781c765 Cleanup and refactor of zfreeze port
Based on the feedback from pull request #1767 I have put in most of
degasus's suggestions in here now.

I think we have a real winner here as moving the code to
VertexManagerBase for a function has allowed OGL to utilize zfreeze now
:)

Correct use of the vertex pointer has also corrected most of the issue
found in pull request #1767 that JMC47 stated.  Which also for me now
has Mario Tennis working with no polygon spikes on the characters
anymore!  Shadows are still an issue and probably in the other games
with shadow problems.  Rebel Strike also seems better but random skybox
glitches can show up.
2015-01-23 03:32:31 +13:00
NanoByte011 937844b9e3 Initial port of zfreeze branch (3.5-1729)
Initial port of original zfreeze branch (3.5-1729) by neobrain into
most recent build of Dolphin.

Makes Rogue Squadron 2 very playable at full speed thanks to recent core
speedups made to Dolphin. Works on DirectX Video plugin only for now.

Enjoy!  and Merry Xmas!!
2015-01-23 03:31:54 +13:00