Commit Graph

229 Commits

Author SHA1 Message Date
Scott Mansell 0d996f512b Multithreadded Shadergen: First pass over pixel Shadergen
Bug Fix: It was theoretically possible for a shader with depth writes
         disabled to map to the same UID as a shader with late depth
	 writes.
	 No known test cases trigger this.
2016-06-26 16:13:20 +12:00
Scott Mansell e99364c7c9 UID Change: Fix bug with indirect stage UIDs
Bug Fix: The normal stage UIDs were randomly overwriting indirect
         stage texture map UID fields. It was possible for multiple
	 shaders with diffrent indirect texture targets to map to
	 the same UID.
         Once again, it dpesn't look like this bug was ever triggered.
2016-06-26 16:13:19 +12:00
Scott Mansell 03f2c9648d Shader UID change: Only store the two bits of components we need.
This frees up 21 bits and allows us to shorten the UID struct by an entire
32 bits.

It's not strictly needed (as it's encoded into the length) but I added a
bit for per-pixel lighiting to make my life easier in the following
commits.
2016-06-26 16:13:19 +12:00
Scott Mansell 53c402dbc5 Multithreadded Shadergen: First Pass over vertex/lighting Shadergens
The only code which touches xfmem is code which writes directly into
uid_data.

All the rest now read their parameters out of uid_data.

I also simplified the lighting code so it always generated seperate
codepaths for alpha and color channels instead of trying to combine
them on the off-chance that the same equation works for all 4 channels.

As modern (post 2008) GPUs generally don't calcualte all 4 channels
in a single vector, this optimisation is pointless. The shader compiler
will undo it during the GLSL/HLSL to IR step.

Bug Fix: The about optimisation was also broken, applying the color light
         equation to the alpha light channel instead of the alpha light
	 euqation. But doesn't look like anything trigged this bug.
2016-06-26 16:13:19 +12:00
Pierre Bourdon 5fcb4bb3ab Further fixes to the formatting change. WX sucks. 2016-06-24 12:16:10 +02:00
Pierre Bourdon 3570c7f03a Reformat all the things. Have fun with merge conflicts. 2016-06-24 10:43:46 +02:00
Jeffrey Pfau d6517a761c VideoCommon: Simplify indirect texture lookup code slightly 2016-04-23 22:55:52 -07:00
Jeffrey Pfau aa736bf258 Revert "VideoBackend: Remove extraneous shifts from indirect texture lookups"
This reverts commit 1f1b127b69.
2016-04-23 22:55:42 -07:00
degasus ef01f234df PixelShaderGen: Fixes implicit type conversion or PR #3772.
This regression did only happen on OpenGL ES.
2016-04-10 12:49:32 +02:00
degasus 10e4f7e7bf PixelShaderGen: Move constant multiplication to constant generation.
No need to do this within the shader per pixel if it can be done once.
2016-04-09 12:25:00 +02:00
Stenzek e6b2212ec0 ShaderGen: Only specify storage qualifier in interface block when needed
Drivers that don't support GL_ARB_shading_language_420pack require that
the storage qualifier be specified even when inside an interface block.

AMD's driver throws a compile error when "centroid in/out" is used within
an interface block.

Our previous behavior was to include the storage qualifier regardless, but
this wasn't working on AMD, therefore we should check for the presence of
the extension and include based on this, instead.
2016-03-30 00:42:50 +10:00
Pierre Bourdon ae4cb12033 Merge pull request #3719 from Sonicadvance1/workaround_osx_video_drivers
Workaround OS X video driver bug #24983074
2016-03-26 01:43:32 +01:00
Ryan Houdek 3ab7806e24 Workaround OS X video driver bug #24983074
OS X's shader compiler has a bug with interface blocks where interface block members don't properly inherit the layout qualifier from the interface
block.
Work around this limitation by explicitly stating the layout qualifier on both the interface block and every single member inside of that block.
2016-03-09 09:11:00 -06:00
Jules Blok 6d1628eda4 Revert "Merge pull request #3578 from Armada651/forced-slow-depth"
This reverts commit e2a1a085b6, reversing
changes made to 2aea549eef.
2016-02-29 00:55:51 +01:00
Jules Blok 9805f70913 VideoConfig: Replace FastDepthCalc by ForcedSlowDepth.
Fast depth is now more accurate than slow depth and should always be used.
The option will be kept in a different form as it is still used as a hack to fix some games.
Also, the slow depth code path will still be relied upon by cards that don't support GL_ARB_clip_control.
2016-02-08 12:26:55 +01:00
Jeffrey Pfau 1f1b127b69 VideoBackend: Remove extraneous shifts from indirect texture lookups 2016-01-25 19:27:26 -08:00
Pierre Bourdon 24c228c6e9 Merge pull request #3523 from lioncash/video
VideoCommon: Header cleanup
2016-01-18 02:24:50 +01:00
Lioncash d9fec92628 VideoCommon: Header cleanup
Also remedies places where the video backends and core rely on things
being indirectly included.
2016-01-17 20:11:45 -05:00
Stenzek edebadc093 PixelShaderGen: Use bitwise AND for wrapping indirect texture coordinates
(x % y) is not defined in GLSL when sign(x) != sign(y).
This also has the added benefit of behaving the same as sampler wrapping modes, in regards to negative inputs.
2016-01-15 19:46:38 +10:00
Stenzek 617f9d9532 ShaderGen: Remove virtual methods from ShaderGeneratorInterface, move string buffer to ShaderCode
This fixes the crashes occuring at startup with a non-empty shader cache.
Because LinearDiskCache reads/writes to the storage of ShaderUid, ShaderUid must be trivially copyable.
Additionally, adds a static assert to LinearDiskCache to ensure this doesn't happen in the future.

The initialization of ShaderUid data has been moved to the code generation functions, so the above condition holds true.
2016-01-02 17:35:06 +10:00
Lioncash c151fe582f ShaderGenerators: Remove unnecessary inline keywords
Static by itself is sufficient
2015-12-26 17:57:32 -05:00
Lioncash 8ce3a4aa70 ShaderGeneration: Get rid of static buffers 2015-12-26 17:01:54 -05:00
degasus e26d9f7c35 MSAA: Store samples in ini files. 2015-12-15 09:41:01 +01:00
Andrew Wickham d9f1523a7b Make cast from int to float explicit in shader
This should fix this panic message I saw when playing Super Mario Strikers:

Failed to compile pixel shader [...]: error C7011: implicit cast from "int" to "float"
2015-11-30 13:35:46 -08:00
Tillmann Karras 71d1eb3c31 VideoCommon: return code/uid from shader gens
rather than passing in non-const references
2015-11-03 14:40:23 +01:00
Sintendo c4b56f06f9 Fix a few typos 2015-11-02 21:17:43 +01:00
Tillmann Karras 983978ee66 VideoCommon: flush vertex manager if components change 2015-11-01 22:39:31 +01:00
Tillmann Karras 7066689131 ShaderCaches: remove unneeded typedefs 2015-10-29 14:43:05 +01:00
Scott Mansell e7b2a22225 Support Conservative Depth as a fallback for EarlyZ
Allows Mesa based drivers to support ZCompLoc
2015-10-18 01:46:54 +13:00
Tillmann Karras 1df455bd13 PixelShaderGen: silence -Wformat-security warnings 2015-10-17 05:05:50 +02:00
Ilia Mirkin 5380fd9dba VideoCommon: fix variable types fed to Write() function 2015-10-16 18:20:36 -04:00
Scott Mansell 645e4cbbee PixelShaderGen: Use arrays of texture samplers. 2015-10-12 05:06:39 +13:00
degasus 1c0366993a VideoBackends: Reimplement SSAA, now for D3D + OGL 2015-09-06 19:40:00 +02:00
Ryan Houdek 9618738278 Remove all of our workarounds for Qualcomm devices we don't support anymore. 2015-09-04 23:45:35 -05:00
Lioncash 71ef0a0245 PixelShaderGen: Use spaces instead of tabs for vertical alignment 2015-09-01 12:20:50 -04:00
Lioncash 91eff28699 PixelShaderGen: Move defines into the implementation file
These aren't used outside of it. This also reduces the amount of things in
the global namespace.
2015-09-01 12:18:18 -04:00
Lioncash c760ffbd28 BPMemory/XFMemory: Convert defines to enums
These actually convey a concrete type, as well as also providing a
symbolic constant during debugging.
2015-09-01 12:07:10 -04:00
Rohit Nirmal 6252d2d71a Fix building with PCH disabled. 2015-08-28 14:13:28 -05:00
Ryan Houdek 3242e1a617 Fix the shader overrunning our max shader size.
The Star Wars games really push the hardware to its limits, which can cause the shaders that are produced to be 18kb or more.
Double our maximum shader size to compensate.
Fixes issue #8860
2015-08-22 01:01:03 -05:00
Ryan Houdek c1df6d7b4e Work around PowerVR's shader compiler.
This bug has been reported to IMGTec at https://pvrsupport.imgtec.com/ticket/472

The basic idea of the bug is that if you're doing a bitwise and  of a constant value vector with a constant scalar value, this causes PowerVR's shader
compiler to fail out with a very non-descriptive message.
Working around the issue by making the value a vector that it is being masked by.
2015-07-20 22:04:16 -05:00
Jules Blok ef1dfa8bcb VideoBackends: Allow the viewport to use the full depth range. 2015-06-06 03:37:46 +02:00
galop1n 2975e53091 D3D: Depth range inversion.
Credits go to Galop1n for designing this technique and to BhaaLseN for cleaning up the commit.
2015-05-26 15:31:31 +02:00
Ryan Houdek 69963dc4b0 Merge pull request #2274 from degasus/disable_bbox
Disable bbox
2015-05-25 08:46:12 -04:00
Tillmann Karras 30ebb2459e Set copyright year to when a file was created 2015-05-25 13:22:31 +02:00
Tillmann Karras cefcb0ace9 Update license headers to GPLv2+ 2015-05-25 13:22:31 +02:00
degasus acd074e291 VideoCommon: Make BBox emulation optional 2015-05-25 09:33:34 +02:00
degasus 3d98f6ab9f PixelShaderGen: apply zfreeze before ztextures
The zfreeze option freezes the depth plane of the rasterization.
So this is done before the TEV stages, where the z-textures are applied.
2015-05-13 20:06:23 +02:00
Jules Blok 1d745d632a PixelShaderGen: Clamp zCoord to the depth range. 2015-05-08 14:43:43 +02:00
Jules Blok 5dbb43ae1d PixelShaderGen: Use new multiplier everywhere and directly cast to int instead or rounding. 2015-05-08 14:32:23 +02:00
Jules Blok c4f85a38e6 VideoBackends: Use proper floating point depth precision. 2015-05-08 14:29:29 +02:00
Dwayne Slater ae83a1b821 Fix OpenGLES 3.0 on Qualcomm's crappy driver, it can't bitshift sometimes.
[fixed lint issues and grammar ~comex]
2015-04-23 16:33:12 -04:00
Markus Wick 6bbf774507 Merge pull request #2075 from magumagu/titantron-fix
Partially fix WWE12 titantron videos.
2015-02-21 10:09:47 +01:00
Scott Mansell 355be1719e Fix regression with directx when zfreeze=true and ztest=false. 2015-02-21 10:52:29 +13:00
magumagu 4cdf9f543f Partially fix WWE12 titantron videos.
The obvious question here is, why does it matter if we round or truncate?
The key is that GC/Wii does fixed-point interpolation, where PC GPUs do
floating-point interpolation. Discarding fractional bits makes the conversion
from floating-point to fixed point give more consistent results.

I'm not confident this is really the right fix, or that my explanation is
completely correct; ideally, we don't want to depend on floating-point
interpolation at all.
2015-02-18 19:41:00 -08:00
Scott Mansell daf760b202 A few small cleanups based on code review. 2015-01-23 04:38:36 +13:00
NanoByte011 add59b3bea Fixes Mario Tennis Gimmick Courts and adds support for FastDepthCalc
- Calculate ZSlope every flush but only set PixelShader Constant on Reset Buffer when zfreeze
- Fixed another Pixel Shader bug in D3D that was giving me grief
2015-01-23 03:32:31 +13:00
Scott Mansell 88c7afd315 Make zfreeze use screenspace coordinates independant of IR.
OpenGL requires the y coordinates to be flipped.

Also refactored PixelGen code to remove duplicate code.
2015-01-23 03:32:31 +13:00
Scott Mansell 418296961c Fix various issues with zfreeze implemntation.
Results are still not correct, but things are getting closer.

 * Don't cull CULLALL primitives so early so they can be used as reference
        planes.
 * Convert CalculateZSlope to screenspace coordinates.
 * Convert Pixelshader to screenspace coordinates (instead of worldspace
        xy coordinates, which is totally wrong)
 * Divide depth by 2^24 instead of clamping to 0.0-1.0 as was done
        before.

Progress:
 * Rouge Squadron 2/3 appear correct in game (videos in rs2 save file
         selection are missing)
 * Shadows draw 100% correctly in NHL 2003.
 * Mario golf menu renders correctly.
 * NFS: HP2, shadows sometimes render on top of car or below the road.
 * Mario Tennis, courts and shadows render correctly, but at wrong depth
 * Blood Omen 2, doesn't work.
2015-01-23 03:32:31 +13:00
NanoByte011 937844b9e3 Initial port of zfreeze branch (3.5-1729)
Initial port of original zfreeze branch (3.5-1729) by neobrain into
most recent build of Dolphin.

Makes Rogue Squadron 2 very playable at full speed thanks to recent core
speedups made to Dolphin. Works on DirectX Video plugin only for now.

Enjoy!  and Merry Xmas!!
2015-01-23 03:31:54 +13:00
NanoByte011 f475e367f2 Lighting Attenuation Fixes 2015-01-21 15:55:32 -07:00
Ryan Houdek 80e6367e46 Merge pull request #1869 from Stevoisiak/GeneralConsistency
Minor consistency changes
2015-01-21 13:46:53 -06:00
Jules Blok f40cd04a29 PixelShaderGen: Fix uninitialized variables. 2015-01-20 23:15:01 +01:00
Stevoisiak cb86db7b68 Minor consistency changes
Mostly small changes, like capitalization and spelling
2015-01-12 15:18:18 -05:00
Scott Mansell 1b771deb56 Move worldpos into it's own varying.
Previously it was packed into spare slots in clippos.xy and normal.w,
but it's ugly and more importantly it's causing bugs.

This was discovered during the debugging of a zfreeze branch, which
expected clippos.xy to be xy position coordinates in clipspace (as
the name suggested).

Turns out the stereoscopy shader had also run into this trap, modifying
clippos.x (introducing errors with per-pixel lighting).

This commit has been moved outside of the zfreeze PR for fast merging.
2015-01-03 09:23:09 +13:00
Jules Blok 3ed777b0f9 PixelShaderGen: Don't assign to input variables. 2014-12-28 23:37:05 +01:00
Jules Blok 7eb353b3bd VideoCommon: Don't pass structs between shaders, use the interface blocks instead. 2014-12-28 23:28:00 +01:00
degasus 01cd11a835 OGL: fix ssbo based bbox support 2014-12-22 19:10:35 +01:00
Jules Blok 0d79e8f32b VideoCommon: Don't specify the redundant in/out qualifier if GL_ARB_shading_language_420pack is supported.
Some driver developers interpreted "can" as "must" in the OpenGL specs. (I'm looking at you AMD)
2014-12-19 22:45:39 +01:00
Jules Blok cdd9e07522 VideoCommon: Add in/out qualifiers to centroid storage qualifier.
Fixes shaders for GPUs that don't support GL_ARB_shading_language_420pack.
2014-12-19 12:19:15 +01:00
Jules Blok 8dc3653ac9 VideoCommon: Don't pass structs between shader stages when geometry shaders are unsupported. 2014-12-18 00:37:16 +01:00
Jules Blok 69df23f725 VideoCommon: Only use interface blocks when geometry shaders are supported. 2014-12-18 00:37:14 +01:00
Jules Blok 782a5adb94 VideoCommon: Pass interface blocks between shader stages to resolve naming conflicts. 2014-12-18 00:36:49 +01:00
Jules Blok 275af9c5e4 VideoCommon: Assume we always use a geometry shader, not just for stereoscopy. 2014-12-15 22:47:41 +01:00
Jules Blok cec5b0ce01 ShaderGen: Remove the GS_OUTPUT struct for OpenGL.
And remove the generator for it since it is no longer used outside of the geometry shader.
2014-12-14 13:28:50 +01:00
Jules Blok fd6b588627 D3D: Define decimals in floating point numbers 2014-12-14 13:28:49 +01:00
Jules Blok b769da23d0 PixelShaderGen: Sample the correct texture slice. 2014-12-14 13:28:45 +01:00
Jules Blok 9253bb7d96 D3D: Add geometry shader stereoscopy support. 2014-12-14 13:28:41 +01:00
degasus bf65c49609 PixelShaderGen: merge OGL+D3D bbox 2014-12-09 19:32:24 +01:00
Markus Wick ff4526b4a9 Merge pull request #1657 from Tinob/master
Add HW bounding Box support to d3d backend
2014-12-08 09:05:22 +01:00
Lioncash 9bcadc8029 Common: Remove locale based functions from CommonFuncs.
Since %f isn't used anymore in the shader generators, these can go.
2014-12-05 20:55:29 -05:00
Rodolfo Bogado 93b4540e19 Add HW bounding Box support to d3d backend 2014-12-05 15:03:24 -03:00
Ryan Houdek 5c3bbf7409 Works around broken Intel Windows video drivers.
Just use regular boolean negation in our pixel shader's depth test everywhere except on Qualcomm.
This works around a bug in the Intel Windows driver where comparing a boolean value against true or false fails but boolean negation works fine.
Quite silly.

Should fix issues #7830 and #7899.
2014-12-03 00:33:42 -06:00
Jules Blok 27f3f804a0 ShaderGen: Only pass VS_OUTPUT between shaders if stereo 3D is enabled.
GLSL130 doesn't support passing structs between shaders.
This is not a problem for stereo 3D which has a GLSL150 requirement.
2014-11-23 14:27:40 +01:00
Jules Blok 9b22e15180 VideoConfigDiag: Add stereoscopy options group. 2014-11-23 14:27:38 +01:00
Jules Blok 63b37e29d1 ShaderGen: Rename "eye" to "layer".
Keeping things generic.
2014-11-23 14:26:56 +01:00
Jules Blok 176191dc16 ShaderGenCommon: Move uniforms into a common static string. 2014-11-23 14:24:09 +01:00
Jules Blok fa32f751d3 ShaderGen: Handle ShaderCode objects directly.
ShaderGeneratorInterface does not have virtual function members, so we have to implement each type explicitly.
2014-11-23 14:24:09 +01:00
Jules Blok b236c363de ShaderGen: Add a stereoscopy flag in the UID data. 2014-11-23 14:23:42 +01:00
Jules Blok d9e280e338 PixelShaderGen: Sample the correct texture layer. 2014-11-23 14:23:41 +01:00
Jules Blok f6ea293027 VertexShaderManager: Compute stereoscopy projection matrices. 2014-11-23 14:23:41 +01:00
Jules Blok 2d8ec62beb Pass VS_OUTPUT structs between shaders. 2014-11-23 14:23:41 +01:00
degasus 6670cacddc use GL_TEXTURE_2D_ARRAY for most of our textures 2014-11-23 14:22:22 +01:00
degasus c211450b99 OGL: implement bounding box support with ssbo
This implemention tries to be as accurate as the old SW implemention, but it will remove the dependcy of our vertexloader on videosw.
2014-11-17 21:20:32 +01:00
Ryan Houdek 6d4867e36a Fixes missing objects on Adreno hardware.
This particular bug from our friends over at Qualcomm manifests itself due to our alpha testing code having a conditional if statement in it.
This is a fairly recent breakage this time around, it was introduced in the v95 driver which comes with Android 5.0 on the Nexus 5.

So to break this issue down; In our alpha testing code we have two comparisons that happen and if they are true we will continue rendering, but if
they aren't true we do an early discard and return. This is summed up with a fairly simple if statement.

if (!(condition_1 <logic op> condition_2)) { /* discard and return */ }

This particular issue isn't actually due to the conditions within the if statement, but the negation of the result. This is the particular issue that
causes Qualcomm to fall flat on its face while doing so.

I've got two simple test cases that demonstrate this.
Non-working: http://hastebin.com/evugohixov.avrasm
Working: http://hastebin.com/afimesuwen.avrasm

As one can see, the disassembled output between the two shaders is different even though in reality it should have the same visual result.

I'm currently writing up a simple test program for Qualcomm to enjoy, since they will be asking for one when I tell them about the bug.
It will be tracked in our video driver failure spreadsheet along with the others.
2014-10-29 06:21:03 -05:00
comex 5c2a470b97 Fix 'sizeof' which broke in my reference-to-pointer conversion. 2014-10-25 15:02:12 -04:00
comex 8492d04dfa Use pointers instead of references in GetUidData to avoid the undefined behavior of *(T *)nullptr (ewwww) 2014-10-21 21:20:05 -04:00
skidau 3023abc1b5 Merge pull request #1285 from degasus/master
PixelShaderGen: replace multiplication with shift
2014-10-16 14:04:25 +11:00
Markus Wick 1227bd2ba6 PixelShaderGen: replace multiplication with shift
iirc both nvidia and i965 doesn't optimize this
2014-10-14 12:34:37 +02:00
skidau b4399dbdf3 Fixed the "Undeclared identifier: uv0" OpenGL shader compile error that appears in NBA2K11. 2014-09-24 00:10:45 +10:00
Ryan Houdek bc9ef95643 Support Sampler binding in the shader.
In the cases where we support the binding layout keyword, use it for more than binding UBO location.
This changes it so it is supported for samplers as well.

Instances when this is enabled is if a device supports GL_ARB_shading_language_420pack, or if it supports GLES 3.10.
2014-07-18 17:04:03 -05:00