Commit Graph

64 Commits

Author SHA1 Message Date
TellowKrinkle 6bd0fc86ba VideoCommon: Properly mask fbfetch logic op emulation 2022-07-13 02:27:45 -05:00
JMC47 cce6133ef6
Merge pull request #10749 from tellowkrinkle/IntelUbershaders
VideoCommon: Fix ubershaders on MoltenVK Intel
2022-07-10 19:35:55 -04:00
JMC47 fac66897af
Merge pull request #10819 from Dentomologist/fix_shader_compilation_warnings
VideoCommon: Fix D3D shader compilation warnings
2022-07-08 18:46:29 -04:00
Dentomologist 71541c1324 VideoCommon: Fix D3D shader warning X4000 (uninitialized variables)
Initialize alpha_A and alpha_B. They were previously only initialized in
cases where they were used, but D3D isn't able to figure that out.
2022-07-08 00:25:14 -07:00
iwubcode cad1d6ce90 VideoCommon: fix support of stereoscopic rendering after moving d3d to SPIRV generation 2022-06-24 18:09:53 -05:00
iwubcode 993fa3bf94 VideoCommon: update UberShaderPixel to properly support logic ops, matching the specialized shader 2022-06-24 18:09:53 -05:00
iwubcode 5dd2704416 D3D / VideoCommon: generate HLSL from SPIRV 2022-06-24 18:09:53 -05:00
TellowKrinkle c7892d7371 VideoCommon: Name ubershaders 2022-06-16 02:08:45 -05:00
TellowKrinkle 25929789c1 VideoCommon: Don't pass State by inout
Spirv-cross's MSL codegen makes the amazing choice of compiling calls to inout functions as `State temp = s; call_function(temp); s = temp`.  Not all Metal backends handle this mess well.  In particular, it causes register spills on Intel, losing about 5% in performance.
2022-06-14 00:48:47 -05:00
OatmealDome 259a5fc7c0 DriverDetails: Add broken discard with early-Z bug on Apple Silicon GPUs 2022-04-20 14:56:34 -04:00
OatmealDome 80dfefb32e UberShaderPixel: Add support for non-dual source shader blending 2022-04-19 10:55:26 -04:00
JosJuice abffa93a72 MoltenVK: Fix pixel shader typo 2022-04-10 20:51:20 +02:00
JosJuice bbb64ff993 Shadergen: Use real_ocol0 workaround for shader logic ops
Previously we were using this workaround when using framebuffer fetch
to emulate dual source blending, but it seems like we also need to use
it when using framebuffer fetch to emulate logic ops, otherwise some
Adreno devices get a crash when compiling OpenGL ES ubershaders.

Using the workaround in specialized shaders doesn't seem to be
necessary, but I've made the same change there for consistency.

This gets us closer to fixing https://bugs.dolphin-emu.org/issues/12791
but doesn't actually fix it.
2022-02-28 18:32:19 +01:00
Pokechu22 444f6fd0cb Treat alpha as 0 if alpha is 1 for blending
This removes the white box in fortune street again, without causing Mario Kart Wii to regress.
2022-02-08 15:15:15 -08:00
Pokechu22 0327e6acb4 Use the same logic for lerp bias for color and alpha
It doesn't make sense for alpha to add the bias ONLY when dividing by 2, while color doesn't apply the bias for divide by 2 only; hardware testing indicates that alpha should have the bias.

This fixes the menus in Mario Kart Wii (https://bugs.dolphin-emu.org/issues/11909) but reintroduces the white rectangle in Fortune Street.

This reverts commit 5aaa5141ed (and several other matching changes elsewhere).
2022-02-08 15:15:15 -08:00
Pokechu22 cc9ed4815d UberShaderPixel: Fix typo in fog calculation 2022-01-26 20:23:35 -08:00
JMC47 b1f79d9ecf
Merge pull request #10215 from OatmealDome/shader-logic-ops
VideoCommon: Support shader logic ops on Metal (Apple GPUs) and OpenGL ES
2021-12-22 16:39:54 -05:00
Pokechu22 f53dc6564f UberShaderPixel: Convert to EnumMap 2021-12-18 12:51:55 -08:00
OatmealDome 74a979db09 UberShaderPixel: Add shader logic ops support on OpenGL ES 2021-12-06 22:36:40 -05:00
OatmealDome a77ae14d94 UberShaderPixel: Add shader logic ops support on Metal 2021-12-06 22:36:40 -05:00
Pokechu22 ddf2691395 VideoCommon: Manually handle texture wrapping and sampling 2021-11-17 20:04:34 -08:00
Pokechu22 9ef228503a VideoCommon: Provide raw texdims to shaders 2021-11-17 20:04:34 -08:00
Pokechu22 555a93057c VideoCommon: Allow BitfieldExtract in specialized shaders 2021-11-17 20:04:33 -08:00
Pokechu22 a372a5947b VideoCommon: Fix color channel logic when per-pixel lighting is in use
This was broken in #10012 (specifically by 06579e4d53 and c3dec34391).
2021-10-13 20:43:32 -07:00
Pokechu22 3b752c4d5d UberShaderPixel: Rename ApiType to api_type 2021-08-01 15:09:20 -07:00
Pokechu22 2feced2e33 Fix indirect textures when format is not ITF_8 2021-07-08 15:48:14 -07:00
Pierre Bourdon e149ad4f0a
treewide: convert GPLv2+ license info to SPDX tags
SPDX standardizes how source code conveys its copyright and licensing
information. See https://spdx.github.io/spdx-spec/1-rationale/ . SPDX
tags are adopted in many large projects, including things like the Linux
kernel.
2021-07-05 04:35:56 +02:00
Pokechu22 2f1726e3f3 UberShaderPixel: always set tevcoord, even if the stage has no texture
This fixes NES game graphics when UberShaders are in use.
2021-06-21 13:01:25 -07:00
Pokechu22 5928182a4c Skip indirect operation for out of bounds indirect stages
This fixes rendering issues in Viewtiful Joe (https://bugs.dolphin-emu.org/issues/12525), but it is not entirely hardware accurate, as hardware testing showed other, more complex behavior in this case.  However, it should be good enough for our purposes.
2021-05-27 22:13:42 -07:00
Pokechu22 e1d45e9ba6 UberShaderPixel: always run indirect stage logic
Hardware testing has confirmed that fb_addprev and wrapping both run even when the indirect stage is disabled.
2021-05-07 16:37:47 -07:00
Pokechu22 5e3360c2cc UberShaderPixel: Fix OOB tex coord indices
Previously we set the texture coordinate to zero, now we set
the texture coordinate *index* to zero. This fixes the ripple
effect of the Mario painting in Luigi's Mansion.
2021-05-07 16:37:47 -07:00
Pokechu22 ed02034967 UberShaderPixel: Return fixed-point values from selectTexCoord
This change should have no behavioral differences itself, but allows for changing the behavior of out of bounds tex coord indices more easily in the next commit.  Without this change, returning tex0 for out of bounds cases and then applying the fixed-point logic would use the wrong tex dimension info (tex0 with I_TEXDIMS[1] or such), which is inaccurate.
2021-05-07 16:37:10 -07:00
Pokechu22 002ff4e4dd PixelShaderGen: Remove unused num_texgens argument
It became unused in f039149198.
2021-05-07 16:28:08 -07:00
Pokechu22 c3668e179c Split TevStageIndirect::mid into matrix_index and matrix_id 2021-05-07 16:27:52 -07:00
Pokechu22 0f7c9ef767 Change BitfieldExtract to use a pointer to the bitfield member 2021-05-07 15:11:17 -07:00
Pokechu22 70f9fc4e75 Convert BPMemory to BitField and enum class
Additional changes:
- For TevStageCombiner's ColorCombiner and AlphaCombiner, op/comparison and scale/compare_mode have been split as there are different meanings and enums if bias is set to compare.  (Shift has also been renamed to scale)
- In TexMode0, min_filter has been split into min_mip and min_filter.
- In TexImage1, image_type is now cache_manually_managed.
- The unused bit in GenMode is now exposed.
- LPSize's lineaspect is now named adjust_for_aspect_ratio.
2021-03-06 19:27:19 -08:00
Lioncash a5b28f1f07 ShaderGenCommon: Rename WriteFmt() to Write()
Now that we've converted all of the shader generators over to using fmt,
we can drop the old Write() member function and perform a rename
operation on the WriteFmt() to turn it into the new Write() function.

All changes within this are the removal of a <cstdarg> header, since the
previous printf-based Write() required it, and renaming. No functional
changes are made at all.
2020-11-09 02:31:49 -05:00
Lioncash dc72edf0e2 UberShaderPixel: Migrate over to fmt
Completes the migration over to using the fmt-formatting WriteFmt
function. The next PR will rename all usages of WriteFmt, while
simultaneously getting rid of the old printf code.
2020-10-27 13:25:11 -04:00
Lioncash 86f8768268 VideoCommon/ShaderGenCommon: Make template functions regular functions
These are only ever used with ShaderCode instances and nothing else.
Given that, we can convert these helper functions to expect that type of
object as an argument and remove the need for templates, improving
compiler throughput a marginal amount, as the template instantiation
process doesn't need to be performed.

We can also move the definitions of these functions into the cpp file,
which allows us to remove a few inclusions from the ShaderGenCommon
header. This uncovered a few instances of indirect inclusions being
relied upon in other source files.

One other benefit is this allows changes to be made to the definitions
of the functions without needing to recompile all translation units that
make use of these functions, making change testing a little quicker.

Moving the definitions into the cpp file also allows us to completely
hide DefineOutputMember() from external view, given it's only ever used
inside of GenerateVSOutputMembers().
2020-05-25 21:12:29 -04:00
Lioncash 149a97e396 VideoCommon: Remove unnecessary memset on ShaderUid instances.
Zero-initialization zeroes out all members and padding bits, so this is
safe to do. While we're at it, also add static assertions that enforce
the necessary requirements of a UID type explicitly within the ShaderUid
class.

This way, we can remove several memset calls around the shader
generation code that makes sure the underlying UID data is zeroed out.
Now our ShaderUid class enforces this for us, so we don't need to care about
it at the usage sites.
2019-05-30 06:41:54 -04:00
Stenzek 16294acd2a VideoBackends: Scale bounding box rectangle in the pixel shader 2019-03-25 18:47:58 +10:00
Stenzek f039149198 Move most backend functionality to VideoCommon 2019-02-19 16:57:54 +10:00
Stenzek 1d61041985 ShaderGen: Don't use interface blocks on Vulkan without GS
Doing so causes the Adreno driver to choke and spew errors about
too many output locations/components, when clearly we're under
the limit.
2019-01-24 17:02:17 +10:00
Stenzek 68cb24172b ShaderGen: Omit some unused varyings when possible
Removes the clipPos varying unless slow-depth is used, and the
clipDistance varyings if geometry shaders are not used.
2019-01-23 18:34:22 +10:00
Stenzek 0c0d66809d PixelShaderGen: Split bbox into seperate variables
The Metal shader compiler fails to compile the atomic instructions
when operating on individual components of a vector. Spltting it
into four variables shouldn't make any difference for other
platforms, as they are accessed independently.
2018-11-07 05:41:09 -08:00
Tillmann Karras 56fdcf5f00 VideoCommon: remove unnecessary floor()
floatindex is clamped to the range [0, 9]. For non-negative numbers
floor() is equivalent to trunc(). Truncation happens implicitly when
converting to uint, so the floor() is unnecessary.
2018-10-09 00:31:43 +01:00
Pierre Bourdon 95c2a92f26
Revert "ShaderGen: Drop broken fragment shader index workaround for Vulkan" 2018-09-01 05:32:56 +02:00
Stenzek 3ad7812b53 ShaderGen: Drop broken fragment shader index workaround for Vulkan
AMD appears to have since fixed this in their driver, and it makes
shadergen ever so slightly less messy.
2018-08-28 23:39:47 +10:00
Stenzek 57976c947b ShaderGen: Don't emit integer outputs when logic op is unsupported
This may have been causing issues for D3D10 hardware, where logic op was
not supported.
2018-05-26 00:09:29 +10:00
Stenzek 9a5c2119e5 ShaderCache: Remove unused UID bits before inserting into shader map 2018-05-26 00:09:10 +10:00