Otherwise, texelFetch() will use an out-of-bounds layer for game textures (that have 1 layer; EFB copies have 2 layers in stereoscopic 3D mode), which is undefined behavior (often resulting in a black image). The fast texture sampling path uses texture(), which always clamps (see https://www.khronos.org/opengl/wiki/Array_Texture#Access_in_shaders), so it was unaffected by this difference.
Specifically, when using Manual Texture Sampling, if textures sizes don't match the size the game specifies, things previously broke. That can happen with custom textures, and also with scaled EFB copies at non-native IRs. It breaks most obviously by not scaling the texture coordinates (so only part of the texture shows up), but the hardware wrapping functionality also assumes texture sizes are a power of 2 (or else it will behave weirdly in a way that matches how hardware behaves weirdly). The fix is to provide alternative texture wrapping logic when custom texture sizes are possible.
This adjusts the NaN replacement logic introduced in #9928 to work around the HLSL compiler optimizing away calls to isnan, which caused that functionality to not work with ubershaders on D3D11 and D3D12 (it did work with specialized shaders, despite a warning being logged for both; that warning is also now gone). Note that the `D3DCOMPILE_IEEE_STRICTNESS` flag did not solve this issue, despite the warning suggesting that it might.
Suggested by @kayru and @jamiehayes.
SPDX standardizes how source code conveys its copyright and licensing
information. See https://spdx.github.io/spdx-spec/1-rationale/ . SPDX
tags are adopted in many large projects, including things like the Linux
kernel.
Now that we've converted all of the shader generators over to using fmt,
we can drop the old Write() member function and perform a rename
operation on the WriteFmt() to turn it into the new Write() function.
All changes within this are the removal of a <cstdarg> header, since the
previous printf-based Write() required it, and renaming. No functional
changes are made at all.
These are only ever used with ShaderCode instances and nothing else.
Given that, we can convert these helper functions to expect that type of
object as an argument and remove the need for templates, improving
compiler throughput a marginal amount, as the template instantiation
process doesn't need to be performed.
We can also move the definitions of these functions into the cpp file,
which allows us to remove a few inclusions from the ShaderGenCommon
header. This uncovered a few instances of indirect inclusions being
relied upon in other source files.
One other benefit is this allows changes to be made to the definitions
of the functions without needing to recompile all translation units that
make use of these functions, making change testing a little quicker.
Moving the definitions into the cpp file also allows us to completely
hide DefineOutputMember() from external view, given it's only ever used
inside of GenerateVSOutputMembers().
Migrates most of VideoCommon over to using fmt, with the exception being
the shader generator code. The shader generators are quite large and
have more corner cases to deal with in terms of conversion (shaders have
braces in them, so we need to make sure to escape them).
Because of the large amount of code that would need to be converted, the
conversion of VideoCommon will be in two parts:
- This change (which converts over the general case string formatting),
- A follow up change that will specifically deal with converting over
the shader generators.