Using 8-bit integer math here lead to precision loss for depth copies,
which broke various effects in games, e.g. lens flare in MK:DD.
It's unlikely the console implements this as a floating-point multiply
(fixed-point perhaps), but since we have the float round trip in our
EFB2RAM shaders anyway, it's not going to make things any worse. If we
do rewrite our shaders to use integer math completely, then it might be
worth switching this conversion back to integers.
However, the range of the values (format) should be known, or we should
expand all values out to 24-bits first.
Also makes y_scale a dynamic parameter for EFB copies, as it doesn't
make sense to keep it as part of the uid, otherwise we're generating
redundant shaders.
HLSL does not define roundEven(), only round(). This means that the
output may differ slightly for OpenGL vs Direct3D. However, it ensures
consistency across OpenGL drivers, as round() in GLSL can go either way.
This was breaking Metroid Prime 2: Echoes’s scanner in some rooms, such
as the one in https://fifoci.dolphin-emu.org/dff/mp2-scanner/
It was found on Ivy Bridge on Mesa, the alpha value read back from the
EFB was off-by-4 in multiple objects, which was a conversion error
because int4() is equivalent to floor() and the value wasn’t always
higher.
Improve bookkeeping around formats. Hopefully make code less confusing.
- Rename TlutFormat -> TLUTFormat to follow conventions.
- Use enum classes to prevent using a Texture format where an EFB Copy format
is expected or vice-versa.
- Use common EFBCopyFormat names regardless of depth and YUV configurations.
Currently, we use the alpha channel from the EFB even if the current
format does not include an alpha channel. Now, the alpha channel is set
to 1 if the format does not have an alpha channel, as well as truncating
to 5/6 bits per channel. This matches the EFB-to-texture behavior.
Addded a few duplicated depth copy texture formats to the enum
in TextureDecoder.h. These texture formats were already implemented
in TextureCacheBase and the ogl/dx11 texture cache implementations.
On D3D, we read from the depth buffer using the format
DXGI_FORMAT_R24_UNORM_X8_TYPELESS (essentially, the "r" component contains
the depth, and the other components contain nothing).
Well, that's not strictly true, but trying to memcpy between two buffers
using different row lengths and different strides is at minimum extremely
unintuitive.