A number of games make an EFB copy in I4/I8 format, then use it as a
texture in C4/C8 format. Detect when this happens, and decode the copy on
the GPU using the specified palette.
This has a few advantages: it allows using EFB2Tex for a few more games,
it, it preserves the resolution of scaled EFB copies, and it's probably a
bit faster.
D3D only at the moment, but porting to OpenGL should be straightforward..
This is the same trick which is used for Metroid's fonts/texts, but for all textures. If 2 different textures at the same address are loaded during the same frame, create a 2nd entry instead of overwriting the existing one. If the entry was overwritten in this case, there wouldn't be any caching, which results in a big performance drop.
The restriction to textures, which are loaded during the same frame, prevents creating lots of textures when textures are used in the regular way. This restriction is new. Overwriting textures, instead of creating new ones is faster, if the old ones are unlikely to be used again.
Since this would break efb copies, don't do it for efb copies.
Castlevania 3 goes from 80 fps to 115 fps for me.
There might be games that need a higher texture cache accuracy with this, but those games should also see a performance boost from this PR.
Some games, which use paletted textures, which are not efb copies, might be faster now. And also not require a higher texture cache accuracy anymore. (similar sitation as PR https://github.com/dolphin-emu/dolphin/pull/1916)
This changes the behavior if both texture are available. The old code did
try to load the modfied texID, the new code tries the unmodified texID first.
Don't change the texID depending on the tlut_hash for paletted textures that are efb copies and don't have an entry in the cache for texID ^ tlut_hash. This makes those textures less broken when using efb to texture.
This is not really fixing those textures, but it's a step forward. The mini map in Twilight Princess for example is in grayscales with this and is more or less usable.
We'll loose data on invalidating them. So just keep them until a new copy is done.
A wrong scaled copy is better than no copy if the game doesn't creates a new one.
This reverts an optimization which isn't worth imo. Every texture uploads have to alloc vram and a staging buffer, so there is no need to do both in the same call.
The D3D / OGL backends only ever used RGBA textures, and the Software
backend uses its own custom code for sampling. The ARGB path seems to
just be dead code.
Since ARGB and RGBA formats are similar, I don't think this will make
the code more difficult to read or unable to be used as
reference. Somebody who wants to use this code to output ARGB can simply
modify the MakeRGBA function to put the shift at the other end.