These are only ever used with ShaderCode instances and nothing else.
Given that, we can convert these helper functions to expect that type of
object as an argument and remove the need for templates, improving
compiler throughput a marginal amount, as the template instantiation
process doesn't need to be performed.
We can also move the definitions of these functions into the cpp file,
which allows us to remove a few inclusions from the ShaderGenCommon
header. This uncovered a few instances of indirect inclusions being
relied upon in other source files.
One other benefit is this allows changes to be made to the definitions
of the functions without needing to recompile all translation units that
make use of these functions, making change testing a little quicker.
Moving the definitions into the cpp file also allows us to completely
hide DefineOutputMember() from external view, given it's only ever used
inside of GenerateVSOutputMembers().
When looking into the Faron Woods fifolog, I noticed this code was quite
high in the profile (~10%). Clearing 4096 entries from the vector isn't
needed every draw, we only need to do this when the cache was actually
valid in the first place.
Should provide a slight general performance boost.
This was causing the depth buffer to be discarded as well, which
has an effect on mobiles (doesn't get loaded into tile memory).
If we find this is hindering performance (remember, the EFB is
only a 640x528 texture), it may be worth changing the interface to
support discarding only the colour buffer.