These are only ever used with ShaderCode instances and nothing else.
Given that, we can convert these helper functions to expect that type of
object as an argument and remove the need for templates, improving
compiler throughput a marginal amount, as the template instantiation
process doesn't need to be performed.
We can also move the definitions of these functions into the cpp file,
which allows us to remove a few inclusions from the ShaderGenCommon
header. This uncovered a few instances of indirect inclusions being
relied upon in other source files.
One other benefit is this allows changes to be made to the definitions
of the functions without needing to recompile all translation units that
make use of these functions, making change testing a little quicker.
Moving the definitions into the cpp file also allows us to completely
hide DefineOutputMember() from external view, given it's only ever used
inside of GenerateVSOutputMembers().
Now that we've extracted all of the stateless functions that can be
hidden, it's time to make the index generator a regular class with
active data members.
This can just be a member that sits within the vertex manager base
class. By deglobalizing the state of the index generator we also get rid
of the wonky dual-initializing that was going on within the OpenGL
backend.
Since the renderer is always initialized before the vertex manager, we
now only call Init() once throughout the execution lifecycle.
Previously the logging was a in a little bit of a disarray. Some things
were in namespaces, and other things were not.
Given this code will feature a bit of restructuring during the
transition over to fmt, this is a good time to unify it under a single
namespace and also remove functions and types from the global namespace.
Now, all functions and types are under the Common::Log namespace. The
only outliers being, of course, the preprocessor macros.
This header doesn't actually make use of MathUtil.h within itself, so
this can be removed. Many other source files used VideoCommon.h as an
indirect include to include MathUtil.h, so these includes can also be
adjusted.
While we're at it, we can also migrate valid inclusions of VideoCommon.h
into cpp files where it can feasibly be done to minimize propagating it
via other headers.
Makes the global variable follow our convention of prefixing g_ on
global variables to make it obvious in surrounding code that it's not a
local variable.
Normalizes all variables related to statistics so that they follow our
coding style.
These are relatively low traffic areas, so this modification isn't too
noisy.
Since C++17, non-member std::size() is present in the standard library
which also operates on regular C arrays. Given that, we can just replace
usages of ArraySize with that where applicable.
In many cases, we can just change the actual C array ArraySize() was
called on into a std::array and just use its .size() member function
instead.
In some other cases, we can collapse the loops they were used in, into a
ranged-for loop, eliminating the need for en explicit bounds query.
Greatly simplifies the overall interface when it comes to compiling
shaders. Also allows getting rid of a std::string overload of the same
name. Now std::string and const char* both go through the same function.
Due to the current design, any of the GL state can be mutated after
calling this function, so we can't assume that the tracked state will
match if we call SetPipeline() after ResetAPIState().
glBlitFramebuffer() does not bypass the scissor test, which meant that
part of texture copies (e.g. XFB) could have been clipped when running
under OpenGL ES, as glCopyImageSubData() is not supported.
Coherent mappings have a lower overhead and less GL codes.
So enables coherent mapping by default for all drivers.
Both Qualcomm and ARM performs very bad with explicit flushing, so this change helps them as well.
AFAIK there was one GPU generation which was slower on coherent mapping: nvidia tesla
So Geforce 200 and 300 series should be tested with this PR before merging.
As this was last tested many years ago, this issue might have been fixed as well.
Those GPUs are close to 10 years old and not supported any more by nvidia.
Using 8-bit integer math here lead to precision loss for depth copies,
which broke various effects in games, e.g. lens flare in MK:DD.
It's unlikely the console implements this as a floating-point multiply
(fixed-point perhaps), but since we have the float round trip in our
EFB2RAM shaders anyway, it's not going to make things any worse. If we
do rewrite our shaders to use integer math completely, then it might be
worth switching this conversion back to integers.
However, the range of the values (format) should be known, or we should
expand all values out to 24-bits first.
This excludes the second argument from template deduction.
Otherwise, it is required to manually cast the second argument to
the ConfigInfo type (because implicit conversions won't work).
e.g. to set the value for a ConfigInfo<std::string> from a string
literal, you'd need a ugly `std::string("yourstring")`.
Also move it to MathUtils where it belongs with the rest of the
power-of-two functions. This gets rid of pollution of the current scope
of any translation unit with b<value> macros that aren't intended to be
used directly.
Also makes y_scale a dynamic parameter for EFB copies, as it doesn't
make sense to keep it as part of the uid, otherwise we're generating
redundant shaders.
Depending on which constructor is invoked, m_id or m_compute_program_id
can end up in an uninitialized state. We should ensure that the object
is completely initialized to something deterministic regardless of the
constructor taken.