Type punning like this is undefined behavior. Instead, we use std::memcpy to
copy the necessary data over, which is well defined (as it treats both
the source and destination as unsigned char).
Gets rid of the need to set up memcpy boilerplate to reinterpret between
floating-point and integers.
While we're at it, also do a minor bit of tidying.
This replaces usages of the non-standard __FUNCTION__ macro with the standard
mandated __func__ identifier.
__FUNCTION__ is a preprocessor definition that is provided as an
extension by compilers. This was the only convenient option to rely on
pre-C++11. However, C++11 and greater mandate the predefined identifier
__func__, which lets us accomplish the same thing.
The difference between the two, however, is that __func__ isn't a
preprocessor macro, it's an actual identifier that exists at function
scope. The C++17 draft standard (N4659) at section [dcl.fct.def.general]
paragraph 8 states:
"
The function-local predefined variable __func__ is defined as if a
definition of the form
static const char __func__[] = "function-name ";
had been provided, where function-name is an implementation-defined
string. It is unspecified whether such
a variable has an address distinct from that of any other object in the
program.
"
Thankfully, we don't do any macro or string concatenation with __FUNCTION__
that can't be modified to use __func__.
The PC offset ADRP() path takes a s32 value, but the input offset was
being tested as abs(ptr) < 0xFFFFFFFF. This caused values between
0x80000000 and 0xFFFFFFFF to incorrectly use this path, despite the
offsets not being representable in an s32.
This caused a crash in the VertexLoader on android 8.1 immediate in wind
waker (and possibly all other apps on android 8.1) as the jit and data
sections happened to be loaded 4gb apart in virtual memory, causing some
pointers to hit this
CNTVCT_EL0 is force-enabled on all linux plattforms.
Windows is untested, but as this is the best way to get *any* low
overhead performance counters, they likely use it as well.
Seems like I was wrong that ANDI2R doesn't require a temporary register here.
There is *one* case when the mask won't fit in the ARM AND instruction:
mask = 0xFFFFFFFF
But let's just use MOV instead of AND here for this case...
Wasn't masking by the size of the offset encoding so negative values were killing the instruction
Missed commiting this in my integer gatherpipe PR.
Fixes crashing on AArch64.