And associated cleanup.
On most compilers these days, it'll either inline the memset with vector
fills or rep stosq, or outline with a call to memset.
I trust the compiler is probably going to make a better decision here,
than manual SSE intrinsics.
Ends up a couple of percent faster in FMV decoding.
This was only necessary on 32-bit because the sign bit was abused for
representing handlers. Since we're 64-bit only, we use bit 63, which
won't clash with the guest's 32-bit virtual address.
Guest memory is now mapped into a shared memory/file mapping, for use
with fastmem.
64-bit and 128-bit arguments are passed by register/value instead of by
reference/address.
LDL/LDR/SDL/SDR now use 64-bit GPRs instead of SSE.
These have no meaning in x64 (apart from throwing compiler warnings),
and we don't do 32-bit anymore. Also saves needing to include
`Pcsx2Defs.h` in files which don't otherwise need it.
Another small piece of #3451
Moves all VTLB pointer manipulation into dedicated classes for the purpose, which should allow the algorithm to be changed much more easily in the future (only have to change the class and recVTLB.cpp assembly since it obviously can't use the class)
Also some of the functions that manipulated the VTLB previously used POINTER_SIGN_BIT (which 1 << 63 on 64-bit) while others used a sign-extended 0x80000000. Now they all use the same one (POINTER_SIGN_BIT)
Note: recVTLB.cpp was updated to keep it compiling but the rest of the x86-64 compatibility changes were left out
Also, Cache.cpp seems to assume VTLB entries are both sides of the union at the same time, which is impossible. Does anyone know how this actually worked (and if this patch breaks it) or if it never worked properly in the first place?
Allocate memory in an x86-64-compatible way
Another part of #3451
Note: While this shouldn't change how anything works, it's been the #1 source of breakage of 32-bit builds in #3451 (it was the cause for the failure of win32 to allocate memory and the failure of linux-32 afterward) so we should definitely make sure it gets tested
see #3523 for more information
Add GoemonUnloadTlb function that invalidate TLB cache.
Currently the function is only used on the interpreter. It fixes TLB error after a reload of data.
Next step: porting to the recompiler
When a tlb miss is detected current instruction must be skipped. We need
to immediately switch to the handler
Typical instruction bug case:
lw a0, 0x8(a0)
a0 mustn't be loaded if we have a miss
v2: create a dedicated exception for tlb miss
v3:
* rename exception to CancelInstruction
* add a basic state machine on the exec loop so we keep same behavior
for eeloadReplaceOSDSYS and eeGameStarting
v4: remove assert
VTLB does some nonsense with signed integers for the pointers.
We've got to make sure to set the signed bit in the correct bit on 64bit pointers so it works.