The value being stored must be loaded into a register. In the case of an
immediate value, this means it must be materialized. The value is
eventually byteswapped before performing the store.
This can be simplified for the value 0 for two reasons:
- ARM64 has a dedicated zero register, so does not need to be
materialized.
- Byteswapping zero is still zero, so we can skip this step.
We could skip byteswapping for other values by immediately materializing
the byteswapped value in a register, but the benefits are not so clear
there (if the value needs to be materialized anyway, it is better to do
it up front).
Before:
0x5280001b mov w27, #0x0 ; =0
0xb9404fba ldr w26, [x29, #0x4c]
0x12881862 mov w2, #-0x40c4 ; =-16580
0x0b020342 add w2, w26, w2
0x5ac00b61 rev w1, w27
0xb8226b81 str w1, [x28, x2]
After:
0xb9404fbb ldr w27, [x29, #0x4c]
0x12881862 mov w2, #-0x40c4 ; =-16580
0x0b020362 add w2, w27, w2
0xb8226b9f str wzr, [x28, x2]
Unlike on x64, inverting EQ or GT in SetCRFieldBit saves us one
instruction. Also unlike on x64, inverting SO or LT in GetCRFieldBit
requires an extra instruction (just like in SetCRFieldBit). Due to this,
replacing an invert in GetCRFieldBit with an invert in SetCRFieldBit
when possible is either equally good or better - never worse.
The game calls GXSetDrawDone and then switches the GP fifo without first
waiting for the draw done interrupt to arrive. Before
e96960e2a6, Dolphin would not execute the
draw done command and potentially also skip other commands in the old GP
fifo. Since that commit, Dolphin executes the remaining commands on the
old GP fifo just before disabling reads for switching, but because
PixelEngineManager::RaiseEvent() enforces a minimum delay of 500 cycles
for the draw done interrupt, it arrives after the game has switched to
the new GP fifo which seems to trigger the deadlock.
This patch replaces the call to GXSetDrawDone by a call to GXDrawDone
which does the same but also waits for the interrupt.
MUL and SUB can be combined in one instruction.
Before:
0x1b1a7c01 mul w1, w0, w26
0x4b010318 sub w24, w24, w1
After:
0x1b1ae018 msub w24, w0, w26, w24
Removes the EFBAccessEnable=false explicit game overrides for:
GT6 (Terminator 3: The Redemption)
GXB (SSX3) [deleted - no other configuration]
RTH (Tony Hawk's Downhill Jam)
SNC (SONIC COLOURS) [deleted - no other configuration]