Jit64: fselx - Skip MOVAPS + MOVSD (SSE4.1)

For the non-packed variant of this instruction, a MOVSD instruction was
generated to copy only the lower 64 bits of XMM1 to the destination
register. This was done in order to keep the destination register's
upper half intact.

However, when register c and the destination register are the same,
there is no need for this copy. Because the registers match and due to
the way the mask is generated, BLENDVPD will end up taking the upper
half from the destination register, as intended.

Additionally, the MOVAPS to copy Rc into XMM1 can also be skipped.

Before:
66 0F 57 C0          xorpd       xmm0,xmm0
F2 41 0F C2 C6 06    cmpnlesd    xmm0,xmm14
41 0F 28 CE          movaps      xmm1,xmm14
66 41 0F 38 15 CA    blendvpd    xmm1,xmm10,xmm0
F2 44 0F 10 F1       movsd       xmm14,xmm1

After:
66 0F 57 C0          xorpd       xmm0,xmm0
F2 41 0F C2 C6 06    cmpnlesd    xmm0,xmm14
66 45 0F 38 15 F2    blendvpd    xmm14,xmm10,xmm0
This commit is contained in:
Sintendo 2020-10-03 17:34:18 +02:00
parent 9ac324aed3
commit 3499cedde4
1 changed files with 1 additions and 1 deletions

View File

@ -459,7 +459,7 @@ void Jit64::fselx(UGeckoInstruction inst)
}
else if (cpu_info.bSSE4_1)
{
if (packed && d == c)
if (d == c)
{
BLENDVPD(Rd, Rb);
return;