There's no need to load the 64-bit immediate into a temporary register.
x64 will sign-extend 32-bit immediates to 64 bits, giving us the exact
value we need in this case.
48 C7 C0 00 00 FF FF mov rax,0FFFFFFFFFFFF0000h
48 21 C2 and rdx,rax
48 81 E2 00 00 FF FF and rdx,0FFFFFFFFFFFF0000h
- LEA is a bit silly when the source and the destination are the same. A
simple ADD or SHL will do in those cases.
66 8D 04 45 00 00 00 00 lea ax,[rax*2]
66 03 C0 add ax,ax
48 8D 04 00 lea rax,[rax+rax]
48 03 C0 add rax,rax
66 8D 14 D5 00 00 00 00 lea dx,[rdx*8]
66 C1 E2 03 shl dx,3
- When scaling by 2, consider summing the register with itself instead.
The former always needs a 32-bit displacement, so the sum is more
compact.
66 8D 14 45 00 00 00 00 lea dx,[rax*2]
66 8D 14 00 lea dx,[rax+rax]
Other than the controller settings and JIT debug settings,
these are the only settings which were defined in Java code
but not defined in the new config system in C++. (There are
still a lot of settings that are defined in the new config
system but not yet saveable in the new config system, though.)
Instead of comparing the game ID, revision, disc number and name,
we can compare a hash of important parts of the disc including
all the aforementioned data but also additional data such as the
FST. The primary reason why I'm making this change is to let us
catch more desyncs before they happen, but this should also fix
https://bugs.dolphin-emu.org/issues/12115. As a bonus, the UI can
now distinguish the case where a client doesn't have the game at
all from the case where a client has the wrong version of the game.
The CMake Windows build was broken because of me adding a usage
of std::codecvt_utf8_utf16 to StringUtil.cpp. Kinda silly to have
a warning for an API with no standard replacement available...
Turns out, Gamecube games actually do check DILENGTH, and if DILENGTH is at 0, they'll think the transfer completed successfully even if DEINT is used, since after all, surely that means everything was sent. That caused all sorts of issues, from audio looping when a disc is removed since it's re-using the same buffer to just flat-out crashing instead of showing the disc removed screen.
In particular:
- Trying to play audio in a non-ready state returns the state-specific error, not an audio buf error
- Audio status cannot be requested in non-ready states
- The audio buffer cannot be configured in states other than ReadyNoReadsMade
- Using the stop motor command while the motor is already stopped doesn't change states
Additionally, the internal state IDs are used (which distinguish ReadyNoReadsMade and Ready), instead of the state IDs exposed in request error. This makes some of the weird behavior a bit more obvious.
State and error behavior of the seek command was not implemented in this commit.
It is my opinion that nobody should use NKit disc images without
being aware of the drawbacks of them. Since it seems like almost
nobody who is using NKit disc images knows what NKit is (hmm, now
how could that have happened...?), I am adding a warning to Dolphin
so that you can't run NKit disc images without finding out about the
drawbacks. In case someone really does want to use NKit disc images,
the warning has a "Don't show this again" option. Unfortunately, I
can't retroactively add the warning where it's most needed:
in Dolphin 5.0, which does not support Wii NKit disc images.
That a device doesn't have a touchscreen doesn't necessarily mean
that it doesn't support rumble (though it is usually the case).
setPhoneVibrator already contains a check for whether the device
supports rumble, so we can simply remove the touchscreen check.
Pretty much the same optimization we did for AVX, although slightly more
constrained because we're stuck with the two-operand instruction where
destination and source have to match.
We could also specialize the case where registers b, c, and d are all
distinct, but I decided against it since I couldn't find any game that
does this.
Before:
66 0F 57 C0 xorpd xmm0,xmm0
66 41 0F C2 C1 06 cmpnlepd xmm0,xmm9
41 0F 28 CE movaps xmm1,xmm14
66 41 0F 38 15 CC blendvpd xmm1,xmm12,xmm0
44 0F 28 F1 movaps xmm14,xmm1
After:
66 0F 57 C0 xorpd xmm0,xmm0
66 41 0F C2 C1 06 cmpnlepd xmm0,xmm9
66 45 0F 38 15 F4 blendvpd xmm14,xmm12,xmm0
AVX has a four-operand VBLENDVPD instruction, which allows for the first
input and the destination to be different. By taking advantage of this,
we no longer need to copy one of the inputs around and we can just
reference it directly, provided it's already in a register (I have yet
to see this not be the case).
Before:
66 0F 57 C0 xorpd xmm0,xmm0
F2 41 0F C2 C6 06 cmpnlesd xmm0,xmm14
41 0F 28 CE movaps xmm1,xmm14
66 41 0F 38 15 CA blendvpd xmm1,xmm10,xmm0
F2 44 0F 10 F1 movsd xmm14,xmm1
After:
66 0F 57 C0 xorpd xmm0,xmm0
F2 41 0F C2 C6 06 cmpnlesd xmm0,xmm14
C4 C3 09 4B CA 00 vblendvpd xmm1,xmm14,xmm10,xmm0
F2 44 0F 10 F1 movsd xmm14,xmm1