These values were obtained by setting a breakpoint at a game's entry point, and then observing the register values with Dolphin's register widget.
There are other registers that aren't handled by this PR, including CR, XER, SRR0, SRR1, and "Int Mask" (as well as most of the GPRs). They could be added in a later PR if it turns out that their values matter, but probably most of them don't.
This fixes Datel titles booting with the IPL skipped (see https://bugs.dolphin-emu.org/issues/8223), though when booted this way they are currently missing textures. Due to somewhat janky code, Datel overwrites the syscall interrupt handler and then immediately triggers it (with the `sc` instruction) before they restore the correct one. This works on real hardware due to icache, and also works in Dolphin when the IPL runs due to icache, but prior to this change `HID0.ICE` defaulted to 0 so icache was not enabled when the IPL was skipped.
DSPHLE::Initialize sets the halt and init bits to true (i.e. m_dsp_control.Hex starts as 0x804), which is reasonable behavior (this is the state the DSP will be in when starting a game from the IPL, as after `__OSStopAudioSystem` the control register is 0x804).
However, CMailHandler::m_halted defaults to false, and we only call CMailHandler::SetHalted in DSPHLE::DSP_WriteControlRegister when m_dsp_control.DSPHalt changes, so since DSPHalt defaults to true, if the first thing that happens is writing true to DSPHalt, we won't properly halt the mail handler.
Now, we call CMailHandler::SetHalted on startup. This fixes Datel titles when the IPL is skipped with DSP HLE (though this configuration only works once https://bugs.dolphin-emu.org/issues/8223 is fixed).
This fixes booting Datel titles with DSPHLE (see https://bugs.dolphin-emu.org/issues/12943). Datel messed up their DSP initialization code, so it only works by receiving a mail later on, but if halting isn't implemented then it receives the mail too early and hangs.
It's cleared whenever the uCode changes, so there's no reason to clear it in a destructor or during initialization.
I've also renamed it to ClearPending.
Android Studio 2022.2 "Chipmunk" changes the code formatting rules a
little. Let's apply the new formatting in this PR so that the lint bot
doesn't take it out on innocent PRs.
The # option means that 0x is prepended already, so the old code resulted in 0x0xDEADBEEF instead of the intended 0xDEADBEEF. WriteMailboxLow was already correct.
I don't know what happened here, unfortunately. The version of dsp_mixer.s added to libogc on Nov 14, 2008 in c76d8b851f uses andcf and jlz here, and the version we have matches the one from Feb 5, 2009 in ae5c3a5fb5 exactly (prior to the fixes in my previous commit). I can't see any reason why wait_for_dsp_mail would be changed like this.
ANDCF and ANDF were previously swapped and JNE/JEQ/JZR/JNZ became JNZ/JZ/JLNZ/JLZ on Apr 3, 2009 in 7c4e654253, corresponding to a change Hermes made on Nov 10, 2008 in 2cea6d99ad. But these predate the test being added.
The only other information I can find is that ASNDLIB 1.0 released on November 11, 2008, at https://web.archive.org/web/20120326145022/http://www.entuwii.net/foro/viewtopic.php?f=6&t=87 (but there aren't any surviving links from there).
This change makes assembling DSPTestText match DSPTestBinary, though HermesText doesn't yet match HermesBinary.
The test data was originally added on April 18, 2009 in e7e4ef4481. Then, set16 and set40 were swapped on April 22, 2009 89178f411c, which updated the DSPSpy version of dsp_code, but not the version in DSPTool used for testing. So, when the test was made, the assembled data matched the text, but a few days after it no longer did.
Similarly, on Jul 7, 2009 in 1654c582ab the conditional instructions were adjusted, and 0x1706 was changed from JRL to JRNC and 0x0297 was changed from JGE to JC.
For what it's worth, devkitPro made the same changes on May 31, 2010 in 8a65c85c9b and updated their version of the asnd ucode (which is this ucode) on June 11, 2011 in b1b8ecab3a (though this update also includes other feature changes). Note that at the time, they didn't reassemble the ucode unless they made changes to it; the assembled was stored in the repo until bfb705fe16~...d20f9bdcfb43260c6c759f4fb98d724931443f93.
This fixes the following failures with Hermes:
!! 0015 : 8e00 vs 8f00 - set16 vs set40
!! 016f : 8e00 vs 8f00 - set16 vs set40
and with Hermes:
!! 0014 : 8e00 vs 8f00 - set16 vs set40
!! 0063 : 8e00 vs 8f00 - set16 vs set40
!! 019b : 1706 vs 1701 - jrnc $AR0 vs jrl $AR0
!! 01bf : 0297 vs 0290 - jc 0x01dc vs jge 0x01dc
!! 01d2 : 0297 vs 0290 - jc 0x01dc vs jge 0x01dc
Hermes has the remaining failures:
!! 027b : 03c0 vs 03a0 - andcf $AC1.M, #0x8000 vs andf $AC1.M, #0x8000
!! 027d : 029d vs 0294 - jlz 0x027a vs jnz 0x027a
Before, both 1441 and 147f would disassemble as `lsr $acc0, #1`, when the second should be `lsr $acc0, #-1`, and both 14c1 and 14ff would be `asr $acc0, #1` when the second should be `asr $acc0, #-1`. I'm not entirely sure whether the minus signs actually make sense here, but this change is consistent with the assembler so that's an improvement at least.
devkitPro previously changed the formatting to not require negative signs for lsr and asr; this is probably something we should do in the future: 8a65c85c9b
This fixes the HermesText and HermesBinary tests (HermesText already wrote `lsr $ACC0, #-5`, so this is consistent with what it used before.)
For instance, ending with 0x009e (which you can do with CW 0x009e) indicates a LRI $ac0.m instruction, but there is no immediate value to load, so before whatever garbage in memory existed after the end of the file was used.
The bounds-checking also previously assumed that IRAM or IROM was being used, both of which were exactly 0x1000 long.
Spirv-cross's MSL codegen makes the amazing choice of compiling calls to inout functions as `State temp = s; call_function(temp); s = temp`. Not all Metal backends handle this mess well. In particular, it causes register spills on Intel, losing about 5% in performance.
X30 is used in fewer situations than the comment was claiming.
(I think that when I wrote the comment I was counting the use of X30
as a temp variable in the slowmem code as clobbering X30, but that
happens after pushing X30, so it doesn't actually get clobbered.)
This is used when fastmem isn't available. Instead of always falling
back to the C++ code in MMU.cpp, the JIT translates addresses on its
own by looking them up in a table that Dolphin constructs. This is
slower than fastmem, but faster than the old non-fastmem code.
This is primarily useful for iOS, since that's the only major platform
nowadays where you can't reliably get fastmem. I think it would make
sense to merge this feature to master despite this, since there's
nothing actually iOS-specific about the feature. It would be of use
for me when I have to disable fastmem to stop Android Studio from
constantly breaking on segfaults, for instance.
Co-authored-by: OatmealDome <julian@oatmealdome.me>
Without this, debug builds of Dolphin fail to launch. The OS tries
to locate org.dolphinemu.dolphinemu.debug.DolphinApplication
but fails to find it because its actual name is
org.dolphinemu.dolphinemu.DolphinApplication.
Partially reverts 6b74907f9d.
This hack was added in 8f0cbefbe5, and the part of it in SI_DeviceGCAdapter is present on Android already, so I don't see any reason why this part doesn't apply to Android.
This is mostly a brainless merge, #ifdef-ing anything that doesn't match between the two while preserving common logic. I didn't rename any variables (although similar ones do exist), but I did change one log that was ERROR on android and NOTICE elsewhere to just always be NOTICE. Further merging will follow.
Instead, saturate in OpReadRegister, as all uses of OpReadRegisterAndSaturate called OpReadRegister for other registers (and there isn't anything that writes to $ac0.m or $ac1.m without saturation).
The current code expects new mail almost immediately after the last map was sent for it to be saved properly. However, I have a test program that ends up looping for 32768 iterations before it sends more mail; this resulted in an incomplete result dump. I've changed it to wait a frame between checking for mail, which solves that issue. This does slow down dumping, but the end speed matches the speed at which the UI updates the registers so this isn't a big deal (the UI waits a frame between mail normally). (Theoretically, it could take even longer for dumping to finish, so this is not a perfect solution. However, for tests that take that long to run, it would be better to save the existing results instead of re-running the test and saving that; that'd be something to do with later improvements.)
Loading configs while another thread is messing with stuff just doesn't feel like a good idea
Hopefully fixes Wiimote Scanning Thread crashes on startup
There were 3 bugs here:
- The input register for the full register wasn't actually being used; it was read into RCX but RCX wasn't used by Update_SR_Register16_OverS32 (except as a scratch register). The way the DSP LLE recompiler uses registers is in general confusing, so this commit changes a few uses to have a variable for the register being used, to make code a bit more readable. (Default parameter values were also removed so that they needed to be explicitly specified).
- Update_SR_Register16 was doing a 64-bit test, when it should have been doing a 16-bit test. For the most part this doesn't matter due to sign-extension, but it does come up with e.g. `ORI` or `ANDI`.
- Update_SR_Register16_OverS32 did the over s32 check, and then called Update_SR_Register16. Update_SR_Register16 masks $sr with ~SR_CMP_MASK, clearing the over s32 bit. Now the over s32 check is performed after calling Update_SR_Register16 (without masking a second time). No official uCode cares about the over s32 bit.
We don't have anything called $amD, though we do have $acsD. However, these instructions affect flags based on the whole accumulator, so it's better to just use $acD.
For more information, ApplyWriteBackLog, WriteToBackLog, and ZeroWriteBackLog were added in b787f5f8f7 and the explanatory comment was added in fd40513fed, although it did not mention the specific instructions that could trigger this edge case. The statements about which registers can be written by main opcodes and extension opcodes are based on my own checking of all instructions in the manual.