This isn't entirely necessary, as they are interpreted as barewords expressions,
but it's still nicer to have by default. And my upcoming input changes will
always put `` around single letter inputs.
-Add pause state to FPSCounter.
-Add ability to have more than one "OnStateChanged" callback in core.
-Add GetActualEmulationSpeed() to Core. Returns 1 by default. It's used by my input PRs.
The SaveToSYSCONF call in BootManager.cpp was unintentionally
overriding the temporary NAND set by the preceding
InitializeWiiRoot call. Fixes
https://bugs.dolphin-emu.org/issues/12500.
Verifying a Wii game creates an instance of IOS, and Dolphin
can't handle more than one instance of IOS at the same time.
Properly supporting it is probably more effort than it's worth.
Fixes https://bugs.dolphin-emu.org/issues/12494.
Avoids the need to copy the *.mo files manually *and* more importantly
this ensures that the mo files are always recreated if the build
output directory is cleared.
Update references was failing to update the references, causing input to stay nullptr and crashing.
I fixed the case that triggered that, though also added checks against nullptrs for safety.
(cherry picked from commit 4bdcf707555a5568eddff957fa3604975ffb6ed7)
I think the AArch64 JIT has come far enough that it doesn't have to
be called experimental anymore.
I'm also labeling the x86-64 JIT as x86-64 for consistence with the
AArch64 JIT. This will especially be helpful if we start supporting
AArch64 on macOS, as AArch64 macOS can run both the x86-64 JIT and
the AArch64 JIT depending on whether you enable Rosetta 2.
I haven't observed this breaking any game, but it didn't match
the behavior of the interpreter as far as I could tell from
reading the code, in that denormals weren't being flushed.
If we can prove that FCVT will provide a correct conversion,
we can use FCVT. This makes the common case a bit faster
and the less likely cases (unfortunately including zero,
which FCVT actually can convert correctly) a bit slower.
Preparation for following commits.
This commit intentionally doesn't touch paired stores,
since paired stores are supposed to flush to zero.
(Consistent with Jit64.)
This simplifies some of the following commits. It does require
an extra register, but hey, we have 32 of them.
Something I think would be nice to add to the register cache
in the future is the ability to keep both the single and double
version of a guest register in two different host registers
when that is useful. That way, the extra register we write to
here can be read by a later instruction, saving us from
having to perform the same conversion again.
Fixes https://bugs.dolphin-emu.org/issues/12388. Might also fix
other games that have problems with float/paired instructions
in JitArm64, but I haven't tested any.
-They might have never drawn if DrawMessages wasn't called before they actually expired
-Their fade was wrong if the duration of the message was less than the fade time
This makes them much more useful for debugging, I know there might be other means
of debugging like logs and imgui, but this was the simplest so that's what I used.
If you want to print the same message every frame, but with a slightly different value
to see the changes, it now work.
To compensate for the fact that they are now always rendered once,
so on start up a lot of old messages (printed while the emulation was off) could show up,
I've added a "drop" time, which means if a msg isn't rendered for the first
time within that time, it will be dropped and never rendered.
When the interpreter writes to a discarded register, its type
must be changed so that it is no longer considered discarded.
Fixes a 62ce1c7 regression.
We normally check for division by zero to know if we should set the
destination register to zero with a XOR. However, when the divisor and
destination registers are the same the explicit zeroing can be omitted.
In addition, some of the surrounding branching can be simplified as
well.
Before:
45 85 FF test r15d,r15d
75 05 jne normal_path
45 33 FF xor r15d,r15d
EB 0C jmp done
normal_path:
B8 5A 00 00 00 mov eax,5Ah
99 cdq
41 F7 FF idiv eax,r15d
44 8B F8 mov r15d,eax
done:
After:
45 85 FF test r15d,r15d
74 0C je done
B8 5A 00 00 00 mov eax,5Ah
99 cdq
41 F7 FF idiv eax,r15d
44 8B F8 mov r15d,eax
done:
Division by a power of two can be slightly improved when the
destination and dividend registers are the same.
Before:
8B C6 mov eax,esi
85 C0 test eax,eax
8D 70 03 lea esi,[rax+3]
0F 49 F0 cmovns esi,eax
C1 FE 02 sar esi,2
After:
85 F6 test esi,esi
8D 46 03 lea eax,[rsi+3]
0F 48 F0 cmovs esi,eax
C1 FE 02 sar esi,2
Repeated erase() + iteration on a std::multimap is extremely slow.
Slow enough that it causes a 7 second long stutter during some
transitions in F-Zero X (a N64 VC game that triggers many, many icache
invalidations).
And slow enough that JitBaseBlockCache::DestroyBlock shows up on a
flame graph as taking >50% of total CPU time on the CPU-GPU thread:
https://i.imgur.com/vvqiFL6.png
This commit optimises those block link queries by replacing the
std::multimap (which is typically implemented with red-black trees)
with hash tables.
Master: https://i.imgur.com/vvqiFL6.png / 7s stutters
(starting from 5.0-2021 and with branch following disabled)
This commit: https://i.imgur.com/hAO74fy.png / ~0.7s stutters, which
is pretty close to 5.0 stable. (5.0-2021 introduced the performance
regression and it is especially noticeable when branch following
is disabled, which is the case for all N64 VC games since 5.0-8377.)
VideoCommon: Change the type of BPMemory.scissorOffset to 10bit signed: S32X10Y10
VideoBackends: Fix Software Clipper.PerspectiveDivide function, use BPMemory.scissorOffset instead of hard code 342
Oversight from #9545, which moved the "new game has been loaded" logic
to a separate OnNewTitleLoad function that has to be called explicitly
*after* a title has loaded.
Coupled with the commit that makes Dolphin not clobber 0x1800-0x3000
when using MIOS, this fixes Wind Waker and other MIOS-patched games
when they are launched from the System Menu.
MIOS puts patch data in low MEM1 (0x1800-0x3000) for its own use.
Overwriting data in this range can cause the IPL to crash when
launching games that get patched by MIOS.
See https://bugs.dolphin-emu.org/issues/11952 for more info.
Not applying the Gecko HLE patches means that Gecko codes will not work
under MIOS, but this is better than the alternative of having specific
games crash.
This particular range is kind of bizarre, and would only interpret
interleave mode 2 as a valid mode, while rejecting interleave mode 1 and
the extension byte mode.
As far as I know, based off the information on Wiibrew, we should be
considering all three values within this range as valid.
texture serialization and deserialization used to involve many memory
allocations and deallocations, along with many copies to and from
those allocations. avoid those by reserving a memory region inside the
output and writing there directly, skipping the allocation and copy to
an intermediate buffer entirely.
This adds a CMake option (DOLPHIN_DEFAULT_UPDATE_TRACK) to allow
configuring SCM_UPDATE_TRACK_STR. This is needed to enable auto-updates
in Windows CMake builds by default.
This adds a function to get the emulated or real Bluetooth device for
an active emulation instance. This lets us deduplicate all the
`ios->GetDeviceByName("/dev/usb/oh1/57e/305")` calls that are currently
scattered in the codebase and ensures Bluetooth passthrough is being
handled correctly.
This also fixes the broken check in WiimoteCommon::UpdateSource.
There was a confusion between "emulated Bluetooth" (as opposed to
"real Bluetooth" aka Bluetooth passthrough) and "emulated Wiimote".
Specifically, 'Scooby-Doo! Mystery Mayhem', 'Scooby-Doo! Unmasked', 'Ed, Edd n Eddy: The Mis-Edventures', and the Wii version of 'Happy Feet'.
The JIT cache causes problems with emulated icache invalidation in these games, resulting in areas failing to load.
This avoids some warnings, which were originally fixed by ignoring loads with a value of zero (see 636bedb207 / #3242).
Note that FifoCI will report some changes, but only on the first frame; these seem to be timing related as they don't happen if a different write is used to replace skipped ones.
They appear to relate to perf queries, and combining them with truely unknown commands would probably hide useful information. Furthermore, 0x20 is issued by every title, so without this every title would be recorded as using an unknown command, which is very unhelpful.
The swaps are confusing and don't accomplish much.
It was originally written like this:
u32 pte = bswap(*(u32*)&base_mem[pteg_addr]);
then bswap was changed to Common::swap32, and then the array access
was replaced with Memory::Read_U32, leading to the useless swaps.
While 6xx_pem.pdf §7.6.1.1 mentions that the number of trailing
zeros in HTABORG must be equal to the number of trailing ones
in the mask (i.e. HTABORG must be properly aligned), this is actually
not a hard requirement. Real hardware will just OR the base address
anyway. Ignoring SDR changes would lead to incorrect emulation.
Logging a warning instead of dropping the SDR update silently is a
saner behaviour.
debaf63fe8 moved the "Sonic epsilon hack"
to vertex shaders. However, it was only done for targets with depth
clamping. If this is not available, for example the target is OpenGL ES,
the Sonic problem appears (https://bugs.dolphin-emu.org/issues/11897).
A version of the "Sonic epsilon hack" is added for targets without
depth clamping.
This changes FileSystemProxy::Open to return a file descriptor wrapper
that will ensure the FD is closed when it goes out of scope.
By using such a wrapper we make it more difficult to forget to close
file descriptors.
This fixes a leak in ReadBootContent. I should have added such a class
from the beginning... In practice, I don't think this would have caused
any obvious issue because ReadBootContent is only called after an IOS
relaunch -- which clears all FDs -- and most titles do not get close
to the FD limit.
JitArm64::DoJit contains a check where it prints a warning and tries
to pause emulation if instructed to compile code at address 0. I'm
assuming this was done in order to provide a nicer error behavior
in cases where PC was accidentally set to null. Unfortunately, it
has started causing us problems recently, as 688bd61 writes and runs
some code at address 0 to simulate the PPC being held in reset.
What makes this worse is that calling Core::SetState from the CPU
thread is actually not allowed and will cause a deadlock instead of
the intended behavior. I don't believe there is anything on a real
console that would stop you from executing code at address 0 (as
long as the MMU has been set up to allow it), and Jit64::DoJit
doesn't contain any check like this, so let's remove the check.
This commit adds a new "discarded" state for registers.
Discarding a register is like flushing it, but without
actually writing its value back to memory. We can discard
a register only when it is guaranteed that no instruction
will read from the register before it is next written to.
Discarding reduces the register pressure a little, and can
also let us skip a few flushes on interpreter fallbacks.
The output of instructions like fabsx and ps_sel is store-safe
if and only if the relevant inputs are. The old code was always
marking the output as store-safe if the output was a single,
and never otherwise.
Also, the old code was treating the output of psq_l/psq_lu as
store-safe, which seems incorrect (if dequantization is disabled).
This improves the speed of verifying Wii WIA/RVZ files.
For me, the verification speed for LZMA2-compressed files
has gone from 11-12 MiB/s to 13-14 MiB/s.
One thing VolumeVerifier does to achieve parallelism is to
compute hashes for one chunk of data while reading the next
chunk of data. In master, when reading data from a Wii
partition, each such chunk is 32 KiB. This is normally fine,
but with WIA and RVZ it leads to rather lopsided read times
(without the compute times being lopsided): The first 32 KiB
of each 2 MiB takes a long time to read, and the remaining
part of the 2 MiB can be read nearly instantly. (The WIA/RVZ
code has to read the entire 2 MiB in order to compute hashes
which appear at the beginning of the 2 MiB, and then caches
the result afterwards.) This leads to us at times not doing
much reading and at other times not doing much computation.
To improve this, this change makes us use 2 MiB chunks
instead of 32 KiB chunks when reading from Wii partitions.
(block = 32 KiB, group = 2 MiB)
This can't actually happen in practice due to how WAD files work,
but it's very easy to add support for thanks to the last commit,
so we might as well add support for it.
The performance gains of doing this aren't too important since you
normally wouldn't run into any disc image that has overlapping blocks
(which by extension means overlapping partitions), but this change also
lets us get rid of things like VolumeVerifier's mutex that used to
exist just for the sake of handling overlapping blocks.
Panic alerts in DiscIO can potentially be very annoying since
large amounts of them can pop up when loading the game list
if you have some particularly weird files in your game list.
This was a much bigger problem back in 5.0 with its
"Tried to decrypt data from a non-Wii volume" panic alert, but
I figured I would take it all the way and remove the remaining
panic alerts that can show up when loading the game list.
I have exempted uses of ASSERT/ASSERT_MSG since they indicate
a bug in Dolphin rather than a malformed file.
If we know at compile time that the PPC carry flag definitely
has a certain value, we can bake that value into the emitted code
and skip having to read from PPCState.