This implements MIOS's PPC bootstrapping functionality, which enables
users to start a GameCube game from the Wii System Menu.
Because we aren't doing Starlet LLE (and don't have a boot1), we can
just jump to MIOS when the emulated software does an ES_LAUNCH or uses
ioctlv 0x25 to launch BC.
Note that the process is more complex on a real Wii and goes through
several more steps before getting to MIOS:
* The System Menu detects a GameCube disc and launches BC (1-100)
instead of the game. [Dolphin does this too.]
* BC, which is reportedly very similar to boot1, lowers the Hollywood
clock speed to the Flipper's and then launches boot2.
* boot2 sees the lowered clock speed and launches MIOS (1-101) instead
of the System Menu.
MIOS runs instead of IOS in GC mode and has an embedded GC IPL (which
is the code actually responsible for loading the disc game) and a PPC
bootstrap code. To get things working properly, we simply need to load
both to memory, then jump to the bootstrap code at 0x3400.
Obviously, because of the way this works, a real MIOS is required.
TryReadInstruction doesn't validate the address it resolves, that
can result in Memory::GetPointer failing and returning nullptr
which then leads to a nullptr dereference and a crash.
Created PowerPC::HostIsInstructionRAMAddress which works the same
way as PowerPC::HostIsRAMAddress for the IBAT.
Dolphin emulates GeckoCodes by fiddling with the CPU state when a
VI Interrupt occurs. The problem with this is that we don't know
where the PC is so it's non-deterministic and not necessarily
suitable for use with the codehandler.
There are two options: Patch the game like Gecko OS either directly
or using HLE::Patch, or use a trampoline so we can branch from any
PC even if it would otherwise not be valid. The problem with Gecko OS
patches is there are 10 of them and they have to be configured
manually (i.e. Game INIs to would need to have a [Core]GeckoHookType
property).
HLE_Misc::GeckoReturnTrampoline enables the Code Handler to be
entered from anywhere, the trampoline restores all the registers that
had to be secretly saved to the stack.
Fundamentally, all this does is enforce the invariant that we always
translate effective addresses based on the current BAT registers and
page table before we do anything else with them.
This change can be logically divided into three parts. The first part is
creating a table to represent the current BAT state, and keeping it up to
date (PowerPC::IBATUpdated, PowerPC::DBATUpdated, etc.). This does
nothing by itself, but it's necessary for the other parts.
The second part (mostly in MMU.cpp) is simply removing all the hardcoded
checks for specific untranslated addresses, and consistently translating
addresses using the current BAT configuration. Very straightforward, but a
lot of code changes because we hardcoded assumptions all over the place.
The third part (mostly in Memmap.cpp) is making the fastmem arena reflect
the current BAT configuration. We do this by redoing the mapping (calling
memmap()) based on the BAT table whenever it changes.
One additional minor change is that translation can fail in two ways:
either the segment is a direct store segment, or page table lookup failed.
The difference doesn't usually matter, but the difference affects cache
instructions, like dcbz.
Specifically, don't make any assumptions about what effective addresses
are used for code, and correctly handle changes to MSR.DR/MSR.IR.
(Split off from dynamic-bat.)
Specifically, don't make any assumptions about what effective addresses
are used for code, and correctly handle changes to MSR.DR/MSR.IR.
(Split off from dynamic-bat.)
Fix Frame Advance and FifoPlayer pause/unpause/stop.
CPU::EnableStepping is not atomic but is called from multiple threads
which races and leaves the system in a random state; also instruction
stepping was unstable, m_StepEvent had an almost random value because
of the dual purpose it served which could cause races where CPU::Run
would SingleStep when it was supposed to be sleeping.
FifoPlayer never FinishStateMove()d which was causing it to deadlock.
Rather than partially reimplementing CPU::Run, just use CPUCoreBase
and then call CPU::Run(). More DRY and less likely to have weird bugs
specific to the player (i.e the previous freezing on pause/stop).
Refactor PowerPC::state into CPU since it manages the state of the
CPU Thread which is controlled by CPU, not PowerPC. This simplifies
the architecture somewhat and eliminates races that can be caused by
calling PowerPC state functions directly instead of using CPU's
(because they bypassed the EnableStepping lock).
This affects enabling and disabling block profiling on the fly.
The block profiling pauses the CPU cores and then flushes the JIT's block cache and enables block profile.
The issue with this is that when we pause the CPU core, we don't have a way to tell if the JIT recompiler has actually left.
So if the secondary thread that is clearing the JIT block cache is too quick, it will clear the cache as a recompiler is still running that block that
has been cleared.
The PowerPC CPU has bits in MSR (DR and IR) which control whether
addresses are translated. We should respect these instead of mixing
physical addresses and translated addresses into the same address space.
This is mostly mass-renaming calls to memory accesses APIs from places
which expect address translation to use a different version from those
which do not expect address translation.
This does very little on its own, but it's the first step to a correct BAT
implementation.
Updated PTE.R bit on Write and Instruction fetch.
Added code to read the PTE from MEM2 if the PTE is stored there.
Refactored the two hash functions to reduce code duplication.
Updated save state version.
Also correct behavior with regards to which bits in XER are treated as zero
based on a hwtest (probably doesn't affect any real games, but might as well
be correct).
The register is RBP, previously in the GPR allocation order. The next
commit will investigate whether there are too few GPRs (now or before),
but for now there is no replacement.
Previously, it was accessed RIP relatively; using RBP, anything in the
first 0x100 bytes of ppcState (including all the GPRs) can be accessed
with three fewer bytes. Code to access ppcState is generated constantly
(mostly by register save/load), so in principle, this should improve
instruction cache footprint significantly. It seems that this makes a
significant performance difference in practice.
The vast majority of this commit is mechanically replacing
M(&PowerPC::ppcState.x) with a new macro PPCSTATE(x).
Version 2: gets most of the cases which were using the register access
macros.
This value was "helpful" for debugging when the stack got corrupted.
Helpful that if gpr[1](Which is the stack pointer with PPC ABI) is zero then the interpreter would spam huge amounts of annoy text saying that we
managed to get in to a "corrupted" state.
This is incremented every instruction on the interpreter, or every block run on the JIT64....Only if debugging is enabled(JIT64 it is a const
variable)
The message is only outputted when interpreter is used and debugging is enabled.