BizHawk/yabause/README.PSP

803 lines
40 KiB
Plaintext

PSP-Specific Yabause Documentation
==================================
Important notice
----------------
PSP support for Yabause is experimental; please be aware that some things
may not work well (or at all).
Unlike Yabause 0.9.10, this version of Yabause now works on all PSPs,
including the original PSP-1000 ("Phat"). However, some games may run
more slowly on PSP Phats because of the limited amount of memory available
for caching dynamically-translated program code.
Installing from a binary distribution
-------------------------------------
The yabause-X.Y.Z.zip archive contains a "PSP" directory (folder); copy
this into the root directory of your Memory Stick. (On Windows, for
example, your Memory Stick might show up as the drive F: -- in this case,
drag the "PSP" folder from the ZIP archive onto the "F:" drive icon in
Windows Explorer.)
The "PSP" directory contains a directory called "GAME", which in turn
contains a directory called "YABAUSE". Inside the "YABAUSE" directory are
two files named "EBOOT.PBP" and "ME.PRX"; these are the program files used
by Yabause, like .EXE and .DLL files on Windows. You'll also need to copy
your CD image and other data files to this directory on your Memory Stick
(see below).
Once you've copied Yabause to your Memory Stick, skip to "How to use
Yabause" below.
Installing from source
----------------------
To build Yabause for PSP from the source code, you'll need a recent (at
least SVN r2450(*)) copy of the unofficial PSP SDK from http://ps2dev.org,
along with the toolchain from the same site; Gentoo Linux users can also
download a Portage overlay from http://achurch.org/portage-psp.tar.bz2 and
"emerge pspsdk". Ensure that the PSP toolchain (psp-gcc) and tools
(psp-prxgen, etc.) are in your $PATH, then configure Yabause with:
./configure --host=psp [options...]
(*) Note that the PSP SDK headers and libraries are, at least through
r2493, missing some functions required by Yabause. If you get errors
about the functions sceKernelIcacheInvalidateAll or
sceKernelIcacheInvalidateRange, apply the patch found in
src/psp/icache-funcs-2450.patch to the PSP SDK source, recompile and
reinstall it, then rebuild Yabause. This patch is already included if
you build the SDK from the Gentoo Portage overlay.
You can ignore the warning about the --build option that appears when you
start the configure script. You may also see a warning about "using cross
tools not prefixed with host triplet"; you can usually ignore this as well,
but if you get strange build errors related to libraries like SDL or
OpenGL, try disabling the optional libraries with the options
"--without-sdl" and "--without-opengl".
The following additional options can be used when configuring for PSP:
--enable-psp-debug
Enables printing of debug messages to standard error.
--enable-psp-profile
Enables printing of profiling statistics to standard error.
By default, statistics are output every 100 frames; edit
src/psp/main.c to change this. Note that profiling has a
significant impact on emulation speed.
--with-psp-me-test
Builds an additional program, "me-test.prx", which tests the
functionality of the Media Engine access library included with
Yabause. Only useful for debugging or extending the library.
Note that if you build with optimization disabled (-O0) or at too low a
level, you may get compilation errors in src/psp/satopt-sh2.c. -O3 is
recommended; set this flag in the CFLAGS environment variable before
running the "configure" script. For example, if you use the "bash" shell:
CFLAGS=-O3 ./configure --host=psp [options...]
After the configure script completes, run "make" to build Yabause. The
build process will create the EBOOT.PBP and me.prx (note that the latter
is lowercase) files in the src/psp/ subdirectory; create a directory for
Yabause under /PSP/GAME on your memory stick (e.g. /PSP/GAME/YABAUSE) and
copy the files there.
How to use Yabause (PSP-specific notes)
---------------------------------------
All files you intend to use with Yabause (BIOS images, CD images, backup
RAM images) must be stored in the same directory as the EBOOT.PBP and
ME.PRX files mentioned above. The default filenames used by Yabause are
as follows:
BIOS.BIN -- BIOS image
CD.ISO -- CD image (can also be a *.CUE file)
BACKUP.BIN -- Backup RAM image (will be created if it does not exist)
You can choose other files from the Yabause configuration menu, which is
displayed the first time you start Yabause and can also be brought up at
any time by pressing the Select button; see below for details. If you do
not already have a backup RAM image, just leave the backup RAM filename at
its default setting, and the file will be created the first time backup RAM
is saved.
The directional pad and analog stick can both be used to emulate the
Saturn controller's directional pad. The default button controls are as
follows:
Start -- Start
A -- Cross
B -- Circle
C -- (unassigned)
X -- Square
Y -- Triangle
Z -- (unassigned)
L -- L
R -- R
Button controls can be changed via the configuration menu.
The Yabause PSP configuration menu
----------------------------------
When you first run Yabause, the configuration menu will be displayed,
allowing you to choose the CD image you want to run and configure other
Yabause options. You can also press Select while the emulator is running
to bring up the menu; the emulator will remain paused while you have the
menu open.
The main menu contains six options:
* "Configure general options..."
This opens a submenu with the following options:
* "Start emulator immediately"
When enabled, the emulator will start running immediately when
you load Yabause, instead of showing the configuration menu.
* "Select BIOS/CD/backup files..."
This opens a submenu which allows you to select the files
containing the BIOS image, CD image, and backup data you want
to use. Selecting one of the three options will open a file
selector, allowing you to choose any file in the Yabause
directory on your Memory Stick.
Note that changing any of the files will reset the emulator.
* "Auto-save backup RAM"
When enabled, automatically saves the contents of backup RAM to
your Memory Stick whenever you save your game in the emulator.
The emulator will display "Backup RAM saved." on the screen for
a short time when an autosave occurs. Note that the emulator
may pause for a fraction of a second while autosaving. This
option is enabled by default.
Be aware that backup RAM is _not_ saved to the Memory Stick
when you quit Yabause; if you disable this option, you need to
manually save it using the "Save backup RAM now" option when
appropriate.
* "Save backup RAM now"
Immediately saves the contents of backup RAM to your Memory
Stick. If you have auto-save disabled, you should use this
option to save backup RAM before quitting Yabause.
* "Save backup RAM as..."
Allows you to enter a new filename (using the PSP's built-in
on-screen keyboard) for the backup RAM save file. This can be
useful if you want to keep separate backup RAM files for
different games, or if you want to save more slots than a game
normally allows. Yabause will immediately save backup RAM to
the filename you enter, and will also use that filename when
later auto-saving backup RAM (or when you manually use "Save
backup RAM now"). However, the new filename will only be used
until you quit Yabause, unless you select "Save options" on the
main menu.
Note that the emulator will _not_ be reset when you use this
option, so you can feel free to select it while playing a game.
(However, don't select it while the game is in the middle of
loading or saving, as this can corrupt backup RAM -- just as if
you tried to remove the PSP's Memory Stick while saving a game
on your PSP.)
NOTE: For reasons currently unknown, the top part of the
on-screen keyboard display may flicker or appear corrupted.
However, text can be entered as usual.
* "Configure controller buttons..."
This opens a submenu which allows you to configure which PSP button
corresponds to which button on the emulated Saturn controller.
Pressing one of the Circle, Cross, Triangle, or Square buttons on
the PSP will assign that button to the currently selected Saturn
controller button. The PSP's Start, L, and R buttons are always
assigned to the same-named buttons on the Saturn controller, and
cannot be changed.
Since both the Circle and Cross buttons are used for button
assignment, the Start button is used to return to the main menu.
* "Configure video options..."
This opens a submenu with the following options:
* "Use hardware video renderer" / "Use software video renderer"
These options allow you to choose between the PSP-specific
hardware renderer and the default software renderer built into
Yabause for displaying Saturn graphics. The hardware renderer
is significantly faster; for simple 2-D graphics, it can run at
a full 60fps without frame skipping (if the game program itself
can be emulated quickly enough). However, a number of more
complex graphics features are not supported, so if a game does
not display correctly, try using the software renderer instead.
The selected renderer can be changed while the emulator is
running without disturbing your game in progress. However,
changing the renderer may cause the screen to blank out or
display corrupted graphics for a short time.
* "Configure hardware rendering settings..."
This option opens another submenu which allows you to change
certain aspects of the hardware video renderer's behavior:
* "Aggressively cache pixel data"
When enabled, Yabause will try to store a copy of all
graphic data in the PSP's native pixel format, to speed up
drawing. However, Yabause may not always notice when the
data is changed, causing incorrect graphics to appear.
(This can be fixed by disabling the option, exiting the
menu for a moment, then re-enabling the option.) When
disabled, all graphics are redrawn from the Saturn data
every frame. This option is enabled by default.
* "Smooth textures and sprites"
When enabled, smoothing (antialiasing) is applied to all
3-D textures and sprites drawn on the screen. This can
make 3-D environments look smoother than on a real Saturn,
but it will also cause zoomed sprites to look blurry, which
may not be the game's intended behavior.
* "Smooth high-resolution graphics"
When enabled, high-resolution graphics (which ordinarly
would not fit on the PSP's screen) are displayed by
averaging adjacent pixels to give a smoother look to the
display; this can particularly help in reading small text
on a high-resolution screen. However, this smoothing is
significantly slower than the default method of just
skipping every second pixel.
* "Enable rotated/distorted graphics"
Selects whether to display rotated or distorted graphics
at all. Most such graphics cannot be rendered by the
PSP's hardware, so Yabause has to draw them in software,
which can be a major source of slowdown. Disabling this
option will turn such graphics off entirely. This option
is enabled by default.
* "Optimize rotated/distorted graphics"
When enabled, Yabause will try to detect certain types of
rotated or distorted graphics which can be approximated by
PSP hardware operations such as 3D transformations, and use
the PSP's hardware to draw them quickly. However, this
will often result in graphics that look different from the
game as played on an actual Saturn, so this option can be
used to disable the optimizations and draw the graphcs more
accurately (at the expense of speed). This option is
enabled by default.
Note that none of the above options have any effect when the
software video renderer is in use.
* "Configure frame-skip settings..."
This option opens another submenu which allows you to configure
the hardware renderer's frame-skip behavior:
* "Frame-skip mode"
This option is intended to allow you to switch between
manual setting and automatic adjustment of frame-skip
parameters. However, automatic mode is not yet
implemented, so always leave this set on "Manual".
* "Number of frames to skip"
In Manual mode, sets the number of frames to skip for every
frame drawn. 0 means "draw every frame", 1 means "draw
every second frame" (skip 1 frame for every frame drawn),
and so on.
* "Limit to 30fps for interlaced display"
Always skip at least one frame when drawing interlaced
(high-resolution) screens. Has no effect unless the number
of frames to skip is set to zero. This option is enabled
by default.
* "Halve framerate for rotated backgrounds"
Reduce the frame rate by half (in other words, skip every
second frame that would otherwise be drawn) when rotated or
distorted background graphics are displayed. Since rotation
and distortion take a long time to process on the PSP, this
option can help keep games playable even when they make use
of these Saturn hardware features. This option is enabled
by default.
Note that this option does not apply to rotated or
distorted graphics which are displayed using an optimized
algorithm (see the "Optimize rotated/distorted graphics"
option above).
Frame skipping is not supported by the software renderer, so
none of these options will have any effect when the software
renderer is in use.
* "Show FPS"
When enabled, the emulator's current speed in emulated frames per
second (FPS) will be displayed in the upper-right corner of the
screen as "FPS: XX.X (Y/Z)". The number "XX.X" is the average
frame rate, calculated from the last few seconds of emulation;
"Y" shows the number of Saturn frames emulated since the previous
frame was shown, while "Z" is the actual time that passed in
60ths of a second. (Thus, the instantaneous frame rate can be
calculated as (Y/Z)*60.)
This option has no effect when the software renderer is in use.
* "Configure advanced settings..."
This opens a submenu with the following options:
* "Use SH-2 recompiler"
This option allows you to choose between the default SH-2 core,
which recompiles Saturn SH-2 code into native MIPS code for the
PSP, and the SH-2 interpreter built into Yabause. The SH-2
interpreter is much slower, often by an order of magnitude or
more, so there is generally no reason to disable this option
unless you suspect a bug in the recompiler.
Note that changing this option will reset the emulator. As with
"Reset emulator" on the main menu, you must hold L and R while
changing this option to avoid an accidental reset.
* "Select SH-2 optimizations..."
This option opens up another submenu which allows you to turn on
or off certain optimizations used by the SH-2 recompiler. These
are shortcuts taken by the recompiler to allow games to run more
quickly, but in rare cases they can cause games to misbehave or
even crash. If a game doesn't work correctly, turning one or
more of these options off may fix it.
These options can be changed while the emulator is running
without disturbing your game in progress. However, changing them
causes the emulator to clear out any recompiled code it has in
memory, so the game may run slowly for a short time after exiting
the menu as the emulator recompiles SH-2 code using the new
options.
All optimizations are enabled by default.
* "Configure Media Engine options..."
This option opens up another submenu with options for
configuring the Media Engine:
* "Use Media Engine for emulation"
Enables the use of the PSP's Media Engine CPU to handle part
of the emulation in parallel with the main CPU. This can
provide a moderate boost to emulation speed; however, since
the Media Engine is not designed for this sort of parallel
processing, some games may behave incorrectly or even crash.
As such, this option is still considered experimental; use
it at your own risk.
IMPORTANT: It is not currently possible to suspend the PSP
while the Media Engine is in use. If you start Yabause with
the Media Engine enabled, the "suspend" function of the
PSP's power switch will be disabled, so you must save your
game inside the emulator and exit Yabause before putting the
PSP into suspend mode.
This option only takes effect when Yabause is started, so if
you change it, make sure you select "Save options" in the
main menu and then quit and restart Yabause.
* "Cache writeback frequency"
Sets the frequency at which the main CPU and Media Engine
caches are synchronized, relative to the frequency of code
execution on the Media Engine. The default frequency of 1/1
is safest; lower frequencies (1/2, 1/4, and so on) can
increase emulation speed, but are also more likely to cause
sound glitches, crashes, or other incorrect behavior
depending on the particular game. However, adjusting the
size of the write-through region (see below) can mitigate
these problems for some games.
Naturally, this option has no effect if the Media Engine is
not being used for emulation.
* "Sound RAM write-through region"
Sets the size of the region at the beginning of sound RAM
which is written through the PSP's cache. Writing through
the cache is an order of magnitude slower than normal
operation, so setting this to a large value can slow down
games significantly. However, most games only use a small
portion of sound RAM for communication with the sound CPU,
so by tuning this value appropriately, you may be able to
reduce the cache writeback frequency (see above) while still
getting stable operation. From experimentation, a value of
2k seems to work well for some games.
Naturally, this option has no effect if the Media Engine is
not being used for emulation.
* "Use more precise emulation timing"
When enabled, the emulator will keep the various parts of the
emulated Saturn hardware more precisely in sync with each other.
This carries a noticeable speed penalty, but some games may
require this more precise timing to work correctly.
* "Sync audio output to emulation"
When enabled, the emulator will synchronize audio output with
the rest of the emulation. In general, this improves audio/video
synchronization but causes more frequent audio dropouts (or
"popping") when the emulator runs more slowly than real time.
However, the exact effect of this option can vary:
- When disabled, the audio can get ahead of the video if the
emulator is running slowly; this can be seen, for example,
in the Saturn BIOS startup animation. On the other hand,
game code that uses the audio output speed for timing (such
as the movie player in Panzer Dragoon Saga) can actually run
faster with synchronization disabled. MIDI-style background
music will also play more smoothly, though of course the
music tempo will slow down depending on the emulation speed.
- When enabled, the audio output will match the output of a
real Saturn much more closely. In particular, this option
is needed to avoid popping in streamed audio such as Red
Book audio tracks when the emulator runs at full speed
(60fps). On the flip side, the audio will momentarily drop
out (as described above) whenever the emulator takes more
than 1/60th of a second to process an emulated frame.
This option is enabled by default.
* "Sync Saturn clock to emulation"
When enabled, the Saturn's internal clock is synchronized with
the emulation, rather than following real time regardless of
emulation speed. If the emulator is running slow, for example,
this option will slow the Saturn's clock down to match the speed
at which the emulator is running. This option is enabled by
default.
* "Always start from 1998-01-01 12:00"
When enabled, the Saturn's internal clock will always be
initialized to 12:00 noon on January 1, 1998, rather than the
current time when the emulator starts. When used with the clock
sync option above, this is useful in debugging because it ensures
a consistent environment each time the emulator is started.
Outside of debugging, however, there is usually no reason to
enable this option.
* "Save options"
Save the current settings, so Yabause will use them automatically the
next time you start it up.
* "Reset emulator"
Reset the emulator, as though you had pressed the Saturn's RESET
button. To avoid accidentally resetting the emulator, you must hold
the PSP's L and R buttons while selecting this option.
Pressing Select on any menu screen will exit the menu and return to the
Saturn emulation.
Troubleshooting
---------------
Q: "My game runs too slowly!"
A: C'est la vie. The PSP is unfortunately just not powerful enough to
emulate the Saturn at full speed (see "Technical notes" below for the
gory details). Here are some things you can do to improve the speed of
the emulator:
* Make sure you are using the hardware video renderer (in the
"Configure video options" menu) and the SH-2 recompiler (in the
"Configure advanced settings" menu).
* Under "Configure video options" / "Configure hardware rendering"
settings", turn off "Enable rotated/distorted graphics". A single
distorted background can take the equivalent of 2 to 3 frames at
60fps to render on the PSP.
* Under "Configure video options" / "Configure frame-skip settings",
set the frame-skip mode to manual and increase the number of frames
to skip. (Many games only run at 30 frames per second, so using a
frame-skip count of 1 won't actually make a visible difference
compared to a count of 0.)
* Under "Configure advanced emulation options" / "Select SH-2
optimizations", make sure all optimizations are enabled.
* Under "Configure advanced emulation options", if "Use more precise
emulation timing" is disabled, try enabling it. (This may cause
the game to freeze or crash, however.)
* Try turning on the "Use Media Engine for emulation" option in the
"Configure advanced emulation options" menu, but note that this
option is experimental and may cause your game to misbehave or even
crash.
* If the Media Engine is enabled, try lowering the cache writeback
frequency in the "advanced emulation options" menu. Typically,
1/4 to 1/8 will provide a noticeable speed increase over 1/1, while
1/16 and lower are not likely to have much effect.
Q: "My game suddenly froze!"
A: Try pressing Select to open the Yabause menu.
* If the menu doesn't open, then either you've hit a bug in Yabause,
or the SH-2 optimizer has caused the program to misbehave. Restart
Yabause, then go to the "Configure advanced emulation options" /
"Select SH-2 optimizations" and disable all of the options there.
If that fixes the problem, you can then try turning the options on
one by one to find the one that caused the crash (you may need to
repeat whatever actions you performed in the game in order to
determine whether the crash occurs or not), and disable only that
option to keep the emulator running as fast as possible.
* If the menu does open, then one likely cause is a timing issue;
this can be seen, for example, when starting Dead or Alive with the
"Use more precise emulation timing" option disabled. Try enabling
this option under the "Configure advanced emulation options" menu
and resetting the emulator to see if it fixes the problem.
In either of the above cases, it's also possible that the game itself
has a bug. Look in FAQs or other online resources and see if any
similar problems have been reported.
Technical notes
---------------
The Saturn, like the PSOne, is only one step down in power from the PSP
itself, so full-speed emulation is a fairly difficult proposition from the
outset. To make matters worse, the Saturn's architecture is about as
different from the PSP as two modern computer architectures can be:
different primary CPUs (SH-2 versus MIPS Allegrex), big-endian byte order
(Saturn) versus little-endian (PSP), tile-based graphics (Saturn) versus
texture-based graphics (PSP), and so on. As such, Yabause must take a
number of shortcuts to make games even somewhat playable.
<<< SH-2 emulation >>>
Emulation of the Saturn's two SH-2 CPUs in particular is problematic.
These processors run at either 26 or 28 MHz, and they use a RISC-like
instruction set in which most instructions execute in one clock cycle, so
in a worst-case scenario Yabause would need to process 56 million SH-2
instructions per second--on top of sound, video, and other hardware
emulation--to maintain full speed. But the PSP's single(*) Allegrex CPU
runs at a maximum of 333MHz, meaning that the SH-2 emulator must be able to
execute each instruction (including accessing the register file, swapping
byte order in memory accesses, updating the SH-2 clock cycle counter, and
so on) within at most 6 native clock cycles for full-speed emulation. In
fact, the demands of emulating the other Saturn hardware reduce this to
something closer to 4 native clock cycles.
(*) The PSP actually has a second CPU, the Media Engine, but limitations
of the PSP architecture make it unsuitable for use as a full-fledged
second processor. See below for details.
With these limitations, interpreted execution of SH-2 code is out of the
question--merely looking up the instruction handler would exhaust the
instruction's quota of execution time. For this reason, the PSP port uses
a dynamic translator to convert blocks of SH-2 code into blocks of native
MIPS code. When the emulator encounters a block of SH-2 code for the first
time, it scans through the block, generating equivalent native code for the
block which is then executed directly on the native CPU. This naturally
causes the emulator to pause for a short time when it encounters a lot of
new code at once, such as when loading a new part of a game from CD; this
is the price that must be paid for the speed of native code execution.
Even with this dynamic translation, however, there are still a number of
hurdles to fast emulation. For example:
* Every time the end of a code block is reached, the emulator must look up
the next block to execute. This lookup consumes precious cycles which do
not directly correspond to SH-2 instruction emulation (around 35 cycles
per lookup in the current version).
In order to streamline code translation and increase the optimizability
of individual blocks, the dynamic translator tends to choose minimally-
sized blocks for translation. Tests showed that this was an improvement
over an older algorithm that used larger blocks, but the resulting
overhead of block lookups imposes a limit on execution speed for certain
types of code, particularly algorithms which rely heavily on subroutine
calls.
At the other end of the spectrum, one might consider modifying a true
compiler like GCC to accept SH-2 instructions as input, then running
each code block through the compiler itself to generate native code.
This could undoubtedly produce efficient output with larger blocks, but
it would also impose significant additional overhead when translating.
* The SH-2 is unable to load arbitrary constants into registers, instead
using PC-relative accesses to load values outside the range of a MOV #imm
instruction from memory. However, Saturn programs also use PC-relative
accesses for function-local static variables, meaning that there is no
general way to tell whether a given value is actually a constant or
merely a variable that may be modified elsewhere.
This presents a particular problem in optimizing memory accesses, since
if a pointer loaded from a PC-relative address is not known to be
constant, the translated code must incur the overhead of checking the
pointer's value every time the block is executed. The SH-2 core includes
an optional optimization, SH2_OPTIMIZE_LOCAL_POINTERS, which takes the
stance that all such pointers either are constant or will always point
within the same memory region (high system RAM, VDP2 RAM, etc.). This
optimization shows a marked improvement in execution speed in some cases,
but any code which violates the assumption above will cause the emulator
to crash.
* Some games make use of self-modifying code, presumably in an attempt to
increase execution speed; one example can be found in the "light ray"
animation used in Panzer Dragoon Saga when obtaining an item. Naturally,
the use of self-modifying code has a severe impact on execution time in a
dynamic translation environment, as each modification requires every
block containing the modified instruction to be retranslated. (A similar
effect can be seen on modern x86-family CPUs, which internally translate
x86 instructions to native micro-ops for execution; self-modifying code
can slow down the processor by an order of magnitude or more.)
The SH-2 core attempts to detect frequently modified instructions and
pass them directly to the interpreter to avoid the overhead of repeated
translation, but there is unfortunately no true solution to the problem
other than rewriting the relevant part of the game program itself.
* Memory accesses are difficult to implement efficiently; in fact, the SH-2
emulator devotes over 1,000 lines of source code to handling load and
store operations, independently of the memory access handlers in the
Yabause core. The current implementation is able to handle accesses to
true RAM fairly quickly, but any access which falls back to the default
MappedMemory*() handlers incurs a significant access penalty (typically
20-30 cycles plus any handling needed for the specific address).
This is most obvious while loading data from the emulated CD, since the
game program must access a hardware register in a loop while waiting for
the CD data to be loaded, and additionally some games read CD data
directly out of the CD data register rather than using DMA to load the
data into memory. Currently, the only way to speed up such code blocks
is through handwritten translation (see src/psp/satopt-sh2.c).
Patches to either speed up specific games or to improve the translation
algorithm generally are of course welcome.
<<< Use of the Media Engine >>>
Aside from the two SH-2 cores, a third major consumer of CPU time is the
SCSP, the Saturn's sound processor, and particularly the MC68EC000
("68k") CPU used therein. While most games don't run particularly complex
code on the 68k, it is nonetheless a proper CPU in its own right, and
requires a fair amount of time to emulate; multi-channel FM background
music takes time to generate as well. Currently, the PSP port of Yabause
has the ability to make use of the PSP's Media Engine CPU to process 68k
instructions and audio generation in parallel with the rest of the
emulation, but this use of the Media Engine is a considerable departure
from Sony's design and thus a risky endeavor.
The primary difficulty with using the ME as a "second core" in the sense
of the multi-core processors used in PCs is that of cache coherency.
Unlike generic multiprocessor or multi-core systems, the PSP's two CPUs
do not implement cache coherency; this means that neither CPU knows what
the other CPU has in its cache, and one CPU may inadvertently clobber the
other's changes, causing stores to memory to get lost. As an example,
consider these two simple loops, operating in parallel on a two-element
array initialized to {1,1} that resides in a single cache line:
Core 1 Core 2
------ ------
for (;;) { for (;;) {
array[0] += array[1]; array[1] += array[0];
} }
This illustrates two problems caused by the lack of cache coherency:
* On a cache-coherent (or single-core) system, the two array elements
will increase unpredictably as each loop sees the updated value stored
by the other loop. On the PSP, however, both elements will increase
monotonically; once each CPU loads the cache line, it never sees any
stores performed by the other CPU, because accesses to the array always
hit the cache.
* On a cache-coherent system, if the cache line is flushed to memory, it
will always contain the current values of both array elements. On the
PSP, however, the array element _not_ updated by the flushing CPU will
be written with the same value it had when the cache line was loaded
by that CPU. In particular, if the other CPU had already flushed the
cache line, that change will be clobbered--for example (here "SC" is
the main CPU and "ME" is the Media Engine):
Time Operation SC cache ME cache Memory Desired
---- ---------- -------- -------- ------ -------
T1 Initialize {1,1} {1,1} {1,1} {1,1}
T2 SC flush {A,1} {1,B} {A,1} {A,B}
T3 ME flush {C,1} {1,D} {1,D} {C,D}
Note that at no time after initialization are the contents of memory
correct, and in particular, the value "A" written by the SC is lost
when the ME flushes {1,D} from its cache, even though the ME loop
never actually modified that array element.
In order for Yabause to have even a hope of stable operation, therefore,
the use of both CPUs' caches must be carefully controlled to avoid data
loss.
When use of the Media Engine is enabled, the following steps are taken
to avoid data corruption due to the lack of cache coherency:
* SCSP state variables used for inter-thread communication are divided into
separate, 64-byte (cache-line) aligned data sections, based on which
thread (the main Yabause thread, running on the SC, or the SCSP thread,
running on the ME) writes to them.
* SCSP state variables are accessed using uncached (0x4nnnnnnn) addresses
in two cases: when _reading_ data written by the other CPU (to avoid an
old value getting stuck in the cache), and when _writing_ data which is
also written by the other CPU (to avoid the cache line clobbering problem
described above).
* Sound RAM is accessed _with_ caching (except in one case described
below), because forcing every sound RAM access through an uncached
pointer causes significant slowdown. Instead, cached CPU data is written
back to RAM at strategic points.
* The SC's data cache is flushed (written back and invalidated) immediately
before waiting for the SCSP thread to finish processing, e.g. for
ScspReset(). The data cache is written back on every ScspExec() call
(though the writeback frequency may be reduced through the configuration
menu), but it is _not_ flushed for performance reasons; instead, sound
RAM read accesses from the SC are made through uncached addresses, as
with SCSP state variables above.
* The ME's data cache is flushed after each iteration of the SCSP thread
loop. This flushing is not coded directly into scsp.c, but instead
takes place in the YabThreadYield() and YabThreadSleep() implementations.
(These functions are naturally meaningless on the ME, but since the SCSP
thread calls one or the other at the end of each loop, it's a convenient
place to flush the cache.)
* The 68k state block, along with dynamically-generated native code when
dynamic translation is enabled, is stored in a separately allocated pool
and managed with custom memory allocation functions (local_malloc() and
friends in psp-m68k.c), since the standard memory management functions
are not designed to work with the ME and would likely cause a crash due
to cache desynchronization.
In general, using the ME provides a moderate speed improvement (10-15%) to
overall emulation speed. There are, however, some cases in which the lack
of cache coherency could cause games to misbehave or even crash Yabause:
* If a game writes (from the SH-2) to a portion of sound RAM containing 68k
program code while the 68k is executing, the 68k may execute incorrect
code, or the dynamic translation memory pool may be corrupted. Normally,
games should only load code while the 68k is stopped, but there may be
cases when the SH-2 writes to a variable in sound RAM which is located in
the same region as 68k code, thus triggering this issue.
* Games which rely on the precise relative timing of the SH-2 and 68k
processors are likely to fail in any multithreaded emulator, but are more
likely to fail when using the ME due to delays in data being written out
from the data caches.