PSP-Specific Yabause Documentation ================================== Important notice ---------------- PSP support for Yabause is experimental; please be aware that some things may not work well (or at all). Unlike Yabause 0.9.10, this version of Yabause now works on all PSPs, including the original PSP-1000 ("Phat"). However, some games may run more slowly on PSP Phats because of the limited amount of memory available for caching dynamically-translated program code. Installing from a binary distribution ------------------------------------- The yabause-X.Y.Z.zip archive contains a "PSP" directory (folder); copy this into the root directory of your Memory Stick. (On Windows, for example, your Memory Stick might show up as the drive F: -- in this case, drag the "PSP" folder from the ZIP archive onto the "F:" drive icon in Windows Explorer.) The "PSP" directory contains a directory called "GAME", which in turn contains a directory called "YABAUSE". Inside the "YABAUSE" directory are two files named "EBOOT.PBP" and "ME.PRX"; these are the program files used by Yabause, like .EXE and .DLL files on Windows. You'll also need to copy your CD image and other data files to this directory on your Memory Stick (see below). Once you've copied Yabause to your Memory Stick, skip to "How to use Yabause" below. Installing from source ---------------------- To build Yabause for PSP from the source code, you'll need a recent (at least SVN r2450(*)) copy of the unofficial PSP SDK from http://ps2dev.org, along with the toolchain from the same site; Gentoo Linux users can also download a Portage overlay from http://achurch.org/portage-psp.tar.bz2 and "emerge pspsdk". Ensure that the PSP toolchain (psp-gcc) and tools (psp-prxgen, etc.) are in your $PATH, then configure Yabause with: ./configure --host=psp [options...] (*) Note that the PSP SDK headers and libraries are, at least through r2493, missing some functions required by Yabause. If you get errors about the functions sceKernelIcacheInvalidateAll or sceKernelIcacheInvalidateRange, apply the patch found in src/psp/icache-funcs-2450.patch to the PSP SDK source, recompile and reinstall it, then rebuild Yabause. This patch is already included if you build the SDK from the Gentoo Portage overlay. You can ignore the warning about the --build option that appears when you start the configure script. You may also see a warning about "using cross tools not prefixed with host triplet"; you can usually ignore this as well, but if you get strange build errors related to libraries like SDL or OpenGL, try disabling the optional libraries with the options "--without-sdl" and "--without-opengl". The following additional options can be used when configuring for PSP: --enable-psp-debug Enables printing of debug messages to standard error. --enable-psp-profile Enables printing of profiling statistics to standard error. By default, statistics are output every 100 frames; edit src/psp/main.c to change this. Note that profiling has a significant impact on emulation speed. --with-psp-me-test Builds an additional program, "me-test.prx", which tests the functionality of the Media Engine access library included with Yabause. Only useful for debugging or extending the library. Note that if you build with optimization disabled (-O0) or at too low a level, you may get compilation errors in src/psp/satopt-sh2.c. -O3 is recommended; set this flag in the CFLAGS environment variable before running the "configure" script. For example, if you use the "bash" shell: CFLAGS=-O3 ./configure --host=psp [options...] After the configure script completes, run "make" to build Yabause. The build process will create the EBOOT.PBP and me.prx (note that the latter is lowercase) files in the src/psp/ subdirectory; create a directory for Yabause under /PSP/GAME on your memory stick (e.g. /PSP/GAME/YABAUSE) and copy the files there. How to use Yabause (PSP-specific notes) --------------------------------------- All files you intend to use with Yabause (BIOS images, CD images, backup RAM images) must be stored in the same directory as the EBOOT.PBP and ME.PRX files mentioned above. The default filenames used by Yabause are as follows: BIOS.BIN -- BIOS image CD.ISO -- CD image (can also be a *.CUE file) BACKUP.BIN -- Backup RAM image (will be created if it does not exist) You can choose other files from the Yabause configuration menu, which is displayed the first time you start Yabause and can also be brought up at any time by pressing the Select button; see below for details. If you do not already have a backup RAM image, just leave the backup RAM filename at its default setting, and the file will be created the first time backup RAM is saved. The directional pad and analog stick can both be used to emulate the Saturn controller's directional pad. The default button controls are as follows: Start -- Start A -- Cross B -- Circle C -- (unassigned) X -- Square Y -- Triangle Z -- (unassigned) L -- L R -- R Button controls can be changed via the configuration menu. The Yabause PSP configuration menu ---------------------------------- When you first run Yabause, the configuration menu will be displayed, allowing you to choose the CD image you want to run and configure other Yabause options. You can also press Select while the emulator is running to bring up the menu; the emulator will remain paused while you have the menu open. The main menu contains six options: * "Configure general options..." This opens a submenu with the following options: * "Start emulator immediately" When enabled, the emulator will start running immediately when you load Yabause, instead of showing the configuration menu. * "Select BIOS/CD/backup files..." This opens a submenu which allows you to select the files containing the BIOS image, CD image, and backup data you want to use. Selecting one of the three options will open a file selector, allowing you to choose any file in the Yabause directory on your Memory Stick. Note that changing any of the files will reset the emulator. * "Auto-save backup RAM" When enabled, automatically saves the contents of backup RAM to your Memory Stick whenever you save your game in the emulator. The emulator will display "Backup RAM saved." on the screen for a short time when an autosave occurs. Note that the emulator may pause for a fraction of a second while autosaving. This option is enabled by default. Be aware that backup RAM is _not_ saved to the Memory Stick when you quit Yabause; if you disable this option, you need to manually save it using the "Save backup RAM now" option when appropriate. * "Save backup RAM now" Immediately saves the contents of backup RAM to your Memory Stick. If you have auto-save disabled, you should use this option to save backup RAM before quitting Yabause. * "Save backup RAM as..." Allows you to enter a new filename (using the PSP's built-in on-screen keyboard) for the backup RAM save file. This can be useful if you want to keep separate backup RAM files for different games, or if you want to save more slots than a game normally allows. Yabause will immediately save backup RAM to the filename you enter, and will also use that filename when later auto-saving backup RAM (or when you manually use "Save backup RAM now"). However, the new filename will only be used until you quit Yabause, unless you select "Save options" on the main menu. Note that the emulator will _not_ be reset when you use this option, so you can feel free to select it while playing a game. (However, don't select it while the game is in the middle of loading or saving, as this can corrupt backup RAM -- just as if you tried to remove the PSP's Memory Stick while saving a game on your PSP.) NOTE: For reasons currently unknown, the top part of the on-screen keyboard display may flicker or appear corrupted. However, text can be entered as usual. * "Configure controller buttons..." This opens a submenu which allows you to configure which PSP button corresponds to which button on the emulated Saturn controller. Pressing one of the Circle, Cross, Triangle, or Square buttons on the PSP will assign that button to the currently selected Saturn controller button. The PSP's Start, L, and R buttons are always assigned to the same-named buttons on the Saturn controller, and cannot be changed. Since both the Circle and Cross buttons are used for button assignment, the Start button is used to return to the main menu. * "Configure video options..." This opens a submenu with the following options: * "Use hardware video renderer" / "Use software video renderer" These options allow you to choose between the PSP-specific hardware renderer and the default software renderer built into Yabause for displaying Saturn graphics. The hardware renderer is significantly faster; for simple 2-D graphics, it can run at a full 60fps without frame skipping (if the game program itself can be emulated quickly enough). However, a number of more complex graphics features are not supported, so if a game does not display correctly, try using the software renderer instead. The selected renderer can be changed while the emulator is running without disturbing your game in progress. However, changing the renderer may cause the screen to blank out or display corrupted graphics for a short time. * "Configure hardware rendering settings..." This option opens another submenu which allows you to change certain aspects of the hardware video renderer's behavior: * "Aggressively cache pixel data" When enabled, Yabause will try to store a copy of all graphic data in the PSP's native pixel format, to speed up drawing. However, Yabause may not always notice when the data is changed, causing incorrect graphics to appear. (This can be fixed by disabling the option, exiting the menu for a moment, then re-enabling the option.) When disabled, all graphics are redrawn from the Saturn data every frame. This option is enabled by default. * "Smooth textures and sprites" When enabled, smoothing (antialiasing) is applied to all 3-D textures and sprites drawn on the screen. This can make 3-D environments look smoother than on a real Saturn, but it will also cause zoomed sprites to look blurry, which may not be the game's intended behavior. * "Smooth high-resolution graphics" When enabled, high-resolution graphics (which ordinarly would not fit on the PSP's screen) are displayed by averaging adjacent pixels to give a smoother look to the display; this can particularly help in reading small text on a high-resolution screen. However, this smoothing is significantly slower than the default method of just skipping every second pixel. * "Enable rotated/distorted graphics" Selects whether to display rotated or distorted graphics at all. Most such graphics cannot be rendered by the PSP's hardware, so Yabause has to draw them in software, which can be a major source of slowdown. Disabling this option will turn such graphics off entirely. This option is enabled by default. * "Optimize rotated/distorted graphics" When enabled, Yabause will try to detect certain types of rotated or distorted graphics which can be approximated by PSP hardware operations such as 3D transformations, and use the PSP's hardware to draw them quickly. However, this will often result in graphics that look different from the game as played on an actual Saturn, so this option can be used to disable the optimizations and draw the graphcs more accurately (at the expense of speed). This option is enabled by default. Note that none of the above options have any effect when the software video renderer is in use. * "Configure frame-skip settings..." This option opens another submenu which allows you to configure the hardware renderer's frame-skip behavior: * "Frame-skip mode" This option is intended to allow you to switch between manual setting and automatic adjustment of frame-skip parameters. However, automatic mode is not yet implemented, so always leave this set on "Manual". * "Number of frames to skip" In Manual mode, sets the number of frames to skip for every frame drawn. 0 means "draw every frame", 1 means "draw every second frame" (skip 1 frame for every frame drawn), and so on. * "Limit to 30fps for interlaced display" Always skip at least one frame when drawing interlaced (high-resolution) screens. Has no effect unless the number of frames to skip is set to zero. This option is enabled by default. * "Halve framerate for rotated backgrounds" Reduce the frame rate by half (in other words, skip every second frame that would otherwise be drawn) when rotated or distorted background graphics are displayed. Since rotation and distortion take a long time to process on the PSP, this option can help keep games playable even when they make use of these Saturn hardware features. This option is enabled by default. Note that this option does not apply to rotated or distorted graphics which are displayed using an optimized algorithm (see the "Optimize rotated/distorted graphics" option above). Frame skipping is not supported by the software renderer, so none of these options will have any effect when the software renderer is in use. * "Show FPS" When enabled, the emulator's current speed in emulated frames per second (FPS) will be displayed in the upper-right corner of the screen as "FPS: XX.X (Y/Z)". The number "XX.X" is the average frame rate, calculated from the last few seconds of emulation; "Y" shows the number of Saturn frames emulated since the previous frame was shown, while "Z" is the actual time that passed in 60ths of a second. (Thus, the instantaneous frame rate can be calculated as (Y/Z)*60.) This option has no effect when the software renderer is in use. * "Configure advanced settings..." This opens a submenu with the following options: * "Use SH-2 recompiler" This option allows you to choose between the default SH-2 core, which recompiles Saturn SH-2 code into native MIPS code for the PSP, and the SH-2 interpreter built into Yabause. The SH-2 interpreter is much slower, often by an order of magnitude or more, so there is generally no reason to disable this option unless you suspect a bug in the recompiler. Note that changing this option will reset the emulator. As with "Reset emulator" on the main menu, you must hold L and R while changing this option to avoid an accidental reset. * "Select SH-2 optimizations..." This option opens up another submenu which allows you to turn on or off certain optimizations used by the SH-2 recompiler. These are shortcuts taken by the recompiler to allow games to run more quickly, but in rare cases they can cause games to misbehave or even crash. If a game doesn't work correctly, turning one or more of these options off may fix it. These options can be changed while the emulator is running without disturbing your game in progress. However, changing them causes the emulator to clear out any recompiled code it has in memory, so the game may run slowly for a short time after exiting the menu as the emulator recompiles SH-2 code using the new options. All optimizations are enabled by default. * "Configure Media Engine options..." This option opens up another submenu with options for configuring the Media Engine: * "Use Media Engine for emulation" Enables the use of the PSP's Media Engine CPU to handle part of the emulation in parallel with the main CPU. This can provide a moderate boost to emulation speed; however, since the Media Engine is not designed for this sort of parallel processing, some games may behave incorrectly or even crash. As such, this option is still considered experimental; use it at your own risk. IMPORTANT: It is not currently possible to suspend the PSP while the Media Engine is in use. If you start Yabause with the Media Engine enabled, the "suspend" function of the PSP's power switch will be disabled, so you must save your game inside the emulator and exit Yabause before putting the PSP into suspend mode. This option only takes effect when Yabause is started, so if you change it, make sure you select "Save options" in the main menu and then quit and restart Yabause. * "Cache writeback frequency" Sets the frequency at which the main CPU and Media Engine caches are synchronized, relative to the frequency of code execution on the Media Engine. The default frequency of 1/1 is safest; lower frequencies (1/2, 1/4, and so on) can increase emulation speed, but are also more likely to cause sound glitches, crashes, or other incorrect behavior depending on the particular game. However, adjusting the size of the write-through region (see below) can mitigate these problems for some games. Naturally, this option has no effect if the Media Engine is not being used for emulation. * "Sound RAM write-through region" Sets the size of the region at the beginning of sound RAM which is written through the PSP's cache. Writing through the cache is an order of magnitude slower than normal operation, so setting this to a large value can slow down games significantly. However, most games only use a small portion of sound RAM for communication with the sound CPU, so by tuning this value appropriately, you may be able to reduce the cache writeback frequency (see above) while still getting stable operation. From experimentation, a value of 2k seems to work well for some games. Naturally, this option has no effect if the Media Engine is not being used for emulation. * "Use more precise emulation timing" When enabled, the emulator will keep the various parts of the emulated Saturn hardware more precisely in sync with each other. This carries a noticeable speed penalty, but some games may require this more precise timing to work correctly. * "Sync audio output to emulation" When enabled, the emulator will synchronize audio output with the rest of the emulation. In general, this improves audio/video synchronization but causes more frequent audio dropouts (or "popping") when the emulator runs more slowly than real time. However, the exact effect of this option can vary: - When disabled, the audio can get ahead of the video if the emulator is running slowly; this can be seen, for example, in the Saturn BIOS startup animation. On the other hand, game code that uses the audio output speed for timing (such as the movie player in Panzer Dragoon Saga) can actually run faster with synchronization disabled. MIDI-style background music will also play more smoothly, though of course the music tempo will slow down depending on the emulation speed. - When enabled, the audio output will match the output of a real Saturn much more closely. In particular, this option is needed to avoid popping in streamed audio such as Red Book audio tracks when the emulator runs at full speed (60fps). On the flip side, the audio will momentarily drop out (as described above) whenever the emulator takes more than 1/60th of a second to process an emulated frame. This option is enabled by default. * "Sync Saturn clock to emulation" When enabled, the Saturn's internal clock is synchronized with the emulation, rather than following real time regardless of emulation speed. If the emulator is running slow, for example, this option will slow the Saturn's clock down to match the speed at which the emulator is running. This option is enabled by default. * "Always start from 1998-01-01 12:00" When enabled, the Saturn's internal clock will always be initialized to 12:00 noon on January 1, 1998, rather than the current time when the emulator starts. When used with the clock sync option above, this is useful in debugging because it ensures a consistent environment each time the emulator is started. Outside of debugging, however, there is usually no reason to enable this option. * "Save options" Save the current settings, so Yabause will use them automatically the next time you start it up. * "Reset emulator" Reset the emulator, as though you had pressed the Saturn's RESET button. To avoid accidentally resetting the emulator, you must hold the PSP's L and R buttons while selecting this option. Pressing Select on any menu screen will exit the menu and return to the Saturn emulation. Troubleshooting --------------- Q: "My game runs too slowly!" A: C'est la vie. The PSP is unfortunately just not powerful enough to emulate the Saturn at full speed (see "Technical notes" below for the gory details). Here are some things you can do to improve the speed of the emulator: * Make sure you are using the hardware video renderer (in the "Configure video options" menu) and the SH-2 recompiler (in the "Configure advanced settings" menu). * Under "Configure video options" / "Configure hardware rendering" settings", turn off "Enable rotated/distorted graphics". A single distorted background can take the equivalent of 2 to 3 frames at 60fps to render on the PSP. * Under "Configure video options" / "Configure frame-skip settings", set the frame-skip mode to manual and increase the number of frames to skip. (Many games only run at 30 frames per second, so using a frame-skip count of 1 won't actually make a visible difference compared to a count of 0.) * Under "Configure advanced emulation options" / "Select SH-2 optimizations", make sure all optimizations are enabled. * Under "Configure advanced emulation options", if "Use more precise emulation timing" is disabled, try enabling it. (This may cause the game to freeze or crash, however.) * Try turning on the "Use Media Engine for emulation" option in the "Configure advanced emulation options" menu, but note that this option is experimental and may cause your game to misbehave or even crash. * If the Media Engine is enabled, try lowering the cache writeback frequency in the "advanced emulation options" menu. Typically, 1/4 to 1/8 will provide a noticeable speed increase over 1/1, while 1/16 and lower are not likely to have much effect. Q: "My game suddenly froze!" A: Try pressing Select to open the Yabause menu. * If the menu doesn't open, then either you've hit a bug in Yabause, or the SH-2 optimizer has caused the program to misbehave. Restart Yabause, then go to the "Configure advanced emulation options" / "Select SH-2 optimizations" and disable all of the options there. If that fixes the problem, you can then try turning the options on one by one to find the one that caused the crash (you may need to repeat whatever actions you performed in the game in order to determine whether the crash occurs or not), and disable only that option to keep the emulator running as fast as possible. * If the menu does open, then one likely cause is a timing issue; this can be seen, for example, when starting Dead or Alive with the "Use more precise emulation timing" option disabled. Try enabling this option under the "Configure advanced emulation options" menu and resetting the emulator to see if it fixes the problem. In either of the above cases, it's also possible that the game itself has a bug. Look in FAQs or other online resources and see if any similar problems have been reported. Technical notes --------------- The Saturn, like the PSOne, is only one step down in power from the PSP itself, so full-speed emulation is a fairly difficult proposition from the outset. To make matters worse, the Saturn's architecture is about as different from the PSP as two modern computer architectures can be: different primary CPUs (SH-2 versus MIPS Allegrex), big-endian byte order (Saturn) versus little-endian (PSP), tile-based graphics (Saturn) versus texture-based graphics (PSP), and so on. As such, Yabause must take a number of shortcuts to make games even somewhat playable. <<< SH-2 emulation >>> Emulation of the Saturn's two SH-2 CPUs in particular is problematic. These processors run at either 26 or 28 MHz, and they use a RISC-like instruction set in which most instructions execute in one clock cycle, so in a worst-case scenario Yabause would need to process 56 million SH-2 instructions per second--on top of sound, video, and other hardware emulation--to maintain full speed. But the PSP's single(*) Allegrex CPU runs at a maximum of 333MHz, meaning that the SH-2 emulator must be able to execute each instruction (including accessing the register file, swapping byte order in memory accesses, updating the SH-2 clock cycle counter, and so on) within at most 6 native clock cycles for full-speed emulation. In fact, the demands of emulating the other Saturn hardware reduce this to something closer to 4 native clock cycles. (*) The PSP actually has a second CPU, the Media Engine, but limitations of the PSP architecture make it unsuitable for use as a full-fledged second processor. See below for details. With these limitations, interpreted execution of SH-2 code is out of the question--merely looking up the instruction handler would exhaust the instruction's quota of execution time. For this reason, the PSP port uses a dynamic translator to convert blocks of SH-2 code into blocks of native MIPS code. When the emulator encounters a block of SH-2 code for the first time, it scans through the block, generating equivalent native code for the block which is then executed directly on the native CPU. This naturally causes the emulator to pause for a short time when it encounters a lot of new code at once, such as when loading a new part of a game from CD; this is the price that must be paid for the speed of native code execution. Even with this dynamic translation, however, there are still a number of hurdles to fast emulation. For example: * Every time the end of a code block is reached, the emulator must look up the next block to execute. This lookup consumes precious cycles which do not directly correspond to SH-2 instruction emulation (around 35 cycles per lookup in the current version). In order to streamline code translation and increase the optimizability of individual blocks, the dynamic translator tends to choose minimally- sized blocks for translation. Tests showed that this was an improvement over an older algorithm that used larger blocks, but the resulting overhead of block lookups imposes a limit on execution speed for certain types of code, particularly algorithms which rely heavily on subroutine calls. At the other end of the spectrum, one might consider modifying a true compiler like GCC to accept SH-2 instructions as input, then running each code block through the compiler itself to generate native code. This could undoubtedly produce efficient output with larger blocks, but it would also impose significant additional overhead when translating. * The SH-2 is unable to load arbitrary constants into registers, instead using PC-relative accesses to load values outside the range of a MOV #imm instruction from memory. However, Saturn programs also use PC-relative accesses for function-local static variables, meaning that there is no general way to tell whether a given value is actually a constant or merely a variable that may be modified elsewhere. This presents a particular problem in optimizing memory accesses, since if a pointer loaded from a PC-relative address is not known to be constant, the translated code must incur the overhead of checking the pointer's value every time the block is executed. The SH-2 core includes an optional optimization, SH2_OPTIMIZE_LOCAL_POINTERS, which takes the stance that all such pointers either are constant or will always point within the same memory region (high system RAM, VDP2 RAM, etc.). This optimization shows a marked improvement in execution speed in some cases, but any code which violates the assumption above will cause the emulator to crash. * Some games make use of self-modifying code, presumably in an attempt to increase execution speed; one example can be found in the "light ray" animation used in Panzer Dragoon Saga when obtaining an item. Naturally, the use of self-modifying code has a severe impact on execution time in a dynamic translation environment, as each modification requires every block containing the modified instruction to be retranslated. (A similar effect can be seen on modern x86-family CPUs, which internally translate x86 instructions to native micro-ops for execution; self-modifying code can slow down the processor by an order of magnitude or more.) The SH-2 core attempts to detect frequently modified instructions and pass them directly to the interpreter to avoid the overhead of repeated translation, but there is unfortunately no true solution to the problem other than rewriting the relevant part of the game program itself. * Memory accesses are difficult to implement efficiently; in fact, the SH-2 emulator devotes over 1,000 lines of source code to handling load and store operations, independently of the memory access handlers in the Yabause core. The current implementation is able to handle accesses to true RAM fairly quickly, but any access which falls back to the default MappedMemory*() handlers incurs a significant access penalty (typically 20-30 cycles plus any handling needed for the specific address). This is most obvious while loading data from the emulated CD, since the game program must access a hardware register in a loop while waiting for the CD data to be loaded, and additionally some games read CD data directly out of the CD data register rather than using DMA to load the data into memory. Currently, the only way to speed up such code blocks is through handwritten translation (see src/psp/satopt-sh2.c). Patches to either speed up specific games or to improve the translation algorithm generally are of course welcome. <<< Use of the Media Engine >>> Aside from the two SH-2 cores, a third major consumer of CPU time is the SCSP, the Saturn's sound processor, and particularly the MC68EC000 ("68k") CPU used therein. While most games don't run particularly complex code on the 68k, it is nonetheless a proper CPU in its own right, and requires a fair amount of time to emulate; multi-channel FM background music takes time to generate as well. Currently, the PSP port of Yabause has the ability to make use of the PSP's Media Engine CPU to process 68k instructions and audio generation in parallel with the rest of the emulation, but this use of the Media Engine is a considerable departure from Sony's design and thus a risky endeavor. The primary difficulty with using the ME as a "second core" in the sense of the multi-core processors used in PCs is that of cache coherency. Unlike generic multiprocessor or multi-core systems, the PSP's two CPUs do not implement cache coherency; this means that neither CPU knows what the other CPU has in its cache, and one CPU may inadvertently clobber the other's changes, causing stores to memory to get lost. As an example, consider these two simple loops, operating in parallel on a two-element array initialized to {1,1} that resides in a single cache line: Core 1 Core 2 ------ ------ for (;;) { for (;;) { array[0] += array[1]; array[1] += array[0]; } } This illustrates two problems caused by the lack of cache coherency: * On a cache-coherent (or single-core) system, the two array elements will increase unpredictably as each loop sees the updated value stored by the other loop. On the PSP, however, both elements will increase monotonically; once each CPU loads the cache line, it never sees any stores performed by the other CPU, because accesses to the array always hit the cache. * On a cache-coherent system, if the cache line is flushed to memory, it will always contain the current values of both array elements. On the PSP, however, the array element _not_ updated by the flushing CPU will be written with the same value it had when the cache line was loaded by that CPU. In particular, if the other CPU had already flushed the cache line, that change will be clobbered--for example (here "SC" is the main CPU and "ME" is the Media Engine): Time Operation SC cache ME cache Memory Desired ---- ---------- -------- -------- ------ ------- T1 Initialize {1,1} {1,1} {1,1} {1,1} T2 SC flush {A,1} {1,B} {A,1} {A,B} T3 ME flush {C,1} {1,D} {1,D} {C,D} Note that at no time after initialization are the contents of memory correct, and in particular, the value "A" written by the SC is lost when the ME flushes {1,D} from its cache, even though the ME loop never actually modified that array element. In order for Yabause to have even a hope of stable operation, therefore, the use of both CPUs' caches must be carefully controlled to avoid data loss. When use of the Media Engine is enabled, the following steps are taken to avoid data corruption due to the lack of cache coherency: * SCSP state variables used for inter-thread communication are divided into separate, 64-byte (cache-line) aligned data sections, based on which thread (the main Yabause thread, running on the SC, or the SCSP thread, running on the ME) writes to them. * SCSP state variables are accessed using uncached (0x4nnnnnnn) addresses in two cases: when _reading_ data written by the other CPU (to avoid an old value getting stuck in the cache), and when _writing_ data which is also written by the other CPU (to avoid the cache line clobbering problem described above). * Sound RAM is accessed _with_ caching (except in one case described below), because forcing every sound RAM access through an uncached pointer causes significant slowdown. Instead, cached CPU data is written back to RAM at strategic points. * The SC's data cache is flushed (written back and invalidated) immediately before waiting for the SCSP thread to finish processing, e.g. for ScspReset(). The data cache is written back on every ScspExec() call (though the writeback frequency may be reduced through the configuration menu), but it is _not_ flushed for performance reasons; instead, sound RAM read accesses from the SC are made through uncached addresses, as with SCSP state variables above. * The ME's data cache is flushed after each iteration of the SCSP thread loop. This flushing is not coded directly into scsp.c, but instead takes place in the YabThreadYield() and YabThreadSleep() implementations. (These functions are naturally meaningless on the ME, but since the SCSP thread calls one or the other at the end of each loop, it's a convenient place to flush the cache.) * The 68k state block, along with dynamically-generated native code when dynamic translation is enabled, is stored in a separately allocated pool and managed with custom memory allocation functions (local_malloc() and friends in psp-m68k.c), since the standard memory management functions are not designed to work with the ME and would likely cause a crash due to cache desynchronization. In general, using the ME provides a moderate speed improvement (10-15%) to overall emulation speed. There are, however, some cases in which the lack of cache coherency could cause games to misbehave or even crash Yabause: * If a game writes (from the SH-2) to a portion of sound RAM containing 68k program code while the 68k is executing, the 68k may execute incorrect code, or the dynamic translation memory pool may be corrupted. Normally, games should only load code while the 68k is stopped, but there may be cases when the SH-2 writes to a variable in sound RAM which is located in the same region as 68k code, thus triggering this issue. * Games which rely on the precise relative timing of the SH-2 and 68k processors are likely to fail in any multithreaded emulator, but are more likely to fail when using the ME due to delays in data being written out from the data caches.