bsnes/higan/sfc/dsp/dsp.cpp

258 lines
4.1 KiB
C++
Raw Normal View History

#include <sfc/sfc.hpp>
namespace SuperFamicom {
DSP dsp;
#define REG(n) state.regs[n]
#define VREG(n) state.regs[v.vidx + n]
#include "gaussian.cpp"
#include "counter.cpp"
#include "envelope.cpp"
#include "brr.cpp"
#include "misc.cpp"
#include "voice.cpp"
#include "echo.cpp"
#include "serialization.cpp"
DSP::DSP() {
static_assert(sizeof(int) >= 32 / 8, "int >= 32-bits");
static_assert((int8_t)0x80 == -0x80, "8-bit sign extension");
static_assert((int16_t)0x8000 == -0x8000, "16-bit sign extension");
static_assert((uint16_t)0xffff0000 == 0, "16-bit unsigned clip");
static_assert((-1 >> 1) == -1, "arithmetic shift right");
//-0x8000 <= n <= +0x7fff
assert(sclamp<16>(+0x8000) == +0x7fff);
assert(sclamp<16>(-0x8001) == -0x8000);
}
/* timing */
auto DSP::step(uint clocks) -> void {
Update to v100r14 release. byuu says: (Windows: compile with -fpermissive to silence an annoying error. I'll fix it in the next WIP.) I completely replaced the time management system in higan and overhauled the scheduler. Before, processor threads would have "int64 clock"; and there would be a 1:1 relationship between two threads. When thread A ran for X cycles, it'd subtract X * B.Frequency from clock; and when thread B ran for Y cycles, it'd add Y * A.Frequency from clock. This worked well and allowed perfect precision; but it doesn't work when you have more complicated relationships: eg the 68K can sync to the Z80 and PSG; the Z80 to the 68K and PSG; so the PSG needs two counters. The new system instead uses a "uint64 clock" variable that represents time in attoseconds. Every time the scheduler exits, it subtracts the smallest clock count from all threads, to prevent an overflow scenario. The only real downside is that rounding errors mean that roughly every 20 minutes, we have a rounding error of one clock cycle (one 20,000,000th of a second.) However, this only applies to systems with multiple oscillators, like the SNES. And when you're in that situation ... there's no such thing as a perfect oscillator anyway. A real SNES will be thousands of times less out of spec than 1hz per 20 minutes. The advantages are pretty immense. First, we obviously can now support more complex relationships between threads. Second, we can build a much more abstracted scheduler. All of libco is now abstracted away completely, which may permit a state-machine / coroutine version of Thread in the future. We've basically gone from this: auto SMP::step(uint clocks) -> void { clock += clocks * (uint64)cpu.frequency; dsp.clock -= clocks; if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread); if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread); } To this: auto SMP::step(uint clocks) -> void { Thread::step(clocks); synchronize(dsp); synchronize(cpu); } As you can see, we don't have to do multiple clock adjustments anymore. This is a huge win for the SNES CPU that had to update the SMP, DSP, all peripherals and all coprocessors. Likewise, we don't have to synchronize all coprocessors when one runs, now we can just synchronize the active one to the CPU. Third, when changing the frequencies of threads (think SGB speed setting modes, GBC double-speed mode, etc), it no longer causes the "int64 clock" value to be erroneous. Fourth, this results in a fairly decent speedup, mostly across the board. Aside from the GBA being mostly a wash (for unknown reasons), it's about an 8% - 12% speedup in every other emulation core. Now, all of this said ... this was an unbelievably massive change, so ... you know what that means >_> If anyone can help test all types of SNES coprocessors, and some other system games, it'd be appreciated. ---- Lastly, we have a bitchin' new about screen. It unfortunately adds ~200KiB onto the binary size, because the PNG->C++ header file transformation doesn't compress very well, and I want to keep the original resource files in with the higan archive. I might try some things to work around this file size increase in the future, but for now ... yeah, slightly larger archive sizes, sorry. The logo's a bit busted on Windows (the Label control's background transparency and alignment settings aren't working), but works well on GTK. I'll have to fix Windows before the next official release. For now, look on my Twitter feed if you want to see what it's supposed to look like. ---- EDIT: forgot about ICD2::Enter. It's doing some weird inverse run-to-save thing that I need to implement support for somehow. So, save states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
Thread::step(clocks);
}
auto DSP::Enter() -> void {
while(true) scheduler.synchronize(), dsp.main();
}
auto DSP::main() -> void {
voice5(voice[0]);
voice2(voice[1]);
tick();
voice6(voice[0]);
voice3(voice[1]);
tick();
voice7(voice[0]);
voice4(voice[1]);
voice1(voice[3]);
tick();
voice8(voice[0]);
voice5(voice[1]);
voice2(voice[2]);
tick();
voice9(voice[0]);
voice6(voice[1]);
voice3(voice[2]);
tick();
voice7(voice[1]);
voice4(voice[2]);
voice1(voice[4]);
tick();
voice8(voice[1]);
voice5(voice[2]);
voice2(voice[3]);
tick();
voice9(voice[1]);
voice6(voice[2]);
voice3(voice[3]);
tick();
voice7(voice[2]);
voice4(voice[3]);
voice1(voice[5]);
tick();
voice8(voice[2]);
voice5(voice[3]);
voice2(voice[4]);
tick();
voice9(voice[2]);
voice6(voice[3]);
voice3(voice[4]);
tick();
voice7(voice[3]);
voice4(voice[4]);
voice1(voice[6]);
tick();
voice8(voice[3]);
voice5(voice[4]);
voice2(voice[5]);
tick();
voice9(voice[3]);
voice6(voice[4]);
voice3(voice[5]);
tick();
voice7(voice[4]);
voice4(voice[5]);
voice1(voice[7]);
tick();
voice8(voice[4]);
voice5(voice[5]);
voice2(voice[6]);
tick();
voice9(voice[4]);
voice6(voice[5]);
voice3(voice[6]);
tick();
voice1(voice[0]);
voice7(voice[5]);
voice4(voice[6]);
tick();
voice8(voice[5]);
voice5(voice[6]);
voice2(voice[7]);
tick();
voice9(voice[5]);
voice6(voice[6]);
voice3(voice[7]);
tick();
voice1(voice[1]);
voice7(voice[6]);
voice4(voice[7]);
tick();
voice8(voice[6]);
voice5(voice[7]);
voice2(voice[0]);
tick();
voice3a(voice[0]);
voice9(voice[6]);
voice6(voice[7]);
echo22();
tick();
voice7(voice[7]);
echo23();
tick();
voice8(voice[7]);
echo24();
tick();
voice3b(voice[0]);
voice9(voice[7]);
echo25();
tick();
echo26();
tick();
misc27();
echo27();
tick();
misc28();
echo28();
tick();
misc29();
echo29();
tick();
misc30();
voice3c(voice[0]);
echo30();
tick();
voice4(voice[0]);
voice1(voice[2]);
tick();
}
auto DSP::tick() -> void {
step(3 * 8);
Update to v100r14 release. byuu says: (Windows: compile with -fpermissive to silence an annoying error. I'll fix it in the next WIP.) I completely replaced the time management system in higan and overhauled the scheduler. Before, processor threads would have "int64 clock"; and there would be a 1:1 relationship between two threads. When thread A ran for X cycles, it'd subtract X * B.Frequency from clock; and when thread B ran for Y cycles, it'd add Y * A.Frequency from clock. This worked well and allowed perfect precision; but it doesn't work when you have more complicated relationships: eg the 68K can sync to the Z80 and PSG; the Z80 to the 68K and PSG; so the PSG needs two counters. The new system instead uses a "uint64 clock" variable that represents time in attoseconds. Every time the scheduler exits, it subtracts the smallest clock count from all threads, to prevent an overflow scenario. The only real downside is that rounding errors mean that roughly every 20 minutes, we have a rounding error of one clock cycle (one 20,000,000th of a second.) However, this only applies to systems with multiple oscillators, like the SNES. And when you're in that situation ... there's no such thing as a perfect oscillator anyway. A real SNES will be thousands of times less out of spec than 1hz per 20 minutes. The advantages are pretty immense. First, we obviously can now support more complex relationships between threads. Second, we can build a much more abstracted scheduler. All of libco is now abstracted away completely, which may permit a state-machine / coroutine version of Thread in the future. We've basically gone from this: auto SMP::step(uint clocks) -> void { clock += clocks * (uint64)cpu.frequency; dsp.clock -= clocks; if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread); if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread); } To this: auto SMP::step(uint clocks) -> void { Thread::step(clocks); synchronize(dsp); synchronize(cpu); } As you can see, we don't have to do multiple clock adjustments anymore. This is a huge win for the SNES CPU that had to update the SMP, DSP, all peripherals and all coprocessors. Likewise, we don't have to synchronize all coprocessors when one runs, now we can just synchronize the active one to the CPU. Third, when changing the frequencies of threads (think SGB speed setting modes, GBC double-speed mode, etc), it no longer causes the "int64 clock" value to be erroneous. Fourth, this results in a fairly decent speedup, mostly across the board. Aside from the GBA being mostly a wash (for unknown reasons), it's about an 8% - 12% speedup in every other emulation core. Now, all of this said ... this was an unbelievably massive change, so ... you know what that means >_> If anyone can help test all types of SNES coprocessors, and some other system games, it'd be appreciated. ---- Lastly, we have a bitchin' new about screen. It unfortunately adds ~200KiB onto the binary size, because the PNG->C++ header file transformation doesn't compress very well, and I want to keep the original resource files in with the higan archive. I might try some things to work around this file size increase in the future, but for now ... yeah, slightly larger archive sizes, sorry. The logo's a bit busted on Windows (the Label control's background transparency and alignment settings aren't working), but works well on GTK. I'll have to fix Windows before the next official release. For now, look on my Twitter feed if you want to see what it's supposed to look like. ---- EDIT: forgot about ICD2::Enter. It's doing some weird inverse run-to-save thing that I need to implement support for somehow. So, save states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
synchronize(smp);
}
/* register interface for S-SMP $00f2,$00f3 */
auto DSP::mute() const -> bool {
return REG(FLG) & 0x40;
}
auto DSP::read(uint8 addr) -> uint8 {
return REG(addr);
}
auto DSP::write(uint8 addr, uint8 data) -> void {
REG(addr) = data;
if((addr & 0x0f) == ENVX) {
state.envxBuffer = data;
} else if((addr & 0x0f) == OUTX) {
state.outxBuffer = data;
} else if(addr == KON) {
state.konBuffer = data;
} else if(addr == ENDX) {
//always cleared, regardless of data written
state.endxBuffer = 0;
REG(ENDX) = 0;
}
}
/* initialization */
auto DSP::load(Markup::Node node) -> bool {
return true;
}
auto DSP::power() -> void {
Update to v098r04 release. byuu says: Changelog: - SFC: fixed behavior of 21fx $21fe register when no device is connected (must return zero) - SFC: reduced 21fx buffer size to 1024 bytes in both directions to mirror the FT232H we are using - SFC: eliminated dsp/modulo-array.hpp [1] - higan: implemented higan/video interface and migrated all cores to it [2] [1] the echo history buffer was 8-bytes, so there was no need for it at all here. Not sure what I was thinking. The BRR buffer was 12-bytes, and has very weird behavior ... but there's only a single location in the code where it actually writes to this buffer. It's much easier to just write to the buffer three times there instead of implementing an entire class just to abstract away two lines of code. This change actually boosted the speed from ~124.5fps to around ~127.5fps, but that's within the margin of error for GCC. I doubt it's actually faster this way. The DSP core could really use a ton of work. It comes from a port of blargg's spc_dsp to my coding style, but he was extremely fond of using 32-bit signed integers everywhere. There's a lot of opportunity to remove red tape masking by resizing the variables to their actual state sizes. I really need to find where I put spc_dsp6.sfc from blargg. It's a great test to verify if I've made any mistakes in my implementation that would cause regressions. Don't suppose anyone has it? [2] so again, the idea is that higan/audio and higan/video are going to sit between the emulation cores and the user interfaces. The hope is to output raw encoding data from the emulation cores without having to worry about the video display format (generally 24-bit RGB) of the host display. And also to avoid having to repeat myself with eg three separate implementations of interframe blending, and so on. Furthermore, the idea is that the user interface can configure its side of the settings, and the emulation cores can configure their sides. Thus, neither has to worry about the other end. And now we can spin off new user interfaces much easier without having to mess with all of these things. Right now, I've implemented color emulation, interframe blending and SNES horizontal color bleed. I did not implement scanlines (and interlace effects for them) yet, but I probably will at some point. Further, for right now, the WonderSwan/Color screen rotation is busted and will only show games in the horizontal orientation. Obviously this must be fixed before the next official release, but I'll want to think about how to implement it. Also, the SNES light gun pointers are missing for now. Things are a bit messy right now as I've gone through several revisions of how to handle these things, so a good house cleaning is in order once everything is feature-complete again. I need to sit down and think through how and where I want to handle things like light gun cursors, LCD icons, and maybe even rasterized text messages. And obviously ... higan/audio is still just nall::DSP's headers. I need to revamp that whole interface. I want to make it quite powerful with a true audio mixer so I can handle things like SNES+SGB+MSU1+Voicer-Kun+SNES-CD (five separate audio streams at once.) The video system has the concept of "effects" for things like color bleed and interframe blending. I want to extend on this with useful other effects, such as NTSC simulation, maybe bringing back my mini-HQ2x filter, etc. I'd also like to restore the saturation/gamma/luma adjustment sliders ... I always liked allowing people to compensate for their displays without having to change settings system-wide. Lastly, I've always wanted to see some audio effects. Although I doubt we'll ever get my dream of CoreAudio-style profiles, I'd like to get some basic equalizer settings and echo/reverb effects in there.
2016-04-11 21:29:56 +00:00
memory::fill(&state, sizeof(State));
for(auto n : range(8)) {
Update to v098r04 release. byuu says: Changelog: - SFC: fixed behavior of 21fx $21fe register when no device is connected (must return zero) - SFC: reduced 21fx buffer size to 1024 bytes in both directions to mirror the FT232H we are using - SFC: eliminated dsp/modulo-array.hpp [1] - higan: implemented higan/video interface and migrated all cores to it [2] [1] the echo history buffer was 8-bytes, so there was no need for it at all here. Not sure what I was thinking. The BRR buffer was 12-bytes, and has very weird behavior ... but there's only a single location in the code where it actually writes to this buffer. It's much easier to just write to the buffer three times there instead of implementing an entire class just to abstract away two lines of code. This change actually boosted the speed from ~124.5fps to around ~127.5fps, but that's within the margin of error for GCC. I doubt it's actually faster this way. The DSP core could really use a ton of work. It comes from a port of blargg's spc_dsp to my coding style, but he was extremely fond of using 32-bit signed integers everywhere. There's a lot of opportunity to remove red tape masking by resizing the variables to their actual state sizes. I really need to find where I put spc_dsp6.sfc from blargg. It's a great test to verify if I've made any mistakes in my implementation that would cause regressions. Don't suppose anyone has it? [2] so again, the idea is that higan/audio and higan/video are going to sit between the emulation cores and the user interfaces. The hope is to output raw encoding data from the emulation cores without having to worry about the video display format (generally 24-bit RGB) of the host display. And also to avoid having to repeat myself with eg three separate implementations of interframe blending, and so on. Furthermore, the idea is that the user interface can configure its side of the settings, and the emulation cores can configure their sides. Thus, neither has to worry about the other end. And now we can spin off new user interfaces much easier without having to mess with all of these things. Right now, I've implemented color emulation, interframe blending and SNES horizontal color bleed. I did not implement scanlines (and interlace effects for them) yet, but I probably will at some point. Further, for right now, the WonderSwan/Color screen rotation is busted and will only show games in the horizontal orientation. Obviously this must be fixed before the next official release, but I'll want to think about how to implement it. Also, the SNES light gun pointers are missing for now. Things are a bit messy right now as I've gone through several revisions of how to handle these things, so a good house cleaning is in order once everything is feature-complete again. I need to sit down and think through how and where I want to handle things like light gun cursors, LCD icons, and maybe even rasterized text messages. And obviously ... higan/audio is still just nall::DSP's headers. I need to revamp that whole interface. I want to make it quite powerful with a true audio mixer so I can handle things like SNES+SGB+MSU1+Voicer-Kun+SNES-CD (five separate audio streams at once.) The video system has the concept of "effects" for things like color bleed and interframe blending. I want to extend on this with useful other effects, such as NTSC simulation, maybe bringing back my mini-HQ2x filter, etc. I'd also like to restore the saturation/gamma/luma adjustment sliders ... I always liked allowing people to compensate for their displays without having to change settings system-wide. Lastly, I've always wanted to see some audio effects. Although I doubt we'll ever get my dream of CoreAudio-style profiles, I'd like to get some basic equalizer settings and echo/reverb effects in there.
2016-04-11 21:29:56 +00:00
memory::fill(&voice[n], sizeof(Voice));
voice[n].brrOffset = 1;
voice[n].vbit = 1 << n;
voice[n].vidx = n * 0x10;
}
}
auto DSP::reset() -> void {
create(Enter, 32040.0 * 768.0);
stream = Emulator::audio.createStream(2, 32040.0);
REG(FLG) = 0xe0;
state.noise = 0x4000;
state.echoHistoryOffset = 0;
state.everyOtherSample = 1;
state.echoOffset = 0;
state.counter = 0;
}
#undef REG
#undef VREG
}