bsnes/higan/sfc/smp/memory.cpp

220 lines
4.5 KiB
C++
Raw Normal View History

alwaysinline auto SMP::ramRead(uint16 addr) -> uint8 {
Update to v099r14 release. byuu says: Changelog: - (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel like they were contributing enough to be worth it] - cleaned up nall::integer,natural,real functionality - toInteger, toNatural, toReal for parsing strings to numbers - fromInteger, fromNatural, fromReal for creating strings from numbers - (string,Markup::Node,SQL-based-classes)::(integer,natural,real) left unchanged - template<typename T> numeral(T value, long padding, char padchar) -> string for print() formatting - deduces integer,natural,real based on T ... cast the value if you want to override - there still exists binary,octal,hex,pointer for explicit print() formatting - lstring -> string_vector [but using lstring = string_vector; is declared] - would be nice to remove the using lstring eventually ... but that'd probably require 10,000 lines of changes >_> - format -> string_format [no using here; format was too ambiguous] - using integer = Integer<sizeof(int)*8>; and using natural = Natural<sizeof(uint)*8>; declared - for consistency with boolean. These three are meant for creating zero-initialized values implicitly (various uses) - R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees up struct IO {} io; naming] - SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {} (status,registers); now - still some CPU::Status status values ... they didn't really fit into IO functionality ... will have to think about this more - SFC CPU, PPU, SMP now use step() exclusively instead of addClocks() calling into step() - SFC CPU joypad1_bits, joypad2_bits were unused; killed them - SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it - SFC PPU OAM moved into PPU::Object; since nothing else uses it - the raw uint8[544] array is gone. OAM::read() constructs values from the OAM::Object[512] table now - this avoids having to determine how we want to sub-divide the two OAM memory sections - this also eliminates the OAM::synchronize() functionality - probably more I'm forgetting The FPS fluctuations are driving me insane. This WIP went from 128fps to 137fps. Settled on 133.5fps for the final build. But nothing I changed should have affected performance at all. This level of fluctuation makes it damn near impossible to know whether I'm speeding things up or slowing things down with changes.
2016-07-01 11:50:32 +00:00
if(addr >= 0xffc0 && io.iplromEnable) return iplrom[addr & 0x3f];
if(io.ramDisable) return 0x5a; //0xff on mini-SNES
return dsp.apuram[addr];
}
alwaysinline auto SMP::ramWrite(uint16 addr, uint8 data) -> void {
//writes to $ffc0-$ffff always go to apuram, even if the iplrom is enabled
if(io.ramWritable && !io.ramDisable) dsp.apuram[addr] = data;
}
auto SMP::portRead(uint2 port) const -> uint8 {
if(port == 0) return io.cpu0;
if(port == 1) return io.cpu1;
if(port == 2) return io.cpu2;
if(port == 3) return io.cpu3;
unreachable;
}
auto SMP::portWrite(uint2 port, uint8 data) -> void {
if(port == 0) io.apu0 = data;
if(port == 1) io.apu1 = data;
if(port == 2) io.apu2 = data;
if(port == 3) io.apu3 = data;
}
auto SMP::busRead(uint16 addr) -> uint8 {
uint result;
switch(addr) {
case 0xf0: //TEST (write-only register)
return 0x00;
case 0xf1: //CONTROL (write-only register)
return 0x00;
case 0xf2: //DSPADDR
Update to v099r14 release. byuu says: Changelog: - (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel like they were contributing enough to be worth it] - cleaned up nall::integer,natural,real functionality - toInteger, toNatural, toReal for parsing strings to numbers - fromInteger, fromNatural, fromReal for creating strings from numbers - (string,Markup::Node,SQL-based-classes)::(integer,natural,real) left unchanged - template<typename T> numeral(T value, long padding, char padchar) -> string for print() formatting - deduces integer,natural,real based on T ... cast the value if you want to override - there still exists binary,octal,hex,pointer for explicit print() formatting - lstring -> string_vector [but using lstring = string_vector; is declared] - would be nice to remove the using lstring eventually ... but that'd probably require 10,000 lines of changes >_> - format -> string_format [no using here; format was too ambiguous] - using integer = Integer<sizeof(int)*8>; and using natural = Natural<sizeof(uint)*8>; declared - for consistency with boolean. These three are meant for creating zero-initialized values implicitly (various uses) - R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees up struct IO {} io; naming] - SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {} (status,registers); now - still some CPU::Status status values ... they didn't really fit into IO functionality ... will have to think about this more - SFC CPU, PPU, SMP now use step() exclusively instead of addClocks() calling into step() - SFC CPU joypad1_bits, joypad2_bits were unused; killed them - SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it - SFC PPU OAM moved into PPU::Object; since nothing else uses it - the raw uint8[544] array is gone. OAM::read() constructs values from the OAM::Object[512] table now - this avoids having to determine how we want to sub-divide the two OAM memory sections - this also eliminates the OAM::synchronize() functionality - probably more I'm forgetting The FPS fluctuations are driving me insane. This WIP went from 128fps to 137fps. Settled on 133.5fps for the final build. But nothing I changed should have affected performance at all. This level of fluctuation makes it damn near impossible to know whether I'm speeding things up or slowing things down with changes.
2016-07-01 11:50:32 +00:00
return io.dspAddr;
case 0xf3: //DSPDATA
//0x80-0xff are read-only mirrors of 0x00-0x7f
Update to v099r14 release. byuu says: Changelog: - (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel like they were contributing enough to be worth it] - cleaned up nall::integer,natural,real functionality - toInteger, toNatural, toReal for parsing strings to numbers - fromInteger, fromNatural, fromReal for creating strings from numbers - (string,Markup::Node,SQL-based-classes)::(integer,natural,real) left unchanged - template<typename T> numeral(T value, long padding, char padchar) -> string for print() formatting - deduces integer,natural,real based on T ... cast the value if you want to override - there still exists binary,octal,hex,pointer for explicit print() formatting - lstring -> string_vector [but using lstring = string_vector; is declared] - would be nice to remove the using lstring eventually ... but that'd probably require 10,000 lines of changes >_> - format -> string_format [no using here; format was too ambiguous] - using integer = Integer<sizeof(int)*8>; and using natural = Natural<sizeof(uint)*8>; declared - for consistency with boolean. These three are meant for creating zero-initialized values implicitly (various uses) - R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees up struct IO {} io; naming] - SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {} (status,registers); now - still some CPU::Status status values ... they didn't really fit into IO functionality ... will have to think about this more - SFC CPU, PPU, SMP now use step() exclusively instead of addClocks() calling into step() - SFC CPU joypad1_bits, joypad2_bits were unused; killed them - SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it - SFC PPU OAM moved into PPU::Object; since nothing else uses it - the raw uint8[544] array is gone. OAM::read() constructs values from the OAM::Object[512] table now - this avoids having to determine how we want to sub-divide the two OAM memory sections - this also eliminates the OAM::synchronize() functionality - probably more I'm forgetting The FPS fluctuations are driving me insane. This WIP went from 128fps to 137fps. Settled on 133.5fps for the final build. But nothing I changed should have affected performance at all. This level of fluctuation makes it damn near impossible to know whether I'm speeding things up or slowing things down with changes.
2016-07-01 11:50:32 +00:00
return dsp.read(io.dspAddr & 0x7f);
case 0xf4: //CPUIO0
synchronize(cpu);
return io.apu0;
case 0xf5: //CPUIO1
synchronize(cpu);
return io.apu1;
case 0xf6: //CPUIO2
synchronize(cpu);
return io.apu2;
case 0xf7: //CPUIO3
Update to v100r14 release. byuu says: (Windows: compile with -fpermissive to silence an annoying error. I'll fix it in the next WIP.) I completely replaced the time management system in higan and overhauled the scheduler. Before, processor threads would have "int64 clock"; and there would be a 1:1 relationship between two threads. When thread A ran for X cycles, it'd subtract X * B.Frequency from clock; and when thread B ran for Y cycles, it'd add Y * A.Frequency from clock. This worked well and allowed perfect precision; but it doesn't work when you have more complicated relationships: eg the 68K can sync to the Z80 and PSG; the Z80 to the 68K and PSG; so the PSG needs two counters. The new system instead uses a "uint64 clock" variable that represents time in attoseconds. Every time the scheduler exits, it subtracts the smallest clock count from all threads, to prevent an overflow scenario. The only real downside is that rounding errors mean that roughly every 20 minutes, we have a rounding error of one clock cycle (one 20,000,000th of a second.) However, this only applies to systems with multiple oscillators, like the SNES. And when you're in that situation ... there's no such thing as a perfect oscillator anyway. A real SNES will be thousands of times less out of spec than 1hz per 20 minutes. The advantages are pretty immense. First, we obviously can now support more complex relationships between threads. Second, we can build a much more abstracted scheduler. All of libco is now abstracted away completely, which may permit a state-machine / coroutine version of Thread in the future. We've basically gone from this: auto SMP::step(uint clocks) -> void { clock += clocks * (uint64)cpu.frequency; dsp.clock -= clocks; if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread); if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread); } To this: auto SMP::step(uint clocks) -> void { Thread::step(clocks); synchronize(dsp); synchronize(cpu); } As you can see, we don't have to do multiple clock adjustments anymore. This is a huge win for the SNES CPU that had to update the SMP, DSP, all peripherals and all coprocessors. Likewise, we don't have to synchronize all coprocessors when one runs, now we can just synchronize the active one to the CPU. Third, when changing the frequencies of threads (think SGB speed setting modes, GBC double-speed mode, etc), it no longer causes the "int64 clock" value to be erroneous. Fourth, this results in a fairly decent speedup, mostly across the board. Aside from the GBA being mostly a wash (for unknown reasons), it's about an 8% - 12% speedup in every other emulation core. Now, all of this said ... this was an unbelievably massive change, so ... you know what that means >_> If anyone can help test all types of SNES coprocessors, and some other system games, it'd be appreciated. ---- Lastly, we have a bitchin' new about screen. It unfortunately adds ~200KiB onto the binary size, because the PNG->C++ header file transformation doesn't compress very well, and I want to keep the original resource files in with the higan archive. I might try some things to work around this file size increase in the future, but for now ... yeah, slightly larger archive sizes, sorry. The logo's a bit busted on Windows (the Label control's background transparency and alignment settings aren't working), but works well on GTK. I'll have to fix Windows before the next official release. For now, look on my Twitter feed if you want to see what it's supposed to look like. ---- EDIT: forgot about ICD2::Enter. It's doing some weird inverse run-to-save thing that I need to implement support for somehow. So, save states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
synchronize(cpu);
return io.apu3;
case 0xf8: //AUXIO4
return io.aux4;
case 0xf9: //AUXIO5
return io.aux5;
case 0xfa: //T0TARGET
case 0xfb: //T1TARGET
case 0xfc: //T2TARGET (write-only registers)
return 0x00;
case 0xfd: //T0OUT (4-bit counter value)
result = timer0.stage3;
timer0.stage3 = 0;
return result;
case 0xfe: //T1OUT (4-bit counter value)
result = timer1.stage3;
timer1.stage3 = 0;
return result;
case 0xff: //T2OUT (4-bit counter value)
result = timer2.stage3;
timer2.stage3 = 0;
return result;
}
return ramRead(addr);
}
auto SMP::busWrite(uint16 addr, uint8 data) -> void {
switch(addr) {
case 0xf0: //TEST
if(r.p.p) break; //writes only valid when P flag is clear
Update to v103r06 release. byuu says: Changelog: - processor/spc700: restored fetch/load/store/pull/push shorthand functions - processor/spc700: split functions that tested the algorithm used (`op != &SPC700:...`) to separate instructions - mostly for code clarity over code size: it was awkward having cycle counts change based on a function parameter - processor/spc700: implemented Overload's new findings on which cycles are truly internal (no bus reads) - sfc/smp: TEST register emulation has been vastly improved¹ ¹: it turns out that TEST.d4,d5 is the external clock divider (used when accessing RAM through the DSP), and TEST.d6,d7 is the internal clock divider (used when accessing IPLROM, IO registers, or during idle cycles.) The DSP (24576khz) feeds its clock / 12 through to the SMP (2048khz). The clock divider setting further divides the clock by 2, 4, 8, or 16. Since 8 and 16 are not cleanly divislbe by 12, the SMP cycle count glitches out and seems to take 10 and 2 clocks instead of 8 or 16. This can on real hardware either cause the SMP to run very slowly, or more likely, crash the SMP completely until reset. What's even stranger is the timers aren't affected by this. They still clock by 2, 4, 8, or 16. Note that technically I could divide my own clock counters by 24 and reduce these to {1,2,5,10} and {1,2,4,8}, I instead chose to divide by 12 to better illustrate this hardware issue and better model that the SMP clock runs at 2048khz and not 1024khz. Further, note that things aren't 100% perfect yet. This seems to throw off some tests, such as blargg's `test_timer_speed`. I can't tell how far off I am because blargg's test tragically doesn't print out fail values. But you can see the improvements in that higan is now passing all of Revenant's tests that were obviously completely wrong before.
2017-07-03 07:24:47 +00:00
io.timersDisable = data.bit (0);
io.ramWritable = data.bit (1);
io.ramDisable = data.bit (2);
io.timersEnable = data.bit (3);
io.externalWaitStates = data.bits(4,5);
io.internalWaitStates = data.bits(6,7);
timer0.synchronizeStage1();
timer1.synchronizeStage1();
timer2.synchronizeStage1();
break;
case 0xf1: //CONTROL
//0->1 transistion resets timers
if(!timer0.enable && data.bit(0)) {
timer0.stage2 = 0;
timer0.stage3 = 0;
}
timer0.enable = data.bit(0);
if(!timer1.enable && data.bit(1)) {
timer1.stage2 = 0;
timer1.stage3 = 0;
}
timer1.enable = data.bit(1);
if(!timer2.enable && data.bit(2)) {
timer2.stage2 = 0;
timer2.stage3 = 0;
}
timer2.enable = data.bit(2);
if(data.bit(4)) {
synchronize(cpu);
io.apu0 = 0x00;
io.apu1 = 0x00;
}
if(data.bit(5)) {
synchronize(cpu);
io.apu2 = 0x00;
io.apu3 = 0x00;
}
io.iplromEnable = data.bit(7);
break;
case 0xf2: //DSPADDR
Update to v099r14 release. byuu says: Changelog: - (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel like they were contributing enough to be worth it] - cleaned up nall::integer,natural,real functionality - toInteger, toNatural, toReal for parsing strings to numbers - fromInteger, fromNatural, fromReal for creating strings from numbers - (string,Markup::Node,SQL-based-classes)::(integer,natural,real) left unchanged - template<typename T> numeral(T value, long padding, char padchar) -> string for print() formatting - deduces integer,natural,real based on T ... cast the value if you want to override - there still exists binary,octal,hex,pointer for explicit print() formatting - lstring -> string_vector [but using lstring = string_vector; is declared] - would be nice to remove the using lstring eventually ... but that'd probably require 10,000 lines of changes >_> - format -> string_format [no using here; format was too ambiguous] - using integer = Integer<sizeof(int)*8>; and using natural = Natural<sizeof(uint)*8>; declared - for consistency with boolean. These three are meant for creating zero-initialized values implicitly (various uses) - R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees up struct IO {} io; naming] - SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {} (status,registers); now - still some CPU::Status status values ... they didn't really fit into IO functionality ... will have to think about this more - SFC CPU, PPU, SMP now use step() exclusively instead of addClocks() calling into step() - SFC CPU joypad1_bits, joypad2_bits were unused; killed them - SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it - SFC PPU OAM moved into PPU::Object; since nothing else uses it - the raw uint8[544] array is gone. OAM::read() constructs values from the OAM::Object[512] table now - this avoids having to determine how we want to sub-divide the two OAM memory sections - this also eliminates the OAM::synchronize() functionality - probably more I'm forgetting The FPS fluctuations are driving me insane. This WIP went from 128fps to 137fps. Settled on 133.5fps for the final build. But nothing I changed should have affected performance at all. This level of fluctuation makes it damn near impossible to know whether I'm speeding things up or slowing things down with changes.
2016-07-01 11:50:32 +00:00
io.dspAddr = data;
break;
case 0xf3: //DSPDATA
Update to v099r14 release. byuu says: Changelog: - (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel like they were contributing enough to be worth it] - cleaned up nall::integer,natural,real functionality - toInteger, toNatural, toReal for parsing strings to numbers - fromInteger, fromNatural, fromReal for creating strings from numbers - (string,Markup::Node,SQL-based-classes)::(integer,natural,real) left unchanged - template<typename T> numeral(T value, long padding, char padchar) -> string for print() formatting - deduces integer,natural,real based on T ... cast the value if you want to override - there still exists binary,octal,hex,pointer for explicit print() formatting - lstring -> string_vector [but using lstring = string_vector; is declared] - would be nice to remove the using lstring eventually ... but that'd probably require 10,000 lines of changes >_> - format -> string_format [no using here; format was too ambiguous] - using integer = Integer<sizeof(int)*8>; and using natural = Natural<sizeof(uint)*8>; declared - for consistency with boolean. These three are meant for creating zero-initialized values implicitly (various uses) - R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees up struct IO {} io; naming] - SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {} (status,registers); now - still some CPU::Status status values ... they didn't really fit into IO functionality ... will have to think about this more - SFC CPU, PPU, SMP now use step() exclusively instead of addClocks() calling into step() - SFC CPU joypad1_bits, joypad2_bits were unused; killed them - SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it - SFC PPU OAM moved into PPU::Object; since nothing else uses it - the raw uint8[544] array is gone. OAM::read() constructs values from the OAM::Object[512] table now - this avoids having to determine how we want to sub-divide the two OAM memory sections - this also eliminates the OAM::synchronize() functionality - probably more I'm forgetting The FPS fluctuations are driving me insane. This WIP went from 128fps to 137fps. Settled on 133.5fps for the final build. But nothing I changed should have affected performance at all. This level of fluctuation makes it damn near impossible to know whether I'm speeding things up or slowing things down with changes.
2016-07-01 11:50:32 +00:00
if(io.dspAddr & 0x80) break; //0x80-0xff are read-only mirrors of 0x00-0x7f
dsp.write(io.dspAddr & 0x7f, data);
break;
case 0xf4: //CPUIO0
synchronize(cpu);
io.cpu0 = data;
break;
case 0xf5: //CPUIO1
synchronize(cpu);
io.cpu1 = data;
break;
case 0xf6: //CPUIO2
synchronize(cpu);
io.cpu2 = data;
break;
case 0xf7: //CPUIO3
Update to v100r14 release. byuu says: (Windows: compile with -fpermissive to silence an annoying error. I'll fix it in the next WIP.) I completely replaced the time management system in higan and overhauled the scheduler. Before, processor threads would have "int64 clock"; and there would be a 1:1 relationship between two threads. When thread A ran for X cycles, it'd subtract X * B.Frequency from clock; and when thread B ran for Y cycles, it'd add Y * A.Frequency from clock. This worked well and allowed perfect precision; but it doesn't work when you have more complicated relationships: eg the 68K can sync to the Z80 and PSG; the Z80 to the 68K and PSG; so the PSG needs two counters. The new system instead uses a "uint64 clock" variable that represents time in attoseconds. Every time the scheduler exits, it subtracts the smallest clock count from all threads, to prevent an overflow scenario. The only real downside is that rounding errors mean that roughly every 20 minutes, we have a rounding error of one clock cycle (one 20,000,000th of a second.) However, this only applies to systems with multiple oscillators, like the SNES. And when you're in that situation ... there's no such thing as a perfect oscillator anyway. A real SNES will be thousands of times less out of spec than 1hz per 20 minutes. The advantages are pretty immense. First, we obviously can now support more complex relationships between threads. Second, we can build a much more abstracted scheduler. All of libco is now abstracted away completely, which may permit a state-machine / coroutine version of Thread in the future. We've basically gone from this: auto SMP::step(uint clocks) -> void { clock += clocks * (uint64)cpu.frequency; dsp.clock -= clocks; if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread); if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread); } To this: auto SMP::step(uint clocks) -> void { Thread::step(clocks); synchronize(dsp); synchronize(cpu); } As you can see, we don't have to do multiple clock adjustments anymore. This is a huge win for the SNES CPU that had to update the SMP, DSP, all peripherals and all coprocessors. Likewise, we don't have to synchronize all coprocessors when one runs, now we can just synchronize the active one to the CPU. Third, when changing the frequencies of threads (think SGB speed setting modes, GBC double-speed mode, etc), it no longer causes the "int64 clock" value to be erroneous. Fourth, this results in a fairly decent speedup, mostly across the board. Aside from the GBA being mostly a wash (for unknown reasons), it's about an 8% - 12% speedup in every other emulation core. Now, all of this said ... this was an unbelievably massive change, so ... you know what that means >_> If anyone can help test all types of SNES coprocessors, and some other system games, it'd be appreciated. ---- Lastly, we have a bitchin' new about screen. It unfortunately adds ~200KiB onto the binary size, because the PNG->C++ header file transformation doesn't compress very well, and I want to keep the original resource files in with the higan archive. I might try some things to work around this file size increase in the future, but for now ... yeah, slightly larger archive sizes, sorry. The logo's a bit busted on Windows (the Label control's background transparency and alignment settings aren't working), but works well on GTK. I'll have to fix Windows before the next official release. For now, look on my Twitter feed if you want to see what it's supposed to look like. ---- EDIT: forgot about ICD2::Enter. It's doing some weird inverse run-to-save thing that I need to implement support for somehow. So, save states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
synchronize(cpu);
io.cpu3 = data;
break;
case 0xf8: //AUXIO4
io.aux4 = data;
break;
case 0xf9: //AUXIO5
io.aux5 = data;
break;
case 0xfa: //T0TARGET
timer0.target = data;
break;
case 0xfb: //T1TARGET
timer1.target = data;
break;
case 0xfc: //T2TARGET
timer2.target = data;
break;
case 0xfd: //T0OUT
case 0xfe: //T1OUT
case 0xff: //T2OUT (read-only registers)
break;
}
ramWrite(addr, data); //all writes, even to I/O registers, appear on bus
}
Update to v103r06 release. byuu says: Changelog: - processor/spc700: restored fetch/load/store/pull/push shorthand functions - processor/spc700: split functions that tested the algorithm used (`op != &SPC700:...`) to separate instructions - mostly for code clarity over code size: it was awkward having cycle counts change based on a function parameter - processor/spc700: implemented Overload's new findings on which cycles are truly internal (no bus reads) - sfc/smp: TEST register emulation has been vastly improved¹ ¹: it turns out that TEST.d4,d5 is the external clock divider (used when accessing RAM through the DSP), and TEST.d6,d7 is the internal clock divider (used when accessing IPLROM, IO registers, or during idle cycles.) The DSP (24576khz) feeds its clock / 12 through to the SMP (2048khz). The clock divider setting further divides the clock by 2, 4, 8, or 16. Since 8 and 16 are not cleanly divislbe by 12, the SMP cycle count glitches out and seems to take 10 and 2 clocks instead of 8 or 16. This can on real hardware either cause the SMP to run very slowly, or more likely, crash the SMP completely until reset. What's even stranger is the timers aren't affected by this. They still clock by 2, 4, 8, or 16. Note that technically I could divide my own clock counters by 24 and reduce these to {1,2,5,10} and {1,2,4,8}, I instead chose to divide by 12 to better illustrate this hardware issue and better model that the SMP clock runs at 2048khz and not 1024khz. Further, note that things aren't 100% perfect yet. This seems to throw off some tests, such as blargg's `test_timer_speed`. I can't tell how far off I am because blargg's test tragically doesn't print out fail values. But you can see the improvements in that higan is now passing all of Revenant's tests that were obviously completely wrong before.
2017-07-03 07:24:47 +00:00
auto SMP::idle() -> void {
wait();
}
auto SMP::read(uint16 addr) -> uint8 {
Update to v103r06 release. byuu says: Changelog: - processor/spc700: restored fetch/load/store/pull/push shorthand functions - processor/spc700: split functions that tested the algorithm used (`op != &SPC700:...`) to separate instructions - mostly for code clarity over code size: it was awkward having cycle counts change based on a function parameter - processor/spc700: implemented Overload's new findings on which cycles are truly internal (no bus reads) - sfc/smp: TEST register emulation has been vastly improved¹ ¹: it turns out that TEST.d4,d5 is the external clock divider (used when accessing RAM through the DSP), and TEST.d6,d7 is the internal clock divider (used when accessing IPLROM, IO registers, or during idle cycles.) The DSP (24576khz) feeds its clock / 12 through to the SMP (2048khz). The clock divider setting further divides the clock by 2, 4, 8, or 16. Since 8 and 16 are not cleanly divislbe by 12, the SMP cycle count glitches out and seems to take 10 and 2 clocks instead of 8 or 16. This can on real hardware either cause the SMP to run very slowly, or more likely, crash the SMP completely until reset. What's even stranger is the timers aren't affected by this. They still clock by 2, 4, 8, or 16. Note that technically I could divide my own clock counters by 24 and reduce these to {1,2,5,10} and {1,2,4,8}, I instead chose to divide by 12 to better illustrate this hardware issue and better model that the SMP clock runs at 2048khz and not 1024khz. Further, note that things aren't 100% perfect yet. This seems to throw off some tests, such as blargg's `test_timer_speed`. I can't tell how far off I am because blargg's test tragically doesn't print out fail values. But you can see the improvements in that higan is now passing all of Revenant's tests that were obviously completely wrong before.
2017-07-03 07:24:47 +00:00
wait(addr);
uint8 data = busRead(addr);
Update to v094r05 release. byuu says: Commands can be prefixed with: (cpu|smp|ppu|dsp|apu|vram|oam|cgram)/ to set their source. Eg "vram/hex 0800" or "smp/breakpoints.append execute ffc0"; default is cpu. These overlap a little bit in odd ways, but that's just the way the SNES works: it's not a very orthogonal system. CPU is both a processor and the main bus (ROM, RAM, WRAM, etc), APU is the shared memory by the SMP+DSP (eg use it to catch writes from either chip); PPU probably won't ever be used since it's broken down into three separate buses (VRAM, OAM, CGRAM), but DSP could be useful for tracking bugs like we found in Koushien 2 with the DSP echo buffer corrupting SMP opcodes. Technically the PPU memory pools are only ever tripped by the CPU poking at them, as the PPU doesn't ever write. I now have run.for, run.to, step.for, step.to. The difference is that run only prints the next instruction after running, whereas step prints all of the instructions along the way as well. run.to acts the same as "step over" here. Although it's not quite as nice, since you have to specify the address of the next instruction. Logging the Field/Vcounter/Hcounter on instruction listings now, good for timing information. Added in the tracer mask, as well as memory export, as well as VRAM/OAM/CGRAM/SMP read/write/execute breakpoints, as well as an APU usage map (it tracks DSP reads/writes separately, although I don't currently have debugger callbacks on DSP accesses just yet.) Have not hooked up actual SMP debugging just yet, but I plan to soon. Still thinking about how I want to allow / block interleaving of instructions (terminal output and tracing.) So ... remaining tasks at this point: - full SMP debugging - CPU+SMP interleave support - aliases - hotkeys - save states (will be kind of tricky ... will have to suppress breakpoints during synchronization, or abort a save in a break event.) - keep track of window geometry between runs
2014-02-05 11:30:08 +00:00
return data;
}
auto SMP::write(uint16 addr, uint8 data) -> void {
Update to v103r06 release. byuu says: Changelog: - processor/spc700: restored fetch/load/store/pull/push shorthand functions - processor/spc700: split functions that tested the algorithm used (`op != &SPC700:...`) to separate instructions - mostly for code clarity over code size: it was awkward having cycle counts change based on a function parameter - processor/spc700: implemented Overload's new findings on which cycles are truly internal (no bus reads) - sfc/smp: TEST register emulation has been vastly improved¹ ¹: it turns out that TEST.d4,d5 is the external clock divider (used when accessing RAM through the DSP), and TEST.d6,d7 is the internal clock divider (used when accessing IPLROM, IO registers, or during idle cycles.) The DSP (24576khz) feeds its clock / 12 through to the SMP (2048khz). The clock divider setting further divides the clock by 2, 4, 8, or 16. Since 8 and 16 are not cleanly divislbe by 12, the SMP cycle count glitches out and seems to take 10 and 2 clocks instead of 8 or 16. This can on real hardware either cause the SMP to run very slowly, or more likely, crash the SMP completely until reset. What's even stranger is the timers aren't affected by this. They still clock by 2, 4, 8, or 16. Note that technically I could divide my own clock counters by 24 and reduce these to {1,2,5,10} and {1,2,4,8}, I instead chose to divide by 12 to better illustrate this hardware issue and better model that the SMP clock runs at 2048khz and not 1024khz. Further, note that things aren't 100% perfect yet. This seems to throw off some tests, such as blargg's `test_timer_speed`. I can't tell how far off I am because blargg's test tragically doesn't print out fail values. But you can see the improvements in that higan is now passing all of Revenant's tests that were obviously completely wrong before.
2017-07-03 07:24:47 +00:00
wait(addr);
busWrite(addr, data);
}
auto SMP::readDisassembler(uint16 addr) -> uint8 {
if((addr & 0xfff0) == 0x00f0) return 0x00;
if(addr >= 0xffc0 && io.iplromEnable) return iplrom[addr & 0x3f];
return dsp.apuram[addr];
}