From c50723ef6158988b708e85d3902a90b02db450a7 Mon Sep 17 00:00:00 2001 From: Tim Allen Date: Sun, 31 Jul 2016 12:11:20 +1000 Subject: [PATCH] Update to v100r15 release. byuu wrote: Aforementioned scheduler changes added. Longer explanation of why here: http://hastebin.com/raw/toxedenece Again, we really need to test this as thoroughly as possible for regressions :/ This is a really major change that affects absolutely everything: all emulation cores, all coprocessors, etc. Also added ADDX and SUB to the 68K core, which brings us just barely above 50% of the instruction encoding space completed. [Editor's note: The "aformentioned scheduler changes" were described in a previous forum post: Unfortunately, 64-bits just wasn't enough precision (we were getting misalignments ~230 times a second on 21/24MHz clocks), so I had to move to 128-bit counters. This of course doesn't exist on 32-bit architectures (and probably not on all 64-bit ones either), so for now ... higan's only going to compile on 64-bit machines until we figure something out. Maybe we offer a "lower precision" fallback for machines that lack uint128_t or something. Using the booth algorithm would be way too slow. Anyway, the precision is now 2^-96, which is roughly 10^-29. That puts us far beyond the yoctosecond. Suck it, MAME :P I'm jokingly referring to it as the byuusecond. The other 32-bits of precision allows a 1Hz clock to run up to one full second before all clocks need to be normalized to prevent overflow. I fixed a serious wobbling issue where I was using clock > other.clock for synchronization instead of clock >= other.clock; and also another aliasing issue when two threads share a common frequency, but don't run in lock-step. The latter I don't even fully understand, but I did observe it in testing. nall/serialization.hpp has been extended to support 128-bit integers, but without explicitly naming them (yay generic code), so nall will still compile on 32-bit platforms for all other applications. Speed is basically a wash now. FC's a bit slower, SFC's a bit faster. The "longer explanation" in the linked hastebin is: Okay, so the idea is that we can have an arbitrary number of oscillators. Take the SNES: - CPU/PPU clock = 21477272.727272hz - SMP/DSP clock = 24576000hz - Cartridge DSP1 clock = 8000000hz - Cartridge MSU1 clock = 44100hz - Controller Port 1 modem controller clock = 57600hz - Controller Port 2 barcode battler clock = 115200hz - Expansion Port exercise bike clock = 192000hz Is this a pathological case? Of course it is, but it's possible. The first four do exist in the wild already: see Rockman X2 MSU1 patch. Manifest files with higan let you specify any frequency you want for any component. The old trick higan used was to hold an int64 counter for each thread:thread synchronization, and adjust it like so: - if thread A steps X clocks; then clock += X * threadB.frequency - if clock >= 0; switch to threadB - if thread B steps X clocks; then clock -= X * threadA.frequency - if clock < 0; switch to threadA But there are also system configurations where one processor has to synchronize with more than one other processor. Take the Genesis: - the 68K has to sync with the Z80 and PSG and YM2612 and VDP - the Z80 has to sync with the 68K and PSG and YM2612 - the PSG has to sync with the 68K and Z80 and YM2612 Now I could do this by having an int64 clock value for every association. But these clock values would have to be outside the individual Thread class objects, and we would have to update every relationship's clock value. So the 68K would have to update the Z80, PSG, YM2612 and VDP clocks. That's four expensive 64-bit multiply-adds per clock step event instead of one. As such, we have to account for both possibilities. The only way to do this is with a single time base. We do this like so: - setup: scalar = timeBase / frequency - step: clock += scalar * clocks Once per second, we look at every thread, find the smallest clock value. Then subtract that value from all threads. This prevents the clock counters from overflowing. Unfortunately, these oscillator values are psychotic, unpredictable, and often times repeating fractions. Even with a timeBase of 1,000,000,000,000,000,000 (one attosecond); we get rounding errors every ~16,300 synchronizations. Specifically, this happens with a CPU running at 21477273hz (rounded) and SMP running at 24576000hz. That may be good enough for most emulators, but ... you know how I am. Plus, even at the attosecond level, we're really pushing against the limits of 64-bit integers. Given the reciprocal inverse, a frequency of 1Hz (which does exist in higan!) would have a scalar that consumes 1/18th of the entire range of a uint64 on every single step. Yes, I could raise the frequency, and then step by that amount, I know. But I don't want to have weird gotchas like that in the scheduler core. Until I increase the accuracy to about 100 times greater than a yoctosecond, the rounding errors are too great. And since the only choice above 64-bit values is 128-bit values; we might as well use all the extra headroom. 2^-96 as a timebase gives me the ability to have both a 1Hz and 4GHz clock; and run them both for a full second; before an overflow event would occur. Another hastebin includes demonstration code: #include #include using namespace nall; // cothread_t mainThread = nullptr; const uint iterations = 100'000'000; const uint cpuFreq = 21477272.727272 + 0.5; const uint smpFreq = 24576000.000000 + 0.5; const uint cpuStep = 4; const uint smpStep = 5; // struct ThreadA { cothread_t handle = nullptr; uint64 frequency = 0; int64 clock = 0; auto create(auto (*entrypoint)() -> void, uint frequency) { this->handle = co_create(65536, entrypoint); this->frequency = frequency; this->clock = 0; } }; struct CPUA : ThreadA { static auto Enter() -> void; auto main() -> void; CPUA() { create(&CPUA::Enter, cpuFreq); } } cpuA; struct SMPA : ThreadA { static auto Enter() -> void; auto main() -> void; SMPA() { create(&SMPA::Enter, smpFreq); } } smpA; uint8 queueA[iterations]; uint offsetA; cothread_t resumeA = cpuA.handle; auto EnterA() -> void { offsetA = 0; co_switch(resumeA); } auto QueueA(uint value) -> void { queueA[offsetA++] = value; if(offsetA >= iterations) { resumeA = co_active(); co_switch(mainThread); } } auto CPUA::Enter() -> void { while(true) cpuA.main(); } auto CPUA::main() -> void { QueueA(1); smpA.clock -= cpuStep * smpA.frequency; if(smpA.clock < 0) co_switch(smpA.handle); } auto SMPA::Enter() -> void { while(true) smpA.main(); } auto SMPA::main() -> void { QueueA(2); smpA.clock += smpStep * cpuA.frequency; if(smpA.clock >= 0) co_switch(cpuA.handle); } // struct ThreadB { cothread_t handle = nullptr; uint128_t scalar = 0; uint128_t clock = 0; auto print128(uint128_t value) { string s; while(value) { s.append((char)('0' + value % 10)); value /= 10; } s.reverse(); print(s, "\n"); } //femtosecond (10^15) = 16306 //attosecond (10^18) = 688838 //zeptosecond (10^21) = 13712691 //yoctosecond (10^24) = 13712691 (hitting a dead-end on a rounding error causing a wobble) //byuusecond? ( 2^96) = (perfect? 79,228 times more precise than a yoctosecond) auto create(auto (*entrypoint)() -> void, uint128_t frequency) { this->handle = co_create(65536, entrypoint); uint128_t unitOfTime = 1; //for(uint n : range(29)) unitOfTime *= 10; unitOfTime <<= 96; //2^96 time units ... this->scalar = unitOfTime / frequency; print128(this->scalar); this->clock = 0; } auto step(uint128_t clocks) -> void { clock += clocks * scalar; } auto synchronize(ThreadB& thread) -> void { if(clock >= thread.clock) co_switch(thread.handle); } }; struct CPUB : ThreadB { static auto Enter() -> void; auto main() -> void; CPUB() { create(&CPUB::Enter, cpuFreq); } } cpuB; struct SMPB : ThreadB { static auto Enter() -> void; auto main() -> void; SMPB() { create(&SMPB::Enter, smpFreq); clock = 1; } } smpB; auto correct() -> void { auto minimum = min(cpuB.clock, smpB.clock); cpuB.clock -= minimum; smpB.clock -= minimum; } uint8 queueB[iterations]; uint offsetB; cothread_t resumeB = cpuB.handle; auto EnterB() -> void { correct(); offsetB = 0; co_switch(resumeB); } auto QueueB(uint value) -> void { queueB[offsetB++] = value; if(offsetB >= iterations) { resumeB = co_active(); co_switch(mainThread); } } auto CPUB::Enter() -> void { while(true) cpuB.main(); } auto CPUB::main() -> void { QueueB(1); step(cpuStep); synchronize(smpB); } auto SMPB::Enter() -> void { while(true) smpB.main(); } auto SMPB::main() -> void { QueueB(2); step(smpStep); synchronize(cpuB); } // #include auto nall::main(string_vector) -> void { mainThread = co_active(); uint masterCounter = 0; while(true) { print(masterCounter++, " ...\n"); auto A = clock(); EnterA(); auto B = clock(); print((double)(B - A) / CLOCKS_PER_SEC, "s\n"); auto C = clock(); EnterB(); auto D = clock(); print((double)(D - C) / CLOCKS_PER_SEC, "s\n"); for(uint n : range(iterations)) { if(queueA[n] != queueB[n]) return print("fail at ", n, "\n"); } } } ...and that's everything.] --- higan/emulator/emulator.hpp | 14 +---- higan/emulator/scheduler.hpp | 61 ++++++++++----------- higan/emulator/thread.hpp | 44 ++++++++------- higan/fc/controller/controller.cpp | 4 +- higan/fc/fc.hpp | 19 +++---- higan/gb/gb.hpp | 19 +++---- higan/gb/ppu/io.cpp | 4 +- higan/gba/gba.hpp | 28 ++++------ higan/md/md.hpp | 19 +++---- higan/processor/m68k/disassembler.cpp | 12 ++++ higan/processor/m68k/instruction.cpp | 48 ++++++++++++++++ higan/processor/m68k/instructions.cpp | 47 +++++++++++++--- higan/processor/m68k/m68k.hpp | 11 +++- higan/processor/spc700/serialization.cpp | 2 +- higan/sfc/controller/controller.cpp | 4 +- higan/sfc/coprocessor/hitachidsp/memory.cpp | 2 +- higan/sfc/coprocessor/icd2/icd2.cpp | 2 +- higan/sfc/coprocessor/sa1/io.cpp | 4 +- higan/sfc/sfc.hpp | 19 +++---- higan/sfc/smp/timing.cpp | 4 +- higan/sfc/system/serialization.cpp | 18 +++--- higan/ws/ws.hpp | 28 ++++------ nall/serializer.hpp | 18 +++--- 23 files changed, 253 insertions(+), 178 deletions(-) diff --git a/higan/emulator/emulator.hpp b/higan/emulator/emulator.hpp index 0eeed2bc..828c9446 100644 --- a/higan/emulator/emulator.hpp +++ b/higan/emulator/emulator.hpp @@ -11,25 +11,15 @@ using namespace nall; namespace Emulator { static const string Name = "higan"; - static const string Version = "100.14"; + static const string Version = "100.15"; static const string Author = "byuu"; static const string License = "GPLv3"; static const string Website = "http://byuu.org/"; //incremented only when serialization format changes - static const string SerializerVersion = "100"; + static const string SerializerVersion = "100.15"; namespace Constants { - namespace Time { - static constexpr double Second = 1.0; - static constexpr double Millisecond = 1'000.0; - static constexpr double Microsecond = 1'000'000.0; - static constexpr double Nanosecond = 1'000'000'000.0; - static constexpr double Picosecond = 1'000'000'000'000.0; - static constexpr double Femtosecond = 1'000'000'000'000'000.0; - static constexpr double Attosecond = 1'000'000'000'000'000'000.0; - } - namespace Colorburst { static constexpr double NTSC = 315.0 / 88.0 * 1'000'000.0; static constexpr double PAL = 283.75 * 15'625.0 + 25.0; diff --git a/higan/emulator/scheduler.hpp b/higan/emulator/scheduler.hpp index 0b1468d2..d9e2ca48 100644 --- a/higan/emulator/scheduler.hpp +++ b/higan/emulator/scheduler.hpp @@ -17,56 +17,55 @@ struct Scheduler { Synchronize, }; - auto active(Thread& thread) const -> bool { - return co_active() == thread.handle(); - } + inline auto synchronizing() const -> bool { return _mode == Mode::SynchronizeSlave; } auto reset() -> void { - threads.reset(); + _host = co_active(); + _threads.reset(); } auto primary(Thread& thread) -> void { - master = _resume = thread.handle(); - host = co_active(); + _master = _resume = thread.handle(); } auto append(Thread& thread) -> bool { - if(threads.find(&thread)) return false; - return threads.append(&thread), true; + if(_threads.find(&thread)) return false; + thread._clock += _threads.size(); //this bias prioritizes threads appended earlier first + return _threads.append(&thread), true; } auto remove(Thread& thread) -> bool { - if(auto offset = threads.find(&thread)) return threads.remove(*offset), true; + if(auto offset = _threads.find(&thread)) return _threads.remove(*offset), true; return false; } - auto enter(Mode mode_ = Mode::Run) -> Event { - mode = mode_; - host = co_active(); + auto enter(Mode mode = Mode::Run) -> Event { + _mode = mode; + _host = co_active(); co_switch(_resume); - return event; + return _event; } inline auto resume(Thread& thread) -> void { - if(mode != Mode::SynchronizeSlave) co_switch(thread.handle()); + if(_mode != Mode::SynchronizeSlave) co_switch(thread.handle()); } - auto exit(Event event_) -> void { - uint64 minimum = ~0ull >> 1; - for(auto thread : threads) { + auto exit(Event event) -> void { + uint128_t minimum = -1; + for(auto thread : _threads) { if(thread->_clock < minimum) minimum = thread->_clock; } - for(auto thread : threads) { + for(auto thread : _threads) { thread->_clock -= minimum; } - event = event_; + _event = event; _resume = co_active(); - co_switch(host); + co_switch(_host); } - auto synchronize(Thread& thread) -> void { - if(thread.handle() == master) { + inline auto synchronize(Thread& thread) -> void { + if(thread.handle() == _master) { while(enter(Mode::SynchronizeMaster) != Event::Synchronize); } else { _resume = thread.handle(); @@ -74,21 +73,21 @@ struct Scheduler { } } - auto synchronize() -> void { - if(co_active() == master) { - if(mode == Mode::SynchronizeMaster) return exit(Event::Synchronize); + inline auto synchronize() -> void { + if(co_active() == _master) { + if(_mode == Mode::SynchronizeMaster) return exit(Event::Synchronize); } else { - if(mode == Mode::SynchronizeSlave) return exit(Event::Synchronize); + if(_mode == Mode::SynchronizeSlave) return exit(Event::Synchronize); } } private: - cothread_t host = nullptr; //program thread (used to exit scheduler) + cothread_t _host = nullptr; //program thread (used to exit scheduler) cothread_t _resume = nullptr; //resume thread (used to enter scheduler) - cothread_t master = nullptr; //primary thread (used to synchronize components) - Mode mode = Mode::Run; - Event event = Event::Step; - vector threads; + cothread_t _master = nullptr; //primary thread (used to synchronize components) + Mode _mode = Mode::Run; + Event _event = Event::Step; + vector _threads; }; } diff --git a/higan/emulator/thread.hpp b/higan/emulator/thread.hpp index 7eebbbe4..88e0f66f 100644 --- a/higan/emulator/thread.hpp +++ b/higan/emulator/thread.hpp @@ -7,31 +7,35 @@ struct Thread { if(_handle) co_delete(_handle); } - auto handle() const { return _handle; } - auto frequency() const { return _frequency; } - auto scalar() const { return _scalar; } - auto clock() const { return _clock; } - - auto create(auto (*entrypoint)() -> void, double frequency, bool resetClock = true) -> void { - if(_handle) co_delete(_handle); - _handle = co_create(64 * 1024 * sizeof(void*), entrypoint); - if(resetClock) _clock = 0; - setFrequency(frequency); - } + inline auto active() const { return co_active() == _handle; } + inline auto handle() const { return _handle; } + inline auto frequency() const { return _frequency; } + inline auto scalar() const { return _scalar; } + inline auto clock() const { return _clock; } auto setFrequency(double frequency) -> void { - _frequency = frequency; - _scalar = 1.0L / frequency * Constants::Time::Attosecond + 0.5L; + _frequency = frequency + 0.5; + _scalar = ((uint128_t)1 << 96) / _frequency; + } + + auto setScalar(uint128_t scalar) -> void { + _scalar = scalar; + } + + auto setClock(uint128_t clock) -> void { + _clock = clock; + } + + auto create(auto (*entrypoint)() -> void, double frequency) -> void { + if(_handle) co_delete(_handle); + _handle = co_create(64 * 1024 * sizeof(void*), entrypoint); + setFrequency(frequency); } inline auto step(uint clocks) -> void { _clock += _scalar * clocks; } - inline auto synchronize(Thread& thread) -> void { - if(_clock > thread._clock) co_switch(thread._handle); - } - auto serialize(serializer& s) -> void { s.integer(_frequency); s.integer(_scalar); @@ -40,9 +44,9 @@ struct Thread { protected: cothread_t _handle = nullptr; - uint64 _frequency = 0; - uint64 _scalar = 0; - uint64 _clock = 0; + uint32_t _frequency = 0; + uint128_t _scalar = 0; + uint128_t _clock = 0; friend class Scheduler; }; diff --git a/higan/fc/controller/controller.cpp b/higan/fc/controller/controller.cpp index 91946ea7..836861e6 100644 --- a/higan/fc/controller/controller.cpp +++ b/higan/fc/controller/controller.cpp @@ -15,8 +15,8 @@ Controller::~Controller() { auto Controller::Enter() -> void { while(true) { scheduler.synchronize(); - if(scheduler.active(*peripherals.controllerPort1)) peripherals.controllerPort1->main(); - if(scheduler.active(*peripherals.controllerPort2)) peripherals.controllerPort2->main(); + if(peripherals.controllerPort1->active()) peripherals.controllerPort1->main(); + if(peripherals.controllerPort2->active()) peripherals.controllerPort2->main(); } } diff --git a/higan/fc/fc.hpp b/higan/fc/fc.hpp index c2ad69ef..8d241c34 100644 --- a/higan/fc/fc.hpp +++ b/higan/fc/fc.hpp @@ -18,8 +18,14 @@ namespace Famicom { extern Cheat cheat; struct Thread : Emulator::Thread { - auto create(auto (*entrypoint)() -> void, double frequency) -> void; - auto synchronize(Thread& thread) -> void; + auto create(auto (*entrypoint)() -> void, double frequency) -> void { + Emulator::Thread::create(entrypoint, frequency); + scheduler.append(*this); + } + + inline auto synchronize(Thread& thread) -> void { + if(clock() >= thread.clock()) scheduler.resume(thread); + } }; #include @@ -29,15 +35,6 @@ namespace Famicom { #include #include #include - - inline auto Thread::create(auto (*entrypoint)() -> void, double frequency) -> void { - Emulator::Thread::create(entrypoint, frequency); - scheduler.append(*this); - } - - inline auto Thread::synchronize(Thread& thread) -> void { - if(_clock > thread._clock) scheduler.resume(thread); - } } #include diff --git a/higan/gb/gb.hpp b/higan/gb/gb.hpp index 7a912d77..a1a3dd28 100644 --- a/higan/gb/gb.hpp +++ b/higan/gb/gb.hpp @@ -18,8 +18,14 @@ namespace GameBoy { extern Cheat cheat; struct Thread : Emulator::Thread { - auto create(auto (*entrypoint)() -> void, double frequency, bool resetClock) -> void; - auto synchronize(Thread& thread) -> void; + auto create(auto (*entrypoint)() -> void, double frequency) -> void { + Emulator::Thread::create(entrypoint, frequency); + scheduler.append(*this); + } + + inline auto synchronize(Thread& thread) -> void { + if(clock() >= thread.clock()) scheduler.resume(thread); + } }; #include @@ -28,15 +34,6 @@ namespace GameBoy { #include #include #include - - inline auto Thread::create(auto (*entrypoint)() -> void, double frequency, bool resetClock = true) -> void { - Emulator::Thread::create(entrypoint, frequency, resetClock); - scheduler.append(*this); - } - - inline auto Thread::synchronize(Thread& thread) -> void { - if(_clock > thread._clock) scheduler.resume(thread); - } } #include diff --git a/higan/gb/ppu/io.cpp b/higan/gb/ppu/io.cpp index 5d4a50ce..0fd85715 100644 --- a/higan/gb/ppu/io.cpp +++ b/higan/gb/ppu/io.cpp @@ -118,7 +118,9 @@ auto PPU::writeIO(uint16 addr, uint8 data) -> void { status.lx = 0; //restart cothread to begin new frame - create(Enter, 4 * 1024 * 1024, false); + auto clock = Thread::clock(); + create(Enter, 4 * 1024 * 1024); + Thread::setClock(clock); } status.displayEnable = data & 0x80; diff --git a/higan/gba/gba.hpp b/higan/gba/gba.hpp index 5c7363ae..55eddfa0 100644 --- a/higan/gba/gba.hpp +++ b/higan/gba/gba.hpp @@ -27,9 +27,18 @@ namespace GameBoyAdvance { }; struct Thread : Emulator::Thread { - auto create(auto (*entrypoint)() -> void, double frequency) -> void; - auto synchronize(Thread& thread) -> void; - auto step(uint clocks) -> void; + auto create(auto (*entrypoint)() -> void, double frequency) -> void { + Emulator::Thread::create(entrypoint, frequency); + scheduler.append(*this); + } + + inline auto synchronize(Thread& thread) -> void { + if(clock() >= thread.clock()) scheduler.resume(thread); + } + + inline auto step(uint clocks) -> void { + _clock += clocks; + } }; #include @@ -39,19 +48,6 @@ namespace GameBoyAdvance { #include #include #include - - inline auto Thread::create(auto (*entrypoint)() -> void, double frequency) -> void { - Emulator::Thread::create(entrypoint, frequency); - scheduler.append(*this); - } - - inline auto Thread::synchronize(Thread& thread) -> void { - if(_clock > thread._clock) scheduler.resume(thread); - } - - inline auto Thread::step(uint clocks) -> void { - _clock += clocks; - } } #include diff --git a/higan/md/md.hpp b/higan/md/md.hpp index d20a78ed..aa5bd036 100644 --- a/higan/md/md.hpp +++ b/higan/md/md.hpp @@ -16,8 +16,14 @@ namespace MegaDrive { extern Scheduler scheduler; struct Thread : Emulator::Thread { - auto create(auto (*entrypoint)() -> void, double frequency) -> void; - auto synchronize(Thread& thread) -> void; + auto create(auto (*entrypoint)() -> void, double frequency) -> void { + Emulator::Thread::create(entrypoint, frequency); + scheduler.append(*this); + } + + inline auto synchronize(Thread& thread) -> void { + if(clock() >= thread.clock()) scheduler.resume(thread); + } }; #include @@ -28,15 +34,6 @@ namespace MegaDrive { #include #include - - inline auto Thread::create(auto (*entrypoint)() -> void, double frequency) -> void { - Emulator::Thread::create(entrypoint, frequency); - scheduler.append(*this); - } - - inline auto Thread::synchronize(Thread& thread) -> void { - if(_clock > thread._clock) scheduler.resume(thread); - } } #include diff --git a/higan/processor/m68k/disassembler.cpp b/higan/processor/m68k/disassembler.cpp index 26fb1e62..42e708db 100644 --- a/higan/processor/m68k/disassembler.cpp +++ b/higan/processor/m68k/disassembler.cpp @@ -112,6 +112,10 @@ template auto M68K::disassembleADDQ(uint4 immediate, EffectiveAddress return {"addq", _suffix(), " #", immediate, ",", _effectiveAddress(modify)}; } +template auto M68K::disassembleADDX(EffectiveAddress target, EffectiveAddress source) -> string { + return {"addx", _suffix(), " ", _effectiveAddress(target), ",", _effectiveAddress(source)}; +} + template auto M68K::disassembleANDI(EffectiveAddress ea) -> string { return {"andi", _suffix(), " ", _immediate(), ",", _effectiveAddress(ea)}; } @@ -343,6 +347,14 @@ auto M68K::disassembleRTS() -> string { return {"rts "}; } +template auto M68K::disassembleSUB(EffectiveAddress source, DataRegister target) -> string { + return {"sub", _suffix(), " ", _effectiveAddress(source), ",", _dataRegister(target)}; +} + +template auto M68K::disassembleSUB(DataRegister source, EffectiveAddress target) -> string { + return {"sub", _suffix(), " ", _dataRegister(source), ",", _effectiveAddress(target)}; +} + template auto M68K::disassembleSUBQ(uint4 immediate, EffectiveAddress ea) -> string { return {"subq", _suffix(), " #", immediate, _effectiveAddress(ea)}; } diff --git a/higan/processor/m68k/instruction.cpp b/higan/processor/m68k/instruction.cpp index 5a39868c..77393d14 100644 --- a/higan/processor/m68k/instruction.cpp +++ b/higan/processor/m68k/instruction.cpp @@ -94,6 +94,24 @@ M68K::M68K() { if(mode == 1) unbind(opcode | 0 << 6); } + //ADDX + for(uint3 treg : range(8)) + for(uint3 sreg : range(8)) { + auto opcode = pattern("1101 ---1 ++00 ----") | treg << 9 | sreg << 0; + + EffectiveAddress dataTarget{DataRegisterDirect, treg}; + EffectiveAddress dataSource{DataRegisterDirect, sreg}; + bind(opcode | 0 << 6 | 0 << 3, ADDX, dataTarget, dataSource); + bind(opcode | 1 << 6 | 0 << 3, ADDX, dataTarget, dataSource); + bind(opcode | 2 << 6 | 0 << 3, ADDX, dataTarget, dataSource); + + EffectiveAddress addressTarget{AddressRegisterIndirectWithPreDecrement, treg}; + EffectiveAddress addressSource{AddressRegisterIndirectWithPreDecrement, sreg}; + bind(opcode | 0 << 6 | 1 << 3, ADDX, addressTarget, addressSource); + bind(opcode | 1 << 6 | 1 << 3, ADDX, addressTarget, addressSource); + bind(opcode | 2 << 6 | 1 << 3, ADDX, addressTarget, addressSource); + } + //ANDI for(uint3 mode : range(8)) for(uint3 reg : range(8)) { @@ -645,6 +663,36 @@ M68K::M68K() { bind(opcode, RTS); } + //SUB + for(uint3 dreg : range(8)) + for(uint3 mode : range(8)) + for(uint3 reg : range(8)) { + auto opcode = pattern("1001 ---0 ++-- ----") | dreg << 9 | mode << 3 | reg << 0; + if(mode == 7 && reg >= 5) continue; + + EffectiveAddress source{mode, reg}; + DataRegister target{dreg}; + bind(opcode | 0 << 6, SUB, source, target); + bind(opcode | 1 << 6, SUB, source, target); + bind(opcode | 2 << 6, SUB, source, target); + + if(mode == 1) unbind(opcode | 0 << 6); + } + + //SUB + for(uint3 dreg : range(8)) + for(uint3 mode : range(8)) + for(uint3 reg : range(8)) { + auto opcode = pattern("1001 ---1 ++-- ----") | dreg << 9 | mode << 3 | reg << 0; + if(mode <= 1 || (mode == 7 && reg >= 2)) continue; + + DataRegister source{dreg}; + EffectiveAddress target{mode, reg}; + bind(opcode | 0 << 6, SUB, source, target); + bind(opcode | 1 << 6, SUB, source, target); + bind(opcode | 2 << 6, SUB, source, target); + } + //SUBQ for(uint3 data : range(8)) for(uint3 mode : range(8)) diff --git a/higan/processor/m68k/instructions.cpp b/higan/processor/m68k/instructions.cpp index 7f5ca589..2bf984f8 100644 --- a/higan/processor/m68k/instructions.cpp +++ b/higan/processor/m68k/instructions.cpp @@ -58,12 +58,14 @@ template auto M68K::negative(uint32 result) -> bool { // -template auto M68K::ADD(uint32 source, uint32 target) -> uint32 { +template auto M68K::ADD(uint32 source, uint32 target) -> uint32 { uint64 result = (uint64)source + (uint64)target; + if(Extend) result += r.x; r.c = sign(result >> 1) < 0; r.v = sign(~(target ^ source) & (target ^ result)) < 0; - r.z = clip(result) == 0; + if(Extend == 0) r.z = clip(result) == 0; + if(Extend == 1) if(clip(result)) r.z = 0; r.n = sign(result) < 0; r.x = r.c; @@ -104,6 +106,13 @@ template auto M68K::instructionADDQ(uint4 immediate, EffectiveAddress write(modify, result); } +template auto M68K::instructionADDX(EffectiveAddress target_, EffectiveAddress source_) -> void { + auto source = read(source_); + auto target = read(target_); + auto result = ADD(source, target); + write(target, result); +} + template auto M68K::instructionANDI(EffectiveAddress ea) -> void { auto source = readPC(); auto target = read(ea); @@ -591,17 +600,39 @@ auto M68K::instructionRTS() -> void { r.pc = pop(); } -template auto M68K::instructionSUBQ(uint4 immediate, EffectiveAddress ea) -> void { - uint64 target = read(ea); - uint64 source = immediate; - uint64 result = target - source; - write(ea, result); +template auto M68K::SUB(uint32 source, uint32 target) -> uint32 { + uint64 result = source - target; + if(Extend) result -= r.x; r.c = sign(result >> 1) < 0; r.v = sign((target ^ source) & (target ^ result)) < 0; - r.z = clip(result) == 0; + if(Extend == 0) r.z = clip(result == 0); + if(Extend == 1) if(clip(result)) r.z = 0; r.n = sign(result) < 0; r.x = r.c; + + return result; +} + +template auto M68K::instructionSUB(EffectiveAddress source_, DataRegister target_) -> void { + auto source = read(source_); + auto target = read(target_); + auto result = SUB(source, target); + write(target_, result); +} + +template auto M68K::instructionSUB(DataRegister source_, EffectiveAddress target_) -> void { + auto source = read(source_); + auto target = read(target_); + auto result = SUB(source, target); + write(target_, result); +} + +template auto M68K::instructionSUBQ(uint4 immediate, EffectiveAddress ea) -> void { + auto source = immediate; + auto target = read(ea); + auto result = SUB(source, target); + write(ea, result); } template auto M68K::instructionTST(EffectiveAddress ea) -> void { diff --git a/higan/processor/m68k/m68k.hpp b/higan/processor/m68k/m68k.hpp index ddb7cb7c..e242a1eb 100644 --- a/higan/processor/m68k/m68k.hpp +++ b/higan/processor/m68k/m68k.hpp @@ -7,7 +7,7 @@ namespace Processor { struct M68K { enum : bool { User, Supervisor }; enum : uint { Byte, Word, Long }; - enum : bool { NoUpdate = 0, Reverse = 1 }; + enum : bool { NoUpdate = 0, Reverse = 1, Extend = 1 }; enum : uint { DataRegisterDirect, @@ -97,11 +97,12 @@ struct M68K { template auto zero(uint32 result) -> bool; template auto negative(uint32 result) -> bool; - template auto ADD(uint32 source, uint32 target) -> uint32; + template auto ADD(uint32 source, uint32 target) -> uint32; template auto instructionADD(DataRegister dr, uint1 direction, EffectiveAddress ea) -> void; template auto instructionADDA(AddressRegister ar, EffectiveAddress ea) -> void; template auto instructionADDI(EffectiveAddress modify) -> void; template auto instructionADDQ(uint4 immediate, EffectiveAddress modify) -> void; + template auto instructionADDX(EffectiveAddress target, EffectiveAddress source) -> void; template auto instructionANDI(EffectiveAddress ea) -> void; auto instructionANDI_TO_CCR() -> void; auto instructionANDI_TO_SR() -> void; @@ -163,6 +164,9 @@ struct M68K { template auto instructionROXR(DataRegister shift, DataRegister modify) -> void; auto instructionROXR(EffectiveAddress modify) -> void; auto instructionRTS() -> void; + template auto SUB(uint32 source, uint32 target) -> uint32; + template auto instructionSUB(EffectiveAddress source, DataRegister target) -> void; + template auto instructionSUB(DataRegister source, EffectiveAddress target) -> void; template auto instructionSUBQ(uint4 immediate, EffectiveAddress ea) -> void; template auto instructionTST(EffectiveAddress ea) -> void; @@ -197,6 +201,7 @@ private: template auto disassembleADDA(AddressRegister ar, EffectiveAddress ea) -> string; template auto disassembleADDI(EffectiveAddress modify) -> string; template auto disassembleADDQ(uint4 immediate, EffectiveAddress modify) -> string; + template auto disassembleADDX(EffectiveAddress target, EffectiveAddress source) -> string; template auto disassembleANDI(EffectiveAddress ea) -> string; auto disassembleANDI_TO_CCR() -> string; auto disassembleANDI_TO_SR() -> string; @@ -249,6 +254,8 @@ private: template auto disassembleROXR(DataRegister shift, DataRegister modify) -> string; auto disassembleROXR(EffectiveAddress modify) -> string; auto disassembleRTS() -> string; + template auto disassembleSUB(EffectiveAddress source, DataRegister target) -> string; + template auto disassembleSUB(DataRegister source, EffectiveAddress target) -> string; template auto disassembleSUBQ(uint4 immediate, EffectiveAddress ea) -> string; template auto disassembleTST(EffectiveAddress ea) -> string; diff --git a/higan/processor/spc700/serialization.cpp b/higan/processor/spc700/serialization.cpp index deda6889..ad195768 100644 --- a/higan/processor/spc700/serialization.cpp +++ b/higan/processor/spc700/serialization.cpp @@ -1,5 +1,5 @@ auto SPC700::serialize(serializer& s) -> void { - s.integer(regs.pc); + s.integer(regs.pc.w); s.integer(regs.a); s.integer(regs.x); s.integer(regs.y); diff --git a/higan/sfc/controller/controller.cpp b/higan/sfc/controller/controller.cpp index ee4ea83d..2a84a8eb 100644 --- a/higan/sfc/controller/controller.cpp +++ b/higan/sfc/controller/controller.cpp @@ -19,8 +19,8 @@ Controller::~Controller() { auto Controller::Enter() -> void { while(true) { scheduler.synchronize(); - if(scheduler.active(*peripherals.controllerPort1)) peripherals.controllerPort1->main(); - if(scheduler.active(*peripherals.controllerPort2)) peripherals.controllerPort2->main(); + if(peripherals.controllerPort1->active()) peripherals.controllerPort1->main(); + if(peripherals.controllerPort2->active()) peripherals.controllerPort2->main(); } } diff --git a/higan/sfc/coprocessor/hitachidsp/memory.cpp b/higan/sfc/coprocessor/hitachidsp/memory.cpp index dbbc123d..167c9f33 100644 --- a/higan/sfc/coprocessor/hitachidsp/memory.cpp +++ b/higan/sfc/coprocessor/hitachidsp/memory.cpp @@ -36,7 +36,7 @@ auto HitachiDSP::write(uint24 addr, uint8 data) -> void { } auto HitachiDSP::romRead(uint24 addr, uint8 data) -> uint8 { - if(scheduler.active(hitachidsp) || regs.halt) { + if(hitachidsp.active() || regs.halt) { addr = Bus::mirror(addr, rom.size()); //if(Roms == 2 && mmio.r1f52 == 1 && addr >= (bit::round(rom.size()) >> 1)) return 0x00; return rom.read(addr, data); diff --git a/higan/sfc/coprocessor/icd2/icd2.cpp b/higan/sfc/coprocessor/icd2/icd2.cpp index f3197471..37308a74 100644 --- a/higan/sfc/coprocessor/icd2/icd2.cpp +++ b/higan/sfc/coprocessor/icd2/icd2.cpp @@ -12,7 +12,7 @@ ICD2 icd2; auto ICD2::Enter() -> void { while(true) { - //if(scheduler.synchronizing()) GameBoy::system.runToSave(); + if(scheduler.synchronizing()) GameBoy::system.runToSave(); scheduler.synchronize(); icd2.main(); } diff --git a/higan/sfc/coprocessor/sa1/io.cpp b/higan/sfc/coprocessor/sa1/io.cpp index 247b2d55..7cfe6e41 100644 --- a/higan/sfc/coprocessor/sa1/io.cpp +++ b/higan/sfc/coprocessor/sa1/io.cpp @@ -1,5 +1,5 @@ auto SA1::readIO(uint24 addr, uint8) -> uint8 { - scheduler.active(cpu) ? cpu.synchronize(sa1) : synchronize(cpu); + cpu.active() ? cpu.synchronize(sa1) : synchronize(cpu); switch(0x2300 | addr.bits(0,7)) { @@ -91,7 +91,7 @@ auto SA1::readIO(uint24 addr, uint8) -> uint8 { } auto SA1::writeIO(uint24 addr, uint8 data) -> void { - scheduler.active(cpu) ? cpu.synchronize(sa1) : synchronize(cpu); + cpu.active() ? cpu.synchronize(sa1) : synchronize(cpu); switch(0x2200 | addr.bits(0,7)) { diff --git a/higan/sfc/sfc.hpp b/higan/sfc/sfc.hpp index 2cee44ea..fc71d2a6 100644 --- a/higan/sfc/sfc.hpp +++ b/higan/sfc/sfc.hpp @@ -27,8 +27,14 @@ namespace SuperFamicom { extern Cheat cheat; struct Thread : Emulator::Thread { - auto create(auto (*entrypoint)() -> void, double frequency) -> void; - auto synchronize(Thread& thread) -> void; + auto create(auto (*entrypoint)() -> void, double frequency) -> void { + Emulator::Thread::create(entrypoint, frequency); + scheduler.append(*this); + } + + inline auto synchronize(Thread& thread) -> void { + if(clock() >= thread.clock()) scheduler.resume(thread); + } }; #include @@ -48,15 +54,6 @@ namespace SuperFamicom { #include #include - - inline auto Thread::create(auto (*entrypoint)(), double frequency) -> void { - Emulator::Thread::create(entrypoint, frequency); - scheduler.append(*this); - } - - inline auto Thread::synchronize(Thread& thread) -> void { - if(_clock > thread._clock) scheduler.resume(thread); - } } #include diff --git a/higan/sfc/smp/timing.cpp b/higan/sfc/smp/timing.cpp index 900005b5..95407cb5 100644 --- a/higan/sfc/smp/timing.cpp +++ b/higan/sfc/smp/timing.cpp @@ -6,8 +6,8 @@ auto SMP::step(uint clocks) -> void { synchronize(cpu); #else //forcefully sync S-SMP to S-CPU in case chips are not communicating - //sync if S-SMP is more than 1ms ahead of S-CPU - if(clock() - cpu.clock() > (Emulator::Constants::Time::Attosecond / Emulator::Constants::Time::Millisecond)) synchronize(cpu); + //sync if S-SMP is more than 24 samples ahead of S-CPU + if(clock() - cpu.clock() > frequency() * scalar() / (768 / 24)) synchronize(cpu); #endif } diff --git a/higan/sfc/system/serialization.cpp b/higan/sfc/system/serialization.cpp index 3e83377e..7f883df8 100644 --- a/higan/sfc/system/serialization.cpp +++ b/higan/sfc/system/serialization.cpp @@ -2,9 +2,9 @@ auto System::serialize() -> serializer { serializer s(serializeSize); uint signature = 0x31545342; - char version[16] = {0}; - char hash[64] = {0}; - char description[512] = {0}; + char version[16] = {}; + char hash[64] = {}; + char description[512] = {}; memory::copy(&version, (const char*)Emulator::SerializerVersion, Emulator::SerializerVersion.size()); memory::copy(&hash, (const char*)cartridge.sha256(), 64); @@ -19,9 +19,9 @@ auto System::serialize() -> serializer { auto System::unserialize(serializer& s) -> bool { uint signature = 0; - char version[16] = {0}; - char hash[64] = {0}; - char description[512] = {0}; + char version[16] = {}; + char hash[64] = {}; + char description[512] = {}; s.integer(signature); s.array(version); @@ -76,9 +76,9 @@ auto System::serializeInit() -> void { serializer s; uint signature = 0; - char version[16] = {0}; - char hash[64] = {0}; - char description[512] = {0}; + char version[16] = {}; + char hash[64] = {}; + char description[512] = {}; s.integer(signature); s.array(version); diff --git a/higan/ws/ws.hpp b/higan/ws/ws.hpp index 9ef6db8e..8e58c460 100644 --- a/higan/ws/ws.hpp +++ b/higan/ws/ws.hpp @@ -26,9 +26,18 @@ namespace WonderSwan { enum : uint { Byte = 1, Word = 2, Long = 4 }; struct Thread : Emulator::Thread { - auto create(auto (*entrypoint)() -> void, double frequency) -> void; - auto synchronize(Thread& thread) -> void; - auto step(uint clocks) -> void; + auto create(auto (*entrypoint)() -> void, double frequency) -> void { + Emulator::Thread::create(entrypoint, frequency); + scheduler.append(*this); + } + + inline auto synchronize(Thread& thread) -> void { + if(clock() >= thread.clock()) scheduler.resume(thread); + } + + inline auto step(uint clocks) -> void { + _clock += clocks; + } }; #include @@ -38,19 +47,6 @@ namespace WonderSwan { #include #include #include - - inline auto Thread::create(auto (*entrypoint)() -> void, double frequency) -> void { - Emulator::Thread::create(entrypoint, frequency); - scheduler.append(*this); - } - - inline auto Thread::synchronize(Thread& thread) -> void { - if(_clock > thread._clock) scheduler.resume(thread); - } - - inline auto Thread::step(uint clocks) -> void { - _clock += clocks; - } } #include diff --git a/nall/serializer.hpp b/nall/serializer.hpp index e883647b..0704b176 100644 --- a/nall/serializer.hpp +++ b/nall/serializer.hpp @@ -11,6 +11,7 @@ //- only plain-old-data can be stored. complex classes must provide serialize(serializer&); //- floating-point usage is not portable across different implementations +#include #include #include #include @@ -46,14 +47,14 @@ struct serializer { } template auto floatingpoint(T& value) -> serializer& { - enum { size = sizeof(T) }; + enum : uint { size = sizeof(T) }; //this is rather dangerous, and not cross-platform safe; //but there is no standardized way to export FP-values auto p = (uint8_t*)&value; if(_mode == Save) { - for(uint n = 0; n < size; n++) _data[_size++] = p[n]; + for(uint n : range(size)) _data[_size++] = p[n]; } else if(_mode == Load) { - for(uint n = 0; n < size; n++) p[n] = _data[_size++]; + for(uint n : range(size)) p[n] = _data[_size++]; } else { _size += size; } @@ -61,12 +62,13 @@ struct serializer { } template auto integer(T& value) -> serializer& { - enum { size = std::is_same::value ? 1 : sizeof(T) }; + enum : uint { size = std::is_same::value ? 1 : sizeof(T) }; if(_mode == Save) { - for(uint n = 0; n < size; n++) _data[_size++] = (uintmax_t)value >> (n << 3); + T copy = value; + for(uint n : range(size)) _data[_size++] = copy, copy >>= 8; } else if(_mode == Load) { value = 0; - for(uint n = 0; n < size; n++) value |= (uintmax_t)_data[_size++] << (n << 3); + for(uint n : range(size)) value |= (T)_data[_size++] << (n << 3); } else if(_mode == Size) { _size += size; } @@ -74,12 +76,12 @@ struct serializer { } template auto array(T (&array)[N]) -> serializer& { - for(uint n = 0; n < N; n++) operator()(array[n]); + for(uint n : range(N)) operator()(array[n]); return *this; } template auto array(T array, uint size) -> serializer& { - for(uint n = 0; n < size; n++) operator()(array[n]); + for(uint n : range(size)) operator()(array[n]); return *this; }