Commit Graph

27 Commits

Author SHA1 Message Date
Tim Allen 16f736307e Update to v103r06 release.
byuu says:

Changelog:

  - processor/spc700: restored fetch/load/store/pull/push shorthand
    functions
  - processor/spc700: split functions that tested the algorithm used (`op
    != &SPC700:...`) to separate instructions
      - mostly for code clarity over code size: it was awkward having
        cycle counts change based on a function parameter
  - processor/spc700: implemented Overload's new findings on which
    cycles are truly internal (no bus reads)
  - sfc/smp: TEST register emulation has been vastly improved¹

¹: it turns out that TEST.d4,d5 is the external clock divider (used
when accessing RAM through the DSP), and TEST.d6,d7 is the internal
clock divider (used when accessing IPLROM, IO registers, or during idle
cycles.)

The DSP (24576khz) feeds its clock / 12 through to the SMP (2048khz).
The clock divider setting further divides the clock by 2, 4, 8, or 16.
Since 8 and 16 are not cleanly divislbe by 12, the SMP cycle count
glitches out and seems to take 10 and 2 clocks instead of 8 or 16. This
can on real hardware either cause the SMP to run very slowly, or more
likely, crash the SMP completely until reset.

What's even stranger is the timers aren't affected by this. They still
clock by 2, 4, 8, or 16.

Note that technically I could divide my own clock counters by 24 and
reduce these to {1,2,5,10} and {1,2,4,8}, I instead chose to divide by
12 to better illustrate this hardware issue and better model that the
SMP clock runs at 2048khz and not 1024khz.

Further, note that things aren't 100% perfect yet. This seems to throw
off some tests, such as blargg's `test_timer_speed`. I can't tell how
far off I am because blargg's test tragically doesn't print out fail
values. But you can see the improvements in that higan is now passing
all of Revenant's tests that were obviously completely wrong before.
2017-07-03 17:24:47 +10:00
Tim Allen 40802b0b9f Update to v103r05 release.
byuu says:

Changelog:

  - fc/controller: added ControllerPort class; removed Peripherals class
  - md/controller/gamepad: removed X,Y,Z buttons since this isn't a
    6-button controller
  - ms/controller: added ControllerPort class (not used in Game Gear
    mode); removed Peripherals class
  - pce/controller: added ControllerPort class; removed Peripherals
    class
  - processor/spc700: idle(address) is part of SMP class again, contains
    flag to detect mov (x)+ edge case
  - sfc/controller/super-scope,justifier: use CPU frequency instead of
    hard-coding NTSC frequency
  - sfc/cpu: move 4x8-bit SMP ports to SMP class
  - sfc/smp: move APU RAM to DSP class
  - sfc/smp: improved emulation of TEST registers bits 4-7 [information
    from nocash]
      - d4,d5 is RAM wait states (1,2,5,10)
      - d6,d7 is ROM/IO wait states (1,2,5,10)
  - sfc/smp: code cleanup to new style (order from lowest to highest
    bits; use .bit(s) functions)
  - sfc/smp: $00f8,$00f9 are P4/P5 auxiliary ports; named the registers
    better
2017-07-01 16:15:27 +10:00
Tim Allen ff3750de4f Update to v103r04 release.
byuu says:

Changelog:

  - fc/apu: $4003,$4007 writes initialize duty counter to 0 instead of 7
  - fc/apu: corrected duty table entries for use with decrementing duty
    counter
  - processor/spc700: emulated the behavior of cycle 3 of (x)+
    instructions to not read I/O registers
      - specifically, this prevents reads from $fd-ff from resetting the
        timers, as observed on real hardware
  - sfc/controller: added ControllerPort class to match Mega Drive
    design
  - sfc/expansion: added ExpansionPort class to match Mega Drive design
  - sfc/system: removed Peripherals class
  - sfc/system: changed `colorburst()` to `cpuFrequency()`; added
    `apuFrequency()`
  - sfc: replaced calls to `system.region == System::Region::*` with
    `Region::*()`
  - sfc/expansion: remove thread from scheduler when device is destroyed
  - sfc/smp: `{read,write}Port` now use a separate 4x8-bit buffer instead
    of underlying APU RAM [hex\_usr]
2017-06-30 14:17:23 +10:00
Tim Allen 78f341489e Update to v103r03 release.
byuu says:

Changelog:

  - md/psg: fixed output frequency rate regression from v103r02
  - processor/m68k: fixed calculations for ABCD, NBCD, SBCD [hex\_usr,
    SuperMikeMan]
  - processor/spc700: renamed abbreviated instructions to functional
    descriptions (eg `XCN` → `ExchangeNibble`)
  - processor/spc700: removed memory.cpp shorthand functions (fetch,
    load, store, pull, push)
  - processor/spc700: updated all instructions to follow cycle behavior
    as documented by Overload with a logic analyzer

Once again, the changes to the SPC700 core are really quite massive. And
this time it's not just cosmetic: the idle cycles have been updated to
pull from various memory addresses. This is why I removed the shorthand
functions -- so that I could handle the at-times very bizarre addresses
the SPC700 has on its address bus during its idle cycles.

There is one behavior Overload mentioned that I don't emulate ... one of
the cycles of the (X) transfer functions seems to not actually access
the $f0-ff internal SMP registers? I don't fully understand what
Overload is getting at, so I haven't tried to support it just yet.

Also, there are limits to logic analyzers. In many cases the same
address is read from twice consecutively. It is unclear which of the two
reads the SPC700 actually utilizes. I tried to choose the most logical
values (usually the first one), but ... I don't know that we'll be able
to figure this one out. It's going to be virtually impossible to test
this through software, because the PC can't really execute out of
registers that have side effects on reads.
2017-06-28 17:24:46 +10:00
Tim Allen 3517d5c4a4 Update to v103r02 release.
byuu says:

Changelog:

  - fc/apu: improved phase duty cycle emulation (mode 3 is 25% phase
    inverted; counter decrements)
  - md/apu: power/reset do not cancel 68K bus requests
  - md/apu: 68K is not granted bus access on Z80 power/reset
  - md/controller: replaced System::Peripherals with ControllerPort
    concept
  - md/controller: CTRL port is now read-write, maintains value across
    controller changes (and soon, soft resets)
  - md/psg: PSG sampling rate unintentionally modified¹
  - processor/spc700: improve cycle timing of (indirect),y instructions
    [Overload]
  - processor/spc700: idle() cycles actually read from the program
    counter; much like the 6502 [Overload]
      - some of the idle() cycles should read from other addresses; this
        still needs to be supported
  - processor/spc700: various cleanups to instruction function naming
  - processor/z80: prefix state (HL→IX,IY override) can now be
    serialized
  - icarus: fix install rule for certain platforms (it wasn't buggy on
    FreeBSD, but was on Linux?)

¹: the clock speed of the PSG is oscillator/15. But I was setting the
sampling rate to oscillator/15/16, which was around 223KHz. I am not
sure whether the PSG should be outputting at 3MHz or 223KHz. Amazingly
... I don't really hear a difference either way `o_O` I didn't actually
mean to make this change; I just noticed it after comparing the diff
between r01 and r02. If this turns out to be wrong, set

    stream = Emulator::audio.createStream(1, frequency() / 16.0);

in md/psg.cpp to revert this change.
2017-06-27 11:18:28 +10:00
Tim Allen e7806dd6e8 Update to v102r27 release.
byuu says:

Changelog:

  - processor/gsu: minor code cleanup
  - processor/hg51b: renamed reg(Read,Write) to register(Read,Write)
  - processor/lr35902: minor code cleanup
  - processor/spc700: completed code cleanup (sans disassembler)
      - no longer uses internal global state inside instructions
  - processor/spc700: will no longer hang the emulator if stuck in a WAI
    (SLEEP) or STP (STOP) instruction
  - processor/spc700: fixed bug in handling of OR1 and AND1 instructions
  - processor/z80: minor code cleanup
  - sfc/dsp: revert to initializing registers to 0x00; save for
    ENDX=random(), FLG=0xe0 [Jonas Quinn]

Major testing of the SNES game library would be appreciated, now that
its CPU cores have all been revised.

We know the DSP registers read back as randomized data ... mostly, but
there are apparently internal latches, which we can't emulate with the
current DSP design. So until we know which registers have separate
internal state that actually *is* initialized, I'm going to play it safe
and not break more games.

Thanks again to Jonas Quinn for the continued research into this issue.

EDIT: that said ... `MD works if((ENDX&0x30) > 0)` is only a 3:4 chance
that the game will work. That seems pretty unlikely that the odds of it
working are that low, given hardware testing by others in the past :/ I
thought if worked if `PITCH != 0` before, which would have been way more
likely.

The two remaining CPU cores that need major cleanup efforts are the
LR35902 and ARM cores. Both are very large, complicated, annoying cores
that will probably be better off as full rewrites from scratch. I don't
think I want to delay v103 in trying to accomplish that, however.

So I think it'll be best to focus on allowing the Mega Drive core to not
lock when processors are frozen waiting on a response from other
processors during a save state operation. Then we should be good for a
new release.
2017-06-19 12:07:54 +10:00
Tim Allen 50411a17d1 Update to v102r26 release.
byuu says:

Changelog:

  - md/ym2612: initialize DAC sample to center volume [Cydrak]
  - processor/arm: add accumulate mode extra cycle to mlal [Jonas
    Quinn]
  - processor/huc6280: split off algorithms, improve naming of functions
  - processor/mos6502: split off algorithms
  - processor/spc700: major revamp of entire core (~50% completed)
  - processor/wdc65816: fixed several bugs introduced by rewrite

For the SPC700, this turns out to be very old code as well, with global
object state variables, those annoying `{Boolean,Natural}BitField` types,
`under_case` naming conventions, heavily abbreviated function names, etc.
I'm working to get the code to be in the same design as the MOS6502,
HuC6280, WDC65816 cores, since they're all extremely similar in terms of
architectural design (the SPC700 is more of an off-label
reimplementation of a 6502 core, but still.)

The main thing left is that about 90% of the actual instructions still
need to be adapted to not use the internal state (`aa`, `rd`, `dp`,
`sp`, `bit` variables.) I wanted to finish this today, but ran out of
time before work.

I wouldn't suggest too much testing just yet. We should wait until the
SPC700 core is finished for that. However, if some does want to and
spots regressions, please let me know.
2017-06-16 10:06:17 +10:00
Tim Allen bdc100e123 Update to v102r02 release.
byuu says:

Changelog:

  - I caved on the `samples[] = {0.0}` thing, but I'm very unhappy about it
      - if it's really invalid C++, then GCC needs to stop accepting it
        in strict `-std=c++14` mode
  - Emulator::Interface::Information::resettable is gone
  - Emulator::Interface::reset() is gone
  - FC, SFC, MD cores updated to remove soft reset behavior
  - split GameBoy::Interface into GameBoyInterface,
    GameBoyColorInterface
  - split WonderSwan::Interface into WonderSwanInterface,
    WonderSwanColorInterface
  - PCE: fixed off-by-one scanline error [hex_usr]
  - PCE: temporary hack to prevent crashing when VDS is set to < 2
  - hiro: Cocoa: removed (u)int(#) constants; converted (u)int(#)
    types to (u)int_(#)t types
  - icarus: replaced usage of unique with strip instead (so we don't
    mess up frameworks on macOS)
  - libco: added macOS-specific section marker [Ryphecha]

So ... the major news this time is the removal of the soft reset
behavior. This is a major!! change that results in a 100KiB diff file,
and it's very prone to accidental mistakes!! If anyone is up for
testing, or even better -- looking over the code changes between v102r01
and v102r02 and looking for any issues, please do so. Ideally we'll want
to test every NES mapper type and every SNES coprocessor type by loading
said games and power cycling to make sure the games are all cleanly
resetting. It's too big of a change for me to cover there not being any
issues on my own, but this is truly critical code, so yeah ... please
help if you can.

We technically lose a bit of hardware documentation here. The soft reset
events do all kinds of interesting things in all kinds of different
chips -- or at least they do on the SNES. This is obviously not ideal.
But in the process of removing these portions of code, I found a few
mistakes I had made previously. It simplifies resetting the system state
a lot when not trying to have all the power() functions call the reset()
functions to share partial functionality.

In the future, the goal will be to come up with a way to add back in the
soft reset behavior via keyboard binding as with the Master System core.
What's going to have to happen is that the key binding will have to send
a "reset pulse" to every emulated chip, and those chips are going to
have to act independently to power() instead of reusing functionality.
We'll get there eventually, but there's many things of vastly greater
importance to work on right now, so it'll be a while. The information
isn't lost ... we'll just have to pull it out of v102 when we are ready.

Note that I left the SNES reset vector simulation code in, even though
it's not possible to trigger, for the time being.

Also ... the Super Game Boy core is still disconnected. To be honest, it
totally slipped my mind when I released v102 that it wasn't connected
again yet. This one's going to be pretty tricky to be honest. I'm
thinking about making a third GameBoy::Interface class just for SGB, and
coming up with some way of bypassing platform-> calls when in this
mode.
2017-01-23 08:04:26 +11:00
Tim Allen bf90bdfcc8 Update to v101r31 release.
byuu says:

Changelog:

  - converted Emulator::Interface::Bind to Emulator::Platform
  - temporarily disabled SGB hooks
  - SMS: emulated Game Gear palette (latching word-write behavior not
    implemented yet)
  - SMS: emulated Master System 'Reset' button, Game Gear 'Start' button
  - SMS: removed reset() functionality, driven by the mappable input now
    instead
  - SMS: split interface class in two: one for Master System, one for
    Game Gear
  - SMS: emulated Game Gear video cropping to 160x144
  - PCE: started on HuC6280 CPU core—so far only registers, NOP
    instruction has been implemented

Errata:

  - Super Game Boy support is broken and thus disabled
  - if you switch between Master System and Game Gear without
    restarting, bad things happen:
      - SMS→GG, no video output on the GG
      - GG→SMS, no input on the SMS

I'm not sure what's causing the SMS\<-\>GG switch bug, having a hard
time debugging it. Help would be very much appreciated, if anyone's up
for it. Otherwise I'll keep trying to track it down on my end.
2017-01-13 12:15:45 +11:00
Tim Allen 79c83ade70 Update to v101r29 release.
byuu says:

Changelog:

  - SMS: background VDP clips partial tiles on the left (math may not be
    right ... it's hard to reason about)
  - SMS: fix background VDP scroll locks
  - SMS: fix VDP sprite coordinates
  - SMS: paint black after the end of the visible display
      - todo: shouldn't be a brute force at the end of the main VDP
        loop, should happen in each rendering unit
  - higan: removed emulator/debugger.hpp
  - higan: removed privileged: access specifier
  - SFC: removed debugger hooks
      - todo: remove sfc/debugger.hpp
  - Z80: fixed disassembly of (fd,dd) cb (displacement) (opcode)
    instructions
  - Z80: fix to prevent interrupts from firing between ix/iy prefixes
    and opcodes
      - todo: this is a rather hacky fix that could, if exploited, crash
        the stack frame
  - Z80: fix BIT flags
  - Z80: fix ADD hl,reg flags
  - Z80: fix CPD, CPI flags
  - Z80: fix IND, INI flags
  - Z80: fix INDR, INIT loop flag check
  - Z80: fix OUTD, OUTI flags
  - Z80: fix OTDR, OTIR loop flag check
2017-01-10 08:27:13 +11:00
Tim Allen 8bdf8f2a55 Update to v101r01 release.
byuu says:

Changelog:

  - added eight more 68K instructions
  - split ADD(direction) into two separate ADD functions

I now have 54 out of 88 instructions implemented (thus, 34 remaining.)
The map is missing 25,182 entries out of 65,536. Down from 32,680 for
v101.00

Aside: this version number feels really silly. r10 and r11 surely will
as well ...
2016-08-08 20:12:03 +10:00
Tim Allen c50723ef61 Update to v100r15 release.
byuu wrote:

Aforementioned scheduler changes added. Longer explanation of why here:
http://hastebin.com/raw/toxedenece

Again, we really need to test this as thoroughly as possible for
regressions :/
This is a really major change that affects absolutely everything: all
emulation cores, all coprocessors, etc.

Also added ADDX and SUB to the 68K core, which brings us just barely
above 50% of the instruction encoding space completed.

[Editor's note: The "aformentioned scheduler changes" were described in
a previous forum post:

    Unfortunately, 64-bits just wasn't enough precision (we were
    getting misalignments ~230 times a second on 21/24MHz clocks), so
    I had to move to 128-bit counters. This of course doesn't exist on
    32-bit architectures (and probably not on all 64-bit ones either),
    so for now ... higan's only going to compile on 64-bit machines
    until we figure something out. Maybe we offer a "lower precision"
    fallback for machines that lack uint128_t or something. Using the
    booth algorithm would be way too slow.

    Anyway, the precision is now 2^-96, which is roughly 10^-29. That
    puts us far beyond the yoctosecond. Suck it, MAME :P I'm jokingly
    referring to it as the byuusecond. The other 32-bits of precision
    allows a 1Hz clock to run up to one full second before all clocks
    need to be normalized to prevent overflow.

    I fixed a serious wobbling issue where I was using clock > other.clock
    for synchronization instead of clock >= other.clock; and also another
    aliasing issue when two threads share a common frequency, but don't
    run in lock-step. The latter I don't even fully understand, but I
    did observe it in testing.

    nall/serialization.hpp has been extended to support 128-bit integers,
    but without explicitly naming them (yay generic code), so nall will
    still compile on 32-bit platforms for all other applications.

    Speed is basically a wash now. FC's a bit slower, SFC's a bit faster.

The "longer explanation" in the linked hastebin is:

    Okay, so the idea is that we can have an arbitrary number of
    oscillators. Take the SNES:

    - CPU/PPU clock = 21477272.727272hz
    - SMP/DSP clock = 24576000hz
    - Cartridge DSP1 clock = 8000000hz
    - Cartridge MSU1 clock = 44100hz
    - Controller Port 1 modem controller clock = 57600hz
    - Controller Port 2 barcode battler clock = 115200hz
    - Expansion Port exercise bike clock = 192000hz

    Is this a pathological case? Of course it is, but it's possible. The
    first four do exist in the wild already: see Rockman X2 MSU1
    patch. Manifest files with higan let you specify any frequency you
    want for any component.

    The old trick higan used was to hold an int64 counter for each
    thread:thread synchronization, and adjust it like so:

    - if thread A steps X clocks; then clock += X * threadB.frequency
      - if clock >= 0; switch to threadB
    - if thread B steps X clocks; then clock -= X * threadA.frequency
      - if clock <  0; switch to threadA

    But there are also system configurations where one processor has to
    synchronize with more than one other processor. Take the Genesis:

    - the 68K has to sync with the Z80 and PSG and YM2612 and VDP
    - the Z80 has to sync with the 68K and PSG and YM2612
    - the PSG has to sync with the 68K and Z80 and YM2612

    Now I could do this by having an int64 clock value for every
    association. But these clock values would have to be outside the
    individual Thread class objects, and we would have to update every
    relationship's clock value. So the 68K would have to update the Z80,
    PSG, YM2612 and VDP clocks. That's four expensive 64-bit multiply-adds
    per clock step event instead of one.

    As such, we have to account for both possibilities. The only way to
    do this is with a single time base. We do this like so:

    - setup: scalar = timeBase / frequency
    - step: clock += scalar * clocks

    Once per second, we look at every thread, find the smallest clock
    value. Then subtract that value from all threads. This prevents the
    clock counters from overflowing.

    Unfortunately, these oscillator values are psychotic, unpredictable,
    and often times repeating fractions. Even with a timeBase of
    1,000,000,000,000,000,000 (one attosecond); we get rounding errors
    every ~16,300 synchronizations. Specifically, this happens with a CPU
    running at 21477273hz (rounded) and SMP running at 24576000hz. That
    may be good enough for most emulators, but ... you know how I am.

    Plus, even at the attosecond level, we're really pushing against the
    limits of 64-bit integers. Given the reciprocal inverse, a frequency
    of 1Hz (which does exist in higan!) would have a scalar that consumes
    1/18th of the entire range of a uint64 on every single step. Yes, I
    could raise the frequency, and then step by that amount, I know. But
    I don't want to have weird gotchas like that in the scheduler core.

    Until I increase the accuracy to about 100 times greater than a
    yoctosecond, the rounding errors are too great. And since the only
    choice above 64-bit values is 128-bit values; we might as well use
    all the extra headroom. 2^-96 as a timebase gives me the ability to
    have both a 1Hz and 4GHz clock; and run them both for a full second;
    before an overflow event would occur.

Another hastebin includes demonstration code:

    #include <libco/libco.h>

    #include <nall/nall.hpp>
    using namespace nall;

    //

    cothread_t mainThread = nullptr;
    const uint iterations = 100'000'000;
    const uint cpuFreq = 21477272.727272 + 0.5;
    const uint smpFreq = 24576000.000000 + 0.5;
    const uint cpuStep = 4;
    const uint smpStep = 5;

    //

    struct ThreadA {
      cothread_t handle = nullptr;
      uint64 frequency = 0;
      int64 clock = 0;

      auto create(auto (*entrypoint)() -> void, uint frequency) {
        this->handle = co_create(65536, entrypoint);
        this->frequency = frequency;
        this->clock = 0;
      }
    };

    struct CPUA : ThreadA {
      static auto Enter() -> void;
      auto main() -> void;
      CPUA() { create(&CPUA::Enter, cpuFreq); }
    } cpuA;

    struct SMPA : ThreadA {
      static auto Enter() -> void;
      auto main() -> void;
      SMPA() { create(&SMPA::Enter, smpFreq); }
    } smpA;

    uint8 queueA[iterations];
    uint offsetA;
    cothread_t resumeA = cpuA.handle;

    auto EnterA() -> void {
      offsetA = 0;
      co_switch(resumeA);
    }

    auto QueueA(uint value) -> void {
      queueA[offsetA++] = value;
      if(offsetA >= iterations) {
        resumeA = co_active();
        co_switch(mainThread);
      }
    }

    auto CPUA::Enter() -> void { while(true) cpuA.main(); }

    auto CPUA::main() -> void {
      QueueA(1);
      smpA.clock -= cpuStep * smpA.frequency;
      if(smpA.clock < 0) co_switch(smpA.handle);
    }

    auto SMPA::Enter() -> void { while(true) smpA.main(); }

    auto SMPA::main() -> void {
      QueueA(2);
      smpA.clock += smpStep * cpuA.frequency;
      if(smpA.clock >= 0) co_switch(cpuA.handle);
    }

    //

    struct ThreadB {
      cothread_t handle = nullptr;
      uint128_t scalar = 0;
      uint128_t clock = 0;

      auto print128(uint128_t value) {
        string s;
        while(value) {
          s.append((char)('0' + value % 10));
          value /= 10;
        }
        s.reverse();
        print(s, "\n");
      }

      //femtosecond (10^15) =    16306
      //attosecond  (10^18) =   688838
      //zeptosecond (10^21) = 13712691
      //yoctosecond (10^24) = 13712691 (hitting a dead-end on a rounding error causing a wobble)
      //byuusecond? ( 2^96) = (perfect? 79,228 times more precise than a yoctosecond)

      auto create(auto (*entrypoint)() -> void, uint128_t frequency) {
        this->handle = co_create(65536, entrypoint);

        uint128_t unitOfTime = 1;
      //for(uint n : range(29)) unitOfTime *= 10;
        unitOfTime <<= 96;  //2^96 time units ...

        this->scalar = unitOfTime / frequency;
        print128(this->scalar);
        this->clock = 0;
      }

      auto step(uint128_t clocks) -> void { clock += clocks * scalar; }
      auto synchronize(ThreadB& thread) -> void { if(clock >= thread.clock) co_switch(thread.handle); }
    };

    struct CPUB : ThreadB {
      static auto Enter() -> void;
      auto main() -> void;
      CPUB() { create(&CPUB::Enter, cpuFreq); }
    } cpuB;

    struct SMPB : ThreadB {
      static auto Enter() -> void;
      auto main() -> void;
      SMPB() { create(&SMPB::Enter, smpFreq); clock = 1; }
    } smpB;

    auto correct() -> void {
      auto minimum = min(cpuB.clock, smpB.clock);
      cpuB.clock -= minimum;
      smpB.clock -= minimum;
    }

    uint8 queueB[iterations];
    uint offsetB;
    cothread_t resumeB = cpuB.handle;

    auto EnterB() -> void {
      correct();
      offsetB = 0;
      co_switch(resumeB);
    }

    auto QueueB(uint value) -> void {
      queueB[offsetB++] = value;
      if(offsetB >= iterations) {
        resumeB = co_active();
        co_switch(mainThread);
      }
    }

    auto CPUB::Enter() -> void { while(true) cpuB.main(); }

    auto CPUB::main() -> void {
      QueueB(1);
      step(cpuStep);
      synchronize(smpB);
    }

    auto SMPB::Enter() -> void { while(true) smpB.main(); }

    auto SMPB::main() -> void {
      QueueB(2);
      step(smpStep);
      synchronize(cpuB);
    }

    //

    #include <nall/main.hpp>
    auto nall::main(string_vector) -> void {
      mainThread = co_active();

      uint masterCounter = 0;
      while(true) {
        print(masterCounter++, " ...\n");

        auto A = clock();
        EnterA();
        auto B = clock();
        print((double)(B - A) / CLOCKS_PER_SEC, "s\n");

        auto C = clock();
        EnterB();
        auto D = clock();
        print((double)(D - C) / CLOCKS_PER_SEC, "s\n");

        for(uint n : range(iterations)) {
          if(queueA[n] != queueB[n]) return print("fail at ", n, "\n");
        }
      }
    }

...and that's everything.]
2016-07-31 12:11:20 +10:00
Tim Allen ca277cd5e8 Update to v100r14 release.
byuu says:

(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)

I completely replaced the time management system in higan and overhauled
the scheduler.

Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.

The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.

The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:

    auto SMP::step(uint clocks) -> void {
      clock += clocks * (uint64)cpu.frequency;
      dsp.clock -= clocks;
      if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
      if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
    }

To this:

    auto SMP::step(uint clocks) -> void {
      Thread::step(clocks);
      synchronize(dsp);
      synchronize(cpu);
    }

As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.

Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.

Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.

Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.

----

Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.

The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.

----

EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 13:56:12 +10:00
Tim Allen 76a8ecd32a Update to v100r03 release.
byuu says:

Changelog:
- moved Thread, Scheduler, Cheat functionality into emulator/ for
  all cores
- start of actual Mega Drive emulation (two 68K instructions)

I'm going to be rather terse on MD emulation, as it's too early for any
meaningful dialogue here.
2016-07-10 15:28:26 +10:00
Tim Allen 88c79e56a0 Update to v100r01 release.
[This version, with the internal version number changed back to "v100",
replaced the original v100 source archive on byuu.org soon after v100's
release, because it fixes important bugs in that version. --Ed]

byuu says:

Changelog:
- fixed default paths for Sufami Turbo slotted games
- moved WonderSwan orientation controls to the port rather than the device
  - I do like hex_usr's idea here; but that'll need more consideration;
    so this is a temporary fix
- added new debugger interface (see the public topic for more on that)
2016-07-08 22:31:35 +10:00
Tim Allen 82293c95ae Update to v099r14 release.
byuu says:

Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
  like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
  - toInteger, toNatural, toReal for parsing strings to numbers
  - fromInteger, fromNatural, fromReal for creating strings from numbers
  - (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
    left unchanged
  - template<typename T> numeral(T value, long padding, char padchar)
    -> string for print() formatting
    - deduces integer,natural,real based on T ... cast the value if you
      want to override
    - there still exists binary,octal,hex,pointer for explicit print()
      formatting
- lstring -> string_vector [but using lstring = string_vector; is
  declared]
  - would be nice to remove the using lstring eventually ... but that'd
    probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
  Natural<sizeof(uint)*8>; declared
  - for consistency with boolean. These three are meant for creating
    zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
  up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
  (status,registers); now
  - still some CPU::Status status values ... they didn't really fit into
    IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
  calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
  - the raw uint8[544] array is gone. OAM::read() constructs values from
    the OAM::Object[512] table now
  - this avoids having to determine how we want to sub-divide the two
    OAM memory sections
  - this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting

The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 21:50:32 +10:00
Tim Allen 7a68059f78 Update to v099r12 release.
byuu says:

Changelog:
- fixed FC AxROM / VRC7 regression
- BitField split to BooleanBitField/NaturalBitField (in preparation
  for IntegerBitField)
- BitFieldReference removed
- GB CPU cleaned up
- GB Cartridge + Mappers cleaned up
- SFC CGRAM is now emulated as uint15[256] instead of uint[512]
- sfc/ppu/memory.cpp no longer needed; removed
- purged SFC Debugger hooks for now (some of the operator[] calls were
  bypassing them anyway)

Unfortunately, for reasons that defy all semblance of logic, the CGRAM
change caused a slight speed hit. As have the last few changes. We're
now down to around 129.5fps compared to 123.fps for v099 and 134.5fps
at our peak (v099r01-r02).

I really like the style I came up with for the Game Boy mappers to settle
the purpose(ROM,RAM) vs (rom,ram)Purpose naming convention. If I ever get
around to redoing the NES mappers, that's likely the approach I'll take.
2016-06-28 20:43:47 +10:00
Tim Allen 3a9c7c6843 Update to v099r09 release.
byuu says:

Changelog:
- Emulator::Interface::Medium::bootable removed
- Emulator::Interface::load(bool required) argument removed
  [File::Required makes no sense on a folder]
- Super Famicom.sys now has user-configurable properties (CPU,PPU1,PPU2
  version; PPU1 VRAM size, Region override)
- old nall/property removed completely
- volatile flags supported on coprocessor RAM files now (still not in
  icarus, though)
- (hopefully) fixed SNES Multitap support (needs testing)
- fixed an OAM tiledata range clipping limit in 128KiB VRAM mode (doesn't
  fix Yoshi's Island, sadly)
- (hopefully, again) fixed the input polling bug hex_usr reported
- re-added dialog box for when File::Required files are missing
  - really cool: if you're missing a boot ROM, BIOS ROM, or IPL ROM,
    it warns you immediately
  - you don't have to select a game before seeing the error message
    anymore
- fixed cheats.bml load/save location
2016-06-25 18:53:11 +10:00
Tim Allen 44a8c5a2b4 Update to v099r03 release.
byuu says:

Changelog:
- finished cleaning up the SFC core to my new coding conventions
- removed sfc/controller/usart (superseded by 21fx)
- hid Synchronize Video option from the menu (still in the configuration
  file)

Pretty much the only minor detail left is some variable names in the
SA-1 core that really won't look good at all if I move to camelCase,
so I'll have to rethink how I handle those. It's probably a good area
to attempt using BitFields, to see how it impacts performance. But I'll
do that in a test branch first.

But for the most part, this should be the end of the gigantic diffs (this
one was 174KiB), at least for the SFC/WS cores. Still have the FC/GB/GBA
cores to clean up more fully. Assuming we don't spot any new regressions,
we should be ~95% out of the woods on code cleanups breaking things.
2016-06-17 23:03:54 +10:00
Tim Allen 20ac95ee49 Update to v098r15 release.
byuu says:

Changelog:
- removed template usage from processor/spc700; cleaned up many function
  names and the switch table
  - object size: 176.8kb => 127.3kb
  - source code size: 43.5kb => 37.0kb
- fixed processor/r65816 BRK/COP vector regression [hex_usr]
- corrected HuC3 unmapped RAM read value; fixes Robopon [endrift]
- cosmetic: simplified the butterworth constant calculation
  [Wolfram|Alpha]

The SPC700 core changes took forever, about three hours of work.

Only the LR35902 and R6502 still need their template functions
removed. The point of this is that it doesn't cause any speed penalty
to do so, and it results in smaller binary sizes and faster compilation
times.
2016-06-05 14:52:43 +10:00
Tim Allen 19e1d89f00 Update to v098r01 release.
byuu says:

Changelog:
- SFC: balanced profile removed
- SFC: performance profile removed
- SFC: code for handling non-threaded CPU, SMP, DSP, PPU removed
- SFC: Coprocessor, Controller (and expansion port) shared Thread code
  merged to SFC::Cothread
  - Cothread here just means "Thread with CPU affinity" (couldn't think
    of a better name, sorry)
- SFC: CPU now has vector<Thread*> coprocessors, peripherals;
  - this is the beginning of work to allow expansion port devices to be
    dynamically changed at run-time
- ruby: all audio drivers default to 48000hz instead of 22050hz now if
  no frequency is assigned
  - note: the WASAPI driver can default to whatever the native frequency
    is; doesn't have to be 48000hz
- tomoko: removed the ability to change the frequency from the UI (but
  it will display the frequency used)
- tomoko: removed the timing settings panel
  - the goal is to work toward smooth video via adaptive sync
  - the model is broken by not being in control of the audio frequency
    anyway
  - it's further broken by PAL running at 50hz and WSC running at 75hz
  - it was always broken anyway by SNES interlace timing varying from
    progressive timing
- higan: audio/ stub created (for now, it's just nall/dsp/ moved here
  and included as a header)
- higan: video/ stub created
- higan/GNUmakefile: now includes build rules for essential components
  (libco, emulator, audio, video)

The audio changes are in preparation to merge wareya's awesome WASAPI
work without the need for the nall/dsp resampler.
2016-04-09 13:40:12 +10:00
Tim Allen ef65bb862a Update to 20160215 release.
byuu says:

Got it. Wow, that didn't hurt nearly as much as I thought it was going
to.

Dropped from 127.5fps to 123.5fps to use Natural/Integer for
(u)int(8,16,32,64).

That's totally worth the cost.
2016-02-16 20:27:55 +11:00
Tim Allen 6c83329cae Update to v097r13 release.
byuu says:

I refactored my schedulers. Added about ten lines to each scheduler, and
removed about 100 lines of calling into internal state in the scheduler
for the FC,SFC cores and about 30-40 lines for the other cores. All of
its state is now private.

Also reworked all of the entry points to static auto Enter() and auto
main(). Where Enter() handles all the synchronization stuff, and main()
doesn't need the while(true); loop forcing another layer of indentation
everywhere.

Took a few hours to do, but totally worth it. I'm surprised I didn't do
this sooner.

Also updated icarus gmake install rule to copy over the database.
2016-02-09 22:51:12 +11:00
Tim Allen 47d4bd4d81 Update to v096r01 release.
byuu says:

Changelog:

- restructured the project and removed a whole bunch of old/dead
  directives from higan/GNUmakefile
- huge amounts of work on hiro/cocoa (compiles but ~70% of the
  functionality is commented out)
- fixed a masking error in my ARM CPU disassembler [Lioncash]
- SFC: decided to change board cic=(411,413) back to board
  region=(ntsc,pal) ... the former was too obtuse

If you rename Boolean (it's a problem with an include from ruby, not
from hiro) and disable all the ruby drivers, you can compile an
OS X binary, but obviously it's not going to do anything.

It's a boring WIP, I just wanted to push out the project structure
change now at the start of this WIP cycle.
2015-12-30 17:54:59 +11:00
Tim Allen 4e2eb23835 Update to v093 release.
byuu says:

Changelog:
- added Cocoa target: higan can now be compiled for OS X Lion
  [Cydrak, byuu]
- SNES/accuracy profile hires color blending improvements - fixes
  Marvelous text [AWJ]
- fixed a slight bug in SNES/SA-1 VBR support caused by a typo
- added support for multi-pass shaders that can load external textures
  (requires OpenGL 3.2+)
- added game library path (used by ananke->Import Game) to
  Settings->Advanced
- system profiles, shaders and cheats database can be stored in "all
  users" shared folders now (eg /usr/share on Linux)
- all configuration files are in BML format now, instead of XML (much
  easier to read and edit this way)
- main window supports drag-and-drop of game folders (but not game files
  / ZIP archives)
- audio buffer clears when entering a modal loop on Windows (prevents
  audio repetition with DirectSound driver)
- a substantial amount of code clean-up (probably the biggest
  refactoring to date)

One highly desired target for this release was to default to the optimal
drivers instead of the safest drivers, but because AMD drivers don't
seem to like my OpenGL 3.2 driver, I've decided to postpone that. AMD
has too big a market share. Hopefully with v093 officially released, we
can get some public input on what AMD doesn't like.
2013-08-18 13:21:14 +10:00
Tim Allen 29ea5bd599 Update to v092r09 release.
byuu says:

This will be another massive diff from the previous version.

All of higan was updated to use the new foo& bar syntax, and I also
updated switch statements to be consistent as well (but not in the
disassemblers, was starting to get an RSI just from what I already did.)

phoenix/{windows, cocoa, qt} need to be updated to use "string foo"
instead of "const string& foo", and after that, the major diffs should
be finished.

This archive is the first time I'm posting my copy-on-write,
size+capacity nall::string class, so any feedback on that is welcome as
well.
2013-05-05 19:21:30 +10:00
Tim Allen 94b2538af5 Update to higan v091 release.
byuu says:

Basically just a project rename, with s/bsnes/higan and the new icon
from lowkee added in.

It won't compile on Windows because I forgot to update the resource.rc
file, and a path transform command isn't working on Windows.
It was really just meant as a starting point, so that v091 WIPs can flow
starting from .00 with the new name (it overshadows bsnes v091, so
publicly speaking this "shouldn't exist" and will probably be deleted
from Google Code when v092 is ready.)
2012-12-26 17:46:36 +11:00