Commit Graph

16 Commits

Author SHA1 Message Date
Tim Allen c50723ef61 Update to v100r15 release.
byuu wrote:

Aforementioned scheduler changes added. Longer explanation of why here:
http://hastebin.com/raw/toxedenece

Again, we really need to test this as thoroughly as possible for
regressions :/
This is a really major change that affects absolutely everything: all
emulation cores, all coprocessors, etc.

Also added ADDX and SUB to the 68K core, which brings us just barely
above 50% of the instruction encoding space completed.

[Editor's note: The "aformentioned scheduler changes" were described in
a previous forum post:

    Unfortunately, 64-bits just wasn't enough precision (we were
    getting misalignments ~230 times a second on 21/24MHz clocks), so
    I had to move to 128-bit counters. This of course doesn't exist on
    32-bit architectures (and probably not on all 64-bit ones either),
    so for now ... higan's only going to compile on 64-bit machines
    until we figure something out. Maybe we offer a "lower precision"
    fallback for machines that lack uint128_t or something. Using the
    booth algorithm would be way too slow.

    Anyway, the precision is now 2^-96, which is roughly 10^-29. That
    puts us far beyond the yoctosecond. Suck it, MAME :P I'm jokingly
    referring to it as the byuusecond. The other 32-bits of precision
    allows a 1Hz clock to run up to one full second before all clocks
    need to be normalized to prevent overflow.

    I fixed a serious wobbling issue where I was using clock > other.clock
    for synchronization instead of clock >= other.clock; and also another
    aliasing issue when two threads share a common frequency, but don't
    run in lock-step. The latter I don't even fully understand, but I
    did observe it in testing.

    nall/serialization.hpp has been extended to support 128-bit integers,
    but without explicitly naming them (yay generic code), so nall will
    still compile on 32-bit platforms for all other applications.

    Speed is basically a wash now. FC's a bit slower, SFC's a bit faster.

The "longer explanation" in the linked hastebin is:

    Okay, so the idea is that we can have an arbitrary number of
    oscillators. Take the SNES:

    - CPU/PPU clock = 21477272.727272hz
    - SMP/DSP clock = 24576000hz
    - Cartridge DSP1 clock = 8000000hz
    - Cartridge MSU1 clock = 44100hz
    - Controller Port 1 modem controller clock = 57600hz
    - Controller Port 2 barcode battler clock = 115200hz
    - Expansion Port exercise bike clock = 192000hz

    Is this a pathological case? Of course it is, but it's possible. The
    first four do exist in the wild already: see Rockman X2 MSU1
    patch. Manifest files with higan let you specify any frequency you
    want for any component.

    The old trick higan used was to hold an int64 counter for each
    thread:thread synchronization, and adjust it like so:

    - if thread A steps X clocks; then clock += X * threadB.frequency
      - if clock >= 0; switch to threadB
    - if thread B steps X clocks; then clock -= X * threadA.frequency
      - if clock <  0; switch to threadA

    But there are also system configurations where one processor has to
    synchronize with more than one other processor. Take the Genesis:

    - the 68K has to sync with the Z80 and PSG and YM2612 and VDP
    - the Z80 has to sync with the 68K and PSG and YM2612
    - the PSG has to sync with the 68K and Z80 and YM2612

    Now I could do this by having an int64 clock value for every
    association. But these clock values would have to be outside the
    individual Thread class objects, and we would have to update every
    relationship's clock value. So the 68K would have to update the Z80,
    PSG, YM2612 and VDP clocks. That's four expensive 64-bit multiply-adds
    per clock step event instead of one.

    As such, we have to account for both possibilities. The only way to
    do this is with a single time base. We do this like so:

    - setup: scalar = timeBase / frequency
    - step: clock += scalar * clocks

    Once per second, we look at every thread, find the smallest clock
    value. Then subtract that value from all threads. This prevents the
    clock counters from overflowing.

    Unfortunately, these oscillator values are psychotic, unpredictable,
    and often times repeating fractions. Even with a timeBase of
    1,000,000,000,000,000,000 (one attosecond); we get rounding errors
    every ~16,300 synchronizations. Specifically, this happens with a CPU
    running at 21477273hz (rounded) and SMP running at 24576000hz. That
    may be good enough for most emulators, but ... you know how I am.

    Plus, even at the attosecond level, we're really pushing against the
    limits of 64-bit integers. Given the reciprocal inverse, a frequency
    of 1Hz (which does exist in higan!) would have a scalar that consumes
    1/18th of the entire range of a uint64 on every single step. Yes, I
    could raise the frequency, and then step by that amount, I know. But
    I don't want to have weird gotchas like that in the scheduler core.

    Until I increase the accuracy to about 100 times greater than a
    yoctosecond, the rounding errors are too great. And since the only
    choice above 64-bit values is 128-bit values; we might as well use
    all the extra headroom. 2^-96 as a timebase gives me the ability to
    have both a 1Hz and 4GHz clock; and run them both for a full second;
    before an overflow event would occur.

Another hastebin includes demonstration code:

    #include <libco/libco.h>

    #include <nall/nall.hpp>
    using namespace nall;

    //

    cothread_t mainThread = nullptr;
    const uint iterations = 100'000'000;
    const uint cpuFreq = 21477272.727272 + 0.5;
    const uint smpFreq = 24576000.000000 + 0.5;
    const uint cpuStep = 4;
    const uint smpStep = 5;

    //

    struct ThreadA {
      cothread_t handle = nullptr;
      uint64 frequency = 0;
      int64 clock = 0;

      auto create(auto (*entrypoint)() -> void, uint frequency) {
        this->handle = co_create(65536, entrypoint);
        this->frequency = frequency;
        this->clock = 0;
      }
    };

    struct CPUA : ThreadA {
      static auto Enter() -> void;
      auto main() -> void;
      CPUA() { create(&CPUA::Enter, cpuFreq); }
    } cpuA;

    struct SMPA : ThreadA {
      static auto Enter() -> void;
      auto main() -> void;
      SMPA() { create(&SMPA::Enter, smpFreq); }
    } smpA;

    uint8 queueA[iterations];
    uint offsetA;
    cothread_t resumeA = cpuA.handle;

    auto EnterA() -> void {
      offsetA = 0;
      co_switch(resumeA);
    }

    auto QueueA(uint value) -> void {
      queueA[offsetA++] = value;
      if(offsetA >= iterations) {
        resumeA = co_active();
        co_switch(mainThread);
      }
    }

    auto CPUA::Enter() -> void { while(true) cpuA.main(); }

    auto CPUA::main() -> void {
      QueueA(1);
      smpA.clock -= cpuStep * smpA.frequency;
      if(smpA.clock < 0) co_switch(smpA.handle);
    }

    auto SMPA::Enter() -> void { while(true) smpA.main(); }

    auto SMPA::main() -> void {
      QueueA(2);
      smpA.clock += smpStep * cpuA.frequency;
      if(smpA.clock >= 0) co_switch(cpuA.handle);
    }

    //

    struct ThreadB {
      cothread_t handle = nullptr;
      uint128_t scalar = 0;
      uint128_t clock = 0;

      auto print128(uint128_t value) {
        string s;
        while(value) {
          s.append((char)('0' + value % 10));
          value /= 10;
        }
        s.reverse();
        print(s, "\n");
      }

      //femtosecond (10^15) =    16306
      //attosecond  (10^18) =   688838
      //zeptosecond (10^21) = 13712691
      //yoctosecond (10^24) = 13712691 (hitting a dead-end on a rounding error causing a wobble)
      //byuusecond? ( 2^96) = (perfect? 79,228 times more precise than a yoctosecond)

      auto create(auto (*entrypoint)() -> void, uint128_t frequency) {
        this->handle = co_create(65536, entrypoint);

        uint128_t unitOfTime = 1;
      //for(uint n : range(29)) unitOfTime *= 10;
        unitOfTime <<= 96;  //2^96 time units ...

        this->scalar = unitOfTime / frequency;
        print128(this->scalar);
        this->clock = 0;
      }

      auto step(uint128_t clocks) -> void { clock += clocks * scalar; }
      auto synchronize(ThreadB& thread) -> void { if(clock >= thread.clock) co_switch(thread.handle); }
    };

    struct CPUB : ThreadB {
      static auto Enter() -> void;
      auto main() -> void;
      CPUB() { create(&CPUB::Enter, cpuFreq); }
    } cpuB;

    struct SMPB : ThreadB {
      static auto Enter() -> void;
      auto main() -> void;
      SMPB() { create(&SMPB::Enter, smpFreq); clock = 1; }
    } smpB;

    auto correct() -> void {
      auto minimum = min(cpuB.clock, smpB.clock);
      cpuB.clock -= minimum;
      smpB.clock -= minimum;
    }

    uint8 queueB[iterations];
    uint offsetB;
    cothread_t resumeB = cpuB.handle;

    auto EnterB() -> void {
      correct();
      offsetB = 0;
      co_switch(resumeB);
    }

    auto QueueB(uint value) -> void {
      queueB[offsetB++] = value;
      if(offsetB >= iterations) {
        resumeB = co_active();
        co_switch(mainThread);
      }
    }

    auto CPUB::Enter() -> void { while(true) cpuB.main(); }

    auto CPUB::main() -> void {
      QueueB(1);
      step(cpuStep);
      synchronize(smpB);
    }

    auto SMPB::Enter() -> void { while(true) smpB.main(); }

    auto SMPB::main() -> void {
      QueueB(2);
      step(smpStep);
      synchronize(cpuB);
    }

    //

    #include <nall/main.hpp>
    auto nall::main(string_vector) -> void {
      mainThread = co_active();

      uint masterCounter = 0;
      while(true) {
        print(masterCounter++, " ...\n");

        auto A = clock();
        EnterA();
        auto B = clock();
        print((double)(B - A) / CLOCKS_PER_SEC, "s\n");

        auto C = clock();
        EnterB();
        auto D = clock();
        print((double)(D - C) / CLOCKS_PER_SEC, "s\n");

        for(uint n : range(iterations)) {
          if(queueA[n] != queueB[n]) return print("fail at ", n, "\n");
        }
      }
    }

...and that's everything.]
2016-07-31 12:11:20 +10:00
Tim Allen ca277cd5e8 Update to v100r14 release.
byuu says:

(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)

I completely replaced the time management system in higan and overhauled
the scheduler.

Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.

The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.

The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:

    auto SMP::step(uint clocks) -> void {
      clock += clocks * (uint64)cpu.frequency;
      dsp.clock -= clocks;
      if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
      if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
    }

To this:

    auto SMP::step(uint clocks) -> void {
      Thread::step(clocks);
      synchronize(dsp);
      synchronize(cpu);
    }

As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.

Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.

Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.

Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.

----

Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.

The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.

----

EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 13:56:12 +10:00
Tim Allen 76a8ecd32a Update to v100r03 release.
byuu says:

Changelog:
- moved Thread, Scheduler, Cheat functionality into emulator/ for
  all cores
- start of actual Mega Drive emulation (two 68K instructions)

I'm going to be rather terse on MD emulation, as it's too early for any
meaningful dialogue here.
2016-07-10 15:28:26 +10:00
Tim Allen f48b332c83 Update to v099r08 release.
byuu says:

Changelog:
- nall/vfs work 100% completed; even SGB games load now
- emulation cores now call load() for the base cartridges as well
- updated port/device handling; portmask is gone; device ID bug should
  be resolved now
- SNES controller port 1 multitap option was removed
- added support for 128KiB SNES PPU VRAM (for now, edit sfc/ppu/ppu.hpp
  VRAM::size=0x10000; to enable)

Overall, nall/vfs was a huge success!! We've substantially reduced
the amount of boilerplate code everywhere, while still allowing (even
easier than before) support for RAM-based game loading/saving. All of
nall/stream is dead and buried.

I am considering removing Emulator::Interface::Medium::id and/or
bootable flag. Or at least, doing something different with it. The
values for the non-bootable GB/BS/ST entries duplicate the ID that is
supposed to be unique. They are for GB/GBC and WS/WSC. Maybe I'll use
this as the hardware revision selection ID, and then gut non-bootable
options. There's really no reason for that to be there. I think at one
point I was using it to generate library tabs for non-bootable systems,
but we don't do that anymore anyway.

Emulator::Interface::load() may not need the required flag anymore ... it
doesn't really do anything right now anyway.

I have a few reasons for having the cores load the base cartridge. Most
importantly, it is going to enable a special mode for the WonderSwan /
WonderSwan Color in the future. If we ever get the IPLROMs dumped ... it's
possible to boot these systems with no games inserted to set user profile
information and such. There are also other systems that may accept being
booted without a cartridge. To reach this state, you would load a game and
then cancel the load dialog. Right now, this results in games not loading.

The second reason is this prevents nasty crashes when loading fails. So
if you're missing a required manifest, the emulator won't die a violent
death anymore. It's able to back out at any point.

The third reason is consistency: loading the base cartridge works the
same as the slot cartridges.

The fourth reason is Emulator::Interface::open(uint pathID)
values. Before, the GB, SB, GBC modes were IDs 1,2,3 respectively. This
complicated things because you had to pass the correct ID. But now
instead, Emulator::Interface::load() returns maybe<uint> that is nothing
when no game is selected, and a pathID for a valid game. And now open()
can take this ID to access this game's folder contents.

The downside, which is temporary, is that command-line loading is
currently broken. But I do intend on restoring it. In fact, I want to do
better than before and allow multi-cart booting from the command-line by
specifying the base cartridge and then slot cartridges. The idea should
be pretty simple: keep a queue of pending filenames that we fill from
the command-line and/or drag-and-drop operations on the main window,
and then empty out the queue or prompt for load dialogs from the UI
when booting a system. This also might be a bit more unorthodox compared
to the traditional emulator design of "loadGame(filename)", but ... oh
well. It's easy enough still.

The port/device changes are fun. We simplified things quite a bit. The
portmask stuff is gone entirely. While ports and devices keep IDs,
this is really just sugar-coating so UIs can use for(auto& port :
emulator->ports) and access port.id; rather than having to use for(auto
n : range(emulator->ports)) { auto& port = emulator->ports[n]; ... };
but they should otherwise generally be identical to the order they appear
in their respective ranges. Still, don't rely on that.

Input::id is gone. There was no point since we also got rid of the nasty
Input::order vector. Since I was in here, I went ahead and caved on the
pedantics and renamed Input::guid to Input::userData.

I removed the SNES controller port 1 multitap option. Basically, the only
game that uses this is N-warp Daisakusen and, no offense to d4s, it's
not really a good game anyway. It's just a quick demo to show 8-players
on the SNES. But in the UI, all it does is confuse people into wasting
time mapping a controller they're never going to use, and they're going
to wonder which port to use. If more compelling use cases for 8-players
comes about, we can reconsider this. I left all the code to support this
in place, so all you have to do is uncomment one line to enable it again.

We now have dsnes emulation! :D
If you change PPU::VRAM::size to 0x10000 (words), then you should now
have 128KiB of VRAM. Even better, it serializes the used-VRAM size,
so your save states shouldn't crash on you if you swap between the two
(though if you try this, you're nuts.)

Note that this option does break commercial software. Yoshi's Island in
particular. This game is setting A15 on some PPU register writes, but
not on others. The end result of this is things break horribly in-game.

Also, this option is causing a very tiny speed hit for obvious reasons
with the variable masking value (I'm even using size-1 for now.) Given
how niche this is, I may just leave it a compile-time constant to avoid
the overhead cost. Otherwise, if we keep the option, then it'll go into
Super Famicom.sys/manifest.bml ... I'll flesh that out in the near-future.

----

Finally, some fun for my OCD ... my monitor suddenly cut out on me
in the middle of working on this WIP, about six hours in of non-stop
work. Had to hit a bunch of ctrl+alt+fN commands (among other things)
and trying to log in headless on another TTY to do issue commands,
trying to recover the display. Finally power cycled the monitor and it
came back up. So all my typing ended up going to who knows where.

Usually this sort of thing terrifies me enough that I scrap a WIP and
start over to ensure I didn't screw anything up during the crashed screen
when hitting keys randomly.

Obviously, everything compiles and appears to work fine. And I know
it's extremely paranoid, but OCD isn't logical, so ... I'm going
to go over every line of the 100KiB r07->r08 diff looking for any
corruption/errors/whatever.

----

Review finished.

r08 diff review notes:
- fc/controller/gamepad/gamepad.cpp:
  use uint device = ID::Device::Gamepad; not id = ...;
- gb/cartridge/cartridge.hpp:
  remove redundant uint _pathID; (in Information::pathID already)
- gb/cartridge/cartridge.hpp:
  pull sha256 inside Information
- sfc/cartridge/load/cpp:
  add " - Slot (A,B)" to interface->load("Sufami Turbo"); to be more
  descriptive
- sfc/controller/gamepad/gamepad.cpp:
  use uint device = ID::Device::Gamepad; not id = ...;
- sfc/interface/interface.cpp:
  remove n variable from the Multitap device input generation loop
  (now unused)
- sfc/interface/interface.hpp:
  put struct Port above struct Device like the other classes
- ui-tomoko:
  cheats.bml is reading from/writing to mediumPaths(0) [system folder
  instead of game folder]
- ui-tomoko:
  instead of mediumPaths(1) - call emulator->metadataPathID() or something
  like that
2016-06-24 22:16:53 +10:00
Tim Allen ccd8878d75 Update to v099r07 release.
byuu says:

Changelog:
- (hopefully) fixed BS Memory and Sufami Turbo slot loading
- ported GB, GBA, WS cores to use nall/vfs
- completely removed loadRequest, saveRequest functionality from
  Emulator::Interface and ui-tomoko
  - loadRequest(folder) is now load(folder)
- save states now use a shared Emulator::SerializerVersion string
  - whenever this is bumped, all older states will break; but this makes
    bumping state versions way easier
  - also, the version string makes it a lot easier to identify
    compatibility windows for save states
- SNES PPU now uses uint16 vram[32768] for memory accesses [hex_usr]

NOTE: Super Game Boy loading is currently broken, and I'm not entirely
sure how to fix it :/
The file loading handoff was -really- complicated, and so I'm kind of
at a loss ... so for now, don't try it.
Everything else should theoretically work, so please report any bugs
you find.

So, this is pretty much it. I'd be very curious to hear feedback from
people who objected to the old nall/stream design, whether they are
happy with the new file loading system or think it could use further
improvements.

The 16-bit VRAM turned out to be a wash on performance (roughly the same
as before. 1fps slower on Zelda 3, 1fps faster on Yoshi's Island.) The
main reason for this was because Yoshi's Island was breaking horribly
until I changed the vramRead, vramWrite functions to take uint15 instead
of uint16.

I suspect the issue is we're using uint16s in some areas now that need
to be uint15, and this game is setting the VRAM address to 0x8000+,
causing us to go out of bounds on memory accesses.

But ... I want to go ahead and do something cute for fun, and just because
we can ... and this new interface is so incredibly perfect for it!! I
want to support an SNES unit with 128KiB of VRAM. Not out of the box,
but as a fun little tweakable thing. The SNES was clearly designed to
support that, they just didn't use big enough VRAM chips, and left one
of the lines disconnected. So ... let's connect it anyway!

In the end, if we design it right, the only code difference should be
one area where we mask by 15-bits instead of by 16-bits.
2016-06-24 22:09:30 +10:00
Tim Allen c074c6e064 Update to v099 release.
byuu says:

Time for a new release. There are a few important emulation improvements
and a few new features; but for the most part, this release focuses on
major code refactoring, the details of which I will mostly spare you.

The major change is that, as of v099, the SNES balanced and performance
cores have been removed from higan. Basically, in addition to my five
other emulation cores, these were too much of a burden to maintain. And
they've come along as far as I was able to develop them. If you need to
use these cores, please use these two from the v098 release.

I'm very well aware that ~80% of the people using higan for SNES
emulation were using the two removed profiles. But they simply had
to go. Hopefully in the future, we can compensate for their loss by
increasing the performance of the accuracy core.

Changelog (since v098):

SFC: balanced profile removed
SFC: performance profile removed
SFC: expansion port devices can now be changed during gameplay (atlhough
    you shouldn't)
SFC: fixed bug in SharpRTC leap year calculations
SFC: emulated new research findings for the S-DD1 coprocessor
SFC: fixed CPU emulation-mode wrapping bug with pei, [dp], [dp]+y
    instructions [AWJ]
SFC: fixed Super Game Boy bug that caused the bottom tile-row to flicker
    in games
GB: added MBC1M (multi-cart) mapper; icarus can't detect these so manual
    manifests are needed for now
GB: corrected return value when HuC3 unmapped RAM is read; fixes Robopon
    [endrift]
GB: improved STAT IRQ emulation; fixes Altered Space, etc [endrift,
    gekkio]
GB: partial emulation of DMG STAT write IRQ bug; fixes Legend of Zerd,
    Road Rash, etc
nall: execute() fix, for some Linux platforms that had trouble detecting
    icarus
nall: new BitField class; which allows for simplifying flag/register
    emulation in various cores
ruby: added Windows WASAPI audio driver (experimental)
ruby: remove attempts to call glSwapIntervalEXT (fixes crashing on some
    Linux systems)
ui: timing settings panel removed
video: restored saturation, gamma, luminance settings
video: added new post-emulation sprite system; light gun cursors are
    now higher-resolution
audio: new resampler (6th-order Butterworth biquad IIR); quite a bit
    faster than the old one
audio: added optional basic reverb filter (for fun)
higan: refresh video outside cooperative threads (workaround for shoddy
    code in AMD graphics drivers)
higan: individual emulation cores no longer have unique names
higan: really substantial code refactoring; 43% reduction in binary size

Off the bat, here are the known bugs:

hiro/Windows: focus stealing bug on startup. Needs to be fixed in hiro,
    not with a cheap hack to tomoko.

higan/SFC: some of the coprocessors are saving some volatile memory to
    disk. Completely harmless, but still needs to be fixed.

ruby/WASAPI: some sound cards have a lot of issues with the current driver
    (eg FitzRoy's). We need to find a clean way to fix this before it
    can be made the default driver. Which would be a huge win because
    the latency improvements are substantial, and in exclusive mode,
    WASAPI allows G-sync to work very well.

[From the v099 WIP thread, here's the changelog since v098r19:

- GB: don't force mode 1 during force-blank; fixes v098r16 regression
  with many Game Boy games
- GB: only perform the STAT write IRQ bug during vblank, not hblank
  (still not hardware accurate, though)

-Ed.]
2016-06-11 11:13:18 +10:00
Tim Allen e2ee6689a0 Update to v098r06 release.
byuu says:

Changelog:
- emulation cores now refresh video from host thread instead of
  cothreads (fix AMD crash)
- SFC: fixed another bug with leap year months in SharpRTC emulation
- SFC: cleaned up camelCase on function names for
  armdsp,epsonrtc,hitachidsp,mcc,nss,sharprtc classes
- GB: added MBC1M emulation (requires manually setting mapper=MBC1M in
  manifest.bml for now, sorry)
- audio: implemented Emulator::Audio mixer and effects processor
- audio: implemented Emulator::Stream interface
  - it is now possible to have more than two audio streams: eg SNES
    + SGB + MSU1 + Voicer-Kun (eventually)
- audio: added reverb delay + reverb level settings; exposed balance
  configuration in UI
- video: reworked palette generation to re-enable saturation, gamma,
  luminance adjustments
- higan/emulator.cpp is gone since there was nothing left in it

I know you guys are going to say the color adjust/balance/reverb stuff
is pointless. And indeed it mostly is. But I like the idea of allowing
some fun special effects and configurability that isn't system-wide.

Note: there seems to be some kind of added audio lag in the SGB
emulation now, and I don't really understand why. The code should be
effectively identical to what I had before. The only main thing is that
I'm sampling things to 48000hz instead of 32040hz before mixing. There's
no point where I'm intentionally introducing added latency though. I'm
kind of stumped, so if anyone wouldn't mind taking a look at it, it'd be
much appreciated :/

I don't have an MSU1 test ROM, but the latency issue may affect MSU1 as
well, and that would be very bad.
2016-04-22 23:35:51 +10:00
Tim Allen 55e507d5df Update to v098r05 release.
byuu says:

Changelog:
- WS/WSC: re-added support for screen rotation (code is inside WS core)
- ruby: changed sample(uint16_t left, uint16_t right) to sample(int16_t
  left, int16_t right);
  - requires casting to uint prior to shifting in each driver, but
    I felt it was misleading to use uint16_t just to avoid that
- ruby: WASAPI is now built in by default; has wareya's improvements,
  and now supports latency adjust
- tomoko: audio settings panel has new "Exclusive Mode" checkbox for
  WASAPI driver only
  - note: although the setting *does* take effect in real-time, I'd
    suggest restarting the emulator after changing it
- tomoko: audio latency can now be set to 0ms (which in reality means
  "the minimum supported by the driver")
- all: increased cothread size from 512KiB to 2MiB to see if it fixes
  bullshit AMD driver crashes
  - this appears to cause a slight speed penalty due to cache locality
    going down between threads, though
2016-04-18 20:49:45 +10:00
Tim Allen 7403e69307 Update to v098r02 release.
byuu says:

Changelog:
- SFC: fixed a regression on auto joypad polling due to missing
  parentheses
- SFC: exported new PPU::vdisp() const -> uint; function [1]
- SFC: merged PPU MMIO functions into the read/write handles (as
  I previously did for the CPU)
- higan: removed individual emulator core names (bnes, bsnes, bgb, bgba,
  bws) [2] Forgot:
- to remove /tomoko from the about dialog

[1] note that technically I was relying on the cached, per-frame
overscan setting when the CPU and light guns were polling the number of
active display scanlines per frame. This was technically incorrect as
you can change this value mid-frame and it'll kick in. I've never seen
any game toggle overscan every frame, we only know about this because
anomie tested this a long time ago. So, nothing should break, but ...
you know how the SNES is. You can't even look at the code without
something breaking, so I figured I'd mention it >_>

[2] I'll probably keep referring to the SNES core as bsnes anyway.
I don't mind if you guys use the b<system> names as shorthand. The
simplification is mostly to make the branding easier.
2016-04-09 15:20:41 +10:00
Tim Allen 680d16561e Update to v097r29 release.
byuu says:

Changelog:
- fixed DAS instruction (Judgment Silversword score)
- fixed [VH]TMR_FREQ writes (Judgement Silversword audio after area 20)
- fixed initialization of SP (fixes seven games that were hanging on
  startup)
- added SER_STATUS and SER_DATA stubs (fixes four games that were
  hanging on startup)
- initialized IEEP data (fixes Super Robot Taisen Compact 2 series)
  - note: you'll need to delete your internal.com in WonderSwan
    (Color).sys folders
- fixed CMPS and SCAS termination condition (fixes serious bugs in four
  games)
- set read/writeCompleted flags for EEPROM status (fixes Tetsujin 28
  Gou)
- major code cleanups to SFC/R65816 and SFC/CPU
  - mostly refactored disassembler to output strings instead of using
    char* buffer
  - unrolled all the subfolders on sfc/cpu to a single directory
  - corrected casing for all of sfc/cpu and a large portion of
    processor/r65816

I kind of went overboard on the code cleanup with this WIP. Hopefully
nothing broke. Any testing one can do with the SFC accuracy core would
be greatly appreciated.

There's still an absolutely huge amount of work left to go, but I do
want to eventually refresh the entire codebase to my current coding
style, which is extremely different from stuff that's been in higan
mostly untouched since ~2006 or so. It's dangerous and fickle work, but
if I don't do it, then the code will be a jumbled mess of several
different styles.
2016-03-26 12:56:15 +11:00
Tim Allen 379ab6991f Update to v097r28 release.
byuu says:

Changelog: (all WSC unless otherwise noted)
- fixed LINECMP=0 interrupt case (fixes FF4 world map during airship
  sequence)
- improved CPU timing (fixes Magical Drop flickering and FF1 battle
  music)
- added per-frame OAM caching (fixes sprite glitchiness in Magical Drop,
  Riviera, etc.)
- added RTC emulation (fixes Dicing Knight and Judgement Silversword)
- added save state support
- added cheat code support (untested because I don't know of any cheat
  codes that exist for this system)
- icarus: can now detect games with RTC chips
- SFC: bugfix to SharpRTC emulation (Dai Kaijuu Monogatari II)
  - ( I was adding the extra leap year day to all 12 months instead of
    just February ... >_< )

Note that the RTC emulation is very incomplete. It's not really
documented at all, and the two games I've tried that use it never even
ask you to set the date/time (so they're probably just using it to count
seconds.) I'm not even sure if I've implement the level-sensitive
behavior correctly (actually, now that I think about it, I need to mask
the clear bit in INT_ACK for the level-sensitive interrupts ...)

A bit worried about the RTC alarm, because it seems like it'll fire
continuously for a full minute. Or even if you turn it off after it
fires, then that doesn't seem to be lowering the line until the next
second ticks on the RTC, so that likely needs to happen when changing
the alarm flag.

Also not sure on this RTC's weekday byte. On the SharpRTC, it actually
computes this for you. Because it's not at all an easy thing to
calculate yourself in 65816 or V30MZ assembler. About 40 lines of code
to do it in C. For now, I'm requiring the program to calculate the value
itself.

Also note that there's some gibberish tiles in Judgement Silversword,
sadly. Not sure what's up there, but the game's still fully playable at
least.

Finally, no surprise: Beat-Mania doesn't run :P
2016-03-25 17:19:08 +11:00
Tim Allen c33065fbd1 Update to v097r24 release.
byuu says:

Changelog:
- WS: fixed bug when IRQs triggered during a rep string instruction
- WS: added sprite attribute caching (per-scanline); absolutely massive
  speed-up
- WS: emulated limit of 32 sprites per scanline
- WS: emulated the extended PPU register bit behavior based on the
  DISP_CTRL tile bit-depth setting
- WS: added "Rotate" key binding; can be used to flip the WS display
  between horizontal and vertical in real-time

The prefix emulation may not be 100% hardware-accurate, but the edge
cases should be extreme enough to not come up in the WS library. No way
to get the emulation 100% down without intensive hardware testing.
trap15 pointed me at a workflow diagram for it, but that diagram is
impossible without a magic internal stack frame that grows with every
IRQ, and can thus grow infinitely large.

The rotation thing isn't exactly the most friendly set-up, but oh well.
I'll see about adding a default rotation setting to manifests, so that
games like GunPey can start in the correct orientation. After that, if
the LCD orientation icon turns out to be reliable, then I'll start using
that. But if there are cases where it's not reliable, then I'll leave it
to manual button presses.

Speaking of icons, I'll need a set of icons to render on the screen.
Going to put them to the top right on vertical orientation, and on the
bottom left for horizontal orientation. Just outside of the video
output, of course.

Overall, WS is getting pretty far along, but still some major bugs in
various games. I really need sound emulation, though. Nobody's going to
use this at all without that.
2016-03-13 11:22:15 +11:00
Tim Allen 3d3ac8c1db Update to v097r22 release.
byuu says:

Changelog:
- WS: fixed lods, scas instructions
- WS: implemented missing GRP4 instructions
- WS: fixed transparency for screen one
- WSC: added color-mode PPU rendering
- WS+WSC: added packed pixel mode support
- WS+WSC: added dummy sound register reads/writes
- SFC: added threading to SuperDisc (it's hanging for right now; need to
  clear IRQ on $21e2 writes)

SuperDisc Timer and Sound Check were failing before due to not turning
off IRQs on $21e4 clear, so I'm happy that's fixed now.

Riviera starts now, and displays the first intro screen before crashing.
Huge, huge amounts of corrupted graphics, though. This game's really
making me work for it :(

No color games seem fully playable yet, but a lot of monochrome and
color games are now at least showing more intro screen graphics before
dying.

This build defaults to horizontal orientation, but I left the inputs
bound to vertical orientation. Whoops. I still need to implement
a screen flip key binding.
2016-03-13 11:22:14 +11:00
Tim Allen 810cbdafb4 Update to v097r16 release.
byuu says:

Changelog:
- sfc/ppu/sprite updated to use new .bit(s) functions; masked sizes
  better; added valid flags instead of using magic numbers
- ws/ppu updates to use new .bit(s) functions
- ws/ppu: added line compare interrupt support
- added ws/eeprom; emulation of WS/WSC internal EEPROM and cartridge
  EEPROM (1kbit - 16kbit supported)
- added basic read/write handlers for remaining WS/WSC PPU registers

WS EEPROM emulation is basically a direct copy of trap15's code. Still
some unknown areas in there, but hopefully it's enough to get further
into games that depend on EEPROM support. Note that you'll have to
manually add the eeprom line to the manifest for now, as icarus doesn't
know how to detect EEPROM/sizes yet.

I figured the changes to the SNES PPU sprites would slow it down a tad,
but it actually sped it up. Most of the impact from the integer classes
are gone now.
2016-03-13 11:22:10 +11:00
Tim Allen a8323d0d2b Update to v097r04 release.
byuu says:

Lots of improvements. We're now able to start executing some V30MZ
instructions. 32 of 256 opcodes implemented so far.

I hope this goes without saying, but there's absolutely no point in
loading WS/WSC games right now. You won't see anything until I have the
full CPU and partial PPU implemented.

ROM bank 2 works properly now, the I/O map is 16-bit (address) x 16-bit
(data) as it should be*, and I have a basic disassembler in place
(adding to it as I emulate new opcodes.)

(* I don't know what happens if you access an 8-bit port in 16-bit mode
or vice versa, so for now I'm just treating the handlers as always being
16-bit, and discarding the upper 8-bits when not needed.)
2016-01-28 22:39:49 +11:00
Tim Allen d7998b23ef Update to v097r03 release.
byuu says:

So, this WIP starts work on something new for higan. Obviously, I can't
keep it a secret until it's ready, because I want to continue daily WIP
releases, and of course, solicit feedback as I go along.
2016-01-27 22:31:39 +11:00