Commit Graph

107 Commits

Author SHA1 Message Date
Tim Allen e7806dd6e8 Update to v102r27 release.
byuu says:

Changelog:

  - processor/gsu: minor code cleanup
  - processor/hg51b: renamed reg(Read,Write) to register(Read,Write)
  - processor/lr35902: minor code cleanup
  - processor/spc700: completed code cleanup (sans disassembler)
      - no longer uses internal global state inside instructions
  - processor/spc700: will no longer hang the emulator if stuck in a WAI
    (SLEEP) or STP (STOP) instruction
  - processor/spc700: fixed bug in handling of OR1 and AND1 instructions
  - processor/z80: minor code cleanup
  - sfc/dsp: revert to initializing registers to 0x00; save for
    ENDX=random(), FLG=0xe0 [Jonas Quinn]

Major testing of the SNES game library would be appreciated, now that
its CPU cores have all been revised.

We know the DSP registers read back as randomized data ... mostly, but
there are apparently internal latches, which we can't emulate with the
current DSP design. So until we know which registers have separate
internal state that actually *is* initialized, I'm going to play it safe
and not break more games.

Thanks again to Jonas Quinn for the continued research into this issue.

EDIT: that said ... `MD works if((ENDX&0x30) > 0)` is only a 3:4 chance
that the game will work. That seems pretty unlikely that the odds of it
working are that low, given hardware testing by others in the past :/ I
thought if worked if `PITCH != 0` before, which would have been way more
likely.

The two remaining CPU cores that need major cleanup efforts are the
LR35902 and ARM cores. Both are very large, complicated, annoying cores
that will probably be better off as full rewrites from scratch. I don't
think I want to delay v103 in trying to accomplish that, however.

So I think it'll be best to focus on allowing the Mega Drive core to not
lock when processors are frozen waiting on a response from other
processors during a save state operation. Then we should be good for a
new release.
2017-06-19 12:07:54 +10:00
Tim Allen 50411a17d1 Update to v102r26 release.
byuu says:

Changelog:

  - md/ym2612: initialize DAC sample to center volume [Cydrak]
  - processor/arm: add accumulate mode extra cycle to mlal [Jonas
    Quinn]
  - processor/huc6280: split off algorithms, improve naming of functions
  - processor/mos6502: split off algorithms
  - processor/spc700: major revamp of entire core (~50% completed)
  - processor/wdc65816: fixed several bugs introduced by rewrite

For the SPC700, this turns out to be very old code as well, with global
object state variables, those annoying `{Boolean,Natural}BitField` types,
`under_case` naming conventions, heavily abbreviated function names, etc.
I'm working to get the code to be in the same design as the MOS6502,
HuC6280, WDC65816 cores, since they're all extremely similar in terms of
architectural design (the SPC700 is more of an off-label
reimplementation of a 6502 core, but still.)

The main thing left is that about 90% of the actual instructions still
need to be adapted to not use the internal state (`aa`, `rd`, `dp`,
`sp`, `bit` variables.) I wanted to finish this today, but ran out of
time before work.

I wouldn't suggest too much testing just yet. We should wait until the
SPC700 core is finished for that. However, if some does want to and
spots regressions, please let me know.
2017-06-16 10:06:17 +10:00
Tim Allen b73d918776 Update to v102r25 release.
byuu says:

Changelog:

  - processor/arm: corrected MUL instruction timings [Jonas Quinn]
  - processor/wdc65816: finished phase two of the rewrite

I'm really pleased with the visual results of the wdc65816 core rewrite.
I was able to eliminate all of the weird `{Boolean,Natural}BitRange`
templates, as well as the need to use unions/structs. Registers are now
just simple `uint24` or `uint16` types (technically they're `Natural<T>`
types, but then all of higan uses those), flags are now just bool types.
I also eliminated all of the implicit object state inside of the core
(aa, rd, dp, sp) and instead do all computations on the stack frame with
local variables. Through using macros to reference the registers and
individual parts of them, I was able to reduce the visual tensity of all
of the instructions. And by using normal types without implicit states,
I was able to eliminate about 15% of the instructions necessary, instead
reusing existing ones.

The final third phase of the rewrite will be to recode the disassembler.
That code is probably the oldest code in all of higan right now, still
using sprintf to generate the output. So it is very long overdue for a
cleanup.

And now for the bad news ... as with any large code cleanup, regression
errors have seeped in. Currently, no games are running at all. I've left
the old disassembler in for this reason: we can compare trace logs of
v102r23 against trace logs of v102r25. The second there's any
difference, we've spotted a buggy instruction and can correct it.

With any luck, this will be the last time I ever rewrite the wdc65816
core. My style has changed wildly over the ~10 years since I wrote this
core, but it's really solidifed in recent years.
2017-06-15 01:55:55 +10:00
Tim Allen 6e8406291c Update to v102r24 release.
byuu says

Changelog:

  - FC: fixed three MOS6502 regressions [hex\_usr]
  - GBA: return fetched instruction instead of 0 for unmapped MMIO
    (passes all of endrift's I/O tests)
  - MD: fix VDP control port read Vblank bit to test screen height
    instead of hard-code 240 (fixes Phantasy Star IV)
  - MD: swap USP,SSP when executing an exception (allows Super Street
    Fighter II to run; but no sprites visible yet)
  - MD: grant 68K access to Z80 bus on reset (fixes vdpdoc demo ROM from
    freezing immediately)
  - SFC: reads from $00-3f,80-bf:4000-43ff no longer update MDR
    [p4plus2]
  - SFC: massive, eight-hour cleanup of WDC65816 CPU core ... still not
    complete

The big change this time around is the SFC CPU core. I've renamed
everything from R65816 to WDC65816, and then went through and tried to
clean up the code as much as possible. This core is so much larger than
the 6502 core that I chose cleaning up the code to rewriting it.

First off, I really don't care for the BitRange style functionality. It
was an interesting experiment, but its fatal flaw are that the types are
just bizarre, which makes them hard to pass around generically to other
functions as arguments. So I went back to the list of bools for flags,
and union/struct blocks for the registers.

Next, I renamed all of the functions to be more descriptive: eg
`op_read_idpx_w` becomes `instructionIndexedIndirectRead16`. `op_adc_b`
becomes `algorithmADC8`. And so forth.

I eliminated about ten instructions because they were functionally
identical sans the index, so I just added a uint index=0 parameter to
said functions. I added a few new ones (adjust→INC,DEC;
pflag→REP,SEP) where it seemed appropriate.

I cleaned up the disaster of the instruction switch table into something
a whole lot more elegant without all the weird argument decoding
nonsense (still need M vs X variants to avoid having to have 4-5
separate switch tables, but all the F/I flags are gone now); and made
some things saner, like the flag clear/set and branch conditions, now
that I have normal types for flags and registers once again.

I renamed all of the memory access functions to be more descriptive to
what they're doing: eg writeSP→push, readPC→fetch,
writeDP→writeDirect, etc. Eliminated some of the special read/write
modes that were only used in one single instruction.

I started to clean up some of the actual instructions themselves, but
haven't really accomplished much here. The big thing I want to do is get
rid of the global state (aa, rd, iaddr, etc) and instead use local
variables like I am doing with my other 65xx CPU cores now. But this
will take some time ... the algorithm functions depend on rd to be set
to work on them, rather than taking arguments. So I'll need to rework
that.

And then lastly, the disassembler is still a mess. I want to finish the
CPU cleanups, and then post a new WIP, and then rewrite the disassembler
after that. The reason being ... I want a WIP that can generate
identical trace logs to older versions, in case the CPU cleanup causes
any regressions. That way I can more easily spot the errors.

Oh ... and a bit of good news. v102 was running at ~140fps on the SNES
core. With the new support to suspend/resume WAI/STP, plus the internal
CPU registers not updating the MDR, the framerate dropped to ~132fps.
But with the CPU cleanups, performance went back to ~140fps. So, hooray.
Of course, without those two other improvements, we'd have ended up at
possibly ~146-148fps, but oh well.
2017-06-13 11:42:31 +10:00
Tim Allen cea64b9991 Update to v102r23 release.
byuu says:

Changelog:
  - rewrote the 6502 CPU core from scratch. Now called MOS6502,
    supported BCD mode
      - Famicom core disables BCD mode via MOS6502::BCD = 0;
  - renamed r65816 folder to wdc65816 (still need to rename the actual
    class, though ...)

Note: need to remove build rules for the now renamed r6502, r65816
objects from processor/GNUmakefile.

So this'll seem like a small WIP, but it was a solid five hours to
rewrite the entire 6502 core. The reason I wanted to do this was because
the old 6502 core was pretty sloppy. My coding style improved a lot, and
I really liked how the HuC6280 CPU core came out, so I wanted the 6502
core to be like that one.

The core can now support BCD mode, so hopefully that will prove useful
to hex\_usr and allow one core to run both the NES and his Atari 2600
cores at some point.

Note that right now, the core doesn't support any illegal instructions.
The old core supported a small number of them, but were mostly the no
operation ones. The goal is support all of the illegal instructions at
some point.

It's very possible the rewrite introduced some regressions, so thorough
testing of the NES core would be appreciated if anyone were up for it.
2017-06-11 11:51:53 +10:00
Tim Allen a4629e1f64 Update to v102r21 release.
byuu says:

Changelog:

  - GBA: fixed WININ2 reads, BG3PB writes [Jonas Quinn]
  - R65816: added support for yielding/resuming from WAI/STP¹
  - SFC: removed status.dmaCounter functionality (also fixes possible
    TAS desync issue)
  - tomoko: added support for combinatorial inputs [hex\_usr\]²
  - nall: fixed missing return value from Arithmetic::operator--
    [Hendricks266]

Now would be the time to start looking for major regressions with the
new GBA PPU renderer, I suppose ...

¹: this doesn't matter for the master thread (SNES CPU), but is
important for slave threads (SNES SA1). If you try to save a state and
the SA1 is inside of a WAI instruction, it will get stuck there forever.
This was causing attempts to create a save state in Super Bomberman
- Panic Bomber W to deadlock the emulator and crash it. This is now
finally fixed.

Note that I still need to implement similar functionality into the Mega
Drive 68K and Z80 cores. They still have the possibility of deadlocking.
The SNES implementation was more a dry-run test for this new
functionality. This possible crashing bug in the Mega Drive core is the
major blocking bug for a new official release.

²: many, many thanks to hex\_usr for coming up with a really nice
design. I mostly implemented it the exact same way, but with a few tiny
differences that don't really matter (display " and ", " or " instead of
" & ", " | " in the input settings windows; append → bind;
assignmentName changed to displayName.)

The actual functionality is identical to the old higan v094 and earlier
builds. Emulated digital inputs let you combine multiple possible keys
to trigger the buttons. This is OR logic, so you can map to eg
keyboard.up OR gamepad.up for instance. Emulated analog inputs always
sum together. Emulated rumble outputs will cause all mapped devices to
rumble, which is probably not at all useful but whatever. Hotkeys use
AND logic, so you have to press every key mapped to trigger them. Useful
for eg Ctrl+F to trigger fullscreen.

Obviously, there are cases where OR logic would be nice for hotkeys,
too. Eg if you want both F11 and your gamepad's guide button to trigger
the fullscreen toggle. Unfortunately, this isn't supported, and likely
won't ever be in tomoko. Something I might consider is a throw switch in
the configuration file to swap between AND or OR logic for hotkeys, but
I'm not going to allow construction of mappings like "(Keyboard.Ctrl and
Keyboard.F) or Gamepad.Guide", as that's just too complicated to code,
and too complicated to make a nice GUI to set up the mappings for.
2017-06-06 23:44:40 +10:00
Tim Allen 82c58527c3 Update to v102r17 release.
byuu says:

Changelog:

  - GBA: process audio at 2MHz instead of 32KHz¹
  - MD: do not allow the 68K to stop the Z80, unless it has been granted
    bus access first
  - MD: do not reset bus requested/granted signals when the 68K resets
    the Z80
      - the above two fix The Lost Vikings
  - MD: clean up the bus address decoding to be more readable
  - MD: add support for a13000-a130ff (#TIME) region; pass to cartridge
    I/O²
  - MD: emulate SRAM mapping used by >16mbit games; bank mapping used
    by >32mbit games³
  - MD: add 'reset pending' flag so that loading save states won't
    reload 68K PC, SP registers
      - this fixes save state support ... mostly⁴
  - MD: if DMA is not enabled, do not allow CD5 to be set [Cydrak]
      - this fixes in-game graphics for Ristar. Title screen still
        corrupted on first run
  - MD: detect and break sprite lists that form an infinite loop
    [Cydrak]
      - this fixes the emulator from dead-locking on certain games
  - MD: add DC offset to sign DAC PCM samples [Cydrak]
      - this improves audio in Sonic 3
  - MD: 68K TAS has a hardware bug that prevents writing the result back
    to RAM
      - this fixes Gargoyles
  - MD: 68K TRAP should not change CPU interrupt level
      - this fixes Shining Force II, Shining in the Darkness, etc
  - icarus: better SRAM heuristics for Mega Drive games

Todo:

  - need to serialize the new cartridge ramEnable, ramWritable, bank
    variables

¹: so technically, the GBA has its FIFO queue (raw PCM), plus a GB
chipset. The GB audio runs at 2MHz. However, I was being lazy and
running the sequencer 64 times in a row, thus decimating the audio to
32KHz. But simply discarding 63 out of every 64 samples resorts in
muddier sound with more static in it.

However ... increasing the audio thread processing intensity 64-fold,
and requiring heavy-duty three-chain lowpass and highpass filters is not
cheap. For this bump in sound quality, we're eating a loss of about 30%
of previous performance.

Also note that the GB audio emulation in the GBA core still lacks many
of the improvements made to the GB core. I was hoping to complete the GB
enhancements, but it seems like I'm never going to pass blargg's
psychotic edge case tests. So, first I want to clean up the GB audio to
my current coding standards, and then I'll port that over to the GBA,
which should further increase sound quality. At that point, it sound
exceed mGBA's audio quality (due to the ridiculously high sampling rate
and strong-attenuation audio filtering.)

²: word writes are probably not handled correctly ... but games are
only supposed to do byte writes here.

³: the SRAM mapping is used by games like "Story of Thor" and
"Phantasy Star IV." Unfortunately, the former wasn't released in the US
and is region protected. So you'll need to change the NTSU to NTSCJ in
md/system/system.cpp in order to boot it. But it does work nicely now.
The write protection bit is cleared in the game, and then it fails to
write to SRAM (soooooooo many games with SRAM write protection do this),
so for now I've had to disable checking that bit. Phantasy Star IV has a
US release, but sadly the game doesn't boot yet. Hitting some other bug.

The bank mapping is pretty much just for the 40mbit Super Street Fighter
game. It shows the Sega and Capcom logos now, but is hitting yet another
bug and deadlocking.

For now, I emulate the SRAM/bank mapping registers on all cartridges,
and set sane defaults. So long as games don't write to $a130XX, they
should all continue to work. But obviously, we need to get to a point
where higan/icarus can selectively enable these registers on a per-game
basis.

⁴: so, the Mega Drive has various ways to lock a chip until another
chip releases it. The VDP can lock the 68K, the 68K can lock the Z80,
etc. If this happens when you save a state, it'll dead-lock the
emulator. So that's obviously a problem that needs to be fixed. The fix
will be nasty ... basically, bypassing the dead-lock, creating a
miniature, one-instruction-long race condition. Extremely unlikely to
cause any issues in practice (it's only a little worse than the SNES
CPU/SMP desync), but ... there's nothing I can do about it. So you'll
have to take it or leave it. But yeah, for now, save states may lock up
the emulator. I need to add code to break the loops when in the process
of creating a save state still.
2017-03-10 21:23:29 +11:00
Tim Allen 04072b278b Update to v102r16 release.
byuu says:

Changelog:

  - Emulator::Stream now allows adding low-pass and high-pass filters
    dynamically
      - also accepts a pass# count; each pass is a second-order biquad
        butterworth IIR filter
  - Emulator::Stream no longer automatically filters out >20KHz
    frequencies for all streams
  - FC: added 20Hz high-pass filter; 20KHz low-pass filter
  - GB: removed simple 'magic constant' high-pass filter of unknown
    cutoff frequency (missed this one in the last WIP)
  - GB,SGB,GBC: added 20Hz high-pass filter; 20KHz low-pass filter
  - MS,GG,MD/PSG: added 20Hz high-pass filter; 20KHz low-pass filter
  - MD: added save state support (but it's completely broken for now;
    sorry)
  - MD/YM2612: fixed Voice#3 per-operator pitch support (fixes sound
    effects in Streets of Rage, etc)
  - PCE: added 20Hz high-pass filter; 20KHz low-pass filter
  - WS,WSC: added 20Hz high-pass filter; 20KHz low-pass filter

So, the point of the low-pass filters is to remove frequencies above
human hearing. If we don't do this, then resampling will introduce
aliasing that results in sounds that are audible to the human ear. Which
basically an annoying buzzing sound. You'll definitely hear the
improvement from these in games like Mega Man 2 on the NES. Of course,
these already existed before, so this WIP won't sound better than
previous WIPs.

The high-pass filters are a little more complicated. Their main role is
to remove DC bias and help to center the audio stream. I don't
understand how they do this at all, but ... that's what everyone who
knows what they're talking about says, thus ... so be it.

I have set all of the high-pass filters to 20Hz, which is below the
limit of human hearing. Now this is where it gets really interesting ...
technically, some of these systems actually cut off a lot of range. For
instance, the GBA should technically use an 800Hz high-pass filter when
output is done through the system's speakers. But of course, if you plug
in headphones, you can hear the lower frequencies.

Now 800Hz ... you definitely can hear. At that level, nearly all of the
bass is stripped out and the audio is very tinny. Just like the real
system. But for now, I don't want to emulate the audio being crushed
that badly.

I'm sticking with 20Hz everywhere since it won't negatively affect audio
quality. In fact, you should not be able to hear any difference between
this WIP and the previous WIP. But theoretically, DC bias should mostly
be removed as a result of these new filters. It may be that we need to
raise the values on some cores in the future, but I don't want to do
that until we know for certain that we have to.

What I can say is that compared to even older WIPs than r15 ... the
removal of the simple one-pole low-pass and high-pass filters with the
newer three-pass, second-order filters should result in much better
attenuation (less distortion of audible frequencies.) Probably not
enough to be noticeable in a blind test, though.
2017-03-09 07:20:40 +11:00
Tim Allen 4c3f9b93e7 Update to v102r12 release.
byuu says:

Changelog:

  - MD/PSG: fixed 68K bus Z80 status read address location
  - MS, GG, MD/PSG: channels post-decrement their counters, not
    pre-decrement [Cydrak]¹
  - MD/VDP: cache screen width registers once per scanline; screen
    height registers once per frame
  - MD/VDP: support 256-width display mode (used in Shining Force, etc)
  - MD/YM2612: implemented timers²
  - MD/YM2612: implemented 8-bit PCM DAC²
  - 68000: TRAP instruction should index the vector location by 32 (eg
    by 128 bytes), fixes Shining Force
  - nall: updated hex(), octal(), binary() functions to take uintmax
    instead of template<typename T> parameter³

¹: this one makes an incredible difference. Sie noticed that lots of
games set a period of 0, which would end up being a really long period
with pre-decrement. By fixing this, noise shows up in many more games,
and sounds way better in games even where it did before. You can hear
extra sound on Lunar - Sanposuru Gakuen's title screen, the noise in
Sonic The Hedgehog (Mega Drive) sounds better, etc.

²: this also really helps sound. The timers allow PSG music to play
back at the correct speed instead of playing back way too quickly. And
the PCM DAC lets you hear a lot of drum effects, as well as the
"Sega!!" sound at the start of Sonic the Hedgehog, and the infamous,
"Rise from your grave!" line from Altered Beast.

Still, most music on the Mega Drive comes from the FM channels, so
there's still not a whole lot to listen to.

I didn't implement Cydrak's $02c test register just yet. Sie wasn't 100%
certain on how the extended DAC bit worked, so I'd like to play it a
little conservative and get sound working, then I'll go back and add a
toggle or something to enable undocumented registers, that way we can
use that to detect any potential problems they might be causing.

³: unfortunately we lose support for using hex() on nall/arithmetic
types. If I have a const Pair& version of the function, then the
compiler gets confused on whether Natural<32> should use uintmax or
const Pair&, because compilers are stupid, and you can't have explicit
arguments in overloaded functions. So even though either function would
work, it just decides to error out instead >_>

This is actually really annoying, because I want hex() to be useful for
printing out nall/crypto keys and hashes directly.

But ... this change had to be made. Negative signed integers would crash
programs, and that was taking out my 68000 disassembler.
2017-02-27 19:45:51 +11:00
Tim Allen 68f04c3bb8 Update to v102r10 release.
byuu says:

Changelog:

  - removed Emulator::Interface::Capabilities¹
  - MS: improved the PSG emulation a bit
  - MS: added cheat code support
  - MS: added save state support²
  - MD: emulated the PSG³

¹: there's really no point to it anymore. I intend to add cheat codes
to the GBA core, as well as both cheat codes and save states to the Mega
Drive core. I no longer intend to emulate any new systems, so these
values will always be true. Further, the GUI doesn't respond to these
values to disable those features anymore ever since the hiro rewrite, so
they're double useless.

²: right now, the Z80 core is using a pointer for HL-\>(IX,IY)
overrides. But I can't reliably serialize pointers, so I need to convert
the Z80 core to use an integer here. The save states still appear to
work fine, but there's the potential for an instruction to execute
incorrectly if you're incredibly unlucky, so this needs to be fixed as
soon as possible. Further, I still need a way to serialize
array<T, Size> objects, and I should also add nall::Boolean
serialization support.

³: I don't have a system in place to share identical sound chips. But
this chip is so incredibly simple that it's not really much trouble to
duplicate it. Further, I can strip out the stereo sound support code
from the Game Gear portion, so it's even tinier.

Note that the Mega Drive only just barely uses the PSG. Not at all in
Altered Beast, and only for a tiny part of the BGM music on Sonic 1,
plus his jump sound effect.
2017-02-23 08:25:01 +11:00
Tim Allen d76c0c7e82 Update to v102r08 release.
byuu says:

Changelog:

  - PCE: restructured VCE, VDCs to run one scanline at a time
  - PCE: bound VDCs to 1365x262 timing (in order to decouple the VDCs
    from the VCE)
  - PCE: the two changes above allow save states to function; also
    grants a minor speed boost
  - PCE: added cheat code support (uses 21-bit bus addressing; compare
    byte will be useful here)
  - 68K: fixed `mov *,ccr` to read two bytes instead of one [Cydrak]
  - Z80: emulated /BUSREQ, /BUSACK; allows 68K to suspend the Z80
    [Cydrak]
  - MD: emulated the Z80 executing instructions [Cydrak]
  - MD: emulated Z80 interrupts (triggered during each Vblank period)
    [Cydrak]
  - MD: emulated Z80 memory map [Cydrak]
  - MD: added stubs for PSG, YM2612 accesses [Cydrak]
  - MD: improved bus emulation [Cydrak]

The PCE core is pretty much ready to go. The only major feature missing
is FM modulation.

The Mega Drive improvements let us start to see the splash screens for
Langrisser II, Shining Force, Shining in the Darkness. I was hoping I
could get them in-game, but no such luck. My Z80 implementation is
probably flawed in some way ... now that I think about it, I believe I
missed the BusAPU::reset() check for having been granted access to the
Z80 first. But I doubt that's the problem.

Next step is to implement Cydrak's PSG core into the Master System
emulator. Once that's in, I'm going to add save states and cheat code
support to the Master System core.

Next, I'll add the PSG core into the Mega Drive. Then I'll add the
'easy' PCM part of the YM2612. Then the rest of the beastly YM2612 core.
Then finally, cap things off with save state and cheat code support.

Should be nearing a new release at that point.
2017-02-20 19:13:10 +11:00
Tim Allen 7c9b78b7bb Update to v102r07 release.
byuu says:

Changelog:

  - PCE: emulated PSG volume controls (vastly enhances audio quality)
  - PCE: emulated PSG noise as a square wave (somewhat enhances audio
    quality)
  - PCE: added save state support (currently broken and deadlocks the
    emulator though)

Thankfully, MAME had some rather easy to read code on how the volume
adjustment works, which they apparently ripped out of expired patents.
Hooray!

The two remaining sound issues are:

1. the random number generator for the noise channel is definitely not
hardware accurate. But it won't affect the sound quality at all. You'd
only be able to tell the difference by looking at hex bytes of a stream
rip.
2. I have no clue how to emulate the LFO (frequency modulation). A comment
in MAME's code (they also don't emulate it) advises that they aren't
aware of any games that even use it. But I'm there has to be at least one?

Given LFO not being used, and the RNG not really mattering all that much
... the sound's pretty close to perfect now.
2017-02-13 10:09:03 +11:00
Tim Allen fa6cbac251 Update to v102r06 release.
byuu says:

Changelog:

  - added higan/emulator/platform.hpp (moved out Emulator::Platform from
    emulator/interface.hpp)
  - moved gmake build paramter to nall/GNUmakefile; both higan and
    icarus use it now
  - added build=profile mode
  - MD: added the region select I/O register
  - MD: started to add region selection support internally (still no
    external select or PAL support)
  - PCE: added cycle stealing when reading/writing to the VDC or VCE;
    and when using ST# instructions
  - PCE: cleaned up PSG to match the behavior of Mednafen (doesn't
    improve sound at all ;_;)
      - note: need to remove loadWaveSample, loadWavePeriod
  - HuC6280: ADC/SBC decimal mode consumes an extra cycle; does not set
    V flag
  - HuC6280: block transfer instructions were taking one cycle too many
  - icarus: added code to strip out PC Engine ROM headers
  - hiro: added options support to BrowserDialog

The last one sure ended in failure. The plan was to put a region
dropdown directly onto hiro::BrowserDialog, and I had all the code for
it working. But I forgot one important detail: the system loads
cartridges AFTER powering on, so even though I could technically change
the system region post-boot, I'd rather not do so.

So that means we have to know what region we want before we even select
a game. Shit.
2017-02-11 10:56:42 +11:00
Tim Allen ee7662a8be Update to v102r04 release.
byuu says:

Changelog:
  - Super Game Boy support is functional once again
  - new GameBoy::SuperGameBoyInterface class
  - system.(dmg,cgb,sgb) is now Model::(Super)GameBoy(Color) ala the PC
    Engine
  - merged WonderSwanInterface, WonderSwanColorInterface shared
    functions to WonderSwan::Interface
  - merged GameBoyInterface, GameBoyColorInterface shared functions to
    GameBoy::Interface
  - Interface::unload() now calls Interface::save() for Master System,
    Game Gear, Mega Drive, PC Engine, SuperGrafx
  - PCE: emulated PCE-CD backup RAM; stored per-game as save.ram (2KiB
    file)
      - this means you can now save your progress in games like Neutopia
      - the PCE-CD I/O registers like BRAM write protect are not
        emulated yet
  - PCE: IRQ sources now hold the IRQ line state, instead of the CPU
    holding it
      - this fixes most SuperGrafx games, which were fighting over the
        VDC IRQ line previously
  - PCE: CPU I/O $14xx should return the pending IRQ bits even if IRQs
    are disabled
  - PCE: VCE and the VDCs now synchronize to each other; fixes pixel
    widths in all games
  - PCE: greatly increased the accuracy of the VPC priority selection
    code (windows may be buggy still)
  - HuC6280: PLA, PLX, PLY should set Z, N flags; fixes many game bugs
    [Jonas Quinn]

The big thing I wanted to do was enslave the VDC(s) to the VCE. But
unfortunately, I forgot about the asynchronous DMA channels that each
VDC supports, so this isn't going to be possible I'm afraid.

In the most demanding case, Daimakaimura in-game, we're looking at 85fps
on my Xeon E3 1276v3. So ... not great, and we don't even have sound
connected yet.

We are going to have to profile and optimize this code once sound
emulation and save states are in.

Basically, think of it like this: the VCE, VDC0, and VDC1 all have the
same overhead, scheduling wise (which is the bulk of the performance
loss) as the dot-renderer for the SNES core. So it's like there's three
bsnes-accuracy PPU threads running just for video.

-----

Oh, just a fair warning ... the hooks for the SGB are a work in
progress.

If anyone is working on higan or a fork and want to do something similar
to it, don't use it as a template, at least not yet.

Right now, higan looks like this:

  - Emulator::Video handles the platform→videoRefresh calls
  - Emulator::Audio handles the platform→audioSample calls
  - each core hard-codes the platform→inputPoll, inputRumble calls
  - each core hard-codes calls to path, open, load to process files
  - dipSettings and notify are specialty hacks, neither are even hooked
    up right now to anything

With the SGB, it's an emulation core inside an emulation core, so
ideally you want to hook all of those functions. Emulator::Video and
Emulator::Audio aren't really abstractions over that, as the GB core
calls them and we have to special case not calling them in SGB mode.

The path, open, load can be implemented without hooks, thanks to the UI
only using one instance of Emulator::Platform for all cores. All we have
to do is override the folder path ID for the "Game Boy.sys" folder, so
that it picks "Super Game Boy.sfc/" and loads its boot ROM instead.
That's just a simple argument to GameBoy::System::load() and we're done.

dipSettings, notify and inputRumble don't matter. But we do also have to
hook inputPoll as well.

The nice idea would be for SuperFamicom::ICD2 to inherit from
Emulator::Platform and provide the desired functions that we need to
overload. After that, we'd just need the GB core to keep an abstraction
over the global Emulator::platform\* handle, to select between the UI
version and the SFC::ICD2 version.

However ... that doesn't work because of Emulator::Video and
Emulator::Audio. They would also have to gain an abstraction over
Emulator::platform\*, and even worse ... you'd have to constantly swap
between the two so that the SFC core uses the UI, and the GB core uses
the ICD2.

And so, for right now, I'm checking Model::SuperGameBoy() -> bool
everywhere, and choosing between the UI and ICD2 targets that way. And
as such, the ICD2 doesn't really need Emulator::Platform inheritance,
although it certainly could do that and just use the functions it needs.

But the SGB is even weirder, because we need additional new signals
beyond just Emulator::Platform, like joypWrite(), etc.

I'd also like to work on the Emulator::Stream for the SGB core. I don't
see why we can't have the GB core create its own stream, and let the
ICD2 just use that instead. We just have to be careful about the ICD2's
CPU soft reset function, to make sure the GB core's Stream object
remains valid. What I think that needs is a way to release an
Emulator::Stream individually, rather than calling
Emulator::Audio::reset() to do it. They are shared\_pointer objects, so
I think if I added a destructor function to remove it from
Emulator::Audio::streams, then that should work.
2017-01-26 12:06:06 +11:00
Tim Allen 186f008574 Update to v102r03 release.
byuu says:

Changelog:

  - PCE: split VCE from VDC
  - HuC6280: changed bus from (uint21 addr) to (uint8 bank, uint13 addr)
  - added SuperGrafx emulation (adds secondary VDC, plus new VPC)

The VDC now has no concept of the actual display raster timing, and
instead is driven by Vpulse (start of frame) and Hpulse (start of
scanline) signals from the VCE. One still can't render the start of the
next scanline onto the current scanline through overly aggressive
timings, but it shouldn't be too much more difficult to allow that to
occur now. This process incurs quite a major speed hit, so low-end
systems with Atom CPUs can't run things at 60fps anymore.

The timing needs a lot of work. The pixels end up very jagged if the VCE
doesn't output batches of 2-4 pixels at a time. But this should not be a
requirement at all, so I'm not sure what's going wrong there.

Yo, Bro and the 512-width mode of TV Sports Basketball is now broken as
a result of these changes, and I'm not sure why.

To load SuperGrafx games, you're going to have to change the .pce
extensions to .sg or .sgx. Or you can manually move the games from the
PC Engine folder to the SuperGrafx folder and change the game folder
extensions. I have no way to tell the games apart. Mednafen uses CRC32
comparisons, and I may consider that since there's only five games, but
I'm not sure yet.

The only SuperGrafx game that's playable right now is Aldynes. And the
priorities are all screwed up. I don't understand how the windows or the
priorities work at all from sgxtech.txt, so ... yeah. It's pretty
broken, but it's a start.

I could really use some help with this, as I'm very lost right now with
rendering :/

-----

Note that the SuperGrafx is technically its own system, it's not an
add-on.

As such, I'm giving it a separate .sys folder, and a separate library.

There's debate over how to name this thing. "SuperGrafx" appears more
popular than "Super Grafx". And you might also call it the "PC Engine
SuperGrafx", but I decided to leave off the prefix so it appears more
distinct.
2017-01-24 08:18:54 +11:00
Tim Allen bdc100e123 Update to v102r02 release.
byuu says:

Changelog:

  - I caved on the `samples[] = {0.0}` thing, but I'm very unhappy about it
      - if it's really invalid C++, then GCC needs to stop accepting it
        in strict `-std=c++14` mode
  - Emulator::Interface::Information::resettable is gone
  - Emulator::Interface::reset() is gone
  - FC, SFC, MD cores updated to remove soft reset behavior
  - split GameBoy::Interface into GameBoyInterface,
    GameBoyColorInterface
  - split WonderSwan::Interface into WonderSwanInterface,
    WonderSwanColorInterface
  - PCE: fixed off-by-one scanline error [hex_usr]
  - PCE: temporary hack to prevent crashing when VDS is set to < 2
  - hiro: Cocoa: removed (u)int(#) constants; converted (u)int(#)
    types to (u)int_(#)t types
  - icarus: replaced usage of unique with strip instead (so we don't
    mess up frameworks on macOS)
  - libco: added macOS-specific section marker [Ryphecha]

So ... the major news this time is the removal of the soft reset
behavior. This is a major!! change that results in a 100KiB diff file,
and it's very prone to accidental mistakes!! If anyone is up for
testing, or even better -- looking over the code changes between v102r01
and v102r02 and looking for any issues, please do so. Ideally we'll want
to test every NES mapper type and every SNES coprocessor type by loading
said games and power cycling to make sure the games are all cleanly
resetting. It's too big of a change for me to cover there not being any
issues on my own, but this is truly critical code, so yeah ... please
help if you can.

We technically lose a bit of hardware documentation here. The soft reset
events do all kinds of interesting things in all kinds of different
chips -- or at least they do on the SNES. This is obviously not ideal.
But in the process of removing these portions of code, I found a few
mistakes I had made previously. It simplifies resetting the system state
a lot when not trying to have all the power() functions call the reset()
functions to share partial functionality.

In the future, the goal will be to come up with a way to add back in the
soft reset behavior via keyboard binding as with the Master System core.
What's going to have to happen is that the key binding will have to send
a "reset pulse" to every emulated chip, and those chips are going to
have to act independently to power() instead of reusing functionality.
We'll get there eventually, but there's many things of vastly greater
importance to work on right now, so it'll be a while. The information
isn't lost ... we'll just have to pull it out of v102 when we are ready.

Note that I left the SNES reset vector simulation code in, even though
it's not possible to trigger, for the time being.

Also ... the Super Game Boy core is still disconnected. To be honest, it
totally slipped my mind when I released v102 that it wasn't connected
again yet. This one's going to be pretty tricky to be honest. I'm
thinking about making a third GameBoy::Interface class just for SGB, and
coming up with some way of bypassing platform-> calls when in this
mode.
2017-01-23 08:04:26 +11:00
Tim Allen ae5968cfeb Update to v102 release.
byuu says (in the public announcement):

This release adds very preliminary emulation of the Sega Master System
(Mark III), Sega Game Gear, Sega Mega Drive (Genesis), and NEC PC Engine
(Turbografx-16). These cores do not yet offer sound emulation, save
states or cheat codes.

I'm always very hesitant to release a new emulation core in its alpha
stages, as in the past this has resulted in lasting bad impressions
of cores that have since improved greatly. For instance, the Game Boy
Advance emulation offered today is easily the second most accurate around,
yet it is still widely judged by its much older alpha implementation.

However, it's always been tradition with higan to not hold onto code
in secret. Rather than delay future releases for another year or two,
I'll put my faith in you all to understand that the emulation of these
systems will improve over time.

I hope that by releasing things as they are now, I might be able to
receive some much needed assistance in improving these cores, as the
documentation for these new systems is very much less than ideal.

byuu says (in the WIP forum):

Changelog:

  - PCE: latch background scroll registers (fixes Neutopia scrolling)
  - PCE: clip background attribute table scrolling (fixes Blazing Lazers
    scrolling)
  - PCE: support background/sprite enable/disable bits
  - PCE: fix large sprite indexing (fixes Blazing Lazers title screen
    sprites)
  - HuC6280: wrap zeropage accesses to never go beyond $20xx
  - HuC6280: fix alternating addresses for block move instructions
    (fixes Neutopia II)
  - HuC6280: block move instructions save and restore A,X,Y registers
  - HuC6280: emulate BCD mode (may not be 100% correct, based on SNES
    BCD) (fixes Blazing Lazers scoring)
2017-01-20 08:01:15 +11:00
Tim Allen b03563426f Update to v101r35 release.
byuu says:

Changelog:
  - PCE: added 384KB HuCard ROM mirroring mode
  - PCE: corrected D-pad polling order
  - PCE: corrected palette color ordering (GRB, not RGB -- yes,
    seriously)
  - PCE: corrected SATB DMA -- should write to SATB, not to VRAM
  - PCE: broke out Background, Sprite VDC settings to separate
    subclasses
  - PCE: emulated VDC backgrounds
  - PCE: emulated VDC sprites
  - PCE: emulated VDC sprite overflow, collision interrupts
  - HuC6280: fixed disassembler output for STi instructions
  - HuC6280: added missing LastCycle check to interrupt()
  - HuC6280: fixed BIT, CMP, CPX, CPY, TRB, TSB, TST flag testing and
    result
  - HuC6280: added extra cycle delays to the block move instructions
  - HuC6280: fixed ordering for flag set/clear instructions (happens
    after LastCycle check)
  - HuC6280: removed extra cycle from immediate instructions
  - HuC6280: fixed indirectLoad, indirectYStore absolute addressing
  - HuC6280: fixed BBR, BBS zeropage value testing
  - HuC6280: fixed stack push/pull direction

Neutopia looks okay until the main title screen, then there's some
gibberish on the bottom. The game also locks up with some gibberish once
you actually start a new game. So, still not playable just yet =(
2017-01-19 19:38:57 +11:00
Tim Allen f500426158 Update to v101r34 release.
byuu says:

Changelog:

  - PCE: emulated gamepad polling
  - PCE: emulated CPU interrupt sources
  - PCE: emulated timer
  - PCE: smarter emulation of ST0,ST1,ST2 instructions
  - PCE: better structuring of CPU, VDP IO registers
  - PCE: connected palette generation to the interface
  - PCE: emulated basic VDC timing
  - PCE: emulated VDC Vblank, Coincidence, and DMA completion IRQs
  - PCE: emulated VRAM, SATB DMA transfers
  - PCE: emulated VDC I/O registers

Everything I've implemented today likely has lots of bugs, and is
untested for obvious reasons.

So basically, after I fix many horrendous bugs, it should now be
possible to implement the VDC and start getting graphical output.
2017-01-17 08:02:56 +11:00
Tim Allen 8499c64756 Update to v101r33 release.
byuu says:

Changelog:

  - PCE: HuC6280 core completed

There's bound to be a countless stream of bugs, and the cycle counts are
almost certainly not exact yet, but ... all instructions are implemented.

So at this point, I can start comparing trace logs against Mednafen's
debugger output.

Of course, we're very likely to immediately slam into a wall of needing
I/O registers implemented for the VDC in order to proceed further.
2017-01-15 11:58:47 +11:00
Tim Allen 26bd7590ad Update to v101r32 release.
byuu says:

Changelog:

  - SMS: fixed controller connection bug
  - SMS: fixed Z80 reset bug
  - PCE: emulated HuC6280 MMU
  - PCE: emulated HuC6280 RAM
  - PCE: emulated HuCard ROM reading
  - PCE: implemented 178 instructions
  - tomoko: removed "soft reset" functionality
  - tomoko: moved "power cycle" to just above "unload" option

I'm not sure of the exact number of HuC6280 instructions, but it's less
than 260.

Many of the ones I skipped are HuC6280-originals that I don't know how
to emulate just yet.

I'm also really unsure about the zero page stuff. I believe we should be
adding 0x2000 to the addresses to hit page 1, which is supposed to be
mapped to the zero page (RAM). But when I look at turboEMU's source, I
have no clue how the hell it could possibly be doing that. It looks to
be reading from page 0, which is almost always ROM, which would be ...
really weird.

I also don't know if I've emulated the T mode opcodes correctly or not.
The documentation on them is really confusing.
2017-01-14 10:59:38 +11:00
Tim Allen bf90bdfcc8 Update to v101r31 release.
byuu says:

Changelog:

  - converted Emulator::Interface::Bind to Emulator::Platform
  - temporarily disabled SGB hooks
  - SMS: emulated Game Gear palette (latching word-write behavior not
    implemented yet)
  - SMS: emulated Master System 'Reset' button, Game Gear 'Start' button
  - SMS: removed reset() functionality, driven by the mappable input now
    instead
  - SMS: split interface class in two: one for Master System, one for
    Game Gear
  - SMS: emulated Game Gear video cropping to 160x144
  - PCE: started on HuC6280 CPU core—so far only registers, NOP
    instruction has been implemented

Errata:

  - Super Game Boy support is broken and thus disabled
  - if you switch between Master System and Game Gear without
    restarting, bad things happen:
      - SMS→GG, no video output on the GG
      - GG→SMS, no input on the SMS

I'm not sure what's causing the SMS\<-\>GG switch bug, having a hard
time debugging it. Help would be very much appreciated, if anyone's up
for it. Otherwise I'll keep trying to track it down on my end.
2017-01-13 12:15:45 +11:00
Tim Allen 0ad70a30f8 Update to v101r30 release.
byuu says:

Changelog:

  - SMS: added cartridge ROM/RAM mirroring (fixes Alex Kidd)
  - SMS: fixed 8x16 sprite mode (fixes Wonder Boy, Ys graphics)
  - Z80: emulated "ex (sp),hl" instruction
  - Z80: fixed INx NF (should be set instead of cleared)
  - Z80: fixed loop condition check for CPxR, INxR, LDxR, OTxR (fixes
    walking in Wonder Boy)
  - SFC: removed Debugger and sfc/debugger.hpp
  - icarus: connected MS, GG, MD importing to the scan dialog
  - PCE: added emulation skeleton to higan and icarus

At this point, Master System games are fairly highly compatible, sans
audio. Game Gear games are running, but I need to crop the resolution
and support the higher color palette that they can utilize. It's really
something else the way they handled the resolution shrink on that thing.

The last change is obviously going to be the biggest news.

I'm very well aware it's not an ideal time to start on a new emulation
core, with the MS and MD cores only just now coming to life with no
audio support.

But, for whatever reason, my heart's really set on working on the PC
Engine. I wanted to write the final higan skeleton core, and get things
ready so that whenever I'm in the mood to work on the PCE, I can do so.

The skeleton is far and away the most tedious and obnoxious part of the
emulator development, because it's basically all just lots of
boilerplate templated code, lots of new files to create, etc.

I really don't know how things are going to proceed ... but I can say
with 99.9% certainty that this will be the final brand new core ever
added to higan -- at least one written by me, that is. This was
basically the last system from my childhood that I ever cared about.
It's the last 2D system with games that I really enjoy playing. No other
system is worth dividing my efforts and reducing the quality and amount
of time to work on the systems I have.

In the future, there will be potential for FDS, Mega CD and PCE-CD
support. But those will all be add-ons, and they'll all be really
difficult and challenge the entire design of higan's UI (it's entirely
cartridge-driven at this time.) None of them will be entirely new cores
like this one.
2017-01-12 07:27:30 +11:00
Tim Allen 79c83ade70 Update to v101r29 release.
byuu says:

Changelog:

  - SMS: background VDP clips partial tiles on the left (math may not be
    right ... it's hard to reason about)
  - SMS: fix background VDP scroll locks
  - SMS: fix VDP sprite coordinates
  - SMS: paint black after the end of the visible display
      - todo: shouldn't be a brute force at the end of the main VDP
        loop, should happen in each rendering unit
  - higan: removed emulator/debugger.hpp
  - higan: removed privileged: access specifier
  - SFC: removed debugger hooks
      - todo: remove sfc/debugger.hpp
  - Z80: fixed disassembly of (fd,dd) cb (displacement) (opcode)
    instructions
  - Z80: fix to prevent interrupts from firing between ix/iy prefixes
    and opcodes
      - todo: this is a rather hacky fix that could, if exploited, crash
        the stack frame
  - Z80: fix BIT flags
  - Z80: fix ADD hl,reg flags
  - Z80: fix CPD, CPI flags
  - Z80: fix IND, INI flags
  - Z80: fix INDR, INIT loop flag check
  - Z80: fix OUTD, OUTI flags
  - Z80: fix OTDR, OTIR loop flag check
2017-01-10 08:27:13 +11:00
Tim Allen a3aea95e6b Update to v101r28 release.
byuu says:

Changelog:

  - SMS: emulated the remaining 240 instructions in the (0xfd, 0xdd)
    0xcb (displacement) (opcode) set
      - 1/8th of these were "legal" instructions, and apparently games
        use them a lot
  - SMS: emulated the standard gamepad controllers
      - reset button not emulated yet

The reset button is tricky. In every other case, reset is a hardware
thing that instantly reboots the entire machine.

But on the SMS, it's more like a gamepad button that's attached to the
front of the device. When you press it, it fires off a reset vector
interrupt and the gamepad polling routine lets you query the status of
the button.

Just having a reset option in the "Master System" hardware menu is not
sufficient to fully emulate the behavior. Even more annoying is that the
Game Gear doesn't have such a button, yet the core information structs
aren't flexible enough for the Master System to have it, and the Game
Gear to not have it, in the main menu. But that doesn't matter anyway,
since it won't work having it in the menu for the Master System.

So as a result, I'm going to have to have a new "input device" called
"Hardware" that has the "Reset" button listed under there. And for the
sake of consistency, I'm not sure if we should treat the other systems
the same way or not :/
2017-01-09 07:55:02 +11:00
Tim Allen 569f5abc28 Update to v101r27 release.
byuu says:

Changelog:

  - SMS: emulated the generic Sega memory mapper (none of the more
    limited forms of it yet)
      - (missing ROM shift, ROM write enable emulation -- no commercial
        games use either, though)
  - SMS: bus I/O returns 0xff instead of 0x00 so games don't think every
    key is being pressed at once
      - (this is a hack until I implement proper controller pad reading)
  - SMS: very limited protection against reading/writing past the end of
    ROM/RAM (todo: should mirror)
  - SMS: VDP background HSCROLL subtracts, rather than adds, to the
    offset (unlike VSCROLL)
  - SMS: VDP VSCROLL is 9-bit, modulates voffset+vscroll to 224 in
    192-line mode (32x28 tilemap)
  - SMS: VDP tiledata for backgrounds and sprites use `7-(x&7)` rather
    than `(x&7)`
  - SMS: fix output color to be 6-bit rather than 5-bit
  - SMS: left clip uses register `#7`, not palette color `#7`
      - (todo: do we want `color[reg7]` or `color[16 + reg7]`?)
  - SMS: refined handling of 0xcb, 0xed prefixes in the Z80 core and its
    disassembler
  - SMS: emulated (0xfd, 0xdd) 0xcb opcodes 0x00-0x0f (still missing
    0x10-0xff)
  - SMS: fixed 0xcb 0b-----110 opcodes to use direct HL and never allow
    (IX,IY)+d
  - SMS: fixed major logic bug in (IX,IY)+d displacement
      - (was using `read(x)` instead of `operand()` for the displacement
        byte fetch before)
  - icarus: fake there always being 32KiB of RAM in all SMS cartridges
    for the time being
      - (not sure how to detect this stuff yet; although I've read it's
        not even really possible `>_>`)

TODO: remove processor/z80/dissassembler.cpp code block at line 396 (as it's unnecessary.)

Lots of commercial games are starting to show trashed graphical output now.
2017-01-06 19:11:38 +11:00
Tim Allen e30780bb72 Update to v101r25 release.
byuu says:

Changelog:

  - Makefile: added $(windres), -lpthread to Windows port
  - GBA: WAITCNT.prefetch is not writable (should fix Donkey Kong: King
    of Swing) \[endrift\]
  - SMS: fixed hcounter shift value \[hex\_usr\]
  - SMS: emulated interrupts (reset button isn't hooked up anywhere, not
    sure where to put it yet)

This WIP actually took a really long time because the documentation on
SMS interrupts was all over the place. I'm hoping I've emulated them
correctly, but I honestly have no idea. It's based off my best
understanding from four or five different sources. So it's probably
quite buggy.

However, a few interrupts fire in Sonic the Hedgehog, so that's
something to start with. Now I just have to hope I've gotten some games
far enough in that I can start seeing some data in the VDP VRAM. I need
that before I can start emulating graphics mode 4 to get some actual
screen output.

Or I can just say to hell with it and use a "Hello World" test ROM.
That'd probably be smarter.
2016-12-26 23:11:08 +11:00
Tim Allen bab2ac812a Update to v101r24 release.
byuu says:

Changelog:

  - SMS: extended bus mapping of in/out ports: now decoding them fully
    inside ms/bus
  - SMS: moved Z80 disassembly code from processor/z80 to ms/cpu
    (cosmetic)
  - SMS: hooked up non-functional silent PSG sample generation, so I can
    cap the framerate at 60fps
  - SMS: hooked up the VDP main loop: 684 clocks/scanline, 262
    scanlines/frame (no PAL support yet)
  - SMS: emulated the VDP Vcounter and Hcounter polling ... hopefully
    it's right, as it's very bizarre
  - SMS: emulated VDP in/out ports (data read, data write, status read,
    control write, register write)
  - SMS: decoding and caching all VDP register flags (variable names
    will probably change)
  - nall: \#undef IN on Windows port (prevent compilation warning on
    processor/z80)

Watching Sonic the Hedgehog, I can definitely see some VDP register
writes going through, which is a good sign.

Probably the big thing that's needed before I can get enough into the
VDP to start showing graphics is interrupt support. And interrupts are
never fun to figure out :/

What really sucks on this front is I'm flying blind on the Z80 CPU core.
Without a working VDP, I can't run any Z80 test ROMs to look for CPU
bugs. And the CPU is certainly too buggy still to run said test ROM
anyway. I can't find any SMS emulators with trace logging from reset.
Such logs vastly accelerate tracking down CPU logic bugs, so without
them, it's going to take a lot longer.
2016-12-17 22:31:34 +11:00
Tim Allen 1d7b674dd4 Update to v101r23 release.
byuu says:

This is a really tiny WIP. Just wanted to add the known fixes before I start debugging it against Mednafen in a fork.

Changelog:

  - Z80: fixed flag calculations on 8-bit ADC, ADD, SBC, SUB
  - Z80: fixed flag calculations on 16-bit ADD
  - Z80: simplified DAA logic \[AWJ\]
  - Z80: RETI sets IFF1=IFF2 (same as RETN)
2016-11-15 18:20:42 +11:00
Tim Allen c2c957a9da Update to v101r22 release.
byuu says:

Changelog:
- Z80: all 25 remaining instructions implemented

Now onto the debugging ... :/
2016-11-01 22:42:25 +11:00
Tim Allen 8cf20dabbf Update to v101r21 release.
byuu says:

Changelog:

- Z80: emulated 83 new instructions
- Z80: timing improvements

DAA is a skeleton implementation to complete the normal opcode set. Also
worth noting that I don't know exactly what the hell RETI is doing,
so for now it acts like RET. RETN probably needs some special handling
besides just setting IFF1=IFF2 as well.

I'm now missing 24 ED-prefix instructions, plus DAA, for a total of 25
opcodes remaining. And then, of course, several weeks worth of debugging
all of the inevitable bugs in the core.
2016-11-01 08:10:33 +11:00
Tim Allen 2707c5316d Update to v101r20 release.
byuu says:

Changelog:
- Z80: emulated 272 new instructions
- hiro/GTK: fixed v101r19 Linux regression [thanks, SuperMikeMan!]
2016-10-29 11:33:30 +11:00
Tim Allen f3e67da937 Update to v101r19 release.
byuu says:

Changelog:

-   added \~130 new PAL games to icarus (courtesy of Smarthuman
    and aquaman)
-   added all three Korean-localized games to icarus
-   sfc: removed SuperDisc emulation (it was going nowhere)
-   sfc: fixed MSU1 regression where the play/repeat flags were not
    being cleared on track select
-   nall: cryptography support added; will be used to sign future
    databases (validation will always be optional)
-   minor shims to fix compilation issues due to nall changes

The real magic is that we now have 25-30% of the PAL SNES library in
icarus!

Signing will be tricky. Obviously if I put the public key inside the
higan archive, then all anyone has to do is change that public key for
their own releases. And if you download from my site (which is now over
HTTPS), then you don't need the signing to verify integrity. I may just
put the public key on my site on my site and leave it at that, we'll
see.
2016-10-28 08:16:58 +11:00
Tim Allen d6e9d94ec3 Update to v101r17 release.
byuu says:

Changelog:

  - Z80: added most opcodes between 0x00 and 0x3f (two or three hard
    ones missing still)
  - Z80: redid register declaration *again* to handle AF', BC', DE',
    HL' (ugggggh, the fuck? Alternate registers??)
      - basically, using `#define <register name>` values to get around
        horrendously awful naming syntax
  - Z80: improved handling of displace() so that it won't ever trigger
    on (BC) or (DE)
2016-09-06 23:53:14 +10:00
Tim Allen 2fbbccf985 Update to v101r16 release.
byuu says:

Changelog:

  - Z80: implemented 113 new instructions (all the easy
    LD/ADC/ADD/AND/OR/SBC/SUB/XOR ones)
  - Z80: used alternative to castable<To, With> type (manual cast inside
    instruction() register macros)
  - Z80: debugger: used register macros to reduce typing and increase
    readability
  - Z80: debugger: smarter way of handling multiple DD/FD prefixes
    (using gotos, yay!)
  - ruby: fixed crash with Windows input driver on exit (from SuperMikeMan)

I have no idea how the P/V flag is supposed to work on AND/OR/XOR, so
that's probably wrong for now. HALT is also mostly a dummy function for
now. But I typically implement those inside instruction(), so it
probably won't need to be changed? We'll see.
2016-09-06 10:09:33 +10:00
Tim Allen 4c3f58150c Update to v101r15 release.
byuu says:

Changelog:

  - added (poorly-named) castable<To, With> template
  - Z80 debugger rewritten to make declaring instructions much simpler
  - Z80 has more instructions implemented; supports displacement on
    (IX), (IY) now
  - added `Processor::M68K::Bus` to mirror `Processor::Z80::Bus`
      - it does add a pointer indirection; so I'm not sure if I want to
        do this for all of my emulator cores ...
2016-09-04 23:51:27 +10:00
Tim Allen d91f3999cc Update to v101r14 release.
byuu says:
Changelog:

  - rewrote the Z80 core to properly handle 0xDD (IX0 and 0xFD (IY)
    prefixes
  - added Processor::Z80::Bus as a new type of abstraction
  - all of the instructions implemented have their proper T-cycle counts
    now
  - added nall/certificates for my public keys

The goal of `Processor::Z80::Bus` is to simulate the opcode fetches being
2-read + 2-wait states; operand+regular reads/writes being 3-read. For
now, this puts the cycle counts inside the CPU core. At the moment, I
can't think of any CPU core where this wouldn't be appropriate. But it's
certainly possible that such a case exists. So this may not be the
perfect solution.

The reason for having it be a subclass of Processor::Z80 instead of
virtual functions for the MasterSystem::CPU core to define is due to
naming conflicts. I wanted the core to say `in(addr)` and have it take
the four clocks. But I also wanted a version of the function that didn't
consume time when called. One way to do that would be for the core to
call `Z80::in(addr)`, which then calls the regular `in(addr)` that goes to
`MasterSystem::CPU::in(addr)`. But I don't want to put the `Z80::`
prefix on all of the opcodes. Very easy to forget it, and then end up not
consuming any time. Another is to use uglier names in the
`MasterSystem::CPU` core, like `read_`, `write_`, `in_`, `out_`, etc. But,
yuck.

So ... yeah, this is an experiment. We'll see how it goes.
2016-09-03 21:26:04 +10:00
Tim Allen 7c96826eb0 Update to v101r13 release.
byuu says:

Changelog:

  - MS: added ms/bus
  - Z80: implemented JP/JR/CP/DI/IM/IN instructions
  - MD/VDP: added window layer emulation
  - MD/controller/gamepad: fixed d2,d3 bits (Altered Beast requires
    this)

The Z80 is definitely a lot nastier than the LR35902. There's a lot of
table duplication with HL→IX→IY; and two of them nest two levels deep
(eg FD CB xx xx), so the design may change as I implement more.
2016-08-27 14:48:21 +10:00
Tim Allen 5df717ff2a Update to v101r12 release.
byuu says:

Changelog:

  - new md/bus/ module for bus reads/writes
      - abstracts byte/word accesses wherever possible (everything but
        RAM; forces all but I/O to word, I/O to byte)
      - holds the system RAM since that's technically not part of the
        CPU anyway
  - added md/controller and md/system/peripherals
  - added emulation of gamepads
  - added stub PSG audio output (silent) to cap the framerate at 60fps
    with audio sync enabled
  - fixed VSRAM reads for plane vertical scrolling (two bugs here: add
    instead of sub; interlave plane A/B)
  - mask nametable read offsets (can't exceed 8192-byte nametables
    apparently)
  - emulated VRAM/VSRAM/CRAM reads from VDP data port
  - fixed sprite width/height size calculations
  - added partial emulation of 40-tile per scanline limitation (enough
    to fix Sonic's title screen)
  - fixed off-by-one sprite range testing
  - fixed sprite tile indexing
  - Vblank happens at Y=224 with overscan disabled
      - unsure what happens when you toggle it between Y=224 and Y=240
        ... probably bad things
  - fixed reading of address register for ADDA, CMPA, SUBA
  - fixed sign extension for MOVEA effect address reads
  - updated MOVEM to increment the read addresses (but not writeback)
    for (aN) mode

With all of that out of the way, we finally have Sonic the Hedgehog
(fully?) playable. I played to stage 1-2 and through the special stage,
at least. EDIT: yeah, we probably need HIRQs for Labyrinth Zone.

Not much else works, of course. Most games hang waiting on the Z80, and
those that don't (like Altered Beast) are still royally screwed. Tons of
features still missing; including all of the Z80/PSG/YM2612.

A note on the perihperals this time around: the Mega Drive EXT port is
basically identical to the regular controller ports. So unlike with the
Famicom and Super Famicom, I'm inheriting the exension port from the
controller class.
2016-08-22 08:11:24 +10:00
Tim Allen f7ddbfc462 Update to v101r11 release.
byuu says:

Changelog:

  - 68K: fixed NEG/NEGX operand order
  - 68K: fixed bug in disassembler that was breaking trace logging
  - VDP: improved sprite rendering (still 100% broken)
  - VDP: added horizontal/vertical scrolling (90% broken)

Forgot:

  - 68K: fix extension word sign bit on indexed modes for disassembler
    as well
  - 68K: emulate STOP properly (use r.stop flag; clear on IRQs firing)

I'm really wearing out fast here. The Genesis documentation is somehow
even worse than Game Boy documentation, but this is a far more complex
system.

It's a massive time sink to sit here banging away at every possible
combination of how things could work, only to see no positive
improvements. Nothing I do seems to get sprites to do a goddamn thing.

squee says the sprite Y field is 10-bits, X field is 9-bits. genvdp says
they're both 10-bits. BlastEm treats them like they're both 10-bits,
then masks off the upper bit so it's effectively 9-bits anyway.

Nothing ever bothers to tell you whether the horizontal scroll values
are supposed to add or subtract from the current X position. Probably
the most basic detail you could imagine for explaining horizontal
scrolling and yet ... nope. Nothing.

I can't even begin to understand how the VDP FIFO functionality works,
or what the fuck is meant by "slots".

I'm completely at a loss as how how in the holy hell the 68K works with
8-bit accesses. I don't know whether I need byte/word handlers for every
device, or if I can just hook it right into the 68K core itself. This
one's probably the most major design detail. I need to know this before
I go and implement the PSG/YM2612/IO ports-\>gamepads/Z80/etc.

Trying to debug the 68K is murder because basically every game likes to
start with a 20,000,000-instruction reset phase of checksumming entire
games, and clearing out the memory as agonizingly slowly as humanly
possible. And like the ARM, there's too many registers so I'd need three
widescreen monitors to comfortably view the entire debugger output lines
onscreen.

I can't get any test ROMs to debug functionality outside of full games
because every **goddamned** test ROM coder thinks it's acceptable to tell
people to go fetch some toolchain from a link that died in the late '90s
and only works on MS-DOS 6.22 to build their fucking shit, because god
forbid you include a 32KiB assembled ROM image in your fucking archives.

... I may have to take a break for a while. We'll see.
2016-08-21 12:50:05 +10:00
Tim Allen 0b70a01b47 Update to v101r10 release.
byuu says:
Changelog:

  - 68K: MOVEQ is 8-bit signed
  - 68K: disassembler was print EOR for OR instructions
  - 68K: address/program-counter indexed mode had the signed-word/long
    bit backward
  - 68K: ADDQ/SUBQ #n,aN always works in long mode; regardless of size
  - 68K→VDP DMA needs to use `mode.bit(0)<<22|dmaSource`; increment by
    one instead of two
  - Z80: added registers and initial two instructions
  - MS: hooked up enough to load and start running games
      - Sonic the Hedgehog can execute exactly one instruction... whoo.
2016-08-20 00:11:26 +10:00
Tim Allen 4d2e17f9c0 Update to v101r09 release.
byuu says:

Sorry, two WIPs in one day. Got excited and couldn't wait.

Changelog:

  - ADDQ, SUBQ shouldn't update flags when targeting an address register
  - ADDA should sign extend effective address reads
  - JSR was pushing the PC too early
  - some improvements to 8-bit register reads on the VDP (still needs
    work)
  - added H/V counter reads to the VDP IO port region
  - icarus: added support for importing Master System and Game Gear ROMs
  - tomoko: added library sub-menus for each manufacturer
      - still need to sort Game Gear after Mega Drive somehow ...

The sub-menu system actually isn't all that bad. It is indeed a bit more
annoying, but not as annoying as I thought it was going to be. However,
it looks a hell of a lot nicer now.
2016-08-18 08:05:50 +10:00
Tim Allen 043f6a8b33 Update to v101r08 release.
byuu says:

Changelog:

  - 68K: fixed read-modify-write instructions
  - 68K: fixed ADDX bug (using wrong target)
  - 68K: fixed major bug with SUB using wrong argument ordering
  - 68K: fixed sign extension when reading address registers from
    effective addressing
  - 68K: fixed sign extension on CMPA, SUBA instructions
  - VDP: improved OAM sprite attribute table caching behavior
  - VDP: improved DMA fill operation behavior
  - added Master System / Game Gear stubs (needed for developing the Z80
    core)
2016-08-17 22:31:22 +10:00
Tim Allen ffd150735b Update to v101r07 release.
byuu says:

Added VDP sprite rendering. Can't get any games far enough in to see if
it actually works. So in other words, it doesn't work at all and is 100%
completely broken.

Also added 68K exceptions and interrupts. So far only the VDP interrupt
is present. It definitely seems to be firing in commercial games, so
that's promising. But the implementation is almost certainly completely
wrong. There is fuck all of nothing for documentation on how interrupts
actually work. I had to find out the interrupt vector numbers from
reading the comments from the Sonic the Hedgehog disassembly. I have
literally no fucking clue what I0-I2 (3-bit integer priority value in
the status register) is supposed to do. I know that Vblank=6, Hblank=4,
Ext(gamepad)=2. I know that at reset, SR.I=7. I don't know if I'm
supposed to block interrupts when I is >, >=, <, <= to the interrupt
level. I don't know what level CPU exceptions are supposed to be.

Also implemented VDP regular DMA. No idea if it works correctly since
none of the commercial games run far enough to use it. So again, it's
horribly broken for usre.

Also improved VDP fill mode. But I don't understand how it takes
byte-lengths when the bus is 16-bit. The transfer times indicate it's
actually transferring at the same speed as the 68K->VDP copy, strongly
suggesting it's actually doing 16-bit transfers at a time. In which case,
what happens when you set an odd transfer length?

Also, both DMA modes can now target VRAM, VSRAM, CRAM. Supposedly there's
all kinds of weird shit going on when you target VSRAM, CRAM with VDP
fill/copy modes, but whatever. Get to that later.

Also implemented a very lazy preliminary wait mechanism to to stall out
a processor while another processor exerts control over the bus. This
one's going to be a major work in progress. For one, it totally breaks
the model I use to do save states with libco. For another, I don't
know if a 68K->VDP DMA instantly locks the CPU, or if it the CPU could
actually keep running if it was executing out of RAM when it started
the DMA transfer from ROM (eg it's a bus busy stall, not a hard chip
stall.) That'll greatly change how I handle the waiting.

Also, the OSS driver now supports Audio::Latency. Sound should be
even lower latency now. On FreeBSD when set to 0ms, it's absolutely
incredible. Cannot detect latency whatsoever. The Mario jump sound seems
to happen at the very instant I hear my cherry blue keyswitch activate.
2016-08-15 14:56:38 +10:00
Tim Allen ac2d0ba1cf Update to v101r05 release.
byuu says:

Changelog:

  - 68K: fixed bug that affected BSR return address
  - VDP: added very preliminary emulation of planes A, B, W (W is
    entirely broken though)
  - VDP: added command/address stuff so you can write to VRAM, CRAM,
    VSRAM
  - VDP: added VRAM fill DMA

I would be really surprised if any commercial games showed anything at
all, so I'd probably recommend against wasting your time trying, unless
you're really bored :P

Also, I wanted to add: I am accepting patches\! So if anyone wants to
look over the 68K core for bugs, that would save me untold amounts of
time in the near future :D
2016-08-13 09:47:30 +10:00
Tim Allen 1df2549d18 Update to v101r04 release.
byuu says:

Changelog:

  - pulled the (u)intN type aliases into higan instead of leaving them
    in nall
  - added 68K LINEA, LINEF hooks for illegal instructions
  - filled the rest of the 68K lambda table with generic instance of
    ILLEGAL
  - completed the 68K disassembler effective addressing modes
      - still unsure whether I should use An to decode absolute
        addresses or not
      - pro: way easier to read where accesses are taking place
      - con: requires An to be valid; so as a disassembler it does a
        poor job
      - making it optional: too much work; ick
  - added I/O decoding for the VDP command-port registers
  - added skeleton timing to all five processor cores
  - output at 1280x480 (needed for mixed 256/320 widths; and to handle
    interlace modes)

The VDP, PSG, Z80, YM2612 are all stepping one clock at a time and
syncing; which is the pathological worst case for libco. But they also
have no logic inside of them. With all the above, I'm averaging around
250fps with just the 68K core actually functional, and the VDP doing a
dumb "draw white pixels" loop. Still way too early to tell how this
emulator is going to perform.

Also, the 320x240 mode of the Genesis means that we don't need an aspect
correction ratio. But we do need to ensure the output window is a
multiple 320x240 so that the scale values work correctly. I was
hard-coding aspect correction to stretch the window an additional \*8/7.
But that won't work anymore so ... the main higan window is now 640x480,
960x720, or 1280x960. Toggling aspect correction only changes the video
width inside the window.

It's a bit jarring ... the window is a lot wider, more black space now
for most modes. But for now, it is what it is.
2016-08-12 11:07:04 +10:00
Tim Allen 9b8c3ff8c0 Update to v101r03 release.
byuu says:

The 68K core now implements all 88 instructions. It ended up being 111
instructions in my core due to splitting up opcodes with the same name
but different addressing modes or directions (removes conditions at the
expense of more code.)

Technically, I don't have exceptions actually implemented yet, and
RESET/STOP don't do anything but set flags. So there's still more to
go. But ... close enough for statistics time!

The M68K core source code is 124,712 bytes in size. The next largest
core is the ARM7 core at 70,203 bytes in size.

The M68K object size is 942KiB; with the next largest being the V30MZ
core at 173KiB.

There are a total of 19,656 invalid opcodes in the 68000 revision (unless
of course I've made mistakes in my mappings, which is very probably.)

Now the fun part ... figuring out how to fix bugs in this core without
VDP emulation :/
2016-08-11 08:02:02 +10:00
Tim Allen 0a57cac70c Update to v101r02 release.
byuu says:

Changelog:

  - Emulator: use `(uintmax)-1 >> 1` for the units of time
  - MD: implemented 13 new 68K instructions (basically all of the
    remaining easy ones); 21 remain
  - nall: replaced `(u)intmax_t` (64-bit) with *actual* `(u)intmax` type
    (128-bit where available)
      - this extends to everything: atoi, string, etc. You can even
        print 128-bit variables if you like

22,552 opcodes still don't exist in the 68K map. Looking like quite a
few entries will be blank once I finish.
2016-08-09 21:07:18 +10:00
Tim Allen 8bdf8f2a55 Update to v101r01 release.
byuu says:

Changelog:

  - added eight more 68K instructions
  - split ADD(direction) into two separate ADD functions

I now have 54 out of 88 instructions implemented (thus, 34 remaining.)
The map is missing 25,182 entries out of 65,536. Down from 32,680 for
v101.00

Aside: this version number feels really silly. r10 and r11 surely will
as well ...
2016-08-08 20:12:03 +10:00
Tim Allen c50723ef61 Update to v100r15 release.
byuu wrote:

Aforementioned scheduler changes added. Longer explanation of why here:
http://hastebin.com/raw/toxedenece

Again, we really need to test this as thoroughly as possible for
regressions :/
This is a really major change that affects absolutely everything: all
emulation cores, all coprocessors, etc.

Also added ADDX and SUB to the 68K core, which brings us just barely
above 50% of the instruction encoding space completed.

[Editor's note: The "aformentioned scheduler changes" were described in
a previous forum post:

    Unfortunately, 64-bits just wasn't enough precision (we were
    getting misalignments ~230 times a second on 21/24MHz clocks), so
    I had to move to 128-bit counters. This of course doesn't exist on
    32-bit architectures (and probably not on all 64-bit ones either),
    so for now ... higan's only going to compile on 64-bit machines
    until we figure something out. Maybe we offer a "lower precision"
    fallback for machines that lack uint128_t or something. Using the
    booth algorithm would be way too slow.

    Anyway, the precision is now 2^-96, which is roughly 10^-29. That
    puts us far beyond the yoctosecond. Suck it, MAME :P I'm jokingly
    referring to it as the byuusecond. The other 32-bits of precision
    allows a 1Hz clock to run up to one full second before all clocks
    need to be normalized to prevent overflow.

    I fixed a serious wobbling issue where I was using clock > other.clock
    for synchronization instead of clock >= other.clock; and also another
    aliasing issue when two threads share a common frequency, but don't
    run in lock-step. The latter I don't even fully understand, but I
    did observe it in testing.

    nall/serialization.hpp has been extended to support 128-bit integers,
    but without explicitly naming them (yay generic code), so nall will
    still compile on 32-bit platforms for all other applications.

    Speed is basically a wash now. FC's a bit slower, SFC's a bit faster.

The "longer explanation" in the linked hastebin is:

    Okay, so the idea is that we can have an arbitrary number of
    oscillators. Take the SNES:

    - CPU/PPU clock = 21477272.727272hz
    - SMP/DSP clock = 24576000hz
    - Cartridge DSP1 clock = 8000000hz
    - Cartridge MSU1 clock = 44100hz
    - Controller Port 1 modem controller clock = 57600hz
    - Controller Port 2 barcode battler clock = 115200hz
    - Expansion Port exercise bike clock = 192000hz

    Is this a pathological case? Of course it is, but it's possible. The
    first four do exist in the wild already: see Rockman X2 MSU1
    patch. Manifest files with higan let you specify any frequency you
    want for any component.

    The old trick higan used was to hold an int64 counter for each
    thread:thread synchronization, and adjust it like so:

    - if thread A steps X clocks; then clock += X * threadB.frequency
      - if clock >= 0; switch to threadB
    - if thread B steps X clocks; then clock -= X * threadA.frequency
      - if clock <  0; switch to threadA

    But there are also system configurations where one processor has to
    synchronize with more than one other processor. Take the Genesis:

    - the 68K has to sync with the Z80 and PSG and YM2612 and VDP
    - the Z80 has to sync with the 68K and PSG and YM2612
    - the PSG has to sync with the 68K and Z80 and YM2612

    Now I could do this by having an int64 clock value for every
    association. But these clock values would have to be outside the
    individual Thread class objects, and we would have to update every
    relationship's clock value. So the 68K would have to update the Z80,
    PSG, YM2612 and VDP clocks. That's four expensive 64-bit multiply-adds
    per clock step event instead of one.

    As such, we have to account for both possibilities. The only way to
    do this is with a single time base. We do this like so:

    - setup: scalar = timeBase / frequency
    - step: clock += scalar * clocks

    Once per second, we look at every thread, find the smallest clock
    value. Then subtract that value from all threads. This prevents the
    clock counters from overflowing.

    Unfortunately, these oscillator values are psychotic, unpredictable,
    and often times repeating fractions. Even with a timeBase of
    1,000,000,000,000,000,000 (one attosecond); we get rounding errors
    every ~16,300 synchronizations. Specifically, this happens with a CPU
    running at 21477273hz (rounded) and SMP running at 24576000hz. That
    may be good enough for most emulators, but ... you know how I am.

    Plus, even at the attosecond level, we're really pushing against the
    limits of 64-bit integers. Given the reciprocal inverse, a frequency
    of 1Hz (which does exist in higan!) would have a scalar that consumes
    1/18th of the entire range of a uint64 on every single step. Yes, I
    could raise the frequency, and then step by that amount, I know. But
    I don't want to have weird gotchas like that in the scheduler core.

    Until I increase the accuracy to about 100 times greater than a
    yoctosecond, the rounding errors are too great. And since the only
    choice above 64-bit values is 128-bit values; we might as well use
    all the extra headroom. 2^-96 as a timebase gives me the ability to
    have both a 1Hz and 4GHz clock; and run them both for a full second;
    before an overflow event would occur.

Another hastebin includes demonstration code:

    #include <libco/libco.h>

    #include <nall/nall.hpp>
    using namespace nall;

    //

    cothread_t mainThread = nullptr;
    const uint iterations = 100'000'000;
    const uint cpuFreq = 21477272.727272 + 0.5;
    const uint smpFreq = 24576000.000000 + 0.5;
    const uint cpuStep = 4;
    const uint smpStep = 5;

    //

    struct ThreadA {
      cothread_t handle = nullptr;
      uint64 frequency = 0;
      int64 clock = 0;

      auto create(auto (*entrypoint)() -> void, uint frequency) {
        this->handle = co_create(65536, entrypoint);
        this->frequency = frequency;
        this->clock = 0;
      }
    };

    struct CPUA : ThreadA {
      static auto Enter() -> void;
      auto main() -> void;
      CPUA() { create(&CPUA::Enter, cpuFreq); }
    } cpuA;

    struct SMPA : ThreadA {
      static auto Enter() -> void;
      auto main() -> void;
      SMPA() { create(&SMPA::Enter, smpFreq); }
    } smpA;

    uint8 queueA[iterations];
    uint offsetA;
    cothread_t resumeA = cpuA.handle;

    auto EnterA() -> void {
      offsetA = 0;
      co_switch(resumeA);
    }

    auto QueueA(uint value) -> void {
      queueA[offsetA++] = value;
      if(offsetA >= iterations) {
        resumeA = co_active();
        co_switch(mainThread);
      }
    }

    auto CPUA::Enter() -> void { while(true) cpuA.main(); }

    auto CPUA::main() -> void {
      QueueA(1);
      smpA.clock -= cpuStep * smpA.frequency;
      if(smpA.clock < 0) co_switch(smpA.handle);
    }

    auto SMPA::Enter() -> void { while(true) smpA.main(); }

    auto SMPA::main() -> void {
      QueueA(2);
      smpA.clock += smpStep * cpuA.frequency;
      if(smpA.clock >= 0) co_switch(cpuA.handle);
    }

    //

    struct ThreadB {
      cothread_t handle = nullptr;
      uint128_t scalar = 0;
      uint128_t clock = 0;

      auto print128(uint128_t value) {
        string s;
        while(value) {
          s.append((char)('0' + value % 10));
          value /= 10;
        }
        s.reverse();
        print(s, "\n");
      }

      //femtosecond (10^15) =    16306
      //attosecond  (10^18) =   688838
      //zeptosecond (10^21) = 13712691
      //yoctosecond (10^24) = 13712691 (hitting a dead-end on a rounding error causing a wobble)
      //byuusecond? ( 2^96) = (perfect? 79,228 times more precise than a yoctosecond)

      auto create(auto (*entrypoint)() -> void, uint128_t frequency) {
        this->handle = co_create(65536, entrypoint);

        uint128_t unitOfTime = 1;
      //for(uint n : range(29)) unitOfTime *= 10;
        unitOfTime <<= 96;  //2^96 time units ...

        this->scalar = unitOfTime / frequency;
        print128(this->scalar);
        this->clock = 0;
      }

      auto step(uint128_t clocks) -> void { clock += clocks * scalar; }
      auto synchronize(ThreadB& thread) -> void { if(clock >= thread.clock) co_switch(thread.handle); }
    };

    struct CPUB : ThreadB {
      static auto Enter() -> void;
      auto main() -> void;
      CPUB() { create(&CPUB::Enter, cpuFreq); }
    } cpuB;

    struct SMPB : ThreadB {
      static auto Enter() -> void;
      auto main() -> void;
      SMPB() { create(&SMPB::Enter, smpFreq); clock = 1; }
    } smpB;

    auto correct() -> void {
      auto minimum = min(cpuB.clock, smpB.clock);
      cpuB.clock -= minimum;
      smpB.clock -= minimum;
    }

    uint8 queueB[iterations];
    uint offsetB;
    cothread_t resumeB = cpuB.handle;

    auto EnterB() -> void {
      correct();
      offsetB = 0;
      co_switch(resumeB);
    }

    auto QueueB(uint value) -> void {
      queueB[offsetB++] = value;
      if(offsetB >= iterations) {
        resumeB = co_active();
        co_switch(mainThread);
      }
    }

    auto CPUB::Enter() -> void { while(true) cpuB.main(); }

    auto CPUB::main() -> void {
      QueueB(1);
      step(cpuStep);
      synchronize(smpB);
    }

    auto SMPB::Enter() -> void { while(true) smpB.main(); }

    auto SMPB::main() -> void {
      QueueB(2);
      step(smpStep);
      synchronize(cpuB);
    }

    //

    #include <nall/main.hpp>
    auto nall::main(string_vector) -> void {
      mainThread = co_active();

      uint masterCounter = 0;
      while(true) {
        print(masterCounter++, " ...\n");

        auto A = clock();
        EnterA();
        auto B = clock();
        print((double)(B - A) / CLOCKS_PER_SEC, "s\n");

        auto C = clock();
        EnterB();
        auto D = clock();
        print((double)(D - C) / CLOCKS_PER_SEC, "s\n");

        for(uint n : range(iterations)) {
          if(queueA[n] != queueB[n]) return print("fail at ", n, "\n");
        }
      }
    }

...and that's everything.]
2016-07-31 12:11:20 +10:00