byuu says:
Changelog:
- gba/cpu: slight speedup to CPU::step()
- processor/arm7tdmi: fixed about ten bugs, ST018 and GBA games are
now playable once again
- processor/arm: removed core from codebase
- processor/v30mz: code cleanup (renamed functions; updated
instruction() for consistency with other cores)
It turns out on my much faster system, the new ARM7TDMI core is very
slightly slower than the old one (by about 2% or so FPS.) But the
CPU::step() improvement basically made it a wash.
So yeah, I'm in really serious trouble with how slow my GBA core is now.
Sigh.
As for higan/processor ... this concludes the first phase of major
cleanups and rewrites.
There will always be work to do, and I have two more phases in mind.
One is that a lot of the instruction disassemblers are very old. One
even uses sprintf still. I'd like to modernize them all. Also, the
ARM7TDMI core (and the ARM core before it) can't really disassemble
because the PC address used for instruction execution is not known prior
to calling instruction(), due to pipeline reload fetches that may occur
inside of said function. I had a nasty hack for debugging the new core,
but I'd like to come up with a clean way to allow tracing the new
ARM7TDMI core.
Another is that I'd still like to rename a lot of instruction function
names in various cores to be more descriptive. I really liked how the
LR35902 core came out there, and would like to get that level of detail
in with the other cores as well.
byuu says
Changelog:
- FC: fixed three MOS6502 regressions [hex\_usr]
- GBA: return fetched instruction instead of 0 for unmapped MMIO
(passes all of endrift's I/O tests)
- MD: fix VDP control port read Vblank bit to test screen height
instead of hard-code 240 (fixes Phantasy Star IV)
- MD: swap USP,SSP when executing an exception (allows Super Street
Fighter II to run; but no sprites visible yet)
- MD: grant 68K access to Z80 bus on reset (fixes vdpdoc demo ROM from
freezing immediately)
- SFC: reads from $00-3f,80-bf:4000-43ff no longer update MDR
[p4plus2]
- SFC: massive, eight-hour cleanup of WDC65816 CPU core ... still not
complete
The big change this time around is the SFC CPU core. I've renamed
everything from R65816 to WDC65816, and then went through and tried to
clean up the code as much as possible. This core is so much larger than
the 6502 core that I chose cleaning up the code to rewriting it.
First off, I really don't care for the BitRange style functionality. It
was an interesting experiment, but its fatal flaw are that the types are
just bizarre, which makes them hard to pass around generically to other
functions as arguments. So I went back to the list of bools for flags,
and union/struct blocks for the registers.
Next, I renamed all of the functions to be more descriptive: eg
`op_read_idpx_w` becomes `instructionIndexedIndirectRead16`. `op_adc_b`
becomes `algorithmADC8`. And so forth.
I eliminated about ten instructions because they were functionally
identical sans the index, so I just added a uint index=0 parameter to
said functions. I added a few new ones (adjust→INC,DEC;
pflag→REP,SEP) where it seemed appropriate.
I cleaned up the disaster of the instruction switch table into something
a whole lot more elegant without all the weird argument decoding
nonsense (still need M vs X variants to avoid having to have 4-5
separate switch tables, but all the F/I flags are gone now); and made
some things saner, like the flag clear/set and branch conditions, now
that I have normal types for flags and registers once again.
I renamed all of the memory access functions to be more descriptive to
what they're doing: eg writeSP→push, readPC→fetch,
writeDP→writeDirect, etc. Eliminated some of the special read/write
modes that were only used in one single instruction.
I started to clean up some of the actual instructions themselves, but
haven't really accomplished much here. The big thing I want to do is get
rid of the global state (aa, rd, iaddr, etc) and instead use local
variables like I am doing with my other 65xx CPU cores now. But this
will take some time ... the algorithm functions depend on rd to be set
to work on them, rather than taking arguments. So I'll need to rework
that.
And then lastly, the disassembler is still a mess. I want to finish the
CPU cleanups, and then post a new WIP, and then rewrite the disassembler
after that. The reason being ... I want a WIP that can generate
identical trace logs to older versions, in case the CPU cleanup causes
any regressions. That way I can more easily spot the errors.
Oh ... and a bit of good news. v102 was running at ~140fps on the SNES
core. With the new support to suspend/resume WAI/STP, plus the internal
CPU registers not updating the MDR, the framerate dropped to ~132fps.
But with the CPU cleanups, performance went back to ~140fps. So, hooray.
Of course, without those two other improvements, we'd have ended up at
possibly ~146-148fps, but oh well.
[This version, with the internal version number changed back to "v100",
replaced the original v100 source archive on byuu.org soon after v100's
release, because it fixes important bugs in that version. --Ed]
byuu says:
Changelog:
- fixed default paths for Sufami Turbo slotted games
- moved WonderSwan orientation controls to the port rather than the device
- I do like hex_usr's idea here; but that'll need more consideration;
so this is a temporary fix
- added new debugger interface (see the public topic for more on that)
byuu says:
Changelog:
- added nall/bit-field.hpp
- updated all CPU cores (sans LR35902 due to some complexities) to use
BitFields instead of bools
- updated as many CPU cores as I could to use BitFields instead of union {
struct { uint8_t ... }; }; pairs
The speed changes are mostly a wash for this. In some instances,
I noticed a ~2-3% speedup (eg SNES emulation), and in others a 2-3%
slowdown (eg Famicom emulation.) It's within the margin of error, so
it's safe to say it has no impact.
This does give us a lot of new useful things, however:
- no more manual reconstruction of flag values from lots of left shifts
and ORs
- no more manual deconstruction of flag values from lots of ANDs
- ability to get completely free aliases to flag groups (eg GSU can
provide alt2, alt1 and also alt (which is alt2,alt1 combined)
- removes the need for the nasty order_lsbN macro hack (eventually will
make higan 100% endian independent)
- saves us from insane compilers that try and do nasty things with
alignment on union-structs
- saves us from insane compilers that try to store bit-field bits in
reverse order
- will allow some really novel new use cases (I'm planning an
instant-decode ARM opcode function, for instance.)
- reduces code size (we can serialize flag registers in one line instead
of one for each flag)
However, I probably won't use it for super critical code that's constantly
reading out register values (eg PPU MMIO registers.) I think there we
would end up with a performance penalty.
byuu says:
Changelog:
- fixed DAS instruction (Judgment Silversword score)
- fixed [VH]TMR_FREQ writes (Judgement Silversword audio after area 20)
- fixed initialization of SP (fixes seven games that were hanging on
startup)
- added SER_STATUS and SER_DATA stubs (fixes four games that were
hanging on startup)
- initialized IEEP data (fixes Super Robot Taisen Compact 2 series)
- note: you'll need to delete your internal.com in WonderSwan
(Color).sys folders
- fixed CMPS and SCAS termination condition (fixes serious bugs in four
games)
- set read/writeCompleted flags for EEPROM status (fixes Tetsujin 28
Gou)
- major code cleanups to SFC/R65816 and SFC/CPU
- mostly refactored disassembler to output strings instead of using
char* buffer
- unrolled all the subfolders on sfc/cpu to a single directory
- corrected casing for all of sfc/cpu and a large portion of
processor/r65816
I kind of went overboard on the code cleanup with this WIP. Hopefully
nothing broke. Any testing one can do with the SFC accuracy core would
be greatly appreciated.
There's still an absolutely huge amount of work left to go, but I do
want to eventually refresh the entire codebase to my current coding
style, which is extremely different from stuff that's been in higan
mostly untouched since ~2006 or so. It's dangerous and fickle work, but
if I don't do it, then the code will be a jumbled mess of several
different styles.
byuu says:
Changelog: (all WSC unless otherwise noted)
- fixed LINECMP=0 interrupt case (fixes FF4 world map during airship
sequence)
- improved CPU timing (fixes Magical Drop flickering and FF1 battle
music)
- added per-frame OAM caching (fixes sprite glitchiness in Magical Drop,
Riviera, etc.)
- added RTC emulation (fixes Dicing Knight and Judgement Silversword)
- added save state support
- added cheat code support (untested because I don't know of any cheat
codes that exist for this system)
- icarus: can now detect games with RTC chips
- SFC: bugfix to SharpRTC emulation (Dai Kaijuu Monogatari II)
- ( I was adding the extra leap year day to all 12 months instead of
just February ... >_< )
Note that the RTC emulation is very incomplete. It's not really
documented at all, and the two games I've tried that use it never even
ask you to set the date/time (so they're probably just using it to count
seconds.) I'm not even sure if I've implement the level-sensitive
behavior correctly (actually, now that I think about it, I need to mask
the clear bit in INT_ACK for the level-sensitive interrupts ...)
A bit worried about the RTC alarm, because it seems like it'll fire
continuously for a full minute. Or even if you turn it off after it
fires, then that doesn't seem to be lowering the line until the next
second ticks on the RTC, so that likely needs to happen when changing
the alarm flag.
Also not sure on this RTC's weekday byte. On the SharpRTC, it actually
computes this for you. Because it's not at all an easy thing to
calculate yourself in 65816 or V30MZ assembler. About 40 lines of code
to do it in C. For now, I'm requiring the program to calculate the value
itself.
Also note that there's some gibberish tiles in Judgement Silversword,
sadly. Not sure what's up there, but the game's still fully playable at
least.
Finally, no surprise: Beat-Mania doesn't run :P
byuu says:
Changelog:
- WS: fixed 8-bit sign-extended imul (fixes Star Hearts completely,
Final Fantasy world map)
- WS: fixed rcl/rcr carry shifting (fixes Crazy Climber, others)
- WS: added sound DMA emulation (Star Hearts rain sound for one example)
- WS: added OAM caching, but it's forced every line for now because
otherwise there are too many sprite glitches
- WS: use headphoneEnable bit instead of speakerEnable bit (fixes muted
audio in games)
- WS: various code cleanups (I/O mapping, audio channel naming, etc)
The hypervoice channel doesn't sound all that great just yet. But I'm
not sure how it's supposed to sound. I need a better example of some
more complex music.
What's left are some unknown register status bits (especially in the
sound area), keypad interrupts, RTC emulation, CPU prefetch emulation.
And then it's all just bugs. Lots and lots of bugs that need to be
fixed.
EDIT: oops, bad typo in the code.
ws/ppu/ppu.cpp line 20: change range(256) to range(224).
Also, delete the r.speed stuff from channel5.cpp to make the rain sound
a lot better in Star Hearts. Apparently that's outdated and not what the
bits really do.
byuu says:
Changelog:
- WS: added HblankTimer and VblankTimer IRQs; although they don't appear
to have any effect on any games that use them :/
- WS: added sound emulation; works perfectly in some games (eg Riviera);
is completely silent in most games (eg GunPey)
The sound emulation only partially supports the hypervoice (headphone
only) channel. I need to implement the SDMA before it'll actually do
anything useful. I'm a bit confused about how exactly things work. It
looks like the speaker volume shift and clamp only applies to speaker
mode and not headphone mode, which is very weird. Then there's the
software possibility of muting the headphones and/or the speaker.
Preferably, I want to leave the emulator always in headphone mode for
the extra audio channel. If there are games that force-mute the
headphones, but not speakers, then I may need to force headphones back
on but with the hypervoice channel disabled. I guess we'll see how
things go.
Rough guess is probably that the channels default to enabled after the
IPLROM, and games aren't bothering to manually enable them or something.
byuu says:
Changelog:
- WS: fixed bug when IRQs triggered during a rep string instruction
- WS: added sprite attribute caching (per-scanline); absolutely massive
speed-up
- WS: emulated limit of 32 sprites per scanline
- WS: emulated the extended PPU register bit behavior based on the
DISP_CTRL tile bit-depth setting
- WS: added "Rotate" key binding; can be used to flip the WS display
between horizontal and vertical in real-time
The prefix emulation may not be 100% hardware-accurate, but the edge
cases should be extreme enough to not come up in the WS library. No way
to get the emulation 100% down without intensive hardware testing.
trap15 pointed me at a workflow diagram for it, but that diagram is
impossible without a magic internal stack frame that grows with every
IRQ, and can thus grow infinitely large.
The rotation thing isn't exactly the most friendly set-up, but oh well.
I'll see about adding a default rotation setting to manifests, so that
games like GunPey can start in the correct orientation. After that, if
the LCD orientation icon turns out to be reliable, then I'll start using
that. But if there are cases where it's not reliable, then I'll leave it
to manual button presses.
Speaking of icons, I'll need a set of icons to render on the screen.
Going to put them to the top right on vertical orientation, and on the
bottom left for horizontal orientation. Just outside of the video
output, of course.
Overall, WS is getting pretty far along, but still some major bugs in
various games. I really need sound emulation, though. Nobody's going to
use this at all without that.
byuu says:
Changelog:
- WS: fixed lods, scas instructions
- WS: implemented missing GRP4 instructions
- WS: fixed transparency for screen one
- WSC: added color-mode PPU rendering
- WS+WSC: added packed pixel mode support
- WS+WSC: added dummy sound register reads/writes
- SFC: added threading to SuperDisc (it's hanging for right now; need to
clear IRQ on $21e2 writes)
SuperDisc Timer and Sound Check were failing before due to not turning
off IRQs on $21e4 clear, so I'm happy that's fixed now.
Riviera starts now, and displays the first intro screen before crashing.
Huge, huge amounts of corrupted graphics, though. This game's really
making me work for it :(
No color games seem fully playable yet, but a lot of monochrome and
color games are now at least showing more intro screen graphics before
dying.
This build defaults to horizontal orientation, but I left the inputs
bound to vertical orientation. Whoops. I still need to implement
a screen flip key binding.
byuu says:
Got it. Wow, that didn't hurt nearly as much as I thought it was going
to.
Dropped from 127.5fps to 123.5fps to use Natural/Integer for
(u)int(8,16,32,64).
That's totally worth the cost.
byuu says:
Alright, well interrupts are in. At least Vblank is.
I also fixed a bug in vector() indexing, MoDRM mod!=3&®==6 using SS
instead of DS, opcodes a0-a3 allowing segment override, and added the
"irq_disable" stuff to the relevant opcodes to suppress IRQs after
certain instructions.
But unfortunately ... still no go on Riviera. It's not reading any
unmapped ports, and although it enables Vblank IRQs and they set port
$b4's status bit, the game never sets the IE flag, so no interrupts ever
actually fire. The game does indeed appear to be sitting in a rather
huge loop, which is probably dependent upon some RAM variable being set
from the Vblank IRQ, but I don't know how I'm supposed to be triggering
it.
... I'm really quite stumped here >_>
byuu says:
All 256 instructions implemented fully. Fixed a major bug with
instructions that both read and write to ModRM with displacement.
Riviera now runs into an infinite loop ... possibly crashed, possibly
waiting on interrupts or in to return something. Added a bunch of PPU
settings registers, but nothing's actually rendering with them yet.
244 of 256 opcodes implemented now, although the interrupt triggering
portions are missing from them still. Much better handling of prefixes
now.
I definitely have a newfound hatepreciation for x86 now >_>
byuu says:
Up to 211 opcodes implemented, with the caveat that the four opcodes
that make up group 3 and group 4 don't do anything yet. Both groups seem
to have some "illegal" instructions in them, so that'll be "fun".
I have a new mechanic in place for opcode prefixes, but it could use
some work still. I also only have it working to override ModRM mem
addressing, but of course it does it in a lot of other places like the
string operations.
Making it about 5.5 million instructions into Gunpey now, but of course
that doesn't mean much. Could be going off the rails at any point due to
CPU bugs or unimplemented ports. Riviera's still crashing.
byuu says:
26 hours in, 173 instructions implemented. Although the four segment
prefix opcodes don't actually do anything yet. There's less than 256
actual instructions on the 80186, not sure of the exact count.
Gunpey gets around ~8,200 instructions in before hitting an unsupported
opcode (loop). Riviera goes off the rails on a retf and ends up
executing an endless stream of bad opcodes in RAM =( Both games hammer
the living shit out of the in/out ports pretty much immediately.
byuu says:
Man, the 80186 is taking a lot longer to implement than I thought it
would. So far I'm 18 hours into this emulator. Whereas I had Super Mario
Bros fully playable (no sound) in 12 hours for the NES >_>
I refactored all the byte/word variant functions to single functions
that take a size parameter. Cuts the amount of code in half.
Also implemented repz/repnz + movsb/movsw, so Riviera now gets 299
instructions in before dying. Nobody really bothers to explain how the
CPU actually implements these instructions, but I think I have it right:
ignore non-string opcodes that follow rep, invoke the string operations
inside the rep opcodes to prevent interrupts from triggering between the
two (which will be even more fun for segment selector overrides ...)
The next opcode needed is 0xC7, which ... throws ModRM on its head. In
this mode, ModRM is only used to determine the target operand (and it
doesn't use the middle bits for that at all), and the source is an
immediate that follows it. Gonna have to waste a few more hours thinking
about how best to handle that.
Also, disabled HiDPI for higan as well on OS X.
byuu says:
More V30MZ implemented, a lot more to go.
icarus now supports importing WS and WSC games. It expects them to have
the correct file extension, same for GB and GBC.
> Ugh, apparently HiDPI icarus doesn't let you press the check boxes.
I set the flag value in the plist to false for now. Forgot to do it for
higan, but hopefully I won't forget before release.
byuu says:
Lots of improvements. We're now able to start executing some V30MZ
instructions. 32 of 256 opcodes implemented so far.
I hope this goes without saying, but there's absolutely no point in
loading WS/WSC games right now. You won't see anything until I have the
full CPU and partial PPU implemented.
ROM bank 2 works properly now, the I/O map is 16-bit (address) x 16-bit
(data) as it should be*, and I have a basic disassembler in place
(adding to it as I emulate new opcodes.)
(* I don't know what happens if you access an 8-bit port in 16-bit mode
or vice versa, so for now I'm just treating the handlers as always being
16-bit, and discarding the upper 8-bits when not needed.)
byuu says:
So, this WIP starts work on something new for higan. Obviously, I can't
keep it a secret until it's ready, because I want to continue daily WIP
releases, and of course, solicit feedback as I go along.