byuu says:
Changelog:
- finished cleaning up the SFC core to my new coding conventions
- removed sfc/controller/usart (superseded by 21fx)
- hid Synchronize Video option from the menu (still in the configuration
file)
Pretty much the only minor detail left is some variable names in the
SA-1 core that really won't look good at all if I move to camelCase,
so I'll have to rethink how I handle those. It's probably a good area
to attempt using BitFields, to see how it impacts performance. But I'll
do that in a test branch first.
But for the most part, this should be the end of the gigantic diffs (this
one was 174KiB), at least for the SFC/WS cores. Still have the FC/GB/GBA
cores to clean up more fully. Assuming we don't spot any new regressions,
we should be ~95% out of the woods on code cleanups breaking things.
byuu says:
Changelog:
- massive cleanups and optimizations on the PPU core
- ~9% speedup over v099 official
This is pretty much it for the low-hanging fruit of speeding up higan. Any
more gains from this point will be extremely hard-fought, unfortunately.
byuu says:
Time for a new release. There are a few important emulation improvements
and a few new features; but for the most part, this release focuses on
major code refactoring, the details of which I will mostly spare you.
The major change is that, as of v099, the SNES balanced and performance
cores have been removed from higan. Basically, in addition to my five
other emulation cores, these were too much of a burden to maintain. And
they've come along as far as I was able to develop them. If you need to
use these cores, please use these two from the v098 release.
I'm very well aware that ~80% of the people using higan for SNES
emulation were using the two removed profiles. But they simply had
to go. Hopefully in the future, we can compensate for their loss by
increasing the performance of the accuracy core.
Changelog (since v098):
SFC: balanced profile removed
SFC: performance profile removed
SFC: expansion port devices can now be changed during gameplay (atlhough
you shouldn't)
SFC: fixed bug in SharpRTC leap year calculations
SFC: emulated new research findings for the S-DD1 coprocessor
SFC: fixed CPU emulation-mode wrapping bug with pei, [dp], [dp]+y
instructions [AWJ]
SFC: fixed Super Game Boy bug that caused the bottom tile-row to flicker
in games
GB: added MBC1M (multi-cart) mapper; icarus can't detect these so manual
manifests are needed for now
GB: corrected return value when HuC3 unmapped RAM is read; fixes Robopon
[endrift]
GB: improved STAT IRQ emulation; fixes Altered Space, etc [endrift,
gekkio]
GB: partial emulation of DMG STAT write IRQ bug; fixes Legend of Zerd,
Road Rash, etc
nall: execute() fix, for some Linux platforms that had trouble detecting
icarus
nall: new BitField class; which allows for simplifying flag/register
emulation in various cores
ruby: added Windows WASAPI audio driver (experimental)
ruby: remove attempts to call glSwapIntervalEXT (fixes crashing on some
Linux systems)
ui: timing settings panel removed
video: restored saturation, gamma, luminance settings
video: added new post-emulation sprite system; light gun cursors are
now higher-resolution
audio: new resampler (6th-order Butterworth biquad IIR); quite a bit
faster than the old one
audio: added optional basic reverb filter (for fun)
higan: refresh video outside cooperative threads (workaround for shoddy
code in AMD graphics drivers)
higan: individual emulation cores no longer have unique names
higan: really substantial code refactoring; 43% reduction in binary size
Off the bat, here are the known bugs:
hiro/Windows: focus stealing bug on startup. Needs to be fixed in hiro,
not with a cheap hack to tomoko.
higan/SFC: some of the coprocessors are saving some volatile memory to
disk. Completely harmless, but still needs to be fixed.
ruby/WASAPI: some sound cards have a lot of issues with the current driver
(eg FitzRoy's). We need to find a clean way to fix this before it
can be made the default driver. Which would be a huge win because
the latency improvements are substantial, and in exclusive mode,
WASAPI allows G-sync to work very well.
[From the v099 WIP thread, here's the changelog since v098r19:
- GB: don't force mode 1 during force-blank; fixes v098r16 regression
with many Game Boy games
- GB: only perform the STAT write IRQ bug during vblank, not hblank
(still not hardware accurate, though)
-Ed.]
byuu says:
Changelog:
- added nall/bit-field.hpp
- updated all CPU cores (sans LR35902 due to some complexities) to use
BitFields instead of bools
- updated as many CPU cores as I could to use BitFields instead of union {
struct { uint8_t ... }; }; pairs
The speed changes are mostly a wash for this. In some instances,
I noticed a ~2-3% speedup (eg SNES emulation), and in others a 2-3%
slowdown (eg Famicom emulation.) It's within the margin of error, so
it's safe to say it has no impact.
This does give us a lot of new useful things, however:
- no more manual reconstruction of flag values from lots of left shifts
and ORs
- no more manual deconstruction of flag values from lots of ANDs
- ability to get completely free aliases to flag groups (eg GSU can
provide alt2, alt1 and also alt (which is alt2,alt1 combined)
- removes the need for the nasty order_lsbN macro hack (eventually will
make higan 100% endian independent)
- saves us from insane compilers that try and do nasty things with
alignment on union-structs
- saves us from insane compilers that try to store bit-field bits in
reverse order
- will allow some really novel new use cases (I'm planning an
instant-decode ARM opcode function, for instance.)
- reduces code size (we can serialize flag registers in one line instead
of one for each flag)
However, I probably won't use it for super critical code that's constantly
reading out register values (eg PPU MMIO registers.) I think there we
would end up with a performance penalty.
byuu says:
Changelog:
- hiro: fixed the BrowserDialog column resizing when navigating to new
folders (prevents clipping of filenames)
- note: this is kind of a quick-fix; but I have a good idea how to do
the proper fix now
- nall: added BitField<T, Lo, Hi> class
- note: not yet working on the SFC CPU class; need to go at it with
a debugger to find out what's happening
- GB: emulated DMG/SGB STAT IRQ bug; fixes Zerd no Densetsu and Road Rash
(won't fix anything else; don't get hopes up)
byuu says:
Changelog:
- fixed Super Game Boy regression from v096r04 with bottom tile row
flickering
- fixed GB STAT IRQ regression from previous WIP
- Altered Space is now playable
- GBVideoPlayer isn't; but nobody seems to know exactly what weird
hardware quirk that one relies on to work
- ~3-4% speed improvement in SuperFX games by eliminating function<>
callback on register assignments
- most noticeable in Doom in-game; least noticeable on Yoshi's Island
title screen (darn)
- finished GSU core and SuperFX coprocessor code cleanups
- did some more work cleaning up the LR35902 core and GB CPU code
Just a fair warning: don't get your hopes up on these GB
fixes. Cliffhanger now hangs completely (har har), and none of the
other bugs are fixed. We pretty much did all this work just for Altered
Space. So, I hope you like playing Altered Space.
byuu says:
Changelog:
- GNUmakefile: reverted $(call unique,) to $(strip)
- processor/r6502: removed templates; reduces object size from 146.5kb
to 107.6kb
- processor/lr35902: removed templates; reduces object size from 386.2kb
to 197.4kb
- processor/spc700: merged op macros for switch table declarations
- sfc/coprocessor/sa1: partial cleanups; flattened directory structure
- sfc/coprocessor/superfx: partial cleanups; flattened directory structure
- sfc/coprocessor/icd2: flattened directory structure
- gb/ppu: changed behavior of STAT IRQs
Major caveat! The GB/GBC STAT IRQ changes has a major bug in it somewhere
that's seriously breaking most games. I'm pushing the WIP anyway, because
I believe the changes to be mostly correct. I'd like to get more people
looking at these changes, and also try more heavy-handed hacking and
diff comparison logging between the previous WIP and this one.
byuu says:
Changelog:
- removed template usage from processor/spc700; cleaned up many function
names and the switch table
- object size: 176.8kb => 127.3kb
- source code size: 43.5kb => 37.0kb
- fixed processor/r65816 BRK/COP vector regression [hex_usr]
- corrected HuC3 unmapped RAM read value; fixes Robopon [endrift]
- cosmetic: simplified the butterworth constant calculation
[Wolfram|Alpha]
The SPC700 core changes took forever, about three hours of work.
Only the LR35902 and R6502 still need their template functions
removed. The point of this is that it doesn't cause any speed penalty
to do so, and it results in smaller binary sizes and faster compilation
times.
byuu says:
Changelog:
- improved attenuation of biquad filter by computing butterworth Q
coefficients correctly (instead of using the same constant)
- adding 1e-25 to each input sample into the biquad filters to try and
prevent denormalization
- updated normalization from [0.0 to 1.0] to [-1.0 to +1.0]; volume/reverb
happen in floating-point mode now
- good amount of work to make the base Emulator::Audio support any number
of output channels
- so that we don't have to do separate work on left/right channels;
and can instead share the code for each channel
- Emulator::Interface::audioSample(int16 left, int16 right); changed to:
- Emulator::Interface::audioSample(double* samples, uint channels);
- samples are normalized [-1.0 to +1.0]
- for now at least, channels will be the value given to
Emulator::Audio::reset()
- fixed GUI crash on startup when audio driver is set to None
I'm probably going to be updating ruby to accept normalized doubles as
well; but I'm not sure if I will try and support anything other 2-channel
audio output. It'll depend on how easy it is to do so; perhaps it'll be
a per-driver setting.
The denormalization thing is fierce. If that happens, it drops the
emulator framerate from 220fps to about 20fps for Game Boy emulation. And
that happens basically whenever audio output is silent. I'm probably
also going to make a nall/denormal.hpp file at some point with
platform-specific functionality to set the CPU state to "denormals as
zero" where applicable. I'll still add the 1e-25 offset (inaudible)
as another fallback.
byuu says:
Changelog:
- nall/dsp returns with new iir/biquad.hpp and resampler/cubic.hpp files
- nall/queue.hpp added (simple ring buffer ... nall/vector wouldn't
cause too many moves with FIFO)
- audio streams now only buffer 20ms; so even if multiple audio streams
desync, latency can never exceed 20ms
- replaced blackman windwed sinc FIR hermite audio filter with transposed
direct form II biquadratic sixth-order IIR butterworth filter (better
attenuation of frequencies above 20KHz, faster, no need for decimation,
less code)
- put in experimental eight-tap echo filter (a lot better than what I
had before, but still rather weak)
- substantial cleanups to the SuperFX GSU processor core (slightly
faster, 479KB->100KB object file, 42.7KB->33.4KB source code size,
way less code duplication)
We'll definitely want to test the whole SuperFX library (not many games)
just to make sure there's no regressions caused by this one.
Not sure what I want to do with audio processing effects yet. I've always
really wanted lots of fun controls to customize audio, and now finally
with this new biquad filter, I can finally start implementing real
effects. For instance, an equalizer wouldn't be too complicated anymore.
The new reverb effect is still a poor man's version. I need to find human
readable source for implementing a comb-filter properly. I'm pretty sure
I can already treat nall::queue as an all-pass filter since all that
does is phase shift (fancy audio term for "delay audio"). What's really
going to be hard is figuring out how to expose user-friendly settings for
controlling it. It looks like you need a bunch of coprime coefficients,
and I don't think casual users are going to be able to hand-enter coprime
values to get the echo effect they want. I uh ... don't even know how
to calculate coprime values dynamically right now >_> But we're going
to have to, as they are correlated to the output sampling rate.
We'll definitely want to make some audio profiles so that users can
quickly select pre-configured themes that sound nice, but expose the
underlying coefficients so that they can tweak stuff to their liking. This
isn't just about higan, this is about me trying to learn digital signal
processing, so please don't be too upset about feature creep or anything
on this.
Anyway ... I'm having some difficulties with my audio right now. When
the reverb effect is enabled, there's a bunch of static on system
reset for just a moment. But this should not be possible. nall::queue
is initializing all previous reverb sample elements to 0.0. I don't
understand where static is coming in from. Further, we have the same
issue with both the windowed sinc and the biquad filters ... a bit of
a popping sound when starting a game. Any help tracking this down would
be appreciated.
There's also one really annoying issue ... I can't seem to do reverb
or volume adjustments with normalized samples. If I say "volume *= 0.5"
in higan/audio/audio.cpp line 68, it doesn't just halve the volume, it
adds a whole bunch of distortion. This makes absolutely zero sense to
me. The sample values are between 0.0 (mute) and 1.0 (full volume) here,
so multiplying a double by 0.5 shouldn't cause distortion. So right now,
I'm doing these adjustments with less precision after denormalizing back
to int16. Anyone ever see something like that? :/
byuu says:
Changelog:
- higan/video: added support for Emulator::Sprite
- higan/resource: a new system for accessing embedded binary files
inside the emulation cores; holds the sprites
- higan/sfc/superscope,justifier: re-enabled display of crosshairs
- higan/sfc/superscope: fixed turbo toggle (also shows different
crosshair color when in turbo mode)
- higan/sfc/ppu: always outputs at 512x480 resolution now
- causes a slight speed-hit from ~127fps to ~125fps;
- but allows high-resolution 32x32 cursors that look way better;
- also avoids the need to implement sprite scaling logic
Right now, the PPU code to always output at 480-height is a really gross
hack. Don't worry, I'll make that nicer before release.
Also, superscope.cpp and justifier.cpp are built around a 256x240
screen. But since we now have 512x480, we can make the cursor's movement
much smoother by doubling the resolution on both axes. The actual games
won't see any accuracy improvements when firing the light guns, but the
cursors will animate nicer so I think it's still worth it. I'll work on
that before the next release as well.
The current 32x32 cursors are nicer, but we can do better now with full
24-bit color. So feel free to submit alternatives. I'll probably reject
them, but you can always try :D
The sprites don't support alpha blending, just color keying (0x00000000
= transparent; anything else is 0xff......). We can revisit that later
if necessary.
The way I have it designed, the only files that do anything with
Emulator::Sprite at all are the superscope and justifier folders.
I didn't have to add any hooks anywhere else. Rendering the sprite is
a lot cleaner than the old code, too.
byuu says:
Changelog:
- fixed nall/path.hpp compilation issue
- fixed ruby/audio/xaudio header declaration compilation issue (again)
- cleaned up xaudio2.hpp file to match my coding syntax (12.5% of the
file was whitespace overkill)
- added null terminator entry to nall/windows/utf8.hpp argc[] array
- nall/windows/guid.hpp uses the Windows API for generating the GUID
- this should stop all the bug reports where two nall users were
generating GUIDs at the exact same second
- fixed hiro/cocoa compilation issue with uint# types
- fixed major higan/sfc Super Game Boy audio latency issue
- fixed higan/sfc CPU core bug with pei, [dp], [dp]+y instructions
- major cleanups to higan/processor/r65816 core
- merged emulation/native-mode opcodes
- use camel-case naming on memory.hpp functions
- simplify address masking code for memory.hpp functions
- simplify a few opcodes themselves (avoid redundant copies, etc)
- rename regs.* to r.* to match modern convention of other CPU cores
- removed device.order<> concept from Emulator::Interface
- cores will now do the translation to make the job of the UI easier
- fixed plurality naming of arrays in Emulator::Interface
- example: emulator.ports[p].devices[d].inputs[i]
- example: vector<Medium> media
- probably more surprises
Major show-stoppers to the next official release:
- we need to work on GB core improvements: LY=153/0 case, multiple STAT
IRQs case, GBC audio output regs, etc.
- we need to re-add software cursors for light guns (Super Scope,
Justifier)
- after the above, we need to fix the turbo button for the Super Scope
I really have no idea how I want to implement the light guns. Ideally,
we'd want it in higan/video, so we can support the NES Zapper with the
same code. But this isn't going to be easy, because only the SNES knows
when its output is interlaced, and its resolutions can vary as
{256,512}x{224,240,448,480} which requires pixel doubling that was
hard-coded to the SNES-specific behavior, but isn't appropriate to be
exposed in higan/video.
byuu says:
Changelog:
- fixed major nall/vector/prepend bug
- renamed hiro/ListView to hiro/TableView
- added new hiro/ListView control which is a simplified abstraction of
hiro/TableView
- updated higan's cheat database window and icarus' scan dialog to use
the new ListView control
- compilation works once again on all platforms (Windows, Cocoa, GTK,
Qt)
- the loki skeleton compiles once again (removed nall/DSP references;
updated port/device ID names)
Small catch: need to capture layout resize events internally in Windows
to call resizeColumns. For now, just resize the icarus window to get it
to use the full window width for list view items.
byuu says:
Changelog:
- nall/vector rewritten from scratch
- higan/audio uses nall/vector instead of raw pointers
- higan/sfc/coprocessor/sdd1 updated with new research information
- ruby/video/glx and ruby/video/glx2: fuck salt glXSwapIntervalEXT!
The big change here is definitely nall/vector. The Windows, OS X and Qt
ports won't compile until you change some first/last strings to
left/right, but GTK will compile.
I'd be really grateful if anyone could stress-test nall/vector. Pretty
much everything I do relies on this class. If we introduce a bug, the
worst case scenario is my entire SFC game dump database gets corrupted,
or the byuu.org server gets compromised. So it's really critical that we
test the hell out of this right now.
The S-DD1 changes mean you need to update your installation of icarus
again. Also, even though the Lunar FMV never really worked on the
accuracy core anyway (it didn't initialize the PPU properly), it really
won't work now that we emulate the hard-limit of 16MiB for S-DD1 games.
byuu says:
Changelog:
- GB: support modeSelect and RAM for MBC1M (Momotarou Collection)
- audio: implemented native resampling support into Emulator::Stream
- audio: removed nall::DSP completely
Unfortunately, the new resampler didn't turn out quite as fast as I had
hoped. The final hermite resampling added some overhead; and I had to
bump up the kernel count to 500 from 400 to get the buzzing to go away
on my main PC. I think that's due to it running at 48000hz output
instead of 44100hz output, maybe?
Compared to Ryphecha's:
(NES) Mega Man 2: 167fps -> 166fps
(GB) Mega Man II: 224fps -> 200fps
(WSC) Riviera: 143fps -> 151fps
Odd that the WS/WSC ends up faster while the DMG/CGB ends up slower.
But this knocks 922 lines down to 146 lines. The only files left in all
of higan not written (or rewritten) by me are ruby/xaudio2.h and
libco/ppc.c
byuu says:
Changelog:
- emulation cores now refresh video from host thread instead of
cothreads (fix AMD crash)
- SFC: fixed another bug with leap year months in SharpRTC emulation
- SFC: cleaned up camelCase on function names for
armdsp,epsonrtc,hitachidsp,mcc,nss,sharprtc classes
- GB: added MBC1M emulation (requires manually setting mapper=MBC1M in
manifest.bml for now, sorry)
- audio: implemented Emulator::Audio mixer and effects processor
- audio: implemented Emulator::Stream interface
- it is now possible to have more than two audio streams: eg SNES
+ SGB + MSU1 + Voicer-Kun (eventually)
- audio: added reverb delay + reverb level settings; exposed balance
configuration in UI
- video: reworked palette generation to re-enable saturation, gamma,
luminance adjustments
- higan/emulator.cpp is gone since there was nothing left in it
I know you guys are going to say the color adjust/balance/reverb stuff
is pointless. And indeed it mostly is. But I like the idea of allowing
some fun special effects and configurability that isn't system-wide.
Note: there seems to be some kind of added audio lag in the SGB
emulation now, and I don't really understand why. The code should be
effectively identical to what I had before. The only main thing is that
I'm sampling things to 48000hz instead of 32040hz before mixing. There's
no point where I'm intentionally introducing added latency though. I'm
kind of stumped, so if anyone wouldn't mind taking a look at it, it'd be
much appreciated :/
I don't have an MSU1 test ROM, but the latency issue may affect MSU1 as
well, and that would be very bad.
byuu says:
Changelog:
- WS/WSC: re-added support for screen rotation (code is inside WS core)
- ruby: changed sample(uint16_t left, uint16_t right) to sample(int16_t
left, int16_t right);
- requires casting to uint prior to shifting in each driver, but
I felt it was misleading to use uint16_t just to avoid that
- ruby: WASAPI is now built in by default; has wareya's improvements,
and now supports latency adjust
- tomoko: audio settings panel has new "Exclusive Mode" checkbox for
WASAPI driver only
- note: although the setting *does* take effect in real-time, I'd
suggest restarting the emulator after changing it
- tomoko: audio latency can now be set to 0ms (which in reality means
"the minimum supported by the driver")
- all: increased cothread size from 512KiB to 2MiB to see if it fixes
bullshit AMD driver crashes
- this appears to cause a slight speed penalty due to cache locality
going down between threads, though