byuu says:
New terminal is in. Much nicer to use now. Command history makes a major
difference in usability.
The SMP is now fully traceable and debuggable. Basically they act as
separate entities, you can trace both at the same time, but for the most
part running and stepping is performed on the chip you select.
I'm going to put off CPU+SMP interleave support for a while. I don't
actually think it'll be too hard. Will get trickier if/when we support
coprocessor debugging.
Remaining tasks:
- aliases
- hotkeys
- save states
- window geometry
Basically, the debugger's done. Just have to add the UI fluff.
I also removed tracing/memory export from higan. It was always meant to
be temporary until the debugger was remade.
byuu says:
Changelog:
- target-ethos/ is now target-higan/ (will unfortunately screw up diffs
pretty badly at this point.)
- had a serious bug in nall::optional<T>::operator=, which is now fixed.
- added tracer (no masking just yet, I need to write a nall::bitvector
class because I don't want to hard-code those anymore.)
- added usage logging (keep track of RWX/EP states for all bus
addresses.)
- added read/write to poke at memory (hex also works for reading, but
this one can poke at MMIO regs and is for one address only.)
- added both run.for (# of instructions) and run.to (program counter
address.)
- added read/write/execute breakpoints with counters for a given
address, and with an optional compare byte (for read/write modes.)
About the only major things left now for loki is support for trace
masking, memory export, and VRAM/OAM/CGRAM access.
For phoenix/Console, I really need to add a history to up+down arrows,
and I should support left/right insert-at.
byuu says:
Changelog:
- ethos: use nall::programpath() instead of realpath(argv[0]) to get
executable path
- loki: add presentation window
- loki: add terminal window
- loki: add interface to emulation core
- loki: add ruby
- loki: add enough support to run games and save data on exit
- load game folders via command-line (or drop folder onto binary),
use "r" to start, "p" to pause ... temporary command names
I'll probably have to say this several times, but for now, loki is only
available on Linux/GTK+, due to the use of the Console widget. Support
for other platforms can come later easily enough.
byuu says:
Changelog:
- port: various compilation fixes for OS X [kode54]
- nall: added programpath() function to return path to process binary
[todo: need to have ethos use this function]
- ruby: XAudio2 will select default game sound device instead of first
sound device
- ruby: DirectInput device IDs are no longer ambiguous when VID+PID are
identical
- ruby: OpenGL won't try and terminate if it hasn't been initialized
- gb: D-pad up+down/left+right not masked in SGB mode
- sfc: rewrote ICD2 video rendering to output in real-time, work with
cycle-based Game Boy renderer
- sfc: rewrote Bus::reduce(), reduces game loading time by about 500ms
- ethos: store save states in {game}/higan/* instead of {game}/bsnes/*
- loki: added target-loki/ (blank stub for now)
- Makefile: purge out/* on make clean
byuu says:
This WIP removes nall/input.hpp entirely, and implements the new
universal cheat format for FC/SFC/GB/GBC/SGB.
GBA is going to be tricky since there's some consternation around
byte/word/dword overrides.
It's also not immediately obvious to me how to implement the code search
in logarithmic time, due to the optional compare value.
Lastly, the cheat values inside cheats.bml seem to be broken for the
SFC. Likely there's a bug somewhere in the conversion process. Obviously
I'll have to fix that before v094.
I received no feedback on the universal cheat format. If nobody adds
anything before v094, then I don't want to hear any complaining about
the formatting :P
byuu says:
Not an official WIP (a WIP WIP? A meta-WIP?), just throwing in the new
fullscreen code, and I noticed that OpenGL colors in 30-bit mode are all
fucked up now for some strange reason. So I'm just using this snapshot
to debug the issue.
byuu says:
Changelog:
- added SA-1 MDR; fixes bug in SD Gundam G-Next where the main
battleship was unable to fire
- added out-of-the-box support for any BSD running Clang 3.3+ (FreeBSD
10+, notably)
- added new video shader, "Display Emulation", which changes the shader
based on the emulated system
- fixed the home button to go to your default library path
- phoenix: Windows port won't send onActivate unless an item is selected
(prevents crashing on pressing enter in file dialog)
- ruby: removed vec4 position from out Vertex {} (helps AMD cards)
- shaders: updated all shaders to use texture() instead of texture2D()
(helps AMD cards)
The "Display Emulation" option works like this: when selected, it tries
to load "<path>/Video Shaders/Emulation/<systemName>.shader/"; otherwise
it falls back to the blur shader. <path> is the usual (next to binary,
then in <config>/higan, then in /usr/share/higan, etc); and <systemName>
is "Famicom", "Super Famicom", "Game Boy", "Game Boy Color", "Game Boy
Advance"
To support BSD, I had to modify the $(platform) variable to
differentiate between Linux and BSD.
As such, the new $(platform) values are:
win -> windows
osx -> macosx
x -> linux or bsd
I am also checking uname -s instead of uname -a now. No reason to
potentially match the hostname to the wrong OS type.
byuu says:
Changelog:
- added Cocoa target: higan can now be compiled for OS X Lion
[Cydrak, byuu]
- SNES/accuracy profile hires color blending improvements - fixes
Marvelous text [AWJ]
- fixed a slight bug in SNES/SA-1 VBR support caused by a typo
- added support for multi-pass shaders that can load external textures
(requires OpenGL 3.2+)
- added game library path (used by ananke->Import Game) to
Settings->Advanced
- system profiles, shaders and cheats database can be stored in "all
users" shared folders now (eg /usr/share on Linux)
- all configuration files are in BML format now, instead of XML (much
easier to read and edit this way)
- main window supports drag-and-drop of game folders (but not game files
/ ZIP archives)
- audio buffer clears when entering a modal loop on Windows (prevents
audio repetition with DirectSound driver)
- a substantial amount of code clean-up (probably the biggest
refactoring to date)
One highly desired target for this release was to default to the optimal
drivers instead of the safest drivers, but because AMD drivers don't
seem to like my OpenGL 3.2 driver, I've decided to postpone that. AMD
has too big a market share. Hopefully with v093 officially released, we
can get some public input on what AMD doesn't like.
byuu describes the changes since v067:
This release officially introduces the accuracy and performance cores,
alongside the previously-existing compatibility core. The accuracy core
allows the most accurate SNES emulation ever seen, with every last
processor running at the lowest possible clock synchronization level.
The performance core allows slower computers the chance to finally use
bsnes. It is capable of attaining 60fps in standard games even on an
entry-level Intel Atom processor, commonly found in netbooks.
The accuracy core is absolutely not meant for casual gaming at all. It
is meant solely for getting as close to 100% perfection as possible, no
matter the cost to speed. It should only be used for testing,
development or debugging.
The compatibility core is identical to bsnes v067 and earlier, but is
now roughly 10% faster. This is the default and recommended core for
casual gaming.
The performance core contains an entirely new S-CPU core, with
range-tested IRQs; and uses blargg's heavily-optimized S-DSP core
directly. Although there are very minor accuracy tradeoffs to increase
speed, I am confident that the performance core is still more accurate
and compatible than any other SNES emulator. The S-CPU, S-SMP, S-DSP,
SuperFX and SA-1 processors are all clock-based, just as in the accuracy
and compatibility cores; and as always, there are zero game-specific
hacks. Its compatibility is still well above 99%, running even the most
challenging games flawlessly.
If you have held off from using bsnes in the past due to its system
requirements, please give the performance core a try. I think you will
be impressed. I'm also not finished: I believe performance can be
increased even further.
I would also strongly suggest Windows Vista and Windows 7 users to take
advantage of the new XAudio2 driver by OV2. Not only does it give you
a performance boost, it also lowers latency and provides better sound by
way of skipping an API emulation layer.
Changelog:
- Split core into three profiles: accuracy, compatibility and
performance
- Accuracy core now takes advantage of variable-bitlength integers (eg
uint24_t)
- Performance core uses a new S-CPU core, written from scratch for speed
- Performance core uses blargg's snes_dsp library for S-DSP emulation
- Binaries are now compiled using GCC 4.5
- Added a workaround in the SA-1 core for a bug in GCC 4.5+
- The clock-based S-PPU renderer has greatly improved OAM emulation;
fixing Winter Gold and Megalomania rendering issues
- Corrected pseudo-hires color math in the clock-based S-PPU renderer;
fixing Super Buster Bros backgrounds
- Fixed a clamping bug in the Cx4 16-bit triangle operation [Jonas
Quinn]; fixing Mega Man X2 "gained weapon" star background effect
- Updated video renderer to properly handle mixed-resolution screens
with interlace enabled; fixing Air Strike Patrol level briefing screen
- Added mightymo's 2010-08-19 cheat code pack
- Windows port: added XAudio2 output support [OV2]
- Source: major code restructuring; virtual base classes for processor
- cores removed, build system heavily modified, etc.
byuu says:
Re-added blargg's DSP, but this time I merged only spc_dsp and also
fixed the initial regs for Dual Orb II. Which restores the 5-10% speedup
on the compatibility and performance cores. Fixed the initial register
values for the fast CPU core, which fixes Armored Police and the other
Atlus game's title screen in the performance core. Added a missing
debugvirtual prefix to op_step, which fixes CPU stepping in the
performance core. I was using the description field for profile
identification in savestates, but that was of course the description for
the state manager. Whoops. Added a new field for profile name to the
save states, and fixed the state manager to work with that change.
Adjusted the about screen colors, which is how you can tell which core
you're using without it being annoying. Probably did some other stuff
too, meh.
byuu says:
Fixed bsnes launcher on Windows XP
Fixed Windows bsnes launcher internationalization support (emulator can
be in a folder with spaces and Japanese characters, and you can drag
a Japanese file name onto the launcher, and it will load it properly)
Moved fast CPU to use a switch table for MMIO, unfortunately for no
speed gain
Bus::read/write take uint24 parameters for address, luckily no speed
penalty
MMIOAccess gained a handle() function, and hid the mmio[] table. Makes
hooking it cleaner
Added malloc.h header to nall/function.hpp to fix a ridiculous GCC 4.5.0
error
Fixed a fairly large bug in the fast CPU IRQ handler, which fixes
Robocop et al
Forgot to bump revision to .24 in the compiled binaries, too lazy to
recompile or hex edit to change them
Unfortunately, in order to add nice battery usage, I have to add the
sleep calls to the video and audio wait loops. But they don't know
anything about the GUI and its settings, nor do I really want to make
them know about this setting. I do not want to force allow it. Even with
the media timer trick, Sleep(0) makes Vsync+Async fail a lot more
frequently than never sleeping at all. I would rather laptop users
suffer 100% utilization of a single core than for all users to not be
able to get good audio+video sync. Not sure what to do about that, so
I'll probably just remove the battery usage comment from performance
mode for now.
byuu says:
Added missing $4200 IRQ lock, which fixes Chou Aniki on the fast CPU
core, so slower PCs can get their brotherly love on.
Added range-based controller IOBit latching to the fast CPU core, which
enables Super Scope and Justifier support. Uses the priority queue as
well, so there is zero speed-hit. Given the way range-testing works, the
trigger point may vary by 1-2 pixels when firing at the same spot. Not
really a big deal when it avoids a massive speed penalty.
Fixed PAL and interlace-mode HVIRQs at V=0,H<2 on the fast CPU core.
Added the dot-renderer's sprite list update-on-OAM-write functionality
to the scanline-based PPU renderer. Unfortunately it looks like all the
speed gain was already taken from the global dirty flag I was using
before, but this certainly won't hurt speed any, so whatever.
Added #ifdef to stop CoInitialize(0) on non-Windows ports.
Added #ifdefs to stop gradient fade on Windows port. Not going to fuck
over the Linux port aesthetic because of Qt bug #47,326,927. If there's
a way to tell what Qt theme is being used, I can leave it enabled for
XP/Vista themes.
Moved HDMA trigger from 1104 to 1112, and reduced channel overhead from
24 to 16, to better simulate one-cycle DMA->CPU sync.
Code clarity: I've re-added my varint.hpp classes, and am actively using
them in the accuracy cores. So far, I haven't done anything that would
detriment speed, but it is certainly cool. The APU ports exposed by the
CPU and SMP now take uint2 address arguments, the CPU WRAM address
register is a uint17, and the IRQ H/VTIME values are uint10. This
basically allows the source to clearly convey the data sizes, and
eliminates the need to manually mask values when writing to registers or
reading from memory. I'm going to be doing this everywhere, and it will
have a speed impact eventually, because the automation means we can't
skip masks when we know the data is already masked off.
Source: archive contains the launcher code, so that I can look into why
it's crashing on XP tomorrow.
It doesn't look like Circuit USA's flags are going to work too well with
this new CPU core. Still not sure what the hell Robocop vs The
Terminator is doing, I'll read through the mega SNES thread for clues
tomorrow. Speedy Gonzales is definitely broken, as modifying the MDR was
breaking things with my current core. Probably because the new CPU core
doesn't wait for a cycle edge to trigger.
I was thinking that perhaps we could keep some form of cheat codes list
to work as game-specific hacks for the performance core. Keeps the hacks
out of the emulator, but could allow the remaining bugs to be worked
around for people who have no choice but to use the performance core.
byuu says:
Added OV2's XAudio2 driver (it's better and faster than the DirectSound
one)
Fixed DirectInput keypad number codes
Added launcher to make the profiles work
Profiles now called: Accuracy, Compatibility, Performance (not debating
names anymore)
The launcher isn't going to work on OS X because of the .app folder
bullshit (yes, yes, .sfc folders.)
It also crashes on Windows XP for god only knows what reason. Works fine
on Windows 7 and Linux. So XP users, rename the .dll files to .exe to
test this release. I'll fix it on Monday.
The color highlighting fucks up the radio boxes on the Windows classic
theme, because Nokia can't afford a god damn QA team.
Lastly, I forgot to add launcher to the make archive-all command, so the
source for it will be in the next WIP.
byuu says:
This moves toward a profile-selection mode. Right now, it is incomplete.
There are three binaries, one for each profile. The GUI selection
doesn't actually do anything yet. There will be a launcher in a future
release that loads each profile's respective binary.
I reverted away from blargg's SMP library for the time being, in favor
of my own. This will fix most of the csnes/bsnes-performance bugs. This
causes a 10% speed hit on 64-bit platforms, and a 15% speed hit on
32-bit platforms. I hope to be able to regain that speed in the future,
I may also experiment with creating my own fast-SMP core which drops bus
hold delays and TEST register support (never used by anything, ever.)
Save states now work in all three cores, but they are not
cross-compatible. The profile name is stored in the description field of
the save states, and it won't load a state if the profile name doesn't
match.
The debugger only works on the research target for now. Give it time and
it will return for the other targets.
Other than that, let's please resume testing on all three once again.
See how far we get this time :)
I can confirm the following games have issues on the performance
profile:
- Armored Police Metal Jacket (minor logo flickering, not a big deal)
- Chou Aniki (won't start, so obviously unplayable)
- Robocop vs The Terminator (major in-game flickering, unplayable)
Anyone still have that gigantic bsnes thread archive from the ZSNES
forum? Maybe I posted about how to fix those two broken games in there,
heh.
I really want to release this as v1.0, but my better judgment says we
need to give it another week. Damn.
byuu says:
Since we're now talking about three splits, that's getting a bit out of
hand.
This WIP combines everything back into one project again. Added the
src/fast folder that has all the speed-oriented cores.
A slight slowdown to csnes from what it was before, I'm using blargg's
accurate DSP. I just don't like the idea of releasing a less accurate
DSP core than Snes9X v1.52 has. Plus the fast DSP core doesn't serialize
yet.
I moved back to snes_spc 0.9.0 because I care more about Tales and Star
Ocean than I do about Earthworm Jim 2. So if you try EWJ2 on csnes,
expect it to sound like it does on Snes9X. In other words, don't wear
headphones if you value your hearing.
The middle-of-the-road bsnes core uses blargg's accurate DSP, because
it's about 3% faster than mine which removes all of blargg's
optimizations. There is absolutely no accuracy loss here. bsnes v067.20
that is included should be equal to v067 official.
Performance:
Code:
asnes = 58fps
bsnes = 172fps +2.97x
csnes = 274fps +1.59x +4.72x
The binaries are not profiled, so that's an additional 15% slower from
the previous builds.
Save states only work on asnes, as I don't know how to serialize
blargg's cores yet. The copy_func thing is very confusing to me for some
reason. The debugger won't work anywhere.
Outside of that, please go ahead and bug test. Once I get the debugger
and save states working, I'll build some profiled v1.0 releases for all
three, and we can test that for a bit and then release.
byuu says:
I wrote a new CPU core from scratch. It has range-based IRQs, and is
good enough even to run F-1 Grand Prix and Sink or Swim. It also uses
a binary min-heap array for the timing priority queue. This resulted in
a ~40% speedup.
I also added in blargg's snes_spc library, which is an S-SMP + S-DSP
emulator. I am still using his accurate DSP core, and not the fast one.
This gives an additional ~10% speedup.
THIS IS NOT PERFECT, THERE WILL BE BUGS!
I already know that Tales of Phantasia and Star Ocean are hitting some
edge cases. Now that it's fast enough, hopefully blargg can take a look
at it. Something he couldn't test before because you can't rip SPCs of
these games, so it's probably something simple.
My CPU core also doesn't nail every last possible edge case. So things
like Wild Guns and the two or three games that rely on NMI/IRQ hold
aren't going to work ... yet. Be patient.
The SuperFX and SA-1 cores are still cycle-accurate. It wouldn't hurt
compatibility to reduce their precision a bit.
End result is that you can now get well over 60fps in normal games even
n a first-generation Intel Atom.
byuu says:
This fixes libsnes and debugger builds, and collapses bsnes/ppu/bppu to
bsnes/ppu and bsnes/dsp/sdsp to bsnes/dsp. It also introduces
bsnes/sync.sh, which will synchronize all of asnes/ with bsnes/,
excepting the custom speed-focused modules. So far, that's bsnes/ppu
(scanline renderer) and bsnes/dsp (state machine.)
Should make keeping the two ports in sync much, much easier. It's
basically the same thing as before, only you run sync.sh and have a few
duplicated folders now. May make it clearer by creating a stub/ or src/
folder inside bsnes to do all of the copying, so that you only see the
custom folders in bsnes/' root directory.