mirror of https://github.com/bsnes-emu/bsnes.git
16 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Tim Allen | c50723ef61 |
Update to v100r15 release.
byuu wrote: Aforementioned scheduler changes added. Longer explanation of why here: http://hastebin.com/raw/toxedenece Again, we really need to test this as thoroughly as possible for regressions :/ This is a really major change that affects absolutely everything: all emulation cores, all coprocessors, etc. Also added ADDX and SUB to the 68K core, which brings us just barely above 50% of the instruction encoding space completed. [Editor's note: The "aformentioned scheduler changes" were described in a previous forum post: Unfortunately, 64-bits just wasn't enough precision (we were getting misalignments ~230 times a second on 21/24MHz clocks), so I had to move to 128-bit counters. This of course doesn't exist on 32-bit architectures (and probably not on all 64-bit ones either), so for now ... higan's only going to compile on 64-bit machines until we figure something out. Maybe we offer a "lower precision" fallback for machines that lack uint128_t or something. Using the booth algorithm would be way too slow. Anyway, the precision is now 2^-96, which is roughly 10^-29. That puts us far beyond the yoctosecond. Suck it, MAME :P I'm jokingly referring to it as the byuusecond. The other 32-bits of precision allows a 1Hz clock to run up to one full second before all clocks need to be normalized to prevent overflow. I fixed a serious wobbling issue where I was using clock > other.clock for synchronization instead of clock >= other.clock; and also another aliasing issue when two threads share a common frequency, but don't run in lock-step. The latter I don't even fully understand, but I did observe it in testing. nall/serialization.hpp has been extended to support 128-bit integers, but without explicitly naming them (yay generic code), so nall will still compile on 32-bit platforms for all other applications. Speed is basically a wash now. FC's a bit slower, SFC's a bit faster. The "longer explanation" in the linked hastebin is: Okay, so the idea is that we can have an arbitrary number of oscillators. Take the SNES: - CPU/PPU clock = 21477272.727272hz - SMP/DSP clock = 24576000hz - Cartridge DSP1 clock = 8000000hz - Cartridge MSU1 clock = 44100hz - Controller Port 1 modem controller clock = 57600hz - Controller Port 2 barcode battler clock = 115200hz - Expansion Port exercise bike clock = 192000hz Is this a pathological case? Of course it is, but it's possible. The first four do exist in the wild already: see Rockman X2 MSU1 patch. Manifest files with higan let you specify any frequency you want for any component. The old trick higan used was to hold an int64 counter for each thread:thread synchronization, and adjust it like so: - if thread A steps X clocks; then clock += X * threadB.frequency - if clock >= 0; switch to threadB - if thread B steps X clocks; then clock -= X * threadA.frequency - if clock < 0; switch to threadA But there are also system configurations where one processor has to synchronize with more than one other processor. Take the Genesis: - the 68K has to sync with the Z80 and PSG and YM2612 and VDP - the Z80 has to sync with the 68K and PSG and YM2612 - the PSG has to sync with the 68K and Z80 and YM2612 Now I could do this by having an int64 clock value for every association. But these clock values would have to be outside the individual Thread class objects, and we would have to update every relationship's clock value. So the 68K would have to update the Z80, PSG, YM2612 and VDP clocks. That's four expensive 64-bit multiply-adds per clock step event instead of one. As such, we have to account for both possibilities. The only way to do this is with a single time base. We do this like so: - setup: scalar = timeBase / frequency - step: clock += scalar * clocks Once per second, we look at every thread, find the smallest clock value. Then subtract that value from all threads. This prevents the clock counters from overflowing. Unfortunately, these oscillator values are psychotic, unpredictable, and often times repeating fractions. Even with a timeBase of 1,000,000,000,000,000,000 (one attosecond); we get rounding errors every ~16,300 synchronizations. Specifically, this happens with a CPU running at 21477273hz (rounded) and SMP running at 24576000hz. That may be good enough for most emulators, but ... you know how I am. Plus, even at the attosecond level, we're really pushing against the limits of 64-bit integers. Given the reciprocal inverse, a frequency of 1Hz (which does exist in higan!) would have a scalar that consumes 1/18th of the entire range of a uint64 on every single step. Yes, I could raise the frequency, and then step by that amount, I know. But I don't want to have weird gotchas like that in the scheduler core. Until I increase the accuracy to about 100 times greater than a yoctosecond, the rounding errors are too great. And since the only choice above 64-bit values is 128-bit values; we might as well use all the extra headroom. 2^-96 as a timebase gives me the ability to have both a 1Hz and 4GHz clock; and run them both for a full second; before an overflow event would occur. Another hastebin includes demonstration code: #include <libco/libco.h> #include <nall/nall.hpp> using namespace nall; // cothread_t mainThread = nullptr; const uint iterations = 100'000'000; const uint cpuFreq = 21477272.727272 + 0.5; const uint smpFreq = 24576000.000000 + 0.5; const uint cpuStep = 4; const uint smpStep = 5; // struct ThreadA { cothread_t handle = nullptr; uint64 frequency = 0; int64 clock = 0; auto create(auto (*entrypoint)() -> void, uint frequency) { this->handle = co_create(65536, entrypoint); this->frequency = frequency; this->clock = 0; } }; struct CPUA : ThreadA { static auto Enter() -> void; auto main() -> void; CPUA() { create(&CPUA::Enter, cpuFreq); } } cpuA; struct SMPA : ThreadA { static auto Enter() -> void; auto main() -> void; SMPA() { create(&SMPA::Enter, smpFreq); } } smpA; uint8 queueA[iterations]; uint offsetA; cothread_t resumeA = cpuA.handle; auto EnterA() -> void { offsetA = 0; co_switch(resumeA); } auto QueueA(uint value) -> void { queueA[offsetA++] = value; if(offsetA >= iterations) { resumeA = co_active(); co_switch(mainThread); } } auto CPUA::Enter() -> void { while(true) cpuA.main(); } auto CPUA::main() -> void { QueueA(1); smpA.clock -= cpuStep * smpA.frequency; if(smpA.clock < 0) co_switch(smpA.handle); } auto SMPA::Enter() -> void { while(true) smpA.main(); } auto SMPA::main() -> void { QueueA(2); smpA.clock += smpStep * cpuA.frequency; if(smpA.clock >= 0) co_switch(cpuA.handle); } // struct ThreadB { cothread_t handle = nullptr; uint128_t scalar = 0; uint128_t clock = 0; auto print128(uint128_t value) { string s; while(value) { s.append((char)('0' + value % 10)); value /= 10; } s.reverse(); print(s, "\n"); } //femtosecond (10^15) = 16306 //attosecond (10^18) = 688838 //zeptosecond (10^21) = 13712691 //yoctosecond (10^24) = 13712691 (hitting a dead-end on a rounding error causing a wobble) //byuusecond? ( 2^96) = (perfect? 79,228 times more precise than a yoctosecond) auto create(auto (*entrypoint)() -> void, uint128_t frequency) { this->handle = co_create(65536, entrypoint); uint128_t unitOfTime = 1; //for(uint n : range(29)) unitOfTime *= 10; unitOfTime <<= 96; //2^96 time units ... this->scalar = unitOfTime / frequency; print128(this->scalar); this->clock = 0; } auto step(uint128_t clocks) -> void { clock += clocks * scalar; } auto synchronize(ThreadB& thread) -> void { if(clock >= thread.clock) co_switch(thread.handle); } }; struct CPUB : ThreadB { static auto Enter() -> void; auto main() -> void; CPUB() { create(&CPUB::Enter, cpuFreq); } } cpuB; struct SMPB : ThreadB { static auto Enter() -> void; auto main() -> void; SMPB() { create(&SMPB::Enter, smpFreq); clock = 1; } } smpB; auto correct() -> void { auto minimum = min(cpuB.clock, smpB.clock); cpuB.clock -= minimum; smpB.clock -= minimum; } uint8 queueB[iterations]; uint offsetB; cothread_t resumeB = cpuB.handle; auto EnterB() -> void { correct(); offsetB = 0; co_switch(resumeB); } auto QueueB(uint value) -> void { queueB[offsetB++] = value; if(offsetB >= iterations) { resumeB = co_active(); co_switch(mainThread); } } auto CPUB::Enter() -> void { while(true) cpuB.main(); } auto CPUB::main() -> void { QueueB(1); step(cpuStep); synchronize(smpB); } auto SMPB::Enter() -> void { while(true) smpB.main(); } auto SMPB::main() -> void { QueueB(2); step(smpStep); synchronize(cpuB); } // #include <nall/main.hpp> auto nall::main(string_vector) -> void { mainThread = co_active(); uint masterCounter = 0; while(true) { print(masterCounter++, " ...\n"); auto A = clock(); EnterA(); auto B = clock(); print((double)(B - A) / CLOCKS_PER_SEC, "s\n"); auto C = clock(); EnterB(); auto D = clock(); print((double)(D - C) / CLOCKS_PER_SEC, "s\n"); for(uint n : range(iterations)) { if(queueA[n] != queueB[n]) return print("fail at ", n, "\n"); } } } ...and that's everything.] |
|
Tim Allen | 3a9c7c6843 |
Update to v099r09 release.
byuu says: Changelog: - Emulator::Interface::Medium::bootable removed - Emulator::Interface::load(bool required) argument removed [File::Required makes no sense on a folder] - Super Famicom.sys now has user-configurable properties (CPU,PPU1,PPU2 version; PPU1 VRAM size, Region override) - old nall/property removed completely - volatile flags supported on coprocessor RAM files now (still not in icarus, though) - (hopefully) fixed SNES Multitap support (needs testing) - fixed an OAM tiledata range clipping limit in 128KiB VRAM mode (doesn't fix Yoshi's Island, sadly) - (hopefully, again) fixed the input polling bug hex_usr reported - re-added dialog box for when File::Required files are missing - really cool: if you're missing a boot ROM, BIOS ROM, or IPL ROM, it warns you immediately - you don't have to select a game before seeing the error message anymore - fixed cheats.bml load/save location |
|
Tim Allen | ccd8878d75 |
Update to v099r07 release.
byuu says: Changelog: - (hopefully) fixed BS Memory and Sufami Turbo slot loading - ported GB, GBA, WS cores to use nall/vfs - completely removed loadRequest, saveRequest functionality from Emulator::Interface and ui-tomoko - loadRequest(folder) is now load(folder) - save states now use a shared Emulator::SerializerVersion string - whenever this is bumped, all older states will break; but this makes bumping state versions way easier - also, the version string makes it a lot easier to identify compatibility windows for save states - SNES PPU now uses uint16 vram[32768] for memory accesses [hex_usr] NOTE: Super Game Boy loading is currently broken, and I'm not entirely sure how to fix it :/ The file loading handoff was -really- complicated, and so I'm kind of at a loss ... so for now, don't try it. Everything else should theoretically work, so please report any bugs you find. So, this is pretty much it. I'd be very curious to hear feedback from people who objected to the old nall/stream design, whether they are happy with the new file loading system or think it could use further improvements. The 16-bit VRAM turned out to be a wash on performance (roughly the same as before. 1fps slower on Zelda 3, 1fps faster on Yoshi's Island.) The main reason for this was because Yoshi's Island was breaking horribly until I changed the vramRead, vramWrite functions to take uint15 instead of uint16. I suspect the issue is we're using uint16s in some areas now that need to be uint15, and this game is setting the VRAM address to 0x8000+, causing us to go out of bounds on memory accesses. But ... I want to go ahead and do something cute for fun, and just because we can ... and this new interface is so incredibly perfect for it!! I want to support an SNES unit with 128KiB of VRAM. Not out of the box, but as a fun little tweakable thing. The SNES was clearly designed to support that, they just didn't use big enough VRAM chips, and left one of the lines disconnected. So ... let's connect it anyway! In the end, if we design it right, the only code difference should be one area where we mask by 15-bits instead of by 16-bits. |
|
Tim Allen | 875f031182 |
Update to v099r06 release.
byuu says: Changelog: - Super Famicom core converted to use nall/vfs - excludes Super Game Boy; since that's invoked from inside the GB core This was definitely the major obstacle to test nall/vfs' applicability. Things worked out pretty great in the end. We went from 22.0KiB (cartridge) + 18.6KiB (interface) to 24.5KiB (cartridge) + 11.4KiB (interface). Or 40.7KiB to 36.0KiB. This removes a very large source of indirection. Before it was: "coprocessor <=> cartridge <=> interface" for loading and saving data, and now it's just "coprocessor <=> cartridge". And it may make sense to eventually turn this into just "cartridge -> coprocessor" by making each coprocessor class handle its own markup parsing. It's nice to have all the manifest parsing in one location (well, sans MSU1); but it's also nice for loading/unloading to be handled by each coprocessor itself. So I'll have to think longer about that one. I've also started handling Interface::save() differently. Instead of keeping track of memory IDs and filenames, and iterating through that vector of objects ... instead I now have a system that mirrors the markup parsing on loading, but handles saving instead. This was actually the reason the code size savings weren't more significant, but I like this style more. As before, it removes an extra level of indirection. So ... next up, I need to port over the GB, then GBA, then WS cores. These shouldn't take too long since they're all very simple with just ROM+RAM(+RTC) right now. Then get the SGB callbacks using vfs. Then after that, gut all the old stream stuff from nall and higan. Kill the (load,save)Request stuff, rename the load(Gamepak)Request to something simpler, and then we should be good. Anyway ... these are some huge changes. |
|
Tim Allen | 44a8c5a2b4 |
Update to v099r03 release.
byuu says: Changelog: - finished cleaning up the SFC core to my new coding conventions - removed sfc/controller/usart (superseded by 21fx) - hid Synchronize Video option from the menu (still in the configuration file) Pretty much the only minor detail left is some variable names in the SA-1 core that really won't look good at all if I move to camelCase, so I'll have to rethink how I handle those. It's probably a good area to attempt using BitFields, to see how it impacts performance. But I'll do that in a test branch first. But for the most part, this should be the end of the gigantic diffs (this one was 174KiB), at least for the SFC/WS cores. Still have the FC/GB/GBA cores to clean up more fully. Assuming we don't spot any new regressions, we should be ~95% out of the woods on code cleanups breaking things. |
|
Tim Allen | 1929ad47d2 |
Update to v098r03 release.
byuu says: It took several hours, but I've rebuilt much of the SNES' bus memory mapping architecture. The new design unifies the cartridge string-based mapping ("00-3f,80-bf:8000-ffff") and internal bus.map calls. The map() function now has an accompanying unmap() function, and instead of a fixed 256 callbacks, it'll scan to find the first available slot. unmap() will free slots up when zero addresses reference a given slot. The controllers and expansion port are now both entirely dynamic. Instead of load/unload/power/reset, they only have the constructor (power/reset/load) and destructor (unload). What this means is you can now dynamically change even expansion port devices after the system is loaded. Note that this is incredibly dangerous and stupid, but ... oh well. The whole point of this was for 21fx. There's no way to change the expansion port device prior to loading a game, but if the 21fx isn't active, then the reset vector hijack won't work. Now you can load a 21fx game, change the expansion port device, and simply reset the system to active the device. The unification of design between controller port devices and expansion port devices is nice, and overall this results in a reduction of code (all of the Mapping stuff in Cartridge is gone, replaced with direct bus mapping.) And there's always the potential to expand this system more in the future now. The big missing feature right now is the ability to push/pop mappings. So if you look at how the 21fx does the reset vector, you might vomit a little bit. But ... it works. Also changed exit(0) to _exit(0) in the POSIX version of nall::execute. [The _exit(0) thing is an attempt to make higan not crash when it tries to launch icarus and it's not on $PATH. The theory is that higan forks, then the child tries to exec icarus and fails, so it exits, all the unique_ptrs clean up their resources and tell the X server to free things the parent process is still using. Calling _exit() prevents destructors from running, and seems to prevent the problem. -Ed.] |
|
Tim Allen | 19e1d89f00 |
Update to v098r01 release.
byuu says: Changelog: - SFC: balanced profile removed - SFC: performance profile removed - SFC: code for handling non-threaded CPU, SMP, DSP, PPU removed - SFC: Coprocessor, Controller (and expansion port) shared Thread code merged to SFC::Cothread - Cothread here just means "Thread with CPU affinity" (couldn't think of a better name, sorry) - SFC: CPU now has vector<Thread*> coprocessors, peripherals; - this is the beginning of work to allow expansion port devices to be dynamically changed at run-time - ruby: all audio drivers default to 48000hz instead of 22050hz now if no frequency is assigned - note: the WASAPI driver can default to whatever the native frequency is; doesn't have to be 48000hz - tomoko: removed the ability to change the frequency from the UI (but it will display the frequency used) - tomoko: removed the timing settings panel - the goal is to work toward smooth video via adaptive sync - the model is broken by not being in control of the audio frequency anyway - it's further broken by PAL running at 50hz and WSC running at 75hz - it was always broken anyway by SNES interlace timing varying from progressive timing - higan: audio/ stub created (for now, it's just nall/dsp/ moved here and included as a header) - higan: video/ stub created - higan/GNUmakefile: now includes build rules for essential components (libco, emulator, audio, video) The audio changes are in preparation to merge wareya's awesome WASAPI work without the need for the nall/dsp resampler. |
|
Tim Allen | f1ebef2ea8 |
Update to v097r01 release.
byuu says: A minor WIP to get us started. Changelog: - System::Video merged to PPU::Video - System::Audio merged to DSP::Audio - System::Configuration merged to Interface::Settings - created emulator/emulator.cpp and accompanying object file for shared code between all cores Currently, emulator.cpp just holds a videoColor() function that takes R16G16B16, performs gamma/saturation/luma adjust, and outputs (currently) A8R8G8B8. It's basically an internal function call for cores to use when generating palette entries. This code used to exist inside ui-tomoko/program/interface.cpp, but we have to move it internal for software display emulation. But in the future, we could add other useful cross-core functionality here. |
|
Tim Allen | 47d4bd4d81 |
Update to v096r01 release.
byuu says: Changelog: - restructured the project and removed a whole bunch of old/dead directives from higan/GNUmakefile - huge amounts of work on hiro/cocoa (compiles but ~70% of the functionality is commented out) - fixed a masking error in my ARM CPU disassembler [Lioncash] - SFC: decided to change board cic=(411,413) back to board region=(ntsc,pal) ... the former was too obtuse If you rename Boolean (it's a problem with an include from ruby, not from hiro) and disable all the ruby drivers, you can compile an OS X binary, but obviously it's not going to do anything. It's a boring WIP, I just wanted to push out the project structure change now at the start of this WIP cycle. |
|
Tim Allen | 4e2eb23835 |
Update to v093 release.
byuu says: Changelog: - added Cocoa target: higan can now be compiled for OS X Lion [Cydrak, byuu] - SNES/accuracy profile hires color blending improvements - fixes Marvelous text [AWJ] - fixed a slight bug in SNES/SA-1 VBR support caused by a typo - added support for multi-pass shaders that can load external textures (requires OpenGL 3.2+) - added game library path (used by ananke->Import Game) to Settings->Advanced - system profiles, shaders and cheats database can be stored in "all users" shared folders now (eg /usr/share on Linux) - all configuration files are in BML format now, instead of XML (much easier to read and edit this way) - main window supports drag-and-drop of game folders (but not game files / ZIP archives) - audio buffer clears when entering a modal loop on Windows (prevents audio repetition with DirectSound driver) - a substantial amount of code clean-up (probably the biggest refactoring to date) One highly desired target for this release was to default to the optimal drivers instead of the safest drivers, but because AMD drivers don't seem to like my OpenGL 3.2 driver, I've decided to postpone that. AMD has too big a market share. Hopefully with v093 officially released, we can get some public input on what AMD doesn't like. |
|
Tim Allen | 29ea5bd599 |
Update to v092r09 release.
byuu says: This will be another massive diff from the previous version. All of higan was updated to use the new foo& bar syntax, and I also updated switch statements to be consistent as well (but not in the disassemblers, was starting to get an RSI just from what I already did.) phoenix/{windows, cocoa, qt} need to be updated to use "string foo" instead of "const string& foo", and after that, the major diffs should be finished. This archive is the first time I'm posting my copy-on-write, size+capacity nall::string class, so any feedback on that is welcome as well. |
|
Tim Allen | 5b4bbf5045 |
Update to v092r05 release.
byuu says: This release should be polished enough for a general release. This release should be polished enough for a general release. Anyone with a real, clean Mac up for posting compiled binaries? Preferably compile with "make profile=balanced" In fact, I'd like it if someone were willing to host a "higan for Mac" page, with binaries of each of the latest releases. Only really needed for major official releases, but it'd be preferable to have the builds updated as soon as possible after I post new builds. Changelog: - no more keyboard chimes when pressing keys - status bar added, fully functional - Label::minimumSize() takes frame into account (but note a few places hard-code raw Font::size(), so a few text labels are still clipped) - resizing the main window looks smooth regardless of whether a game is running or not - currently, resizing the window pauses the emulation. Allowing it to run the main loop was lagging out the window resize process too much to be worth it Additional OS X integration enhancements: - closing the main window unloads the current game, but does not quit the application (quit via the main menu or the dock menu) - clicking the icon in the dock will (re)display the main menu |
|
Tim Allen | bbc33fe05f |
Update to higan v092r01, ananke v02r01 and purify v03r01 releases.
byuu says: higan changelog: - compiler is set to g++-4.7, subst(cc,++) rule is gone, C files compile with $(compiler) -x c - make throws an error when you specify an invalid profile or compile on an unsupported platform (instead of hanging forever) - added unverified.png to resources (causes too big of a speed hit to actually check for folder/unverified file ... so disabled for now) - fixed default browser paths for Game Boy, Sufami Turbo and BS-X Satellaview (have to delete paths.cfg to see this) - browser home button seeks to configpath()/higan/library.cfg - settings->driver is now settings->advanced, and it adds game library path setting and profile information - emulation cores now load manifest files internally, manifest.bml is not required for a game folder to be recognized by higan as such - BS-X Satellaview and Sufami Turbo slot cartridge handling moved out of sfc/chip and into sfc/slot - Video::StartFullScreen only sets fullscreen when a game is specified on the command-line purify and ananke changelog: - library output path shown in purify window - added button to change library path - squelch firmware warning windows to prevent multi-threading crash, but only via purify (they show up in higan still) |
|
Tim Allen | d4751c5244 |
Update to v091r10 release.
byuu says: This release adds HSU1 support, and fixes the reduce() memory mapping function. |
|
Tim Allen | ef746bbda4 |
Update to v091r05 release.
[No prior releases were posted to the WIP thread. -Ed.] byuu says: Super Famicom mapping system has been reworked as discussed with the mask= changes. offset becomes base, mode is gone. Also added support for comma-separated fields in the address fields, to reduce the number of map lines needed. <?xml version="1.0" encoding="UTF-8"?> <cartridge region="NTSC"> <superfx revision="2"> <rom name="program.rom" size="0x200000"/> <ram name="save.rwm" size="0x8000"/> <map id="io" address="00-3f,80-bf:3000-32ff"/> <map id="rom" address="00-3f:8000-ffff" mask="0x8000"/> <map id="rom" address="40-5f:0000-ffff"/> <map id="ram" address="00-3f,80-bf:6000-7fff" size="0x2000"/> <map id="ram" address="70-71:0000-ffff"/> </superfx> </cartridge> Or in BML: cartridge region=NTSC superfx revision=2 rom name=program.rom size=0x200000 ram name=save.rwm size=0x8000 map id=io address=00-3f,80-bf:3000-32ff map id=rom address=00-3f:8000-ffff mask=0x8000 map id=rom address=40-5f:0000-ffff map id=ram address=00-3f,80-bf:6000-7fff size=0x2000 map id=ram address=70-71:0000-ffff As a result of the changes, old mappings will no longer work. The above XML example will run Super Mario World 2: Yoshi's Island. Otherwise, you'll have to write your own. All that's left now is to work some sort of database mapping system in, so I can start dumping carts en masse. The NES changes that FitzRoy asked for are mostly in as well. Also, part of the reason I haven't released a WIP ... but fuck it, I'm not going to wait forever to post a new WIP. I've added a skeleton driver to emulate Campus Challenge '92 and Powerfest '94. There's no actual emulation, except for the stuff I can glean from looking at the pictures of the board. It has a DSP-1 (so SR/DR registers), four ROMs that map in and out, RAM, etc. I've also added preliminary mapping to upload high scores to a website, but obviously I need the ROMs first. |
|
Tim Allen | 94b2538af5 |
Update to higan v091 release.
byuu says: Basically just a project rename, with s/bsnes/higan and the new icon from lowkee added in. It won't compile on Windows because I forgot to update the resource.rc file, and a path transform command isn't working on Windows. It was really just meant as a starting point, so that v091 WIPs can flow starting from .00 with the new name (it overshadows bsnes v091, so publicly speaking this "shouldn't exist" and will probably be deleted from Google Code when v092 is ready.) |