2012-03-19 11:19:53 +00:00
|
|
|
#include <gba/gba.hpp>
|
|
|
|
|
2012-03-21 11:08:16 +00:00
|
|
|
//pixel: 4 cycles
|
|
|
|
|
|
|
|
//hdraw: 240 pixels ( 960 cycles)
|
|
|
|
//hblank: 68 pixels ( 272 cycles)
|
|
|
|
//scanline: 308 pixels (1232 cycles)
|
|
|
|
|
|
|
|
//vdraw: 160 scanlines (197120 cycles)
|
|
|
|
//vblank: 68 scanlines ( 83776 cycles)
|
2012-03-29 11:58:10 +00:00
|
|
|
//frame: 228 scanlines (280896 cycles)
|
2012-03-21 11:08:16 +00:00
|
|
|
|
2012-04-26 10:51:13 +00:00
|
|
|
namespace GameBoyAdvance {
|
2012-03-19 11:19:53 +00:00
|
|
|
|
Update to v097r02 release.
byuu says:
Note: balanced/performance profiles still broken, sorry.
Changelog:
- added nall/GNUmakefile unique() function; used on linking phase of
higan
- added nall/unique_pointer
- target-tomoko and {System}::Video updated to use
unique_pointer<ClassName> instead of ClassName* [1]
- locate() updated to search multiple paths [2]
- GB: pass gekkio's if_ie_registers and boot_hwio-G test ROMs
- FC, GB, GBA: merge video/ into the PPU cores
- ruby: fixed ~AudioXAudio2() typo
[1] I expected this to cause new crashes on exit due to changing the
order of destruction of objects (and deleting things that weren't
deleted before), but ... so far, so good. I guess we'll see what crops
up, especially on OS X (which is already crashing for unknown reasons on
exit.)
[2] right now, the search paths are: programpath(), {configpath(),
"higan/"}, {localpath(), "higan/"}; but we can add as many more as we
want, and we can also add platform-specific versions.
2016-01-25 11:27:18 +00:00
|
|
|
PPU ppu;
|
2012-04-03 23:50:40 +00:00
|
|
|
#include "background.cpp"
|
2012-04-03 00:47:28 +00:00
|
|
|
#include "object.cpp"
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
#include "window.cpp"
|
2012-04-03 00:47:28 +00:00
|
|
|
#include "screen.cpp"
|
Update to v099r13 release.
byuu says:
Changelog:
- GB core code cleanup completed
- GBA core code cleanup completed
- some more cleanup on missed processor/arm functions/variables
- fixed FC loading icarus bug
- "Load ROM File" icarus functionality restored
- minor code unification efforts all around (not perfect yet)
- MMIO->IO
- mmio.cpp->io.cpp
- read,write->readIO,writeIO
It's been a very long work in progress ... starting all the way back with
v094r09, but the major part of the higan code cleanup is now completed! Of
course, it's very important to note that this is only for the basic style:
- under_score functions and variables are now camelCase
- return-type function-name() are now auto function-name() -> return-type
- Natural<T>/Integer<T> replace (u)intT_n types where possible
- signed/unsigned are now int/uint
- most of the x==true,x==false tests changed to x,!x
A lot of spot improvements to consistency, simplicity and quality have
gone in along the way, of course. But we'll probably never fully finishing
beautifying every last line of code in the entire codebase. Still,
this is a really great start. Going forward, WIP diffs should start
being smaller and of higher quality once again.
I know the joke is, "until my coding style changes again", but ... this
was way too stressful, way too time consuming, and way too risky. I'm
too old and tired now for extreme upheavel like this again. The only
major change I'm slowly mulling over would be renaming the using
Natural<T>/Integer<T> = (u)intT; shorthand to something that isn't as
easily confused with the (u)int_t types ... but we'll see. I'll definitely
continue to change small things all the time, but for the larger picture,
I need to just accept the style I have and live with it.
2016-06-29 11:10:28 +00:00
|
|
|
#include "io.cpp"
|
2012-04-09 06:41:27 +00:00
|
|
|
#include "memory.cpp"
|
Update to v087r26 release.
byuu says:
Changelog:
- fixed FIFO[1] reset behavior (fixes audio in Sword of Mana)
- added FlashROM emulation (both sizes)
- GBA parses RAM settings from manifest.xml now
- save RAM is written to disk now
- added save state support (it's currently broken, though)
- fixed ROM/RAM access timings
- open bus should mostly work (we don't do the PC+12 stuff yet)
- emulated the undocumented memory control register (mirror IWRAM,
disable I+EWRAM, EWRAM wait state count)
- emulated keypad interrupts
- emulated STOP (freezes video, audio, DMA and timers; only breaks on
keypad IRQs)
- probably a lot more, it was a long night ...
Show stoppers, missing things, broken things, etc:
- ST018 is still completely broken
- GBC audio sequencer apparently needs work
- GBA audio FIFO buffer seems too quiet
- PHI / ROM prefetch needs to be emulated (no idea on how to do this,
especially PHI)
- SOUNDBIAS 64/128/256khz modes should output at that resolution
(really, we need to simulate PWM properly, no idea on how to do this)
- object mosaic top-left coordinates are wrong (minor, fixing will
actually make the effect look worse)
- need to emulate PPU greenswap and color palette distortion (no idea on
how do this)
- need GBA save type database (I would also LIKE to blacklist
/ patch-out trainers, but that's a discussion for another day.)
- some ARM ops advance the prefetch buffer, so you can read PC+12 in
some cases
2012-04-16 12:19:39 +00:00
|
|
|
#include "serialization.cpp"
|
2012-03-19 11:19:53 +00:00
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
auto PPU::blank() -> bool {
|
Update to v103r07 release.
byuu says:
Changelog:
- gba/cpu: massive code cleanup effort
- gba/cpu: DMA can run in between active instructions¹
- gba/cpu: added two-cycle startup delay between DMA activation and
DMA transfers²
- processor/spc700: BBC, BBC, CBNE cycle 4 is an idle cycle
- processor/spc700: ADDW, SUBW, MOVW (read) cycle 4 is an idle cycle
¹: unfortunately, this causes yet another performance penalty for the
poor GBA core =( Also, I think I may have missed disabling DMAs while
the CPU is stopped. I'll fix that in the next WIP.
²: I put the waiting counter decrement at the wrong place, so this
doesn't actually work. Needs to be more like
this:
auto CPU::step(uint clocks) -> void {
for(auto _ : range(clocks)) {
for(auto& timer : this->timer) timer.run();
for(auto& dma : this->dma) if(dma.active && dma.waiting) dma.waiting--;
context.clock++;
}
...
auto CPU::DMA::run() -> bool {
if(cpu.stopped() || !active || waiting) return false;
transfer();
if(irq) cpu.irq.flag |= CPU::Interrupt::DMA0 << id;
if(drq && id == 3) cpu.irq.flag |= CPU::Interrupt::Cartridge;
return true;
}
Of course, the real fix will be restructuring how DMA works, so that
it's always running in parallel with the CPU instead of this weird
design where it tries to run all channels in some kind of loop until no
channels are active anymore whenever one channel is activated.
Not really sure how to design that yet, however.
2017-07-05 05:29:27 +00:00
|
|
|
return io.forceBlank || cpu.stopped();
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
}
|
|
|
|
|
2015-11-16 08:38:05 +00:00
|
|
|
PPU::PPU() {
|
|
|
|
output = new uint32[240 * 160];
|
|
|
|
}
|
|
|
|
|
|
|
|
PPU::~PPU() {
|
|
|
|
delete[] output;
|
|
|
|
}
|
|
|
|
|
|
|
|
auto PPU::Enter() -> void {
|
2016-02-09 11:51:12 +00:00
|
|
|
while(true) scheduler.synchronize(), ppu.main();
|
2012-03-19 11:19:53 +00:00
|
|
|
}
|
|
|
|
|
2015-11-16 08:38:05 +00:00
|
|
|
auto PPU::step(uint clocks) -> void {
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
Thread::step(clocks);
|
|
|
|
synchronize(cpu);
|
2012-03-19 11:19:53 +00:00
|
|
|
}
|
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
auto PPU::main() -> void {
|
Update to v103r07 release.
byuu says:
Changelog:
- gba/cpu: massive code cleanup effort
- gba/cpu: DMA can run in between active instructions¹
- gba/cpu: added two-cycle startup delay between DMA activation and
DMA transfers²
- processor/spc700: BBC, BBC, CBNE cycle 4 is an idle cycle
- processor/spc700: ADDW, SUBW, MOVW (read) cycle 4 is an idle cycle
¹: unfortunately, this causes yet another performance penalty for the
poor GBA core =( Also, I think I may have missed disabling DMAs while
the CPU is stopped. I'll fix that in the next WIP.
²: I put the waiting counter decrement at the wrong place, so this
doesn't actually work. Needs to be more like
this:
auto CPU::step(uint clocks) -> void {
for(auto _ : range(clocks)) {
for(auto& timer : this->timer) timer.run();
for(auto& dma : this->dma) if(dma.active && dma.waiting) dma.waiting--;
context.clock++;
}
...
auto CPU::DMA::run() -> bool {
if(cpu.stopped() || !active || waiting) return false;
transfer();
if(irq) cpu.irq.flag |= CPU::Interrupt::DMA0 << id;
if(drq && id == 3) cpu.irq.flag |= CPU::Interrupt::Cartridge;
return true;
}
Of course, the real fix will be restructuring how DMA works, so that
it's always running in parallel with the CPU instead of this weird
design where it tries to run all channels in some kind of loop until no
channels are active anymore whenever one channel is activated.
Not really sure how to design that yet, however.
2017-07-05 05:29:27 +00:00
|
|
|
cpu.keypad.run();
|
Update to v087r26 release.
byuu says:
Changelog:
- fixed FIFO[1] reset behavior (fixes audio in Sword of Mana)
- added FlashROM emulation (both sizes)
- GBA parses RAM settings from manifest.xml now
- save RAM is written to disk now
- added save state support (it's currently broken, though)
- fixed ROM/RAM access timings
- open bus should mostly work (we don't do the PC+12 stuff yet)
- emulated the undocumented memory control register (mirror IWRAM,
disable I+EWRAM, EWRAM wait state count)
- emulated keypad interrupts
- emulated STOP (freezes video, audio, DMA and timers; only breaks on
keypad IRQs)
- probably a lot more, it was a long night ...
Show stoppers, missing things, broken things, etc:
- ST018 is still completely broken
- GBC audio sequencer apparently needs work
- GBA audio FIFO buffer seems too quiet
- PHI / ROM prefetch needs to be emulated (no idea on how to do this,
especially PHI)
- SOUNDBIAS 64/128/256khz modes should output at that resolution
(really, we need to simulate PWM properly, no idea on how to do this)
- object mosaic top-left coordinates are wrong (minor, fixing will
actually make the effect look worse)
- need to emulate PPU greenswap and color palette distortion (no idea on
how do this)
- need GBA save type database (I would also LIKE to blacklist
/ patch-out trainers, but that's a discussion for another day.)
- some ARM ops advance the prefetch buffer, so you can read PC+12 in
some cases
2012-04-16 12:19:39 +00:00
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
io.vblank = io.vcounter >= 160 && io.vcounter <= 226;
|
|
|
|
io.vcoincidence = io.vcounter == io.vcompare;
|
2012-03-31 08:17:36 +00:00
|
|
|
|
2017-06-06 01:39:27 +00:00
|
|
|
if(io.vcounter == 0) {
|
2012-03-31 08:17:36 +00:00
|
|
|
frame();
|
2012-04-07 08:17:49 +00:00
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
bg2.io.lx = bg2.io.x;
|
|
|
|
bg2.io.ly = bg2.io.y;
|
2012-04-07 08:17:49 +00:00
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
bg3.io.lx = bg3.io.x;
|
|
|
|
bg3.io.ly = bg3.io.y;
|
2012-03-31 08:17:36 +00:00
|
|
|
}
|
2012-03-29 11:58:10 +00:00
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
if(io.vcounter == 160) {
|
Update to v103r07 release.
byuu says:
Changelog:
- gba/cpu: massive code cleanup effort
- gba/cpu: DMA can run in between active instructions¹
- gba/cpu: added two-cycle startup delay between DMA activation and
DMA transfers²
- processor/spc700: BBC, BBC, CBNE cycle 4 is an idle cycle
- processor/spc700: ADDW, SUBW, MOVW (read) cycle 4 is an idle cycle
¹: unfortunately, this causes yet another performance penalty for the
poor GBA core =( Also, I think I may have missed disabling DMAs while
the CPU is stopped. I'll fix that in the next WIP.
²: I put the waiting counter decrement at the wrong place, so this
doesn't actually work. Needs to be more like
this:
auto CPU::step(uint clocks) -> void {
for(auto _ : range(clocks)) {
for(auto& timer : this->timer) timer.run();
for(auto& dma : this->dma) if(dma.active && dma.waiting) dma.waiting--;
context.clock++;
}
...
auto CPU::DMA::run() -> bool {
if(cpu.stopped() || !active || waiting) return false;
transfer();
if(irq) cpu.irq.flag |= CPU::Interrupt::DMA0 << id;
if(drq && id == 3) cpu.irq.flag |= CPU::Interrupt::Cartridge;
return true;
}
Of course, the real fix will be restructuring how DMA works, so that
it's always running in parallel with the CPU instead of this weird
design where it tries to run all channels in some kind of loop until no
channels are active anymore whenever one channel is activated.
Not really sure how to design that yet, however.
2017-07-05 05:29:27 +00:00
|
|
|
if(io.irqvblank) cpu.irq.flag |= CPU::Interrupt::VBlank;
|
Update to v099r13 release.
byuu says:
Changelog:
- GB core code cleanup completed
- GBA core code cleanup completed
- some more cleanup on missed processor/arm functions/variables
- fixed FC loading icarus bug
- "Load ROM File" icarus functionality restored
- minor code unification efforts all around (not perfect yet)
- MMIO->IO
- mmio.cpp->io.cpp
- read,write->readIO,writeIO
It's been a very long work in progress ... starting all the way back with
v094r09, but the major part of the higan code cleanup is now completed! Of
course, it's very important to note that this is only for the basic style:
- under_score functions and variables are now camelCase
- return-type function-name() are now auto function-name() -> return-type
- Natural<T>/Integer<T> replace (u)intT_n types where possible
- signed/unsigned are now int/uint
- most of the x==true,x==false tests changed to x,!x
A lot of spot improvements to consistency, simplicity and quality have
gone in along the way, of course. But we'll probably never fully finishing
beautifying every last line of code in the entire codebase. Still,
this is a really great start. Going forward, WIP diffs should start
being smaller and of higher quality once again.
I know the joke is, "until my coding style changes again", but ... this
was way too stressful, way too time consuming, and way too risky. I'm
too old and tired now for extreme upheavel like this again. The only
major change I'm slowly mulling over would be renaming the using
Natural<T>/Integer<T> = (u)intT; shorthand to something that isn't as
easily confused with the (u)int_t types ... but we'll see. I'll definitely
continue to change small things all the time, but for the larger picture,
I need to just accept the style I have and live with it.
2016-06-29 11:10:28 +00:00
|
|
|
cpu.dmaVblank();
|
2012-03-29 11:58:10 +00:00
|
|
|
}
|
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
if(io.irqvcoincidence) {
|
Update to v103r07 release.
byuu says:
Changelog:
- gba/cpu: massive code cleanup effort
- gba/cpu: DMA can run in between active instructions¹
- gba/cpu: added two-cycle startup delay between DMA activation and
DMA transfers²
- processor/spc700: BBC, BBC, CBNE cycle 4 is an idle cycle
- processor/spc700: ADDW, SUBW, MOVW (read) cycle 4 is an idle cycle
¹: unfortunately, this causes yet another performance penalty for the
poor GBA core =( Also, I think I may have missed disabling DMAs while
the CPU is stopped. I'll fix that in the next WIP.
²: I put the waiting counter decrement at the wrong place, so this
doesn't actually work. Needs to be more like
this:
auto CPU::step(uint clocks) -> void {
for(auto _ : range(clocks)) {
for(auto& timer : this->timer) timer.run();
for(auto& dma : this->dma) if(dma.active && dma.waiting) dma.waiting--;
context.clock++;
}
...
auto CPU::DMA::run() -> bool {
if(cpu.stopped() || !active || waiting) return false;
transfer();
if(irq) cpu.irq.flag |= CPU::Interrupt::DMA0 << id;
if(drq && id == 3) cpu.irq.flag |= CPU::Interrupt::Cartridge;
return true;
}
Of course, the real fix will be restructuring how DMA works, so that
it's always running in parallel with the CPU instead of this weird
design where it tries to run all channels in some kind of loop until no
channels are active anymore whenever one channel is activated.
Not really sure how to design that yet, however.
2017-07-05 05:29:27 +00:00
|
|
|
if(io.vcoincidence) cpu.irq.flag |= CPU::Interrupt::VCoincidence;
|
2012-03-29 11:58:10 +00:00
|
|
|
}
|
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
if(io.vcounter < 160) {
|
|
|
|
uint y = io.vcounter;
|
|
|
|
bg0.scanline(y);
|
|
|
|
bg1.scanline(y);
|
|
|
|
bg2.scanline(y);
|
|
|
|
bg3.scanline(y);
|
|
|
|
objects.scanline(y);
|
|
|
|
for(uint x : range(240)) {
|
|
|
|
bg0.run(x, y);
|
|
|
|
bg1.run(x, y);
|
|
|
|
bg2.run(x, y);
|
|
|
|
bg3.run(x, y);
|
|
|
|
objects.run(x, y);
|
|
|
|
window0.run(x, y);
|
|
|
|
window1.run(x, y);
|
|
|
|
window2.output = objects.output.window;
|
|
|
|
window3.output = true;
|
|
|
|
uint15 color = screen.run(x, y);
|
|
|
|
output[y * 240 + x] = color;
|
|
|
|
step(4);
|
2012-04-09 06:19:32 +00:00
|
|
|
}
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
} else {
|
|
|
|
step(960);
|
2012-04-03 00:47:28 +00:00
|
|
|
}
|
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
io.hblank = 1;
|
Update to v103r07 release.
byuu says:
Changelog:
- gba/cpu: massive code cleanup effort
- gba/cpu: DMA can run in between active instructions¹
- gba/cpu: added two-cycle startup delay between DMA activation and
DMA transfers²
- processor/spc700: BBC, BBC, CBNE cycle 4 is an idle cycle
- processor/spc700: ADDW, SUBW, MOVW (read) cycle 4 is an idle cycle
¹: unfortunately, this causes yet another performance penalty for the
poor GBA core =( Also, I think I may have missed disabling DMAs while
the CPU is stopped. I'll fix that in the next WIP.
²: I put the waiting counter decrement at the wrong place, so this
doesn't actually work. Needs to be more like
this:
auto CPU::step(uint clocks) -> void {
for(auto _ : range(clocks)) {
for(auto& timer : this->timer) timer.run();
for(auto& dma : this->dma) if(dma.active && dma.waiting) dma.waiting--;
context.clock++;
}
...
auto CPU::DMA::run() -> bool {
if(cpu.stopped() || !active || waiting) return false;
transfer();
if(irq) cpu.irq.flag |= CPU::Interrupt::DMA0 << id;
if(drq && id == 3) cpu.irq.flag |= CPU::Interrupt::Cartridge;
return true;
}
Of course, the real fix will be restructuring how DMA works, so that
it's always running in parallel with the CPU instead of this weird
design where it tries to run all channels in some kind of loop until no
channels are active anymore whenever one channel is activated.
Not really sure how to design that yet, however.
2017-07-05 05:29:27 +00:00
|
|
|
if(io.irqhblank) cpu.irq.flag |= CPU::Interrupt::HBlank;
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
if(io.vcounter < 160) cpu.dmaHblank();
|
2012-03-31 08:17:36 +00:00
|
|
|
|
2012-04-07 08:17:49 +00:00
|
|
|
step(240);
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
io.hblank = 0;
|
|
|
|
if(io.vcounter < 160) cpu.dmaHDMA();
|
2012-03-31 08:17:36 +00:00
|
|
|
|
2012-04-07 08:17:49 +00:00
|
|
|
step(32);
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
if(++io.vcounter == 228) io.vcounter = 0;
|
2012-03-19 11:19:53 +00:00
|
|
|
}
|
|
|
|
|
2015-11-16 08:38:05 +00:00
|
|
|
auto PPU::frame() -> void {
|
2013-12-20 11:40:39 +00:00
|
|
|
player.frame();
|
2016-02-09 11:51:12 +00:00
|
|
|
scheduler.exit(Scheduler::Event::Frame);
|
2012-03-19 11:19:53 +00:00
|
|
|
}
|
|
|
|
|
Update to v098r06 release.
byuu says:
Changelog:
- emulation cores now refresh video from host thread instead of
cothreads (fix AMD crash)
- SFC: fixed another bug with leap year months in SharpRTC emulation
- SFC: cleaned up camelCase on function names for
armdsp,epsonrtc,hitachidsp,mcc,nss,sharprtc classes
- GB: added MBC1M emulation (requires manually setting mapper=MBC1M in
manifest.bml for now, sorry)
- audio: implemented Emulator::Audio mixer and effects processor
- audio: implemented Emulator::Stream interface
- it is now possible to have more than two audio streams: eg SNES
+ SGB + MSU1 + Voicer-Kun (eventually)
- audio: added reverb delay + reverb level settings; exposed balance
configuration in UI
- video: reworked palette generation to re-enable saturation, gamma,
luminance adjustments
- higan/emulator.cpp is gone since there was nothing left in it
I know you guys are going to say the color adjust/balance/reverb stuff
is pointless. And indeed it mostly is. But I like the idea of allowing
some fun special effects and configurability that isn't system-wide.
Note: there seems to be some kind of added audio lag in the SGB
emulation now, and I don't really understand why. The code should be
effectively identical to what I had before. The only main thing is that
I'm sampling things to 48000hz instead of 32040hz before mixing. There's
no point where I'm intentionally introducing added latency though. I'm
kind of stumped, so if anyone wouldn't mind taking a look at it, it'd be
much appreciated :/
I don't have an MSU1 test ROM, but the latency issue may affect MSU1 as
well, and that would be very bad.
2016-04-22 13:35:51 +00:00
|
|
|
auto PPU::refresh() -> void {
|
|
|
|
Emulator::video.refresh(output, 240 * sizeof(uint32), 240, 160);
|
|
|
|
}
|
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
auto PPU::power() -> void {
|
Update to v103r07 release.
byuu says:
Changelog:
- gba/cpu: massive code cleanup effort
- gba/cpu: DMA can run in between active instructions¹
- gba/cpu: added two-cycle startup delay between DMA activation and
DMA transfers²
- processor/spc700: BBC, BBC, CBNE cycle 4 is an idle cycle
- processor/spc700: ADDW, SUBW, MOVW (read) cycle 4 is an idle cycle
¹: unfortunately, this causes yet another performance penalty for the
poor GBA core =( Also, I think I may have missed disabling DMAs while
the CPU is stopped. I'll fix that in the next WIP.
²: I put the waiting counter decrement at the wrong place, so this
doesn't actually work. Needs to be more like
this:
auto CPU::step(uint clocks) -> void {
for(auto _ : range(clocks)) {
for(auto& timer : this->timer) timer.run();
for(auto& dma : this->dma) if(dma.active && dma.waiting) dma.waiting--;
context.clock++;
}
...
auto CPU::DMA::run() -> bool {
if(cpu.stopped() || !active || waiting) return false;
transfer();
if(irq) cpu.irq.flag |= CPU::Interrupt::DMA0 << id;
if(drq && id == 3) cpu.irq.flag |= CPU::Interrupt::Cartridge;
return true;
}
Of course, the real fix will be restructuring how DMA works, so that
it's always running in parallel with the CPU instead of this weird
design where it tries to run all channels in some kind of loop until no
channels are active anymore whenever one channel is activated.
Not really sure how to design that yet, however.
2017-07-05 05:29:27 +00:00
|
|
|
create(PPU::Enter, system.frequency());
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
|
|
|
|
for(uint n = 0x000; n <= 0x055; n++) bus.io[n] = this;
|
|
|
|
|
|
|
|
for(uint n = 0; n < 240 * 160; n++) output[n] = 0;
|
|
|
|
|
2017-06-06 01:39:27 +00:00
|
|
|
for(uint n = 0; n < 96 * 1024; n++) vram[n] = 0x00;
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
for(uint n = 0; n < 1024; n += 2) writePRAM(n, Half, 0x0000);
|
|
|
|
for(uint n = 0; n < 1024; n += 2) writeOAM(n, Half, 0x0000);
|
|
|
|
|
2018-05-28 01:16:27 +00:00
|
|
|
io = {};
|
2017-06-06 01:39:27 +00:00
|
|
|
for(auto& object : this->object) object = {};
|
|
|
|
for(auto& param : this->objectParam) param = {};
|
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
bg0.power(BG0);
|
|
|
|
bg1.power(BG1);
|
|
|
|
bg2.power(BG2);
|
|
|
|
bg3.power(BG3);
|
2017-06-06 01:39:27 +00:00
|
|
|
objects.power();
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
window0.power(IN0);
|
|
|
|
window1.power(IN1);
|
|
|
|
window2.power(IN2);
|
|
|
|
window3.power(OUT);
|
|
|
|
screen.power();
|
|
|
|
}
|
|
|
|
|
2012-03-19 11:19:53 +00:00
|
|
|
}
|