Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
auto PPU::Objects::scanline(uint y) -> void {
|
2017-06-06 01:39:27 +00:00
|
|
|
mosaicOffset = 0;
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
for(auto& pixel : buffer) pixel = {};
|
|
|
|
if(ppu.blank() || !io.enable) return;
|
|
|
|
|
|
|
|
for(auto& object : ppu.object) {
|
|
|
|
uint8 py = y - object.y;
|
|
|
|
if(object.affine == 0 && object.affineSize == 1) continue; //hidden
|
|
|
|
if(py >= object.height << object.affineSize) continue; //offscreen
|
|
|
|
|
|
|
|
uint rowSize = io.mapping == 0 ? 32 >> object.colors : object.width >> 3;
|
|
|
|
uint baseAddress = object.character << 5;
|
|
|
|
|
|
|
|
if(object.mosaic && io.mosaicHeight) {
|
|
|
|
int mosaicY = (y / (1 + io.mosaicHeight)) * (1 + io.mosaicHeight);
|
|
|
|
py = object.y >= 160 || mosaicY - object.y >= 0 ? mosaicY - object.y : 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int16 pa = ppu.objectParam[object.affineParam].pa;
|
|
|
|
int16 pb = ppu.objectParam[object.affineParam].pb;
|
|
|
|
int16 pc = ppu.objectParam[object.affineParam].pc;
|
|
|
|
int16 pd = ppu.objectParam[object.affineParam].pd;
|
|
|
|
|
|
|
|
//center-of-sprite coordinates
|
|
|
|
int16 centerX = object.width >> 1;
|
|
|
|
int16 centerY = object.height >> 1;
|
Update to v087r30 release.
byuu says:
Changelog:
- DMA channel masks added (some are 27-bit source/target and some are
14-bit length -- hooray, varuint_t class.)
- No more state.pending flags. Instead, we set dma.pending flag when we
want a transfer (fixes GBA Video - Pokemon audio) [Cydrak]
- fixed OBJ Vmosaic [Cydrak, krom]
- OBJ cannot read <=0x13fff in BG modes 3-5 (fixes the garbled tile at
the top-left of some games)
- DMA timing should be much closer to hardware now, but probably not
perfect
- PPU frame blending uses blargg's bit-perfect, rounded method (slower,
but what can you do?)
- GBA carts really unload now
- added nall/gba/cartridge.hpp: used when there is no manifest. Scans
ROMs for library tags, and selects the first valid one found
- added EEPROM auto-detection when EEPROM size=0. Forces disk/save state
size to 8192 (otherwise states could crash between pre and post
detect.)
- detects first read after a set read address command when the size
is zero, and sets all subsequent bit-lengths to that value, prints
detected size to terminal
- added nall/nes/cartridge.hpp: moves iNES detection out of emulation
core.
Important to note: long-term goal is to remove all
nall/(system)/cartridge.hpp detections from the core and replace with
databases. All in good time.
Anyway, the GBA workarounds should work for ~98.5% of the library, if my
pre-scanning was correct (~40 games with odd tags. I reject ones without
numeric versions now, too.)
I think we're basically at a point where we can release a new version
now. Compatibility should be relatively high (at least for a first
release), and fixes are only going to affect one or two games at a time.
I'd like to start doing some major cleaning house internally (rename
NES->Famicom, SNES->SuperFamicom and such.) Would be much wiser to do
that on a .01 WIP to minimize regressions.
The main problems with a release now:
- speed is pretty bad, haven't really optimized much yet (not sure how
much we can improve it yet, this usually isn't easy)
- sound isn't -great-, but the GBA audio sucks anyway :P
- couple of known bugs (Sonic X video, etc.)
2012-04-22 10:49:19 +00:00
|
|
|
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
//origin coordinates (top-left of sprite)
|
|
|
|
int28 originX = -(centerX << object.affineSize);
|
|
|
|
int28 originY = -(centerY << object.affineSize) + py;
|
|
|
|
|
|
|
|
//fractional pixel coordinates
|
|
|
|
int28 fx = originX * pa + originY * pb;
|
|
|
|
int28 fy = originX * pc + originY * pd;
|
|
|
|
|
|
|
|
for(uint px : range(object.width << object.affineSize)) {
|
|
|
|
uint sx, sy;
|
|
|
|
if(!object.affine) {
|
|
|
|
sx = px ^ (object.hflip ? object.width - 1 : 0);
|
|
|
|
sy = py ^ (object.vflip ? object.height - 1 : 0);
|
|
|
|
} else {
|
|
|
|
sx = (fx >> 8) + centerX;
|
|
|
|
sy = (fy >> 8) + centerY;
|
|
|
|
}
|
|
|
|
|
|
|
|
uint9 bx = object.x + px;
|
|
|
|
if(bx < 240 && sx < object.width && sy < object.height) {
|
|
|
|
uint offset = (sy >> 3) * rowSize + (sx >> 3);
|
|
|
|
offset = offset * 64 + (sy & 7) * 8 + (sx & 7);
|
|
|
|
|
|
|
|
uint8 color = ppu.readObjectVRAM(baseAddress + (offset >> !object.colors));
|
|
|
|
if(object.colors == 0) color = sx & 1 ? color >> 4 : color & 15;
|
|
|
|
if(color) {
|
|
|
|
if(object.mode & 2) {
|
|
|
|
buffer[bx].window = true;
|
|
|
|
} else if(!buffer[bx].enable || object.priority < buffer[bx].priority) {
|
|
|
|
if(object.colors == 0) color = object.palette * 16 + color;
|
|
|
|
buffer[bx].enable = true;
|
|
|
|
buffer[bx].priority = object.priority;
|
|
|
|
buffer[bx].color = ppu.pram[256 + color];
|
|
|
|
buffer[bx].translucent = object.mode == 1;
|
|
|
|
buffer[bx].mosaic = object.mosaic;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
fx += pa;
|
|
|
|
fy += pc;
|
|
|
|
}
|
Update to v087r30 release.
byuu says:
Changelog:
- DMA channel masks added (some are 27-bit source/target and some are
14-bit length -- hooray, varuint_t class.)
- No more state.pending flags. Instead, we set dma.pending flag when we
want a transfer (fixes GBA Video - Pokemon audio) [Cydrak]
- fixed OBJ Vmosaic [Cydrak, krom]
- OBJ cannot read <=0x13fff in BG modes 3-5 (fixes the garbled tile at
the top-left of some games)
- DMA timing should be much closer to hardware now, but probably not
perfect
- PPU frame blending uses blargg's bit-perfect, rounded method (slower,
but what can you do?)
- GBA carts really unload now
- added nall/gba/cartridge.hpp: used when there is no manifest. Scans
ROMs for library tags, and selects the first valid one found
- added EEPROM auto-detection when EEPROM size=0. Forces disk/save state
size to 8192 (otherwise states could crash between pre and post
detect.)
- detects first read after a set read address command when the size
is zero, and sets all subsequent bit-lengths to that value, prints
detected size to terminal
- added nall/nes/cartridge.hpp: moves iNES detection out of emulation
core.
Important to note: long-term goal is to remove all
nall/(system)/cartridge.hpp detections from the core and replace with
databases. All in good time.
Anyway, the GBA workarounds should work for ~98.5% of the library, if my
pre-scanning was correct (~40 games with odd tags. I reject ones without
numeric versions now, too.)
I think we're basically at a point where we can release a new version
now. Compatibility should be relatively high (at least for a first
release), and fixes are only going to affect one or two games at a time.
I'd like to start doing some major cleaning house internally (rename
NES->Famicom, SNES->SuperFamicom and such.) Would be much wiser to do
that on a .01 WIP to minimize regressions.
The main problems with a release now:
- speed is pretty bad, haven't really optimized much yet (not sure how
much we can improve it yet, this usually isn't easy)
- sound isn't -great-, but the GBA audio sucks anyway :P
- couple of known bugs (Sonic X video, etc.)
2012-04-22 10:49:19 +00:00
|
|
|
}
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
auto PPU::Objects::run(uint x, uint y) -> void {
|
|
|
|
output = {};
|
2017-06-06 01:39:27 +00:00
|
|
|
if(ppu.blank() || !io.enable) {
|
|
|
|
mosaic = {};
|
|
|
|
return;
|
|
|
|
}
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
|
|
|
|
output = buffer[x];
|
2017-06-06 01:39:27 +00:00
|
|
|
|
|
|
|
//horizontal mosaic
|
|
|
|
if(!output.mosaic || ++mosaicOffset >= 1 + io.mosaicWidth) {
|
|
|
|
mosaicOffset = 0;
|
|
|
|
mosaic = output;
|
|
|
|
}
|
Update to v102r19 release.
byuu says:
Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on
Windows (ugh ...) Now to await posts about this in four more threads
again ;)
Changelog:
- GBA: rewrote PPU from a scanline-based renderer to a pixel-based
renderer
- ruby: fixed video/gdi bugs
Note that there's an approximately 21% speed penalty compared to v102r18
for the pixel-based renderer.
Also, horizontal mosaic effects are not yet implemented. But they should
be prior to v103. This one is a little tricky as it currently works on
fully rendered scanlines. I need to roll the mosaic into the background
renderers, and then for sprites, well ... see below.
The trickiest part by far of this new renderer is the object (sprite)
system. Unlike every other system I emulate, the GBA supports affine
rendering of its sprites. Or in other words, rotation effects. And it
also has a very complex priority system.
Right now, I can't see any way that the GBA PPU could render pixels in
real-time like this. My belief is that there's a 240-entry buffer that
fills up the next scanline's row of pixels. Which means it probably also
runs on the last scanline of Vblank so that the first scanline has
sprite data.
However, I didn't design my object renderer like this just yet. For now,
it creates a buffer of all 240 pixels right away at the start of the
scanline. I know\!\! That's technically scanline-based. But it's only
for fetching object tiledata, and it's only temporary.
What needs to happen is I need a way to run something like a "mini libco
thread" inside of the main thread, so that the object renderer can run
in parallel with the rest of the PPU, yet not be a hideous abomination
of a state machine, yet also not be horrendously slow as a full libco
thread would be.
I'm envisioning some kind of stackless yielding coroutine. But I'll need
to think through how to design that, given the absence of coroutines
even in C++17.
2017-06-04 03:16:44 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
auto PPU::Objects::power() -> void {
|
2018-05-28 01:16:27 +00:00
|
|
|
io = {};
|
2017-06-06 01:39:27 +00:00
|
|
|
for(auto& pixel : buffer) pixel = {};
|
|
|
|
output = {};
|
|
|
|
mosaic = {};
|
|
|
|
mosaicOffset = 0;
|
Update to v087r30 release.
byuu says:
Changelog:
- DMA channel masks added (some are 27-bit source/target and some are
14-bit length -- hooray, varuint_t class.)
- No more state.pending flags. Instead, we set dma.pending flag when we
want a transfer (fixes GBA Video - Pokemon audio) [Cydrak]
- fixed OBJ Vmosaic [Cydrak, krom]
- OBJ cannot read <=0x13fff in BG modes 3-5 (fixes the garbled tile at
the top-left of some games)
- DMA timing should be much closer to hardware now, but probably not
perfect
- PPU frame blending uses blargg's bit-perfect, rounded method (slower,
but what can you do?)
- GBA carts really unload now
- added nall/gba/cartridge.hpp: used when there is no manifest. Scans
ROMs for library tags, and selects the first valid one found
- added EEPROM auto-detection when EEPROM size=0. Forces disk/save state
size to 8192 (otherwise states could crash between pre and post
detect.)
- detects first read after a set read address command when the size
is zero, and sets all subsequent bit-lengths to that value, prints
detected size to terminal
- added nall/nes/cartridge.hpp: moves iNES detection out of emulation
core.
Important to note: long-term goal is to remove all
nall/(system)/cartridge.hpp detections from the core and replace with
databases. All in good time.
Anyway, the GBA workarounds should work for ~98.5% of the library, if my
pre-scanning was correct (~40 games with odd tags. I reject ones without
numeric versions now, too.)
I think we're basically at a point where we can release a new version
now. Compatibility should be relatively high (at least for a first
release), and fixes are only going to affect one or two games at a time.
I'd like to start doing some major cleaning house internally (rename
NES->Famicom, SNES->SuperFamicom and such.) Would be much wiser to do
that on a .01 WIP to minimize regressions.
The main problems with a release now:
- speed is pretty bad, haven't really optimized much yet (not sure how
much we can improve it yet, this usually isn't easy)
- sound isn't -great-, but the GBA audio sucks anyway :P
- couple of known bugs (Sonic X video, etc.)
2012-04-22 10:49:19 +00:00
|
|
|
}
|