bsnes/higan/gba/ppu/background.cpp

184 lines
4.3 KiB
C++
Raw Normal View History

//I/O settings shared by all background layers
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
uint3 PPU::Background::IO::mode;
uint1 PPU::Background::IO::frame;
uint5 PPU::Background::IO::mosaicWidth;
uint5 PPU::Background::IO::mosaicHeight;
auto PPU::Background::scanline(uint y) -> void {
mosaicOffset = 0;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
}
auto PPU::Background::run(uint x, uint y) -> void {
output = {};
if(ppu.blank() || !io.enable) {
mosaic = {};
return;
}
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
switch(id) {
case PPU::BG0:
if(io.mode <= 1) { linear(x, y); break; }
break;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
case PPU::BG1:
if(io.mode <= 1) { linear(x, y); break; }
break;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
case PPU::BG2:
if(io.mode == 0) { linear(x, y); break; }
if(io.mode <= 2) { affine(x, y); break; }
if(io.mode <= 5) { bitmap(x, y); break; }
break;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
case PPU::BG3:
if(io.mode == 0) { linear(x, y); break; }
if(io.mode == 2) { affine(x, y); break; }
break;
}
//horizontal mosaic
if(!io.mosaic || ++mosaicOffset >= 1 + io.mosaicWidth) {
mosaicOffset = 0;
mosaic = output;
}
}
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
auto PPU::Background::linear(uint x, uint y) -> void {
if(x == 0) {
if(!io.mosaic || (y % (1 + io.mosaicHeight)) == 0) {
vmosaic = y;
}
fx = io.hoffset;
fy = vmosaic + io.voffset;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
}
uint6 tx = fx >> 3;
uint6 ty = fy >> 3;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
uint3 px = fx;
uint3 py = fy;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
uint offset = (ty & 31) << 5 | (tx & 31);
if(io.screenSize.bit(0) && (tx & 32)) offset += 32 << 5;
if(io.screenSize.bit(1) && (ty & 32)) offset += 32 << 5 + io.screenSize.bit(0);
offset = (io.screenBase << 11) + (offset << 1);
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
uint16 tilemap = ppu.readVRAM(Half, offset);
uint10 character = tilemap.bits( 0, 9);
uint4 palette = tilemap.bits(12,15);
if(tilemap.bit(10)) px ^= 7;
if(tilemap.bit(11)) py ^= 7;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
if(io.colorMode == 0) {
offset = (io.characterBase << 14) + (character << 5) + (py << 2) + (px >> 1);
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
if(uint4 color = ppu.readVRAM(Byte, offset) >> (px & 1 ? 4 : 0)) {
output.enable = true;
output.priority = io.priority;
output.color = ppu.pram[palette << 4 | color];
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
}
} else {
offset = (io.characterBase << 14) + (character << 6) + (py << 3) + (px);
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
if(uint8 color = ppu.readVRAM(Byte, offset)) {
output.enable = true;
output.priority = io.priority;
output.color = ppu.pram[color];
}
}
fx++;
}
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
auto PPU::Background::affine(uint x, uint y) -> void {
if(x == 0) {
if(!io.mosaic || (y % (1 + io.mosaicHeight)) == 0) {
hmosaic = io.lx;
vmosaic = io.ly;
}
fx = hmosaic;
fy = vmosaic;
}
uint screenSize = 16 << io.screenSize;
uint screenWrap = (1 << (io.affineWrap ? 7 + io.screenSize : 20)) - 1;
uint cx = (fx >> 8) & screenWrap;
uint cy = (fy >> 8) & screenWrap;
uint tx = cx >> 3;
uint ty = cy >> 3;
uint3 px = cx;
uint3 py = cy;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
if(tx < screenSize && ty < screenSize) {
uint8 character = ppu.vram[(io.screenBase << 11) + ty * screenSize + tx];
if(uint8 color = ppu.vram[(io.characterBase << 14) + (character << 6) + (py << 3) + px]) {
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
output.enable = true;
output.priority = io.priority;
output.color = ppu.pram[color];
}
}
fx += io.pa;
fy += io.pc;
if(x == 239) {
io.lx += io.pb;
io.ly += io.pd;
}
}
auto PPU::Background::bitmap(uint x, uint y) -> void {
if(x == 0) {
if(!io.mosaic || (y % (1 + io.mosaicHeight)) == 0) {
hmosaic = io.lx;
vmosaic = io.ly;
}
fx = hmosaic;
fy = vmosaic;
}
uint1 depth = io.mode != 4; //0 = 8-bit (mode 4); 1 = 15-bit (mode 3, mode 5)
uint width = io.mode == 5 ? 160 : 240;
uint height = io.mode == 5 ? 128 : 160;
uint mode = depth ? Half : Byte;
uint baseAddress = io.mode == 3 ? 0 : 0xa000 * io.frame;
uint px = fx >> 8;
uint py = fy >> 8;
if(px < width && py < height) {
uint offset = py * width + px;
uint15 color = ppu.readVRAM(mode, baseAddress + (offset << depth));
if(depth || color) { //8bpp color 0 is transparent; 15bpp color is always opaque
if(depth == 0) color = ppu.pram[color];
output.enable = true;
output.priority = io.priority;
output.color = color;
}
}
fx += io.pa;
fy += io.pc;
if(x == 239) {
io.lx += io.pb;
io.ly += io.pd;
}
}
auto PPU::Background::power(uint id) -> void {
this->id = id;
memory::fill(&io, sizeof(IO));
output = {};
mosaic = {};
mosaicOffset = 0;
hmosaic = 0;
vmosaic = 0;
fx = 0;
fy = 0;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
}